Error Handling and Logging Conventions¶
This guide documents the standard patterns for error handling and logging across all agentd service crates.
Error Handling¶
Principles¶
- Use
thiserrorfor domain errors — all service crates define theirApiErrorenum with#[derive(Debug, thiserror::Error)] - Use
anyhowfor internal propagation — internal helpers and storage operations returnanyhow::Result - Map errors to HTTP status codes — every
ApiErrorvariant has a defined HTTP status viaIntoResponse - Error responses are JSON — all errors return
{"error": "message"}format
Standard ApiError Pattern¶
Every HTTP service crate defines an ApiError enum in its api.rs with this pattern:
#[derive(Debug, thiserror::Error)]
pub enum ApiError {
/// Maps to HTTP 500
#[error("internal error: {0}")]
Internal(#[from] anyhow::Error),
/// Maps to HTTP 404
#[error("not found: {0}")]
NotFound(String),
/// Maps to HTTP 400
#[error("invalid input: {0}")]
InvalidInput(String),
}
Additional domain-specific variants can be added as needed:
| Crate | Extra Variants | Status Code |
|---|---|---|
| ask | QuestionNotFound |
404 |
| ask | QuestionNotActionable |
410 |
| ask | TmuxError |
500 |
| ask | NotificationError |
502 |
| orchestrator | AgentNotRunning |
409 |
IntoResponse Implementation¶
All ApiError enums implement axum::response::IntoResponse:
impl IntoResponse for ApiError {
fn into_response(self) -> axum::response::Response {
let (status, message) = match &self {
ApiError::Internal(_) => (StatusCode::INTERNAL_SERVER_ERROR, self.to_string()),
ApiError::NotFound(_) => (StatusCode::NOT_FOUND, self.to_string()),
// ... other variants
};
(status, Json(serde_json::json!({ "error": message }))).into_response()
}
}
Error Propagation¶
- Use
?operator withanyhow::Errorin handlers — it automatically converts via#[from] - For cross-service errors, wrap the upstream error with context:
state.notification_client
.create_notification(request)
.await
.map_err(|e| ApiError::NotificationError(e))?;
Adding New Error Types¶
When adding a new error type to a service:
- Add a variant to the crate's
ApiErrorenum with#[error("...")] - Choose an appropriate HTTP status code
- Add the variant to the
IntoResponsematch - If wrapping another error, use
#[from]for automatic conversion
Logging¶
Environment Variables¶
| Variable | Values | Default | Description |
|---|---|---|---|
RUST_LOG |
trace, debug, info, warn, error |
info |
Log level filter |
LOG_FORMAT |
json, (unset) |
(unset = human-readable) | Output format |
Standard Initialization¶
All services initialize logging the same way in main():
let env_filter = tracing_subscriber::EnvFilter::try_from_default_env()
.unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("info"));
if std::env::var("LOG_FORMAT").as_deref() == Ok("json") {
tracing_subscriber::fmt().json().with_env_filter(env_filter).init();
} else {
tracing_subscriber::fmt().with_env_filter(env_filter).init();
}
JSON Logging¶
For production environments or log aggregation pipelines, enable structured JSON output:
Output format:
{"timestamp":"2026-03-02T12:00:00.000Z","level":"INFO","fields":{"message":"Starting agentd-notify service..."},"target":"agentd_notify"}
Request Tracing¶
All HTTP services include tower-http TraceLayer middleware that automatically logs:
- Request method and path
- Response status code
- Request duration
2026-03-02T12:00:00.000Z INFO request{method=GET uri=/health} started
2026-03-02T12:00:00.001Z INFO request{method=GET uri=/health} completed status=200 latency=1ms
Logging Levels¶
Follow these conventions:
| Level | Use For |
|---|---|
error! |
Unrecoverable failures, data loss, service-breaking issues |
warn! |
Recoverable issues, degraded operation, retries |
info! |
Service lifecycle (start/stop), request summaries, important state changes |
debug! |
Detailed operation traces, request/response bodies, internal state |
trace! |
Very verbose debugging, message-level protocol traces |
Structured Fields¶
Use structured fields in log messages:
// Good — structured
info!(agent_id = %id, status = %agent.status, "Agent state changed");
// Avoid — unstructured
info!("Agent {} changed to {}", id, agent.status);
Dependencies¶
All crates use workspace-level dependency versions:
# Cargo.toml (workspace)
[workspace.dependencies]
anyhow = "1.0"
thiserror = "2.0"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
Service crates also include: