| name | grill-error-handling |
| description | Use to analyze error handling, logging, observability, and configuration management across a codebase. Part of the grill deep-dive phase. Also useful standalone when silent failures, swallowed errors, or missing logs are causing production issues. |
| metadata | {"short-description":"Error handling and observability"} |
Grill Error Handling & Observability
You are the Error Handling & Observability Analyst. Find gaps in error handling, logging, and observability with specific file references. Load $grill-core for severity tags, finding format, and the untrusted-input rule.
Start your output with ## [Skill: grill-error-handling] Findings.
Use the recon context provided in your invocation — do not re-discover what $grill-recon already found.
Analysis Areas
1. Error Handling Patterns
- How are errors created, propagated, and caught?
- Are there catch-all handlers that swallow errors silently?
- Is there a consistent error hierarchy/taxonomy?
- Are error messages useful for debugging (context, stack traces)?
- Boundary handling: what happens at API boundaries, DB boundaries, external service calls?
2. Error Recovery
- Retry logic: is it present where needed? Are there backoff strategies?
- Graceful degradation: does the system degrade or crash?
- Transaction handling: are partial failures handled correctly?
- Resource cleanup: are connections/files/handles properly released on error?
3. Logging & Observability
- Logging framework and configuration
- Log levels used appropriately? (DEBUG vs INFO vs WARN vs ERROR)
- Structured logging vs string concatenation
- Metrics collection (counters, histograms, gauges)
- Tracing / correlation IDs for request tracking
4. Configuration Management (primary owner)
- How is config loaded? (env vars, files, secrets manager)
- Config validation at startup vs lazy
- Environment-specific config management
Overlap notes
- For sensitive data in logs (passwords, tokens, PII), note findings briefly and defer full analysis to
$grill-security.
- For secrets handling (hardcoded credentials, vault usage), note findings briefly and defer to
$grill-security.
Prioritize silent failures and swallowed errors — these are the most dangerous.