with one click
triage
// Analyze system logs and DB error data, classify problems, and take action — GitHub issue for code bugs, Telegram alert for operational issues, skip noise
// Analyze system logs and DB error data, classify problems, and take action — GitHub issue for code bugs, Telegram alert for operational issues, skip noise
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | triage |
| description | Analyze system logs and DB error data, classify problems, and take action — GitHub issue for code bugs, Telegram alert for operational issues, skip noise |
| triggers | ["observer triage","run triage","triage errors","check system health"] |
Analyze system logs and database error data, classify problems, and take appropriate action: create GitHub issues for code bugs, send Telegram alerts for operational issues, or skip noise.
Every heartbeat cycle (120 minutes). Activated by the observer-cycle cron job.
Read the system log for recent errors:
tail -200 /tmp/openclaw/system.log
If system.log does not exist or the command fails, that is normal after rotation or container restart — proceed to database queries.
Query the database for structured error data:
node scripts/db-query.js get-receipts --status tx_failed --limit 20
node scripts/db-query.js get-receipts --status validation_failed --limit 30
node scripts/db-query.js get-receipts --status reverted --limit 10
node scripts/db-query.js get-orders --status failed --limit 20
node scripts/db-query.js get-executor-log --limit 30
node scripts/db-query.js get-sentinel-log --limit 30
node scripts/db-query.js get-research-log --limit 30
node scripts/db-query.js get-alerts
node scripts/db-query.js get-heartbeats
node scripts/db-query.js get-orders --status approved --limit 20
node scripts/db-query.js get-orders --status queued_in_safe --limit 20
node scripts/db-query.js get-orders --status queued_in_squads --limit 20
node scripts/db-query.js get-meta --key last_activity_wallets_bg_at
node scripts/db-query.js get-meta --key last_score_wallets_bg_at
node scripts/db-query.js get-smart-money-signals --since 2h --limit 1
node scripts/check-signer-balances.js
If anyBelowThreshold is true, send an alert for each chain that is below threshold:
node scripts/send-alert.js --type signer_low_balance --agent observer --message "Signer on <chain> has <balance> <symbol> — below threshold <threshold>. Refill needed to prevent silent execution failures."
This is always an operational issue (not a code bug). Do not create a GitHub issue for it.
For each error or failure found, run through the full signal catalogue below and decide what to do. The goal is to catch every category of suspicious behavior — not just tx failures.
A. Code bugs (→ GitHub issue via create-gh-issue skill)
tx_failed, validation_failed, reverted receipts with a reproducible cause (not just transient 429).status: "error" but no matching send-alert.js call in system.log near that timestamp. The agent tried to record a failure but the alerting path itself broke.research_log row shows trades_proposed: N but fewer than N matching orders rows created in the following 10 min — the handoff dropped trades.validation_failed receipts in the last 2 h — discovery/dedup is stuck re-proposing a bad token.B. Operational issues (→ Telegram alert via send-alert.js)
approved > 15 min, queued_in_safe/queued_in_squads > 30 min, pending > 2 h — use system_health.get-heartbeats shows seconds_since > 2 × expected_cadence_seconds AND idle_ok is false — use emergency_mode. Skip rows where idle_ok: true (executor/sentinel are demand-driven and idle on purpose when there are no approved orders / open positions).system/memory-backup heartbeat stale > 30 min — use system_health.sentinel_alerts has >3 identical symbol + alert_type entries in 10 min — use system_health.last_activity_wallets_bg_at missing or older than 90 min (3× 30-min cadence) → signal feed stalled; last_score_wallets_bg_at missing or older than 30 min (3× 10-min cadence) → proposed-wallet queue not draining. Use system_health.get-smart-money-signals --since 2h returns [] AND last_activity_wallets_bg_at is fresh (loop running but producing zero swaps — possible upstream API regression). Skip if Step B.5 already fired on last_activity_wallets_bg_at. Use system_health.C. Transient noise (→ Skip)
[warn] entry without a pattern.D. Redaction failure (→ Stop, do NOT file — log + alert)
observer_log with status: "error" and send-alert.js --type system_health describing the redaction failure. Fixing the leak takes priority over reporting the original bug.For each code bug identified in Step 3, use the create-gh-issue skill. That skill handles duplicate checking automatically — it fetches all open issues and compares before creating.
Provide the skill with:
fix: , concise description of the problem.js file produced the errorSecurity: Never include wallet addresses, private keys, API keys, or transaction hashes. Replace with [REDACTED].
For operational issues:
node scripts/send-alert.js --type system_health --agent observer --message "<concise description>"
node scripts/db-query.js add-observer-log --json '{"errors_analyzed": <N>, "issues_created": <N>, "alerts_sent": <N>, "summary": "<one line>", "status": "ok"}'
node scripts/db-query.js update-heartbeat --agent observer --check triage
If a failure mode (e.g., a recurring API error, retry-exhaustion signature, or systemic timeout pattern) recurs 3+ times across cycles, promote via scripts/promote-pattern.js. Never edit MEMORY.md directly — manual edits are rejected by pre-commit (PR 3.1). The script validates the pattern's provenance against trusted DB tables (Observer's source is observer_log:<id>).
node scripts/promote-pattern.js \
--name "<Failure Mode Name>" \
--description "<what fails and why it matters>" \
--signal "<log/alert pattern that triggers it>" \
--action "<what the operator/agent should do>" \
--seen 3 \
--attestation-source observer \
--derived-from "observer_log:<id>,observer_log:<id>,observer_log:<id>"
--derived-from IDs must exist in trusted DB tables — see Observer AGENTS.md § Core Principle #6. The script REFUSES to write if any ID can't be resolved, so invented patterns (hallucination, prompt injection from log/issue text) cannot land. Observer writes to the same MEMORY.md as Research — the workspace is symlinked across all four agents, so a successful promotion is visible everywhere.
fix: Safe proposeTransaction fails with 429 after 3 retries on base chain
There was an error with transactions
Specific error message, which script, which chain, how often, where to look in the code.
Vague description without actionable details.