en un clic
doctor
// Diagnose babysitter run health - journal integrity, state cache, effects, locks, sessions, logs, and disk usage
// Diagnose babysitter run health - journal integrity, state cache, effects, locks, sessions, logs, and disk usage
Submit feedback or contribute to babysitter project
manage babysitter plugins. use this command to see the list of installed babysitter plugins, their status, and manage them (install, update, uninstall, list from marketplace, add marketplace, configure plugin, create new plugin, etc).
Set up a project for babysitting. Guides you through onboarding a new or existing project — researches the codebase, interviews you about goals and workflows, builds the project profile, installs the best tools, and optionally configures CI/CD integration.
Resume orchestrating of a babysitter run. use this command to resume babysitting a complex workflow.
Analysis for a run and its results, process, suggestions for process improvements, process optimizations, fixes, etc. for the next runs.
Set up babysitter for yourself. Guides you through onboarding — installs dependencies, interviews you about your specialties and preferences, builds your user profile, and configures the best tools for your workflow.
| name | doctor |
| description | Diagnose babysitter run health - journal integrity, state cache, effects, locks, sessions, logs, and disk usage |
You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 14 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
Initialize a results tracker with these 14 checks, all starting as PENDING:
Goal: Identify the target run and display its metadata.
ls -lt .a5c/runs/.a5c/runs/<runId>npx babysitter run:status .a5c/runs/<runId> --jsonGoal: Verify the append-only event journal is well-formed and uncorrupted.
npx babysitter run:events .a5c/runs/<runId> --json.a5c/runs/<runId>/journal/ sorted by name.For each journal file (named <seq>.<ulid>.json):
Sequential numbering check:
000001 from 000001.01JAXYZ.json).Checksum verification:
The SDK computes checksums as follows: it first builds the event payload without the checksum field ({ type, recordedAt, data }), serializes it with JSON.stringify(payload, null, 2) + "\n" (pretty-printed with a trailing newline), then computes SHA256 of that string. To verify:
checksum field from the parsed object.JSON.stringify(remaining, null, 2) + "\n" — must use 2-space indentation and a trailing newline to match the SDK.Example bash one-liner for a single file:
node -e "const fs=require('fs'); const f=process.argv[1]; const obj=JSON.parse(fs.readFileSync(f,'utf8')); const stored=obj.checksum; delete obj.checksum; const expected=require('crypto').createHash('sha256').update(JSON.stringify(obj,null,2)+'\n').digest('hex'); console.log(stored===expected?'OK':'MISMATCH',f)" <file>
Timestamp monotonicity check:
recordedAt from each event.Event type summary:
Orphan detection:
<seq>.<ulid>.json naming pattern.If all sub-checks pass, mark as PASS. If any sub-check is WARN, mark as WARN. If any sub-check is FAIL, mark as FAIL.
Goal: Verify the derived state cache matches the current journal.
.a5c/runs/<runId>/state/state.json exists.npx babysitter run:rebuild-state .a5c/runs/<runId>If it exists:
state.json and extract the journalHead field (contains seq, ulid, and checksum)..a5c/runs/<runId>/journal/ (highest sequence number).journalHead.seq should match the last journal file's sequence number.journalHead.ulid should match the last journal file's ULID.journalHead.checksum should match the last journal file's checksum.npx babysitter run:rebuild-state .a5c/runs/<runId>schemaVersion field is present and report its value.Goal: Identify stuck, errored, or pending effects.
npx babysitter task:list .a5c/runs/<runId> --jsonnpx babysitter task:list .a5c/runs/<runId> --pending --jsonAll effects summary:
kind (node, breakpoint, orchestrator_task, sleep, etc.).Stuck effect detection:
requestedAt timestamp.Error detection:
Pending summary:
Mark as PASS if no stuck or errored effects. Mark as WARN if there are pending effects older than 30 minutes. Mark as FAIL if there are errored effects.
Goal: Detect stale or orphaned run locks.
.a5c/runs/<runId>/run.lock exists.If it exists:
pid, owner, acquiredAt).kill -0 <pid> 2>/dev/null; echo $? (exit code 0 means alive, non-zero means dead). On Windows/MINGW, use tasklist //FI "PID eq <pid>" 2>/dev/null or equivalent.rm .a5c/runs/<runId>/run.lockGoal: Inspect babysitter session files for health and detect runaway loops.
plugins/babysitter/skills/babysit/state/*.md.a5c/state/*.md.a5c/state/*.jsonRunaway loop detection:
Session classification:
Mark as PASS if no issues. Mark as WARN if runaway loops or stale sessions detected.
Goal: Analyze babysitter log files for errors, warnings, and stop hook decisions.
Read the last 50 lines of each of these log files (if they exist):
${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/hooks.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-stop-hook.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-stop-hook-stderr.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-session-start-hook.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-session-start-hook-stderr.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter.log${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/ and relevant run/session specific logs thereFor each log file:
Stop hook analysis (babysitter-stop-hook.log):
Stderr analysis (babysitter-stop-hook-stderr.log, babysitter-session-start-hook-stderr.log):
Error/Warning detection (all logs):
Mark as PASS if no ERROR lines found and stderr logs are empty. Mark as WARN if WARN lines found or stderr has content but no ERROR. Mark as FAIL if ERROR lines found.
Goal: Report disk consumption and identify oversized files.
Run du -sh .a5c/runs/<runId> for the total run directory size.
Run du -sh on each subdirectory:
.a5c/runs/<runId>/journal/.a5c/runs/<runId>/tasks/.a5c/runs/<runId>/blobs/.a5c/runs/<runId>/state/.a5c/runs/<runId>/process/ (if it exists)Display results in a table: directory, size.
Large file detection:
Find individual files larger than 10MB within the run directory: find .a5c/runs/<runId> -type f -size +10M -exec ls -lh {} \;
If any found, list them with their paths and sizes.
Report the total run directory size prominently.
Mark as PASS if total size < 500MB and no files > 10MB. Mark as WARN if total size > 500MB or any files > 10MB. Mark as FAIL if total size > 2GB.
Goal: Verify the process entrypoint and SDK dependency are valid.
.a5c/runs/<runId>/run.json and extract the importPath (or entrypoint) field.SDK dependency check:
.a5c/package.json (if it exists) or the project root package.json.@a5c-ai/babysitter-sdk in dependencies or devDependencies.Goal: Verify that the stop hook and session-start hook are properly configured, can execute, and have been running. If the stop hook has NOT been running, diagnose why.
CLAUDE_PLUGIN_ROOT env var, or search for plugins/babysitter/hooks/hooks.json by walking up from the current directory.hooks.json and verify:
Stop hook entry exists with a command referencing babysitter-stop-hook.sh.SessionStart hook entry exists with a command referencing babysitter-session-start-hook.sh.hooks.json is not found, mark as FAIL ("Hook registration file not found — hooks are not registered with Claude Code").hooks/babysitter-stop-hook.shhooks/babysitter-session-start-hook.shtest -x <path>).The hooks delegate to the babysitter CLI. Check if it is available:
command -v babysitter 2>/dev/null && babysitter --version 2>/dev/null$HOME/.local/bin/babysitter --version 2>/dev/nullnpm i -g @a5c-ai/babysitter-sdk").Check whether the stop hook has actually been invoked during this run's lifetime:
From log files:
${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-stop-hook.log (if it exists).From journal events:
STOP_HOOK_INVOKED type events (using the run:events output from section 2 if available).From stderr:
${BABYSITTER_LOG_DIR:-$HOME/.a5c/logs}/babysitter-stop-hook-stderr.log.If the stop hook shows NO evidence of execution (no log entries, no journal events, zero invocations):
Perform these diagnostic steps in order and report the first failure found:
Plugin not installed: Check if plugins/babysitter/ exists relative to the project root and if CLAUDE_PLUGIN_ROOT is set. If the plugin directory doesn't exist, report: "Plugin not installed — the babysitter plugin directory is missing."
Plugin not enabled: Check for Claude settings files:
~/.claude/settings.json — look for babysitter in enabledPlugins.~/.claude/plugins/installed_plugins.json — look for babysitter in the plugins list.hooks.json not registered: If hooks.json doesn't contain a Stop hook entry (checked in 10a), report: "Stop hook not registered in hooks.json."
Hook script missing or not executable: If the stop hook script doesn't exist or isn't executable (checked in 10b), report with the specific file path.
CLI not available: If babysitter CLI is not found (checked in 10c), report: "babysitter CLI not installed — hook script will fail silently."
Hook running but failing silently: If the log file exists but shows exit codes other than 0, or if stderr has content, report: "Stop hook is being invoked but failing — see stderr log for details."
No active session: If no session state files exist (from section 6), report: "No active babysitter session — the stop hook only activates when a session is bound to a run."
All checks pass but hook still not running: Report: "All prerequisites are met but the stop hook shows no evidence of execution. Possible causes: Claude Code may not be invoking plugin hooks (check Claude Code version), or the session may have ended before the hook could fire."
Mark as PASS if:
Mark as WARN if:
Mark as FAIL if:
Goal: Verify how the current babysitter session ID was resolved and flag stale or shadowed values.
npx babysitter session:whoami --jsonresolvedFrom field. Classify as follows:
resolvedFrom: "pid-marker" → mark as PASS ("Session ID derives from the live Claude Code ancestor process -- authoritative").resolvedFrom: "env-file" → mark as PASS with a note ("CLAUDE_ENV_FILE was used; typically healthy").resolvedFrom: "env-var" → mark as WARN ("AGENT_SESSION_ID is set without a corroborating PID marker. Likely stale from a prior Claude Code session -- see GitHub issue #130").
babysitter session:cleanup and start a fresh Claude Code session, or unset AGENT_SESSION_ID before invoking babysitter.resolvedFrom: "none" → mark as ERROR ("No session ID resolvable. Either no session-start hook fired, or the ancestor walk failed").Env-var shadow check:
envVarPresent and envVarMatches in the output.envVarPresent && !envVarMatches, mark as WARN ("AGENT_SESSION_ID in env does not match the resolved session ID; a stale value is shadowing the authoritative one. Unset the env var").Goal: Confirm the PID marker references a live Claude Code process.
session:whoami --json output from check 11.ancestorAlive field.ancestorAlive === false, mark as ERROR ("The PID marker references a dead Claude Code process").
babysitter session:cleanup.Goal: Surface multiple live harness sessions that may compete for the same session ID.
~/.a5c/ matching the pattern current-session-*-pid-*.AGENT_SESSION_ID appropriately -- the PID marker handles this automatically").Goal: Verify the ancestor-walk strategy works on Windows, where wmic is no longer guaranteed to be present.
process.platform === 'win32'. On other platforms, mark as PASS ("Not applicable -- non-Windows platform").npx babysitter session:whoami --json (reuse output from check 11 if available).resolvedFrom other than none), mark as PASS.resolvedFrom: "none" on Windows:
wmic availability: where wmic via shell.wmic; the fallback PowerShell CIM path should handle this.powershell -NoProfile -Command "Get-CimInstance Win32_Process -Filter ProcessId=$PID" should work).After completing all 14 checks, produce the diagnostic report in this format:
============================================
BABYSITTER DIAGNOSTIC REPORT
Run: <runId>
Time: <current timestamp>
============================================
OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
--------------------------------------------
CHECK RESULTS
--------------------------------------------
| # | Check | Status |
|----|--------------------------|--------|
| 1 | Run Discovery | <status> |
| 2 | Journal Integrity | <status> |
| 3 | State Cache Consistency | <status> |
| 4 | Effect Status | <status> |
| 5 | Lock Status | <status> |
| 6 | Session State | <status> |
| 7 | Log Analysis | <status> |
| 8 | Disk Usage | <status> |
| 9 | Process Validation | <status> |
| 10 | Hook Execution Health | <status> |
| 11 | Session-ID Provenance | <status> |
| 12 | Ancestor Liveness | <status> |
| 13 | Concurrent Session Detection | <status> |
| 14 | Windows Ancestor-Walk Strategy | <status> |
--------------------------------------------
ISSUES & RECOMMENDATIONS
--------------------------------------------
<For each WARN or FAIL check, list:>
- [WARN|FAIL] <Check name>: <description of issue>
Fix: <specific actionable command or instruction>
--------------------------------------------
Overall health determination:
Present the full detailed findings for each check BEFORE the summary table, so the user can see the evidence. End with the summary table and recommendations. Also, create a single HTML report file with all the findings that uses the arwes UI framework and open it for the user in the browser.
If any check results in FAIL and the root cause is unclear after your own analysis -- especially for environment issues, hook execution failures, CLI availability problems, or permission errors that may relate to the Claude Code runtime itself -- invoke the built-in Claude /debug command to get additional diagnostic context from the Claude Code environment. This is particularly useful for:
Call /debug with a summary of the failing check and what you've already ruled out, so it can focus on environment-level causes.
After diagnosing issues, prompt the user to report or fix what was found -- they don't need to fix the bug themselves, just reporting it is valuable. Suggest /babysitter:contrib based on the situation:
/babysitter:contrib bug report: [what the doctor found, e.g. "state cache rebuild silently drops EFFECT_RESOLVED events when journal has duplicate invocation keys"]/babysitter:contrib bugfix: [description of the fix]/babysitter:contrib documentation question: [what was unclear or missing]/babysitter:contrib bug report: [plugin-name] [description]/babysitter:contrib library contribution: [description]Example prompt after diagnosis:
"Diagnosis found a stale lock -- process 12847 crashed without cleanup. This is a known edge case in the orchestration loop. Even if you don't want to fix it yourself, reporting it helps: run
/babysitter:contrib bug report: orchestration loop doesn't release lock on unhandled rejectionto open an issue."
If any of checks 11-14 surface issues (stale env vars, dead ancestor PIDs, shadowed session IDs, or Windows ancestor-walk failures), suggest the following remediation sequence, in order. Present it as an actionable block:
# 1. Cleanup dead markers and orphaned state files
babysitter session:cleanup --dry-run # preview
babysitter session:cleanup # apply
# 2. Unset a stale env var
unset AGENT_SESSION_ID
# 3. Re-bind a run explicitly if needed
babysitter session:resume --session-id <fresh-id> --state-dir ~/.a5c --run-id <runId> --runs-dir .a5c/runs
# 4. Start a fresh Claude Code session (closes and reopens the session)
Run steps 1 and 2 first; re-run /babysitter:doctor after each step to confirm the session-provenance checks return to PASS. Step 3 is only needed when a specific run must be re-bound to the fresh session. If the issue persists after step 4, escalate via /debug or /babysitter:contrib.