con un clic
stop
// Stop all agent-spec processes, clear ports, remove sandboxes, verify clean state.
// Stop all agent-spec processes, clear ports, remove sandboxes, verify clean state.
Compare two eval runs and report what changed. Reads both runs' events, transcripts, and produced artifacts. Writes a short markdown summary classifying differences as regression, improvement, or neutral.
Generalized recursive iteration loop. Runs parallel sub-agents against a target, scores deterministically, diagnoses instruction gaps, applies fixes, and recurses until the stop condition is met or max depth is reached.
Run an evaluation against an eval with a specific config
Write a handoff document so a new chat can continue the work
Show evaluation results and comparisons
Test-driven development of a Hono/Bun WebSocket application. Read requirements, read tests, build server, verify, iterate until all tests pass.
| name | stop |
| description | Stop all agent-spec processes, clear ports, remove sandboxes, verify clean state. |
| disable-model-invocation | true |
Run cleanup, then verify:
python3 scripts/cleanup.py
cleanup.py handles: tracked PIDs (with SIGTERM→SIGKILL escalation and process tree traversal), port clearing, orphaned process detection, sandbox removal, parallel log cleanup, and worktree pruning.
If processes remain after cleanup, escalate with --force to also delete /tmp run logs:
python3 scripts/cleanup.py --force
Verify with:
python3 scripts/system_monitor.py
Active Runs, Sandboxes, Ports in use, and Tracked PIDs should all be 0.