| name | afk |
| description | Autonomous loop that drains the `ready-for-agent` queue on the issue tracker. Each iteration claims an issue, runs it in an isolated worktree, executes with claude or codex, merges back to main, and closes the issue. Use when the user wants to run AFK execution, drain a PRD, hammer specific issues, or otherwise let agents grind through the backlog. |
| argument-hint | [--prd N | --issues N,N,N] [--runner claude|codex] [--alternate] [--fallback-runner] [--request TEXT] [-n N] [--once] [--boot-only] | fleet [N] | fleet stop | monitor | reap |
/afk
Drain the agent-ready backlog. Single skill that owns issue selection, worktree isolation, inner-agent execution, GitHub state coordination, merge-back, and runner-fallback.
Runtime & Invocation
Run this skill — do not read its code. This SKILL.md is the complete behavioural contract. The bin/ bundle and the scripts/ shell files are build/runtime artifacts, not documentation: opening them to "understand what /afk does" wastes context and is never required. Everything an agent needs to operate /afk is in this file.
The skill ships a single committed runtime bundle. Invoke it as:
RED_AFK_RUNNER=<claude|codex> node "$CLAUDE_PLUGIN_ROOT/skills/engineering/afk/bin/afk.mjs" <command> [params]
The invoking LLM is responsible for setting RED_AFK_RUNNER to its own host runner (codex from Codex, claude from Claude Code). Do not infer a different runner from binaries on PATH; use --runner only when the user explicitly pinned one.
afk.mjs is a dedicated forwarder (ADR 0039 entrypoint, build role run:dev): every argument is passed straight to the dev bundle, whose own command surface (run, monitor, fleet, …) is documented below. So … afk.mjs run --once, … afk.mjs monitor, and the bare … afk.mjs --issues 42 all reach the orchestrator. The generic entrypoint verbs (run <plugin> / fetch) belong to red-fetch.mjs, not to this launcher — they do not shadow the bundle's commands (#434).
Commands and their parameters are documented in When To Use below — that section is authoritative for the CLI surface. The commands are run (the default — a bare token routes here with argv preserved), monitor, fleet, reap, statusline (the Claude Code statusline aggregator; reads the payload on stdin and takes the project root as its one argument), and the hidden __supervise (the fleet supervisor entrypoint; never invoked by hand). run accepts --prd, --issues, --runner, --alternate, --fallback-runner, --request/-r, -n, and --once; monitor accepts --once; fleet accepts an optional numeric target N, the stop subcommand, --request/-r, and --runner; reap takes no flags; statusline takes the project-root path as $1.
The bundle is a dependency-free build (one file, no node_modules, no install step) and is the public entrypoint. Every command — orchestration, supervisor, statusline, and hooks — executes natively in the bundle; the legacy shell orchestrator under scripts/ has been removed (ADR 0032, ADR 0034). Treat this SKILL.md as the contract: run the bundle, don't read its source.
When To Use
/afk — every issue currently labelled ready-for-agent.
/afk --prd 42 — only issues that reference PRD #42 (by prd: #42 line in body, parent link, or prd:42 label).
/afk --issues 356,359,362 — explicit list, in that order.
/afk --runner codex — pin a backend (disables detection cascade; mutually exclusive with --alternate).
/afk --alternate — opt in to round-robin runner rotation between issues (claude → codex → claude → …).
/afk --fallback-runner — opt in to swapping runners mid-issue when one returns RUNNER_EXHAUSTED. Without this flag, exhaustion exits with code 75.
/afk --request "dont run cargo tests for this issue resolution" or /afk -r "..." — add a special user request block to every inner-agent prompt for this run.
/afk -n 5 — cap at five issues. -n N caps the run at N issues; -n 0 (and omitting -n) drains the whole queue until it is empty (0 means unlimited, not zero). For a no-agent dry-run use --boot-only instead.
/afk --once — single supervised iteration. Use for debugging the prompt.
/afk --boot-only — run the boot sweeps then exit without claiming or spawning an agent; a safe dry-run to inspect bootstrap / orphan-cleanup / unblock-sweep / precheck.
/afk monitor — readonly status board, aggregates every .red/tmp/workers/*/*/afk.state.json so you see all live workers from another terminal. Also (binding): mirrors live workers onto the host runner's native task surface — TaskCreate/TaskUpdate under Claude Code, the sub-agent surface under Codex when present (falls back to the dashboard otherwise). See Task Mirror below — this is not optional and you must do it on every tick, even when the user only asked "como estamos?".
/afk fleet [N] — launch the supervisor maintaining N concurrent workers (default 2). See Fleet Mode below.
/afk fleet stop — gracefully shut down a running fleet supervisor and cancel its auto-monitor cron.
/afk reap — run branch hygiene without starting a worker: one count line for remote afk/*, remote afk-attempts/*, and local afk/*, then the same safe reapers used at boot.
Parallelization
/afk is trivially parallel — just open another terminal and run /afk again. No flag, no coordination, no slot to manage.
/afk
/afk
/afk
Each invocation generates its own worker ID — literal w plus 4 random characters from [A-Z0-9] (e.g. wZ2R4, ~1.7M possible IDs) — and uses it as the prefix for every per-run file. The leading w makes the worker directory .red/tmp/workers/{id} an unambiguous live-worker anchor. The ID is printed on the first line of the run so you can tail or kill it later.
Per-attempt files live under .red/tmp/workers/{id}/{N}-a{n}/ in the primary checkout, where {id} is the worker ID, {N} is the issue number, and {n} is the per-issue attempt counter (derived by the attempt-ledger — every retry, even by a different worker, gets a fresh a{n} directory). Everything for one (worker, issue, attempt) is in one directory — when the attempt ends successfully the whole directory is removed; when it blocks the whole directory is preserved. The worker also holds a single per-worker liveness anchor at .red/tmp/workers/{id}/worker.pid (see the worker.pid row below).
| Path | Purpose |
|---|
.red/tmp/workers/{id}/worker.pid | Per-worker liveness anchor: the orchestrator's PID, written once at bootstrap and removed on the worker's EXIT trap (along with rmdir of the empty worker dir). The single liveness anchor for the worker; the fleet supervisor's slot matching keys off it. |
.red/tmp/workers/{id}/{N}-a{n}/worktree/ | Git worktree for issue N on attempt n. Lives inside the gitignored .red/tmp/ so it never pollutes sibling directories. |
.red/tmp/workers/{id}/{N}-a{n}/afk.log | Append-only plain log for this attempt (orchestrator output + inner-agent stdout + heartbeat lines). Per-attempt scope — each attempt gets a fresh log. |
.red/tmp/workers/{id}/{N}-a{n}/agent.log.jsonl | Clean agent lane (issue #250) — one type=agent JSONL record per assistant turn and nothing synthetic, so it is the true liveness signal and reads as a live transcript: tail -f … | jq -r .msg. Single-writer. |
.red/tmp/workers/{id}/{N}-a{n}/log.jsonl | The firehose (issue #250) — every record of the attempt in the uniform JSONL envelope: agent turns, heartbeat vitals, hook dispatches, runner timings, and errors. Flock-serialised (many concurrent writers). |
.red/tmp/workers/{id}/{N}-a{n}/afk.state.json | State snapshot for this attempt. Schema in State File below. |
.red/tmp/workers/{id}/{N}-a{n}/handoff.md | Handoff file the inner agent reads — <issue-body> (issue body verbatim, including the ## Agent brief markdown section), <previous-attempts>, <human-guidance-thread> (one <human-guidance> per extracted directive), <thread-discussion> (advisory comments with no directive marker), <agent-notes>. Top-level XML wrappers make body/comments/notes unambiguous. Template in Handoff File Template below. |
Two workers cannot claim the same issue thanks to a local mkdir lock at .red/tmp/claims/{N}/ plus a gh issue view pre-check before the edit. The gh edit itself is not atomic (see Issue Lifecycle below for the full three-layer scheme). The race surface is the brief window between two separate checkouts on the same host — acceptable for the intended scale.
Hard Preconditions
Refuse to start if any of these fail. The user fixes them, you don't.
git remote -v shows only SSH remotes. Reject HTTPS — never auto-rewrite.
gh auth status succeeds.
- Repo has a
main branch and git -C primary log -1 main works.
- Issue tracker label
ready-for-agent exists. If not, point at /triage.
pnpm is on PATH (logger and tooling guidelines assume pnpm).
Bootstrap
Run before the first iteration:
- Ensure
.red/tmp/ exists. Create it.
- Ensure
.red/tmp/ is in .gitignore of the primary checkout. Append if missing.
- Generate the worker ID. Literal
w + 4 random characters from [A-Z0-9] (e.g. wZ2R4). On the astronomically unlikely chance the chosen ID already maps to a live worker directory .red/tmp/workers/{id} (its worker.pid alive), regenerate. Print the ID on the first line of the run: worker: {id}. All per-attempt paths interpolate {id}, the issue number {N}, and the attempt number {n}.
- Resolve the runner via the detection cascade:
--runner X flag (pin) — wins over everything, logged as detected via --runner pin.
- Caller runner env —
RED_AFK_RUNNER=claude|codex|hermes is the host-session identity and wins over ambient env/process/path sniffing. Under Codex, invoke the bundle as RED_AFK_RUNNER=codex ...; under Claude Code, invoke it as RED_AFK_RUNNER=claude .... Logged as detected via env-var.
- Env-var sniff —
CLAUDECODE, CLAUDE_CODE_ENTRYPOINT, or CLAUDE_CODE_SSE_PORT → claude; CODEX_HOME, CODEX_SANDBOX, CODEX_SANDBOX_NETWORK_DISABLED, or CODEX_MANAGED_BY_NPM → codex. Logged as detected via env-var.
- Process-tree sniff — if the invoking process tree contains Claude Code, use
claude; if it contains Codex, use codex. Logged as detected via process. This is the normal path for repo-local skill copies whose filesystem path is neutral.
$BASH_SOURCE path sniff — script lives under ~/.claude/... → claude; under ~/.codex/... → codex. Logged as detected via path.
- Default fallback —
claude. Logged as detected via env-fallback.
The boot log prints one line per invocation: runner: <runner> (detected via <method>). Load runner-claude.md or runner-codex.md so the spawn command is ready.
Do not probe command -v claude / command -v codex to choose a different runner after a transport failure. Installed binaries are capabilities, not caller identity. A runner swap is allowed only when the user explicitly passes --fallback-runner.
- Read
SAFETY.md. It is binding for every shell action the loop takes.
- Write the per-worker pid file. Create
.red/tmp/workers/{id}/ and write worker.pid (current $$) once — this is the worker's single liveness anchor for its whole lifetime.
- Install signal handlers — SIGINT, SIGTERM, and normal exit all release any in-flight issue claim, preserve the active
workers/{id}/{N}-a{n}/ attempt directory, and on the EXIT trap remove worker.pid (and rmdir the empty worker dir) before terminating.
The per-attempt workers/{id}/{N}-a{n}/ directory (log, state, handoff, worktree) is created in Per-Issue Loop step 1 below, not here — the worker has no attempt files until it claims an issue. Only worker.pid exists from bootstrap.
Orphan Cleanup (boot-time)
Right after bootstrap and before Straggler Check, /afk runs two passes. First it drain-wipes any leftover legacy flat .red/tmp/work-*/ dirs — these are never created under the nested scheme (the drain-first cutover, issue #252), so any survivor is a pre-cutover relic and is removed unconditionally. Then it sweeps the nested attempt dirs .red/tmp/workers/*/*/ whose parent worker's worker.pid is dead, and afterwards removes the dead worker.pid files and the now-empty worker dirs. For each orphaned attempt dir:
- (Slice D — heartbeat sub-shell retired.) No zombie reap step is needed; older state files may still carry a
heartbeat_pid but it's vestigial and ignored.
- Decide fate from issue state.
gh issue view N --json labels,state:
state == CLOSED → rm -rf. Work landed; nothing to inspect.
- label
ready-for-human → split TTL based on envelope.posted in the attempt state file (see Terminal-Event Envelope below):
envelope.posted == true → 1-day TTL. The issue thread already carries the canonical record; the local dir is pure redundancy.
envelope.posted == false or field missing → 7-day TTL. The envelope POST failed (or this dir predates the envelope writer), so the local notes/log are the only copy.
- label
running (orchestrator crashed mid-issue) → restore ready-for-agent, post a recovery comment, then rm -rf. Leaving the issue eternally running is worse than losing the dir.
- any other state →
rm -rf.
- Fallback on gh failure. Network / rate-limit error → fall back to mtime TTL: 7 days for dirs with a state file, 1 day for dirs without one. Conservative enough to survive transient outages without losing artefacts the human wanted.
This removes the manual "remember to clean .red/tmp/" discipline. Blocker dirs persist until their TTL expires; everything else self-collects on the next /afk run.
Attempt Cap (boot-time, issue #257)
The Completion sweep (close step 11) only fires when an issue completes. Issues that never complete — blocked-forever work that accumulates retries — would otherwise leak attempt dirs indefinitely. Right after Orphan Cleanup, cap_issue_attempts walks every attempt dir across all workers, groups them by issue, and per issue prunes (newest attempt kept first) anything over either cap:
- Age cap —
RED_AFK_ATTEMPT_TTL_S (default 14 days). An attempt dir whose mtime is older than this is reclaimed.
- Count cap —
RED_AFK_ATTEMPT_KEEP (default 5). Only the newest KEEP attempts (by attempt number) for one issue are retained; older ones are reclaimed.
Both caps share the completion sweep's invariant: a live worker's active attempt (state file carrying a live pid) is never counted toward the cap nor removed. A non-numeric or zero env value falls back to the default so an operator typo can never disable a cap.
Snapshot Branch Grace Cleanup (boot-time, issue #258)
The Completion sweep and Attempt Cap reclaim local attempt dirs; the failure-push afk-attempts/{wid}/{N}-slug snapshot branches live on origin and are the canonical record a terminal-failure envelope links to. After Attempt Cap, prune_completed_attempt_branches reaps those remote branches for issues that have completed: it lists afk-attempts/* on origin, groups branches by the issue number in the ref, classifies each issue with gh issue view, and:
- still-open issues — every branch is left untouched;
- closed within the grace window — kept, so a reopened issue can still recover its prior attempts from origin;
- closed longer than the grace window ago — every snapshot branch for that issue is deleted from origin (cross-worker).
The grace window is RED_AFK_ATTEMPT_SNAPSHOT_GRACE_S (default 7 days), measured from the issue's GitHub closedAt. A non-numeric value falls back to the default so an operator typo can never disable the grace; 0 is honoured as "delete immediately on completion". The pass is best-effort and runs at boot, never on the close path — a slow or failing gh/git can never block a completion, and an issue it cannot classify is left strictly in place.
On-Demand Branch Reaper (issue #275)
Run /afk reap (the bundle's reap command) to perform branch hygiene without starting a worker, claiming an issue, or firing lifecycle hooks. The command first prints one line:
afk branch counts: remote-afk=N remote-afk-attempts=N local-afk=N
It then applies the same three namespace reapers used during /afk boot: remote afk-attempts/*, remote afk/*, and local afk/*. Open issues and transiently unclassified issues are kept; local branches checked out by any worktree are kept. Each successful deletion logs the branch, issue number, and classification reason. Re-running is a natural no-op once stale refs are gone. Snapshot grace still comes from RED_AFK_ATTEMPT_SNAPSHOT_GRACE_S; live afk/* cleanup keeps the existing closed-vs-open policy.
Dependency Unblock — req:N edges, close cascade + boot sweep
Dependencies are first-class req:N edge labels (one per blocker), and a dependency-blocked issue holds the blocked:dependency state — not ready-for-human (it is healthy, waiting, and never pages). Two mechanisms promote it to ready-for-agent:
1. Close cascade (event-driven, the fast path). Immediately after /afk closes an issue #N on the DONE path (after the completion sweep), it re-evaluates every dependent of #N:
gh issue list --label req:N --state open --json number,labels.
- For each dependent, read its
req:* labels and resolve each referenced issue's state (the just-closed #N is known closed; others via a cached lookup).
- When every
req:* of a dependent is now closed: gh issue edit --remove-label blocked:dependency --add-label ready-for-agent + post 🤖 /afk unblocked: all dependencies closed (#…).
Best-effort: a gh failure here logs a warn: and never fails the close — the boot sweep below catches anything the cascade missed.
2. Unblock Sweep (boot-time, the safety net). After Orphan Cleanup and before Straggler Check, /afk re-scans dependency-blocked issues by label and promotes any whose deps all closed:
gh issue list for open blocked:dependency issues with number,labels,body.
- Deps come from the
req:* labels (the source of truth); for pre-req:N issues with no such label, fall back to extracting #N refs under the literal ## Blocked by body heading (- [ ] #N) only when the issue is still labelled blocked:dependency.
- Resolve each dep via
gh issue view <N> --json state; promote only when every dep is CLOSED.
- On promotion: remove the holding label (
blocked:dependency), add ready-for-agent, post the audit comment, and log unblocked N issue(s): #A #B.
ready-for-human is a human gate, not dependency-wait. The boot sweep must not promote it from a legacy ## Blocked by body parse, because a closed blocker can still encode a failed measurement or a no-go decision. blocked:dependency issues do not have that ambiguity: the label means dependency-wait, which is the whole point of separating it from ready-for-human.
Current Blocker State
Human gates are first-class issue-body state, not implicit thread archaeology. Before claiming an issue, /afk checks for an active ## Current blocker block:
## Current blocker
<!-- red:blocker-state v1 -->
status: blocked
kind: decision
ref: #856
summary: Phase 2 measured no columnar read win.
next: Human must decide whether to stop, redesign, or continue anyway.
<!-- /red:blocker-state -->
If this block is present with status: blocked, /afk does not create an attempt. It removes ready-for-agent, adds ready-for-human plus the typed blocker label, leaves the issue open, and waits for /hitl.
When an attempt reaches a terminal human page (BLOCKED, validation failure, merge conflict, or no-sentinel after retry budget exhaustion), the runtime writes or replaces this block so the next /hitl turn can start from the current blocker instead of re-reading every old envelope. /hitl clears the block to None, records it under ## Resolved blockers, refreshes ## Agent brief, and moves the issue back to ready-for-agent only when the next agent can continue without guessing.
Use ## Blocked by only for mechanical dependencies that should auto-promote on close. Use ## Current blocker / ## Human decision needed for gates, measurements, product calls, or any state where "the referenced issue closed" is not enough to prove the work is delegable.
Straggler Check
Before issue selection, /afk counts open issues in states it cannot consume:
unlabeled — never triaged
needs-triage — triage in progress
needs-info — waiting on reporter
If any of those are non-zero, print a warning and (on a TTY, not in --once) prompt to confirm before proceeding. This catches the "issue perdida" case where a fresh report never made it through /triage and is silently invisible to /afk.
The systemic fix is the red-issues-needs-triage.yml workflow installed by /setup-red-skills, which auto-applies needs-triage to every fresh issue. The straggler check is the in-loop safety net for repos where the workflow isn't installed yet.
Issue Selection
Pull the candidate list with gh issue list --label ready-for-agent --state open --json number,title,labels,body --limit 100.
PRD exclusion (hard). Drop every issue carrying the type:prd label before any other filter. PRDs describe what to build, not an implementable slice — they must be split by /to-issues first. If a PRD is found in ready-for-agent (usually because someone labelled it manually), log a warning naming the issue numbers and the fix (/to-issues N), and continue with the remaining candidates. This defence is in addition to /to-prd never applying ready-for-agent in the first place.
Urgent prepend (hard, runs before any filter). Issues carrying priority:urgent always jump the head of the queue, ahead of --prd and --issues filters. Source: the /urgent skill files an issue with priority:urgent + ready-for-agent; every /afk invocation prepends those to the candidate list regardless of which selection flags were passed. Among urgents, oldest issue number first.
Apply filters to the non-urgent remainder in this order:
- If
--issues was passed: keep only those numbers, in argument order. Error if any are missing or not labelled ready-for-agent. PRDs in the explicit list are still rejected — the user is told to slice them first.
- Else if
--prd was passed: keep issues with prd: #N in the body, a parent link to issue N, or a prd:N label. The PRD itself (#N) is excluded by the type:prd filter above.
- Else: keep all remaining
ready-for-agent issues. Sort by triage priority — priority:high before priority:low (and unlabelled), then by issue number ascending.
The final queue is [urgent…] + [filtered non-urgent…], deduped by number (so an urgent issue that also matched the filter only appears once, at the front).
If the list is empty, print <promise>NO MORE TASKS</promise> and exit 0.
Issue Lifecycle (the /afk slice)
Canonical state machine lives in setup-red-skills/triage-labels.md. The portion /afk touches:
ready-for-agent
│
(1) claim
remove ready-for-agent
add running
post start comment
│
▼
running
┌───┴───┐
│ │ inner agent works in worktree → emits DONE | BLOCKED
│ │ orchestrator runs feedback loops, then merges to main
│ │
│ ├──── DONE + green + merged + pushed
│ │ │
│ │ (4a) close
│ │ remove running
│ │ gh issue close --reason completed
│ │ │
│ │ ▼
│ │ closed
│ │
│ └──── BLOCKED, or merge conflict, or both runners exhausted
│ │
│ (4b) release
│ remove running
│ add ready-for-human
│ post blocker comment with worktree path
│ │
│ ▼
│ ready-for-human (worktree preserved at moment of blocker)
│
└──── orchestrator interrupted (SIGINT/SIGTERM)
│
(4c) release
remove running
restore ready-for-agent
post interruption comment
│
▼
ready-for-agent (next /afk run can pick it up)
Label transitions are not atomic at the gh level — gh issue edit --remove-label A --add-label B resolves the new label set client-side and submits the union, so a removed-but-no-longer-present label is a silent no-op and the edit returns 0. To prevent two parallel /afk runners from both thinking they claimed the same issue, the per-issue claim uses three layers:
- Local
mkdir lock at .red/tmp/claims/{N}/ (POSIX-atomic). Workers in the same checkout race here, and the loser skips.
- Pre-check via
gh issue view --json labels — if ready-for-agent is already gone or running is already present, abort before the edit. Cuts the cross-checkout race window to roughly one round-trip.
- Stale-lock sweep at boot, during orphan cleanup — any
.red/tmp/claims/{N}/ whose recorded pid is dead gets reclaimed automatically.
Residual gap: two clones of the same repo on the same host (or different hosts) do not share .red/tmp/, so each holds its own mkdir lock and the gh edit race re-opens for the brief window the pre-check leaves uncovered. Acceptable for the intended scale (a few terminals, one checkout). If you need cross-host claim safety, gate /afk on a proper coordinator instead of GitHub labels.
Per-Issue Loop
For each issue N:
- Claim.
gh issue edit N --remove-label ready-for-agent --add-label running. Then resolve the attempt number {n} from the attempt-ledger (per-issue across all workers), create the attempt directory .red/tmp/workers/{id}/{N}-a{n}/, open afk.log (tee target for orchestrator output), and initialise afk.state.json per State File below. The orchestrator PID is already recorded once in the per-worker worker.pid (written at bootstrap) and is also embedded in afk.state.json's .pid field — there is no per-attempt pid file. Comment a start line on the issue: ISO timestamp, runner identity, worktree path. If labelling fails because someone else already claimed it, abandon the attempt directory and skip to the next issue.
- Worktree. Resolve the base branch with precedence lock > pin > main (ADR 0031): the primary checkout's branch-lock value (
.red/tmp/branch-lock.yaml, written by the branch-lock skill) wins when set; else the pinned branch (ADR 0008 — the issue's own branch: line, else its parent PRD's); else main. ({pinned} below denotes this resolved base.) Then git -C primary fetch origin {pinned} and git worktree add .red/tmp/workers/{id}/{N}-a{n}/worktree -b afk/{id}/{N}-{slug} origin/{pinned} from the primary checkout. The worktree lives inside the gitignored .red/tmp/ tree so it never appears in git status for main. Immediately after worktree creation the runtime mirrors the new branch on origin (git push origin -u HEAD:refs/heads/afk/{id}/{N}-{slug} --force-with-lease) and installs a per-worktree post-commit hook that fire-and-forgets a git push origin HEAD --force-with-lease after every inner-agent commit. Both calls are best-effort: a network/auth failure logs a warn: line and the iteration continues — the afk-attempts/* failure-push net (see Terminal-Event Envelope) still fires on terminal failure. Net effect: afk/{id}/{N}-{slug} is a remote-tracked branch throughout the iteration, so a SIGKILL anywhere from here on preserves the diff on origin without manual recovery.
- Handoff file. Materialise the handoff into
.red/tmp/workers/{id}/{N}-a{n}/handoff.md using the template below — top-level XML wrappers (<issue-body>, <previous-attempts>, <prior-attempt-context>, <human-guidance-thread>, <agent-notes>) keep the issue body, orchestrator-authored prior attempts, the restart-informed retry block, human comments, and the inner-agent scratchpad unambiguous. <issue-body> carries the issue body verbatim (including the ## Agent brief section written by /triage). The handoff file lives one level above the worktree so the inner agent reads it via ../handoff.md from inside the worktree, and so it survives a worktree wipe on retry.
- Restart-informed retries (PRD #244, issue #255). On a terminal failure the orchestrator writes two marker files into the failing attempt dir:
snapshot-branch.ref (the afk-attempts/{id}/{N}-{slug} ref it pushed to) and failure.reason (the envelope summary). On the next attempt — the runtime reads those markers before the current attempt dir is created, so it sees the prior attempt's state — the handoff builder fetches that snapshot branch into the worktree under the local ref refs/afk/prior-attempt and emits a <prior-attempt-context> element carrying prev-snapshot-branch, the verbatim prev-failure-reason, and prev-fetched-ref. The retry still branches fresh off the base (step 2 is unchanged), so a wrong prior approach never compounds; the fetched ref is read-only history for the inner agent to inspect. First attempts skip all of this and are byte-for-byte unchanged.
- Local heartbeat marker. Write one
[heartbeat] iteration started for #N line to afk.log. Slice D retired the periodic GitHub-comment heartbeat (:one: :two: :three: :four:) — local liveness is now signalled by the inner-agent stdout stream tee'd into afk.log plus state-file mtime, both of which already exist.
- Inner agent. Invoke claude/codex per
runner-*.md with AGENT-PROMPT.md + the handoff file + last 5 commits of main + the optional --request/-r special user request block. Stream stdout into the loop's header tail. Detect stages by grep on the stream — see Stage Detection below.
- Inner result.
- Inner committed and emits
<promise>DONE</promise> → continue to feedback loops.
- Inner emits
<promise>BLOCKED</promise> plus notes appended to the handoff file → comment the blocker on the issue, re-label ready-for-human, drop the worktree, go to next issue.
- Inner emits
<promise>NO MORE TASKS</promise> from inside one iteration → ignored. That sentinel is for the outer loop.
- Runner-exhausted signal (rate limit / quota error string per runner) → keep the worktree, swap runner, retry the same issue once. If both runners exhaust, exit 75 (
EX_TEMPFAIL).
- Feedback loops. In the worktree, derive relevant package scopes from the worker branch diff against the pinned base, then run
test, typecheck, lint, and build with pnpm -C <scope> for each touched package that declares the script. Root-only repos keep using the root package. Any missing script is reported as an explicit per-scope skip in the validation section. Any failure blocks the merge and flips the issue to ready-for-human with the validation report in the blocker envelope.
- Merge. All steps target the base branch resolved in step 2 (
{pinned}, defaults to main). The integration prelude is shared; landing is lock-toggled by the branch-lock state (ADR 0030).
- Primary dirty? Auto-stage and commit
chore(afk): pre-merge snapshot for #N in primary. Never git stash. Never git checkout -- ..
git -C primary fetch origin {pinned}. The primary checkout is pinned to main by the precheck; when {pinned} is not main, switch the primary checkout onto it for the merge (creating the local branch from origin/{pinned} if needed) and restore it to main on every exit path.
- Integrate the fetched tip into local
{pinned} before merging: fast-forward when local is strictly behind, otherwise rebase local commits onto origin/{pinned}. Without this the worker branch merges onto the stale boot-time HEAD and the push is rejected non-fast-forward whenever origin moved mid-run. If integration fails (diverged history that won't rebase), abort the merge → ready-for-human.
- Capture the integrated tip (
pre_merge_sha) for rollback, then land per lock state:
- Locked (
.red/tmp/branch-lock.yaml present — {pinned} is the locked branch): git -C primary merge --no-ff afk/{id}/{N}-{slug} -m "merge: #{N} {title}" directly into the local locked branch, then git -C primary push origin {pinned}. Nothing reaches main — promoting the locked branch to main is the operator's call. Conflict → one-shot self-resolve; still-conflicting → git merge --abort → ready-for-human. Push rejected → roll back to pre_merge_sha → ready-for-human.
- Unlocked: land via an admin-merged PR. Force-push the attempt branch's final state to origin, open (or reuse) a PR
--base {pinned} --head afk/{id}/{N}-{slug}, then gh pr merge --admin --merge. The PR is the durable per-attempt history — it survives the branch deletion in step 11. No completed work reaches {pinned} except through this admin-merge. Any failure (push, create, or admin-merge) → ready-for-human. Then fast-forward local {pinned} to the PR merge commit so the closing envelope's merge_sha is correct.
- Push. Folded into step 8: the locked path pushes the locked branch over SSH (rollback on reject); the unlocked path's push is the admin-merge of the PR. Either way, do not retry-loop indefinitely.
- Close. Validation comment on the issue: tests pass/fail, lint, typecheck, build, commits added, files touched. Then
gh issue close N --reason completed. Remove running label. Once the close succeeds, delete the live remote branch (git push origin --delete afk/{id}/{N}-{slug}) so the remote graveyard stays tidy — the merge commit on {pinned} already carries the diff. Best-effort: a failed delete (branch protection, network) logs a warn: line and the close still completes; the orphan afk/* branch can be cleaned up later.
- Cleanup (split teardown, issue #256). Every close path — success and failure/blocker — always drops the heavy worktree (
git worktree remove .red/tmp/workers/{id}/{N}-a{n}/worktree) while retaining the cheap artifacts (the JSONL lanes log.jsonl / agent.log.jsonl and the handoff.md) in the attempt directory for post-mortem. On DONE the merged branch is also deleted (git branch -d afk/{id}/{N}-{slug}, after the worktree is gone). The retained attempt's state file is marked not-live (pid: 0) so monitor / mirror / statusline read it as finished. No worktree survives a close; the attempt dir itself is reclaimed later by the boot-time orphan sweep's TTL or, on DONE, immediately by the completion sweep below. The remote afk/{id}/{N}-{slug} ref was deleted in step 10 on DONE; failure paths leave the remote ref intact and instead push the canonical afk-attempts/{id}/{N}-{slug} ref (see Terminal-Event Envelope).
- Completion sweep (issue #257). Once an issue is closed and merged, the runtime reclaims every attempt dir for that issue across all workers via the canonical
.red/tmp/workers/*/{N}-a* glob — not just the worker that completed it. The split-teardown retention only buys time for the orphan-sweep TTL; a completed issue needs none of it, so its retained dirs (including this worker's just-closed one) go now. A live worker's active attempt — one whose own state file still carries a live pid — is always skipped, though the claim lock makes a live duplicate of a just-completed issue unlikely.
- Tick. Update state file. Recompute ETA from rolling average of last 3 issue durations. Print one summary line:
finished {done}/{total} ({pct}%) — next: #{next}.
Runner Fallback
Default behaviour is no rotation and no fallback — the runner resolved by the detection cascade (see Bootstrap step 4) is used for every issue in the run, and RUNNER_EXHAUSTED exits the loop with code 75 and a log line naming the dead runner. Both behaviours are opt-in:
--alternate re-enables round-robin rotation between consecutive issues (claude → codex → claude → …). Mutually exclusive with --runner.
--fallback-runner re-enables mid-issue swap when the active runner returns RUNNER_EXHAUSTED. Without it, exhaustion is terminal for the run.
Exhaustion detection lives in runner-claude.md and runner-codex.md — they own the per-runner error strings. The orchestrator only sees RUNNER_EXHAUSTED as a structured signal.
When swap happens mid-issue (only with --fallback-runner), the same worktree and handoff file are reused; the new runner sees the previous agent's Notes appended.
The attempt-exit reader (<promise> is canonical — ADR 0028)
The <promise>…</promise> sentinel the inner agent emits is the canonical "attempt is over" signal — not pipe EOF, not the child process exiting. Pipe EOF and process exit are demoted to crash detectors: they only matter when the agent never authored its own exit. This is the architecture fix flagged during the #216 bash-hang diagnosis ("a gente tem que ser mais sensível ao resultado da promise").
Failure mode it closes: the inner agent emits <promise>DONE</promise> (or BLOCKED) but then leaves a tool call / background subprocess running — typically run_in_background followed by a bash -c 'until grep "test result" $out; do sleep 5; done' polling loop without a timeout. The bg task holds the stream-json pipe open, and an orchestrator that waited for EOF would hang for hours.
The runtime owns the stream-read + sentinel-detection + bounded tear-down, watching the runner's output in the foreground instead of waiting on the bare pipe. It tails the runner's stream capture (line-anchored match, so the agent quoting the sentinel in planning prose does not false-positive) and, once it sees <promise>DONE|BLOCKED|NO MORE TASKS</promise>:
- records the normalized outcome (
done / blocked / no_more_tasks) in ATTEMPT_READER_OUTCOME;
- gives the child
RED_AFK_ATTEMPT_GRACE_S (default 30) to exit cleanly;
- if still alive, recursively SIGTERM the pipeline pid and every descendant (claude / codex, jq, grep, tee, and any bash child stuck in a polling loop);
- waits
RED_AFK_ATTEMPT_KILL_S (default 10), then SIGKILL anything still alive.
The orchestrator proceeds with feedback loops / labelling regardless of how the tear-down resolves — the agent's commit work, sentinel, and result are all already on disk by the time tear-down fires, and afk.log still captures everything the runner printed after the sentinel (the tee/fan-out sits upstream of the kill). The legacy RED_AFK_WATCHDOG_GRACE_S is still honoured as a back-compat alias for the grace window. Setting the grace below ~5 s risks killing healthy pipelines that just haven't flushed jq's buffer; 30 s is conservative.
EOF without a sentinel is on_attempt_error. If the pipe closes before any <promise> is observed, the agent never declared the attempt over (crash, kill, or a daemon that closed the pipe without speaking): the attempt is recorded as errored (on_attempt_error fires, error class no-sentinel), the issue lands in ready-for-human, and post_attempt does not fire for that invocation. Runner exhaustion (RUNNER_EXHAUSTED) stays out of the sentinel channel — it keeps its own branch and the fallback-runner swap.
The parsed outcome rides into the post_attempt mutable context as result.outcome and the RED_AFK_RESULT_OUTCOME env var, so hooks (and the Memory attempt.hooks record, #216) see the agent-authored exit, not just success/fail.
Preventive counterpart lives in AGENT-PROMPT.md under Background Tasks and Polling — inner agents are required to cap every polling loop with a deadline. The bounded tear-down is the safety net; the prompt rule is the design.
Heartbeat (local-only, post-Slice-D)
The issue-thread heartbeat (:one: / :two: / :three: / :four: cycling every 10 minutes via gh issue comment) was removed in Slice D. The issue thread is now timeline-only: boot stamp, attempt envelopes, human guidance, closing envelope. No periodic noise.
Local liveness is signalled by:
-
Inner-agent stdout stream, continuously tee'd into the iteration's afk.log by run_inner — forensic inspection of a running worker tails this file.
-
Clean agent lane + firehose (issue #250) — alongside afk.log, the runtime fans each assistant turn out to a clean single-writer agent.log.jsonl (one type=agent record per turn, nothing synthetic — the true liveness signal, readable as a live transcript with tail -f … | jq -r .msg) and to a log.jsonl firehose that also carries the heartbeat vitals, hook dispatches, runner timings, and errors in the uniform JSONL envelope. The heartbeat writes its vitals to the firehose as a type=heartbeat record but never to the agent lane, so the agent lane's silence is real silence (the masking that defeated stall/reaper detection in #243). afk.log is unchanged and still carries the tee'd stdout + heartbeat lines below.
-
State-file mtime, bumped on every state update. The monitor combines orchestrator pid liveness with state-file freshness to render 🟢 live vs 🟡 stale.
-
Iteration boundary markers — heartbeat_start / heartbeat_stop write a single [heartbeat] iteration started/stopped line each to afk.log so forensic readers can see when an iteration entered and left the inner-agent stage.
-
Periodic orchestrator heartbeat (issue #194) — heartbeat_start also spawns a side-channel sub-shell that appends one line every RED_AFK_HEARTBEAT_S (default 60s) to afk.log:
[heartbeat] stage:tests t+00:14:02 last_stream_line="..." cpu=12% rss=420M
The loop re-reads current.stage and current.last_stream_line from afk.state.json on every tick (so a mid-iteration stage flip shows up in the next heartbeat) and reads cpu/rss from ps against the orchestrator pid. Because it lives in its own sub-shell — independent of the inner-agent stream tee that buffers inside the runner pipeline — a forcibly hung worker (kill -STOP on the inner agent, or a runner that never flushes) still produces one heartbeat line per minute with stage frozen and wall-clock advancing. The no-sentinel envelope's data-section=log carries these lines, so the issue thread alone is enough to diagnose where the hang occurred. Set RED_AFK_HEARTBEAT_S=0 to disable the periodic loop (boundary markers still fire).
The terminal header has its own independent 3 s redraw tick — see Live Header below. It is unrelated to (and survives the removal of) the GitHub-thread heartbeat.
Deprecated state fields. current.heartbeat_glyph and current.heartbeat_pid are kept as null for one release window so older monitors don't error on read; they are no longer written meaningfully and may be removed in a future release.
Terminal-Event Envelope
Every terminal event of an iteration posts exactly one structured comment on the issue. The comment is the canonical record of what the worker saw and did, and a future Slice C parser will reconstruct iteration history by walking these envelopes in a thread.
Statuses (one per envelope, mutually exclusive):
data-attempt-status | trigger |
|---|
blocked | inner agent emitted <promise>BLOCKED</promise> |
no-sentinel | inner agent exited without DONE or BLOCKED |
merge-conflict | orchestrator could not merge to main |
done | success — merged, closing envelope |
Schema (deterministic — Slice C depends on this shape):
<details data-attempt-status="blocked"><summary>worker `wZ2R4` · status: blocked · duration: 2m5s · diff: +42 -10 · attempt: 1</summary>
<details data-section="notes"><summary>notes</summary>
…handoff `<agent-notes>` body…
</details>
</details>
Per-status body sections:
blocked → one data-section="notes" block carrying the handoff's <agent-notes> body (the inner agent's appended progress/blockers).
no-sentinel → both data-section="notes" (handoff <agent-notes>, may be empty placeholder) and data-section="log" (last 50 lines of the captured inner-agent stdout, fenced).
merge-conflict → one data-section="log" block carrying the merge-conflict diff tail (last 50 lines of git merge output), fenced. Mirrors the no-sentinel log shape.
done → one data-section="validation" block carrying the package-aware feedback report. Summary carries diff: merged and merge: <sha> (GitHub auto-links bare SHAs to the commit on main). The merge commit on main is the diff — no need to duplicate it inline.
User-hook executions section (issue #215). Every terminal Envelope (any of the four statuses above) also carries a trailing data-section="hooks" block when at least one user-declared lifecycle hook ran during the issue's lifecycle. Built-in defaults (cargo, gradle, heartbeat, envelope, validation — see the Lifecycle Hooks table) are deliberately excluded; the block exists to surface the policy the operator wrote in .red/config.yaml, not the skill's own machinery. Each line has the deterministic shape <lifecycle_name> <command> exit=<rc>, in execution order across the entire lifecycle (pre_session → pre_pick → post_pick → pre_worktree → pre_attempt → post_attempt → pre_merge → post_merge → on_attempt_error → on_idle → post_session / on_session_error). Non-zero exits are listed with their exit code — never omitted — so a reviewer can see which user-declared policy guarded the merge or mutated the queue, and whether it failed. When no user hook ran (the common case for projects without an afk.hooks block in .red/config.yaml), the section is omitted entirely rather than rendered empty. The discarded supervisor envelope never carries this section: discards record a slot-park decision made above the per-issue lifecycle, so no per-issue hook chain exists to enumerate.
Branch namespaces — afk/* vs afk-attempts/* (issue #191). Two distinct remote namespaces, never overlapping:
afk/{id}/{N}-{slug} is the live-iteration branch. It's pushed at worktree-create (push_initial), kept in sync after every inner-agent commit by a per-worktree post-commit hook (install_post_commit_hook), and deleted on DONE after gh issue close succeeds (delete_remote). On any terminal failure the live ref is not deleted — it survives on origin so a human can git fetch && git checkout afk/{id}/{N}-{slug} to inspect mid-iteration state.
afk-attempts/{id}/{N}-{slug} is the failure-only marker that the terminal-failure envelope (BLOCKED / no-sentinel / merge-conflict) links to. It's pushed exactly once per terminal failure inside envelope_emit_attempt and never deleted by the orchestrator — it's the forensic record. This lifecycle is unchanged by #191.
Validation Sidecar
During feedback validation, AFK also writes a structured JSONL sidecar at
$ITER_DIR/validation.jsonl. It is not rendered into the issue comment; it is
the machine-readable source used by the optional Memory bridge.
Each line is one command/check execution:
{"schema":"red.afk.validation.v1","name":"test:plugins/memory","command":"pnpm -C /repo/plugins/memory test","status":"passed","durationMs":1234,"summary":"command exited 0"}
Fields:
schema: literal red.afk.validation.v1.
name: stable check name, usually {script}:{scope} such as typecheck:root or lint:plugins/memory.
command: command string when a command ran; omitted for skipped checks.
status: passed, failed, or skipped.
durationMs: command duration when a command ran.
summary: short relevant output/error summary, or a skip reason.
The Memory attempt writer only consumes this structured sidecar after parsing it
as JSON. It must not derive validation graph nodes by parsing free-form stdout,
Envelope notes, validation-summary prose, or <agent-notes>.
Summary line is always worker {id} · status: {status} · duration: NmSs · diff: {diff} · attempt: K [· merge: {sha}], where {diff} is +N -M against origin/main for non-DONE statuses and the literal merged for DONE.
After a successful POST (any 2xx), the orchestrator sets envelope.posted: true in the iteration state file. The boot-time Orphan Cleanup reads that field to pick a TTL for preserved ready-for-human dirs: 1 day when the envelope made it to the issue (the thread carries the canonical record), 7 days when the POST failed (the local dir is the only copy of the notes/log). The field is initialised false at iteration start.
On any terminal failure (BLOCKED, no-sentinel, merge-conflict), the worker branch is pushed via SSH to origin/afk-attempts/{worker_id}/{issue}-{slug} before the envelope is posted. The envelope's data-section="diff" block then carries a clickable live-branch tree/afk/{id}/{N}-{slug} link (the live ref survives on origin after a terminal failure, so a human can git checkout it to inspect or continue, #443), the compare/main...afk-attempts/... link, and a +N -M files=K diffstat. If the push fails (network, auth, anything non-2xx), the iteration still completes — the diff section embeds only the diffstat plus the local worktree path, and a warn: line is logged. DONE iterations do not push to afk-attempts/ (the merge commit on main is the diff). Local branch cleanup (git branch -d) only deletes the local ref; the remote afk-attempts/ ref stays alive for forensics, with no retention policy in this slice (branch sprawl is acknowledged and deferred — see PRD #2 Out of Scope).
The Slice D heartbeat-glyph cleanup has landed — there is no periodic :one: :two: … traffic on the issue thread to defer or replace.
Stage Detection
Inner agent stages, detected from stdout stream of the runner:
| stage | signal |
|---|
| setup | first output line |
| explore | git ls-files, find, repeated Read |
| impl | first Edit/Write call |
| tests | pnpm test invocation |
| commit | git commit invocation |
| merge | orchestrator stage, post-inner |
| push | orchestrator stage |
| close | orchestrator stage |
Each transition writes to state file. The monitor renders the current stage.
Live Header
Redraw every 3 s on the controlling TTY, top of the scroll buffer. Use tput sc; tput cup 0 0; …; tput rc so the inner agent's stream below stays intact.
┌─ /afk ────────────────────────────────────────────────────┐
│ runner: codex elapsed: 00:14:23 eta: ~01:20:00 │
│ done: 3 / 12 (25%) blocked: 0 merged: 3 │
│ │
│ ▶ #142 wire OAuth callback │
│ worktree: .red/tmp/workers/wZ2R4/142-a1/worktree │
│ stage: impl │
│ last: writing tests for callback handler │
│ │
│ queue: #143 #144 #145 #146 ... │
└────────────────────────────────────────────────────────────┘
If stdout is not a TTY (CI, piped log), skip header rendering and print one JSON line per state transition to stderr.
State File
Path: .red/tmp/workers/{id}/{N}-a{n}/afk.state.json — one snapshot per (worker, issue, attempt). Schema:
{
"version": 1,
"worker_id": "wZ2R4",
"pid": 12340,
"log": ".red/tmp/workers/wZ2R4/142-a1/afk.log",
"started_at": "2026-05-16T12:00:00-03:00",
"runner": "codex",
"filter": { "kind": "prd|issues|all", "value": "42" },
"total": 12,
"done": 3,
"failed": 0,
"blocked": 0,
"completed": [139, 140, 141],
"queue": [143, 144, 145, 146],
"current": {
"number": 142,
"title": "wire OAuth callback",
"slug": "wire-oauth-callback",
"worktree": ".red/tmp/workers/wZ2R4/142-a1/worktree",
"handoff": ".red/tmp/workers/wZ2R4/142-a1/handoff.md",
"started_at": "2026-05-16T12:14:00-03:00",
"stage": "impl",
"heartbeat_glyph": null,
"heartbeat_pid": null,
"runner": "codex",
"retries": 0,
"last_stream_line": "writing tests for callback handler"
},
"durations_seconds": [820, 940, 760],
"envelope": { "posted": false }
}
Atomic write: write to afk.state.json.tmp inside the attempt directory, mv over the original. /afk monitor and any other reader open it read-only. Between issues the worker has no live state file — monitor renders that as "idle".
Auto-Monitor Loop (Claude Code only — binding)
When /afk is invoked to spawn a worker (i.e., not the monitor subcommand), the agent additionally schedules a recurring /dev:afk monitor cron inside the current Claude Code session so the user sees progress without re-typing. Death of every worker auto-cancels the cron.
Setup (runs immediately after the run worker is launched in the background):
- Fetch
CronCreate and CronList via ToolSearch if not already loaded (they are deferred tools).
CronList — if any existing job has prompt == "/dev:afk monitor", skip step 3 (don't double-schedule when the user runs a second parallel /afk in the same session).
CronCreate(cron="*/10 * * * *", prompt="/dev:afk monitor", recurring=true). The cron is session-only — it dies when the Claude Code session ends, so no risk of orphans across sessions. Auto-expires after 7 days regardless.
- Tell the user one line:
monitor loop scheduled (every 10 min) — auto-cancels when all workers exit.
The monitor invocation handles its own teardown — see Self-Cancel under the Monitor section below.
Skip the auto-loop when:
- The invocation is
/afk monitor (not a worker spawn).
- The invocation is
/afk --once (single supervised iteration; user is already watching).
CronCreate is unavailable (not running under Claude Code — e.g. Codex). Print one line monitor loop unavailable in this runner; tail .red/tmp/workers/*/*/afk.log manually. and continue.
Fleet Mode (runner-portable — binding)
/dev:afk fleet [N] and /dev:afk fleet stop are the user-facing fleet commands. They let one terminal command spin up (or shut down) N concurrent run workers on the current checkout, with the supervisor handling respawn, the circuit breaker, the passive stall detector (samples each slot's per-attempt agent lane agent.log.jsonl mtime — the clean liveness signal — every RED_AFK_STALL_POLL_S=30s; flags any slot alive ≥ RED_AFK_STALL_THRESHOLD_S=600 whose agent lane has been idle ≥ the same — surfaces as ⏸️ stalled in /dev:afk monitor. It keys off the agent lane, never afk.log/log.jsonl, because the orchestrator heartbeat writes those every minute and would mask a real stall — the masking that defeated detection in #243), the hard stall reaper (a slot silent on the agent lane past RED_AFK_STALL_KILL_THRESHOLD_S=1800 is only a candidate: the irreversible kill is gated behind a reaper-signal predicate, so a worker mid-build/test — an active vitest/tsc/cargo/… descendant under its tree, or non-trivial aggregate cpu — is busy and left alone, while a genuinely stuck worker [idle past the threshold, no active descendant, flat cpu] is killed tree-wide, a data-attempt-status="no-sentinel" envelope is posted with the attempt-dir afk.log tail, the issue label is rotated back to ready-for-agent, the worktree + attempt dir are removed, and the slot is freed for the next health-check respawn — RED_AFK_STALL_KILL_THRESHOLD_S must be strictly greater than RED_AFK_STALL_THRESHOLD_S, validated at supervisor boot), and per-slot build isolation.
Worker env passthrough. Any RED_AFK_* variable exported in the operator's shell before /dev:afk fleet is auto-forwarded to every worker the supervisor spawns. Use this for worker-side toggles like RED_AFK_SKIP_PERF=1 or RED_AFK_SKIP_COMPETITIVE_BASELINE=1 without writing a hook. Internal supervisor knobs (RED_AFK_TARGET, RED_AFK_POLL_S, RED_AFK_STALL_*, RED_AFK_CIRCUIT_*, RED_AFK_RUNNER, RED_AFK_REQUEST, RED_AFK_PLUGIN_DIR) and the per-slot *_BASE build-isolation vars are excluded — they have dedicated wiring and the supervisor denylists them from passthrough. The supervisor re-pins RED_AFK_RUNNER=<runner> for each worker.
$ export RED_AFK_SKIP_PERF=1
$ export RED_AFK_SKIP_COMPETITIVE_BASELINE=1
$ /dev:afk fleet 1
Fleet mode is runner-portable: the supervisor is plain process orchestration, not a Claude Code primitive. Claude Code, Codex, and bare terminals may all launch and stop the supervisor when the normal AFK hard preconditions pass. Runner-specific observability degrades independently:
- Claude Code: schedule the auto-monitor cron when
CronCreate/CronList are available; if not, launch fleet anyway and print monitor loop unavailable in this runner; run /dev:afk monitor or tail .red/tmp/afk-supervisor.log manually.
- Codex: launch fleet with
RED_AFK_RUNNER=codex, skip cron, and spawn one read-only Codex monitor agent when a sub-agent primitive is available. If no sub-agent primitive is available, launch fleet anyway and print the same manual-monitor guidance.
- Bare terminal / unknown runner: launch fleet, skip cron/native monitor, and print the manual-monitor guidance.
/dev:afk fleet [N] — launch
N is optional and defaults to 2. Parse it as a non-negative integer; reject anything else (including stop, which is the other subcommand and routes below). Steps the agent must perform, in order:
- Resolve runner. Determine the active runner using the same intent as the normal AFK cascade: explicit user
--runner if present, else RED_AFK_RUNNER, else runner env/process/path signals, else claude. The resolved value is carried into the supervisor as RED_AFK_RUNNER=<runner> so detached workers do not fall through to the supervisor's historical claude fallback. Under Codex, this must resolve to codex.
- PID-file pre-check. Read
.red/tmp/afk-supervisor.pid. If it exists and kill -0 <pid> succeeds, refuse the launch:
✗ fleet already running (supervisor pid=<pid>, log .red/tmp/afk-supervisor.log).
to stop it: /dev:afk fleet stop
Do not touch the file or attempt to recover. A stale PID file (file exists but kill -0 fails) is left alone — the fleet command clears it itself when it acquires the supervisor lock.
- Launch the fleet. From the project root, run the bundle's
fleet command with the target and any flags:
RED_AFK_RUNNER=<runner> node "$CLAUDE_PLUGIN_ROOT/skills/engineering/afk/bin/afk.mjs" fleet <N> [--request <text>]
The command performs the PID-file pre-check from step 2 itself (refusing if a live supervisor already runs), detaches the supervisor, and forwards the resolved runner and the --request/-r text to every worker it spawns. It waits up to 3 s for .red/tmp/afk-supervisor.pid to appear and contain a live PID, then prints the launched supervisor PID and target; on failure it reports the tail of .red/tmp/afk-supervisor.log. Capture the reported PID for the Report back step. The launched supervisor is the native __supervise entrypoint of the same bundle.
- Attach the best available monitor surface.
- Claude Code: same flow as Auto-Monitor Loop —
CronList first to deduplicate, then CronCreate(cron="*/10 * * * *", prompt="/dev:afk monitor", recurring=true). If cron tools are unavailable, skip and use the manual-monitor line.
- Codex: fetch a sub-agent spawn primitive via
ToolSearch (query: spawn agent background monitor). If available, spawn exactly one read-only Codex monitor agent for this newly-launched supervisor. Its task: from the project root, periodically run /dev:afk monitor --once (the bundle's monitor --once), report concise progress, and auto-close when .red/tmp/afk-supervisor.pid is missing/dead and no [live] workers remain. It must never edit files, claim issues, stop workers, or run merges. The user may close it manually; workers continue. If the primitive is unavailable, skip and use the manual-monitor line.
- Bare/unknown: skip native monitor setup and use the manual-monitor line.
- Report back. Print:
🚀 fleet launched (supervisor pid=<pid>, target=<N>)
log: .red/tmp/afk-supervisor.log
stop: /dev:afk fleet stop
<monitor-status-line>
Monitor status line choices:
- Claude cron scheduled:
monitor loop scheduled (every 10 min) — auto-cancels when all workers exit.
- Claude cron already existed:
monitor loop already running (existing cron <id>).
- Codex monitor agent spawned:
Codex monitor agent spawned — auto-closes when fleet exits; manual monitor: /dev:afk monitor.
- Native monitor unavailable:
monitor loop unavailable in this runner; run /dev:afk monitor or tail .red/tmp/afk-supervisor.log manually.
/dev:afk fleet stop — graceful shutdown
Steps, in order:
- Liveness check. Read
.red/tmp/afk-supervisor.pid. The three cases:
- File missing → print
no fleet running. and continue to step 3 (still try runner-specific monitor teardown).
- File present but
kill -0 fails → stale. Print no fleet running (stale pid file at .red/tmp/afk-supervisor.pid — cleaning)., rm -f it, and continue to step 3.
- File present and PID alive → continue to step 2.
- Touch the stop file.
touch .red/tmp/afk-supervisor.stop. The supervisor's health-check cycle (default RED_AFK_POLL_S=15s) picks it up and runs cleanup, which SIGTERMs every worker, removes the PID file, removes the stop file, and exits. Wait up to 30 s for the PID file to disappear (poll every 1 s, deadline-bounded — never bare while). If it's gone, print 🛑 fleet stopped (supervisor pid=<pid> exited).. If the deadline trips, print one warning line naming the PID and the log path, and continue to step 3 anyway — the stop file is still there and the supervisor will pick it up eventually.
- Tear down runner-specific monitors.
- Claude Code:
CronList → find every job whose prompt == "/dev:afk monitor" (there will normally be one, possibly zero, occasionally more if the user manually /loop-ed). CronDelete each. Print one line: auto-monitor cron cancelled (<count> entr{y,ies}). (or no auto-monitor cron to cancel. when count is zero). If cron tools are unavailable, print auto-monitor cron unavailable in this runner; skipped.
- Codex: do not stop workers through the monitor agent. It auto-closes when it observes no supervisor/live workers, and the user may close it manually. Print
Codex monitor agent will self-close when it observes fleet stopped.
- Bare/unknown: print
no native monitor teardown for this runner.
- Idempotency. Re-running
/dev:afk fleet stop after a successful stop just hits the "file missing" branch in step 1 and the runner-specific teardown no-op in step 3. Exit 0 either way.
Circuit Trip Sweep
When the circuit breaker parks a slot (CIRCUIT_K fast deaths inside CIRCUIT_WINDOW_S) the supervisor — not a human — runs sweep_parked_slot to clean up after the burned workers. Three actions, in order, gated on the trip:
- Sweep affected attempt dirs. From the slot log (
afk-supervisor-slot-{slot}.log) the supervisor parses every [afk] worker: w… boot stamp emitted while the slot was alive, globs .red/tmp/workers/{wid}/*/ for each ID, and reads afk.state.json's .current.number to identify the affected issues. Each attempt dir is rm -rf'd after its issue has been processed.
- Post a discard envelope on each affected issue. Same
<details data-attempt-status="…"> schema as the per-issue terminal envelope, with status="discarded" and a summary line that names the runner and the trip cause (runner-broken, slot parked after K fast deaths). The envelope's data-section="summary" block carries the slot index, comma-joined worker IDs, fast-death count, and the supervisor log path. No notes, drop, or log sections — the attempts produced no usable artefacts.
- Restore label state on each affected issue. Single
gh issue edit adds ready-for-agent and runner-error, removes ready-for-human and (defensively) running — covers both the "issue had already been promoted to ready-for-human" path and the "issue was still running at the moment of trip" path.
The runner-error label is created idempotently by /setup-red-skills (see triage-labels.md). The supervisor still calls gh label create runner-error on the fly during a trip so cleanup never fails just because the label is missing.
Idempotency: SLOT_SWEPT[slot]=1 blocks a second sweep within the same supervisor lifetime. Across restarts a new trip yields fresh worker IDs and fresh attempt dirs, so re-tripping never re-touches the previously swept issues. A trip that finds no claimed issues (all workers exited before claiming) parks the slot but posts no envelopes — the attempt-dir sweep is a no-op.
Refs
- The bundle's
fleet / fleet stop commands — the entrypoints this section drives. Stop-file path, env contract, circuit breaker, and trip-sweep are part of the supervisor behaviour described above.
- Auto-Monitor Loop above — the cron lifecycle Fleet Mode hooks into.
- Self-Cancel under Monitor — the dual teardown path (cron tears itself down when no workers remain; fleet stop tears it down immediately).
Monitor
BINDING — every monitor tick must do BOTH of the following, in order. No shortcuts.
- Render the dashboard (the bundle's
monitor --once).
- Mirror live workers onto the host runner's native task surface. Per-runner mapping:
- Claude Code: pipe the tracked-task JSONL into the bundle's
monitor --mirror-plan and apply the emitted call plan via TaskCreate (one task per live worker, titled #<n> w<id> — <title>) and TaskUpdate (description carries stage:<x>, terminal events flip state to completed/failed). See Task Mirror below for the full protocol.
- Codex: run
monitor --mirror-plan --runner codex. Today Codex exposes no native task surface, so the sink emits an empty plan and the mirror falls back to the dashboard plus a one-line notice — that is the mirror under Codex; do not silently skip. If Codex grows a native surface, the sink emits the same call-plan descriptors against it.
- Bare terminal / unknown runner: skip the mirror silently — the
monitor dashboard is the canonical view.
The mirror is the only way the user sees per-worker progress advance in their native UI. Skipping it (because "nothing changed" or "just answering a status question") is a bug, not a shortcut — monitor --mirror-plan is idempotent and emits zero descriptors when nothing changed.
/afk monitor is the readonly aggregated view across all live workers. Run the bundle's monitor command — do not reinvent the rendering in inline bash. It:
- Globs
.red/tmp/workers/*/*/afk.state.json and renders one section per active attempt.
- Verifies liveness via the orchestrator PID recorded in
afk.state.json (.pid field) — attempts whose PID is dead are flagged stale/gone and excluded, not counted as running.
- Optionally tails the sibling
afk.log for the most recent line under each worker's header.
- Renders the 48h sparkline header (next subsection) on every refresh.
To invoke, from the project root:
RED_AFK_RUNNER=<runner> node "$CLAUDE_PLUGIN_ROOT/skills/engineering/afk/bin/afk.mjs" monitor
The command has two modes, auto-selected by stdout type:
- TTY (real terminal): full box-drawing layout, refreshes every 3 s,
clear between frames. Ctrl-C to exit.
- Non-TTY (piped, captured by an agent, redirected): one-shot compact dashboard — one sparkline header + one line per worker, then exit 0. Force this with
--once or RED_AFK_MONITOR_COMPACT=1 even from a TTY.
Compact output shape (≈3 lines total for 2 workers — fits inline without truncation in an agent transcript):
48h: ···············································█ (4 closed, peak 4/h, all workers) Δ fleet +382 -45
wZ2R4 [live] claude issues 4/5 #150 [blog/D] Agent SDK on RedDB stage:impl 00:23:01 +382 -45
wK7M2 [live] codex issues 0/16 idle +0 -0
The progress counter is issues <done>/<total> — issues closed over the queue total, not a completion percentage (the old (80%) form read as "no work done" while a worker had already committed thousands of lines). The real volume signal is the +A -R diff (committed + uncommitted, measured from the branch's merge-base with origin/main), which is rendered on every worker line unconditionally — idle and +0 -0 included — and summed across the fleet into the Δ fleet +A -R suffix on the sparkline header, so the total diff volume is always visible at a glance.
When invoking from inside another agent session (Claude Code, Codex), prefer --once even if stdin is a pipe — explicit beats inference. Don't use the full TTY mode in agent transcripts; the 3 s refresh loop floods the captured stream and gets truncated to garbage.
Single-worker operation shows one section/line. Multi-worker adds one section/line per live worker, sorted by started_at. The sparkline aggregates all workers in this checkout's .red/state/afk-history.jsonl — not fractured per-worker; the Δ fleet diff total likewise sums every worker.
The header of every render shows a 48h sparkline of issues closed, one glyph per hour, scaled to the peak hour:
48h: ·▁··▁·▁·▁··█▁▁··▁·▁···▁·▁·▆▁▁··▁···▁▆·▁··▁▃▁·▃▁· (35 closed, peak 5/h)
Source data: .red/state/afk-history.jsonl, an append-only event log written by the orchestrator on every terminal event:
{"ts":"2026-05-17T12:14:00-03:00","epoch":1747494840,"worker":"wK7M2","issue":571,"event":"done","duration_s":816,"runner":"codex","merge_sha":"0936ba54"}
{"ts":"...","epoch":...,"worker":"wK7M2","issue":569,"event":"blocked","duration_s":120,"runner":"codex","reason":"merge-conflict"}
{"ts":"...","epoch":...,"worker":"wK7M2","issue":568,"event":"exhausted","duration_s":0,"runner":"claude","reason":"both-runners"}
.red/state/ is gitignored. The orchestrator creates it during bootstrap, parallel workers serialise appends via flock, and the boot-time orphan sweep truncates the file to the last 10000 lines if it grows past that cap.
The sparkline only counts event == "done". Blockers and exhausted runs are recorded for forensics but excluded from the throughput view.
Self-Cancel (binding when invoked under Claude Code)
Every /afk monitor run — whether typed by the user or fired by the auto-monitor cron — is responsible for tearing down the cron once there's nothing left to watch.
After rendering the dashboard, the agent must:
- Count workers with status
[live] in the rendered output (i.e., orchestrator pid alive, post-orphan-cleanup).
- If
live_workers == 0:
- Fetch
CronList and CronDelete via ToolSearch if not already loaded.
CronList — find every job with prompt == "/dev:afk monitor". There will normally be exactly one; multiples can appear if the user manually invoked /loop 3m /dev:afk monitor on top of the auto-loop.
CronDelete each match.
- Append one line to the user-facing output:
🛑 no live workers — auto-cancelled monitor loop (cron <id>).
- If
live_workers >= 1: do nothing. The cron continues firing every 10 minutes.
When CronList / CronDelete are unavailable (Codex runner, or /afk monitor invoked outside Claude Code), skip the teardown silently — the cron infrastructure isn't running there to begin with.
Task Mirror And Codex Monitor Agent (binding)
Every /dev:afk monitor run also mirrors each live worker onto the runner's native task list when that runner exposes one, so a /afk session surfaces progress on the host's native UI — advancing through stages on its own, with no extra typing. This is a read-only reflection of afk.state.json; the mirror never writes state and never touches the orchestration.
The mirror is a pure diff: it reconciles the live worker state files against the tasks already on the native surface and emits a call plan. After rendering the dashboard, the agent (under Claude Code only) must:
- Fetch
TaskCreate, TaskUpdate, and TaskList via ToolSearch if not already loaded (deferred tools).
- Build the tracked set.
TaskList → keep the mirror-owned tasks (those whose title matches #<n> w<id> — …). For each, emit one JSONL line {"key":"<worker_id>:<issue>","stage":"<last stage>"}, reading the key from the title and the stage from the description (stage: <x>). Keep a key→task_id map for step 4.
- Compute the plan. Pipe the tracked JSONL from step 2 into the bundle's
monitor --mirror-plan subcommand:
printf '%s\n' "$tracked" | node "$CLAUDE_PLUGIN_ROOT/skills/engineering/afk/bin/afk.mjs" monitor --mirror-plan
The command globs the state files and reconciles them against the tracked set on stdin (keyed by worker_id:issue, so parallel workers each get exactly one task and re-runs never duplicate), then prints a JSONL call plan to stdout — one descriptor per harness call (empty stdin → cold reconcile; empty plan → no output):
{"call":"TaskCreate","key":"wAAAA:22","title":"#22 wAAAA — extract state.sh","description":"stage: impl","state":"in_progress"}
{"call":"TaskUpdate","key":"wAAAA:22","description":"stage: tests","state":"in_progress"}
{"call":"TaskUpdate","key":"wAAAA:22","state":"completed"}
- Apply the plan. For each descriptor in order:
TaskCreate → create the task; record key → task_id.
TaskUpdate → resolve key to its task_id via the map and update. A state of completed/failed marks the worker's terminal event (done/blocked); the task drops off the active list and the mirror self-cleans. A descriptor whose key has no known task_id (e.g. a complete for a task that was never created in this session) is skipped.
An empty plan means nothing changed since the last tick — apply no calls. Because the plan is keyed by worker_id:issue, an idempotent re-run with no stage advance emits zero descriptors.
Re-hydration on session reopen. A native task dies with the Claude Code session; the nohup AFK worker does not. When a session opens with workers still running, TaskList (step 2) returns no mirror-owned tasks, so the tracked set is empty and monitor --mirror-plan reconciles cold — emitting a TaskCreate for every live worker. The status bar recovers the per-worker tasks with no operator action. This is the same path as steady-state, not a new one: only workers whose orchestrator PID (the .pid field in afk.state.json, via state_is_live) is alive re-hydrate (dead workers are untracked-terminal on a cold tick → no ghost task), and the next tick is idempotent because the freshly-created tasks now form the tracked set.
When TaskCreate / TaskUpdate are unavailable because the session is outside any runner (a bare terminal), skip the mirror silently — there is no native surface to drive, and the monitor dashboard is already the canonical view.
Codex sink (runner-specific — binding). The mirror is per-runner, mirroring the runner-claude.md / runner-codex.md split (ADR 0003). Under Codex the state reader and plan reconciler are reused unchanged — only the sink differs. After rendering the dashboard, the Codex agent runs monitor --mirror-plan --runner codex instead of the Claude TaskCreate/TaskUpdate loop:
- If Codex grows a native background-task surface, the sink emits the same call-plan descriptors the Claude sink applies — apply them against the Codex primitive.
- Otherwise (today's reality),
--runner codex emits an empty plan, so the mirror falls back to the monitor dashboard and a one-line notice. No native calls are emitted, so there is no half-rendered state, and a dashboard hiccup is swallowed so the tick never crashes.
Codex monitor agent (fleet-specific — binding). Codex has a native sub-agent UI even though it does not expose the Claude-style TaskCreate/TaskUpdate task API. When /dev:afk fleet N launches a new supervisor under Codex, the agent should spawn exactly one read-only Codex monitor agent when the sub-agent primitive is available. That monitor agent periodically runs /dev:afk monitor --once, reports concise progress, and exits once no supervisor or live workers remain. It is a presentation consumer only: it must not edit files, stop workers, claim issues, or merge anything. Closing it manually must not affect the fleet.
Do not invent a cross-runner task abstraction (rejected in ADR 0003) — keep the adapter explicitly per-runner.
Handoff File Template
.red/tmp/workers/{id}/{N}-a{n}/handoff.md:
Top-level content is XML elements (not markdown headers) so the inner agent
cannot confuse the issue body with comments, or human direction with
orchestrator audits. Markdown sections like ## Agent brief, ## Acceptance,
## Refs, and ## Suggested Skills live inside the <issue-body> element
(they are part of the issue body verbatim).
# Issue #{N} — {title} [AFK]
source: {gh-url}
prd: {prd-url-or-issue-ref} # omit if none
runner: {claude|codex}
started: {iso8601}
attempt: {1..}
<issue-body>
{issue body verbatim — includes the `## Agent brief`, `## Acceptance`, `## Refs`,
and `## Suggested Skills` markdown sections written by /triage}
</issue-body>
<previous-attempts> <!-- omitted when empty -->
<previous-attempt n="1" status="blocked" worker="wXXXX" duration="0m50s" branch="afk-attempts/wXXXX/N-slug">
<notes>
{inner agent's appended notes from prior attempt}
</notes>
<log>
{tail of prior attempt's stdout, if captured}
</log>
</previous-attempt>
</previous-attempts>
<prior-attempt-context> <!-- omitted on a first attempt -->
prev-attempt: 1
prev-snapshot-branch: afk-attempts/wXXXX/N-slug
prev-failure-reason:
{verbatim failure.reason from the previous attempt — the envelope summary}
prev-fetched-ref: refs/afk/prior-attempt
{inspect the prior failed approach with `git log refs/afk/prior-attempt`; you
branch fresh off the base — do NOT fix-forward on it}
</prior-attempt-context>
<human-guidance-thread> <!-- omitted when empty -->
<human-guidance author="@alice" at="{iso8601}">
{verbatim content of one extracted <details data-kind="directive"> marker — one
<human-guidance> element per directive, so a single comment carrying two markers
emits two siblings with identical author/at}
</human-guidance>
</human-guidance-thread>
<thread-discussion> <!-- omitted when empty -->
<thread-discussion-entry author="@alice" at="{iso8601}">
{human comment body verbatim that carried no directive marker — advisory only,
lowest authority; orchestrator audits already filtered out by body shape}
</thread-discussion-entry>
</thread-discussion>
<agent-notes>
<!-- inner agent appends progress/blockers here across attempts -->
</agent-notes>
The handoff file follows the same minimalism as the /handoff skill — reference artifacts by path, do not duplicate their content.
Stop Conditions
- Queue drained →
<promise>NO MORE TASKS</promise> → exit 0.
-n N reached → summary + exit 0.
- Both runners exhausted → exit 75.
- Uncaught error in orchestrator → leave worktree in place, exit 1, print recovery hint. (No heartbeat sub-shell to kill since Slice D.)
Reporting
After every issue, print:
✓ #142 wire OAuth callback 12m 14s tests:✓ lint:✓ types:✓ build:✓ merged b3f2a91
finished 4 / 12 (33%) — next: #143
After the loop, a final block:
/afk done.
runner : codex (3 issues), claude (1 issue)
duration : 01:14:22
processed : 4 closed, 0 blocked, 0 failed
remaining : 8 still ready-for-agent
Configuration
Scalar run settings live in .red/config.yaml under the afk: key (alongside the afk.hooks block documented below). Each one has a matching RED_AFK_* env override that wins over the config value, so an E2E/CI run can pick a setting without mutating the target repo's config.
| Config key | Env override | Default | Meaning |
|---|
afk.default_runner | RED_AFK_RUNNER | claude | Caller runner identity/default backend consumed before ambient sniffing. |
afk.model | — | runner-specific | Legacy global model override. Prefer afk.models.<runner> so Codex never receives a Claude-only model. |
afk.models.claude | — | claude-opus-4-8 | Claude Code model id. |
afk.models.codex | — | gpt-5.5 | Codex model id. |
afk.sandbox | RED_AFK_SANDBOX | none | Isolation backend (none | docker | podman, ADR 0033). |
afk.max_iterations | RED_AFK_MAX_ITERATIONS | 12 | Sandcastle re-invocation ceiling (issue #322) — the safety cap for "the agent never emits <promise>DONE</promise> or <promise>BLOCKED</promise>". The completion sentinel is the real terminator, so a normal issue finishes in 1–3 iterations; this leaves headroom without letting repeated no-sentinel failures run for too long. A non-numeric / zero / negative value in either the env or the config is ignored (falls through to the default) so a typo can never disable the cap or pin the agent to 1. |
afk.backpressure | — | (empty) | Ordered list of shell commands run as an extra pre-merge gate on the DONE path (issue #430, PRD #429). |
afk.merge.wait_for_review | — | false | Merge-gate policy (ADR 0048). When false (default), the unlocked admin-merge proceeds ignoring advisory review checks (e.g. CodeRabbit) — the binding gates are drift-guard (the pre_merge hook) + in-process backpressure/feedback. When true, the unlocked landing waits for the configured review check to conclude before merging, then merges regardless of its verdict (the review stays advisory). drift-guard is a hard gate either way. |
afk.merge.review_check | — | CodeRabbit | Name (case-insensitive substring) of the advisory review check wait_for_review polls via gh pr checks. Only consulted when afk.merge.wait_for_review is true. |
afk:
sandbox: none
max_iterations: 12
backpressure:
- npm run test
- npm run lint
merge:
wait_for_review: false
review_check: CodeRabbit
Backpressure gate
afk.backpressure is an operator-declared, ordered list of shell commands that supplements the auto-derived feedback gate (it does not replace it). On a successful DONE attempt — after the scope-derived test/typecheck/lint/build feedback gate passes, before landing — AFK runs each backpressure command in order (sh -c <command>) against a checkout of the worker branch. If any command exits non-zero the merge is blocked and the issue is parked to ready-for-human with blocked:validation, exactly like a feedback failure: the failing command and its output tail land in the terminal envelope and in the red.afk.validation.v1 validation sidecar (records named backpressure:<command>). An absent or empty block is a no-op (today's behaviour). The namespaced plugins.dev.afk.backpressure location is honoured with the legacy bare afk.backpressure fallback (ADR 0042).
Merge-gate policy
The unlocked admin-merge (gh pr merge --admin --merge, ADR 0030) ignores advisory review checks by default — this is intentional, not an oversight. The binding gates on a landing are:
drift-guard — the pre_merge hook, a hard gate that aborts the merge for this issue (→ ready-for-human).
- In-process backpressure / feedback — the pre-merge feedback-validation step (typecheck/tests, ADR 0008) that only mechanism can refuse a merge on.
External advisory reviewers (CodeRabbit and the like) are not binding: the worker is autonomous, and gating an autonomous loop on a human-paced external reviewer would stall the queue (ADR 0048). Opt into waiting with afk.merge.wait_for_review: true — the unlocked landing then polls afk.merge.review_check until it concludes and merges regardless of the verdict (so its comments are posted before the merge, but the review never blocks the land). The wait is fail-open: a reviewer that never registers or never concludes within the poll budget does not wedge the landing.
Lifecycle Hooks
/afk exposes a fixed set of lifecycle points declared in .red/config.yaml under afk.hooks and resolved as ordered lists of shell commands. Every hook follows a single interceptor contract:
- Input: documented
RED_AFK_* env vars (unset — not empty-string — when the field is irrelevant to the current point) plus the full mutable context as JSON on stdin.
- Output: empty stdout → context unchanged; JSON object on stdout → AFK replaces the documented mutable slice with the returned value. Non-JSON stdout is treated as a parse failure.
- Exit code:
0 continues the chain; non-zero is routed through a per-hook policy table — pre_* aborts the step, post_* / on_idle / on_*_error log and continue so a broken notifier never wedges AFK.
Within a single hook list, built-in defaults run first, user-declared commands run after, and declaration order is preserved inside each group. A bare string is shorthand for a one-element list. An unknown hook name in .red/config.yaml is a hard error at session boot. Disable a built-in default with afk.hooks.defaults.<name>: false — reordering is not supported.
The full lifecycle table is defined in PRD #207. The hooks shipped so far:
| Hook | When it fires | Env vars | Mutable slice | Exit-code policy |
|---|
pre_session | Boot, before any queue work | RED_AFK_RUNNER, RED_AFK_WORKSPACE | session config (runner, worker_id, filter, iter_cap) | non-zero aborts the session loudly |
pre_pick | Before listing the tracker queue | RED_AFK_RUNNER, RED_AFK_WORKSPACE | query params (label, state, limit) — filter.{kind,value} is read-only context | non-zero aborts the pick; queue listing is skipped this iteration and AFK falls through to the empty-queue / on_idle path |
post_pick | After listing, before claiming | RED_AFK_RUNNER, RED_AFK_WORKSPACE | issues[] (filter / reorder; replace with {issues:[…]}) — extra keys are silently ignored | non-zero is logged and AFK continues with the un-mutated list (defensive default — a broken filter must not silently drop work) |
pre_worktree | After claim, before git worktree add | RED_AFK_RUNNER, RED_AFK_WORKSPACE, RED_AFK_ISSUE, RED_AFK_SLOT | issue, target (worktree path), env (k/v map merged into the parent shell so CARGO_TARGET_DIR etc. propagate to the runner) — branch is read-only context | non-zero aborts: the claim is restored to ready-for-agent, the iteration tear-down runs, and the worktree is not created |
pre_attempt | After worktree exists, before each runner invocation (per attempt, not per issue — re-fires on a --fallback-runner swap with attempt_n=2) | RED_AFK_RUNNER, RED_AFK_WORKSPACE (now the worktree), RED_AFK_ISSUE, RED_AFK_ATTEMPT_N | issue, workspace (worktree path), attempt_n — runner is read-only context | non-zero skips runner invocation: the worktree is preserved, the heartbeat stops, and the claim is restored to ready-for-agent so post-pick state is reconciled cleanly |
post_attempt | After the runner returned with an authored <promise> exit — DONE or BLOCKED — for that attempt. Does not fire on runner crash or EOF-without-sentinel (see on_attempt_error). Under --fallback-runner it fires once per runner invocation (the swapped-away attempt closes with result.status=exhausted). The parsed sentinel outcome (done / blocked / no_more_tasks, or "" for the exhausted firings) rides in result.outcome (ADR 0028). | RED_AFK_RUNNER, RED_AFK_WORKSPACE (the worktree), RED_AFK_ISSUE, RED_AFK_RESULT_STATUS (success | fail), RED_AFK_RESULT_OUTCOME (done | blocked | no_more_tasks | empty), RED_AFK_ATTEMPT_N | issue, workspace, result ({status, outcome}), attempt_n | non-zero is logged and the loop continues — a broken notifier/pager must never wedge AFK |
on_attempt_error | When the attempt produced no authored exit: either an unhandled exception in the worker path (run_inner exited non-zero outside the quota branch — runner-crash), or the runner's pipe closed with no <promise> sentinel (EOF-without-sentinel — no-sentinel, ADR 0028; the issue lands in ready-for-human). Distinct from post_attempt with result.status=fail, so hook authors do not have to demultiplex. | RED_AFK_RUNNER, RED_AFK_WORKSPACE (the worktree), RED_AFK_ISSUE, RED_AFK_ERROR_CLASS (runner-crash | no-sentinel), RED_AFK_ATTEMPT_N | issue, workspace, error ({class, rc}), attempt_n | non-zero is logged and the loop continues |
pre_merge | Before the merge mechanism (git merge --no-ff into the pinned base). The diff between the merge base and the worker branch is on stdin so a guard hook can reject changes by size, file pattern, etc. The merge itself plus conflict resolution remain mechanism (ADR 0008) and sit between pre_merge and post_merge — never dispatched as a hook. | RED_AFK_RUNNER, RED_AFK_WORKSPACE (primary checkout), RED_AFK_ISSUE, RED_AFK_MERGE_BASE | issue, workspace, diff — branch is read-only context | non-zero aborts the merge for this issue; the failure surfaces as a worker-failure (merge-conflict envelope, issue flipped to ready-for-human) |
post_merge | After a successful merge and push to origin/{pinned}. The merge commit already exists, so user notifiers can include the real merge commit URL. Does not fire when the merge was aborted (pre_merge rejection, conflict resolver exhausted, push rejected). | RED_AFK_RUNNER, RED_AFK_WORKSPACE (primary checkout), RED_AFK_ISSUE, RED_AFK_MERGE_COMMIT (full sha), RED_AFK_MERGE_SHA (short sha) | issue, workspace, merge_commit ({sha, short}) — extended by the built-in validation default with result.{validation_status, validation_summary} | non-zero is logged and the loop continues — the merge has already landed; a broken notifier or a flaky smoke test must never roll it back |
on_idle | Queue drained at top of loop iteration, before sleep/exit. Distinct from post_session — this is "between drains" maintenance (e.g. cache cleanup), not session termination. Does not fire on session exit. | RED_AFK_RUNNER, RED_AFK_WORKSPACE | none in this slice — stats.{done,blocked,total} are read-only context | non-zero is logged and the loop continues |
post_session | Normal session termination | RED_AFK_RUNNER, RED_AFK_WORKSPACE | session stats (runner, worker_id, stats.{done,blocked,total}) | non-zero is logged and the session ends as NO MORE TASKS |
on_session_error | Last gasp — the AFK loop itself crashed (unhandled set -e exit, supervisor died, unrecoverable orchestrator exception). Distinct from on_attempt_error (a single attempt blew up; the loop continued) and from post_session (clean shutdown). This is the only path that guarantees a notification when the autonomous worker stopped without the operator noticing. Does not fire on a user-requested abort (pre_session rejection, straggler decline, Ctrl+C / SIGTERM through the cleanup trap) — those set the clean-exit sentinel before exiting. | RED_AFK_RUNNER, RED_AFK_WORKSPACE, RED_AFK_ERROR_CLASS (session-crash by default), RED_AFK_ERROR_MESSAGE | error ({class, rc, message}) — none mutable (the loop is already collapsing) | non-zero is logged but the process still exits — this hook cannot rescue the session, only announce its death |
Attempt vocabulary & back-compat (issue #226, ADR 0026). The attempt-level hooks were renamed from pre_worker / post_worker / on_worker_error to pre_attempt / post_attempt / on_attempt_error so "worker" unambiguously names the orchestrator process (RED_AFK_WORKER_ID) and the hooks align 1:1 with ADR 0017's attempt (one node = one runner invocation). They fire per runner invocation, so a --fallback-runner swap on one issue yields two pre_attempt → post_attempt cycles; attempt_n (mutable-context field and the RED_AFK_ATTEMPT_N env var) carries the attempt counter (1 for the first runner, 2 for the swap). For one release window, the old names declared in .red/config.yaml still fire — they are translated to the canonical names at session boot with a single deprecation warning logged. Rename them before the next release; the back-compat shim is dropped then.
Built-in defaults
Defaults are AFK-shipped commands registered before any user hook at the
same lifecycle point. They run first, in a fixed registration order
that users cannot change — only disable individual defaults via
afk.hooks.defaults.<name>: false. The disable-not-reorder rule keeps
later defaults (and user hooks) able to assume an earlier default has
already had its turn at the env.
Currently shipped:
| Default | Lifecycle point | Effect | Disable |
|---|
cargo | pre_worktree | When Cargo.toml exists at $PROJECT_ROOT, sets CARGO_TARGET_DIR=${RED_AFK_CARGO_TARGET_BASE:-/opt/cargo-target}/slot-${RED_AFK_SLOT} (mkdir-p'd) so each slot's cargo state is isolated. | afk.hooks.defaults.cargo: false |
gradle | pre_worktree | When build.gradle* exists at $PROJECT_ROOT and RED_AFK_GRADLE_USER_HOME_BASE is set, sets GRADLE_USER_HOME=${RED_AFK_GRADLE_USER_HOME_BASE}/slot-${RED_AFK_SLOT} so each slot's Gradle daemons / caches are isolated. The env-var opt-in is deliberate — AFK will not claim a path on your filesystem without consent. | afk.hooks.defaults.gradle: false |
heartbeat | post_attempt | Stops the orchestrator's per-minute heartbeat sub-shell (RED_AFK_HEARTBEAT_PID) and appends the iteration stopped boundary marker to RED_AFK_ITER_LOG. Migrated from an inline heartbeat_stop call so the heartbeat now terminates before any user post_attempt hook runs. | afk.hooks.defaults.heartbeat: false |
envelope | post_attempt | Reconciles result.status onto the AFK state file (current.result_status in RED_AFK_STATE_FILE) so a user post_attempt notifier reading state sees the worker's terminal status without re-deriving it from the sentinel. | afk.hooks.defaults.envelope: false |
validation | post_merge | Runs pnpm test / typecheck / lint / build against the merged primary checkout (when a package.json is present at the workspace root), then attaches result.{validation_status, validation_summary} to the post_merge context so user hooks see the CI/smoke outcome reconciled before they fire. Migrated from the inline post-merge CI/smoke call; the pre-merge feedback-validation step remains as the mechanism-owned safety gate (ADR 0008 — only mechanism can refuse a merge), so this default is observability + notification surface, not a gate. | afk.hooks.defaults.validation: false |
Example configuration:
afk:
hooks:
pre_session: "echo boot"
post_pick:
- "RED_AFK_GITHUB_LOGIN=$(gh api user --jq .login) \
plugins/dev/skills/engineering/afk/examples/only-mine.sh"
pre_worktree:
- "echo isolated cargo dir: $CARGO_TARGET_DIR"
on_idle:
- "cargo clean -p reddb-storage"
post_session:
- "echo session done"
- "curl -s -X POST $SLACK_URL -d \"done=$(jq -r .stats.done)\""
defaults:
gradle: false
Safety
See SAFETY.md. The orchestrator and the inner agent both inherit those rules. Violations abort the loop.
Source Of Truth
This skill is the single source of truth for autonomous execution in red-skills repos.