| name | ev-loop-interactive |
| description | Execution loop for human-paired work. Runs a phase as a sequence of deliverables, each with its own unit contract and evaluator checkpoint. Supports sequential (ordered) and free (user picks next) deliverable ordering. Composes /trout-* primitives; composes no other loop. Use when a phase is exploratory, creative, or otherwise not a bulk transform. |
| argument-hint | <project-slug-or-path> <phase-number> |
| allowed-tools | Read, Write, Edit, Bash, Agent, Skill |
/ev-loop-interactive
Execute one phase of a project as a human-paired loop: discrete
deliverables, per-deliverable contract and checkpoint. The human drives
order when ordering is free; the loop keeps the substrate honest.
Composes: .claude/scripts/trout/autosave.ts,
.claude/scripts/trout/autoload.ts (both via Bash),
/trout-pull-request, /guild-validate.
Does not compose: other loops.
Format reference: projects/CONVENTIONS.md (repo-relative).
Skill invocations like /trout-pull-request and /guild-validate below
mean Skill(skill: <name>, args: "…") — the Skill tool is how loops
compose substrate skills. Script invocations like
.claude/scripts/trout/autosave.ts mean
Bash("node .claude/scripts/trout/autosave.ts <args>"). Antagonist
evaluation runs through /guild-validate, which spawns evaluator agents
in parallel via /guild-spawn; the loop itself never calls the Agent
tool directly.
Arguments
<project-slug-or-path> — resolved like .claude/scripts/trout/autosave.ts
(exact slug → suffix match → full path). If missing or unresolved,
stop and ask the user for the project.
<phase-number> — which phase to run. If missing, default to the
next non-completed phase from MANIFEST.md and confirm with the
user before proceeding. If the named phase is already completed,
stop and ask whether to re-run or pick a different phase.
Ordering
Read the phase entry in PLAN.md to determine ordering:
- Sequential — deliverables are numbered and must run in order.
The loop picks the next one automatically.
- Free — deliverables are a set. The loop presents them and asks
the user to pick.
If PLAN.md doesn't specify, default to free and ask.
Phase-level process
Step 0. Pre-flight
Bash("node .claude/scripts/trout/autoload.ts <slug>") to refresh state.
- Working tree clean, branch matches MANIFEST, verification baseline.
Step 1. Enumerate deliverables
Parse the phase's deliverables from PLAN.md. Each deliverable becomes
one unit. If the phase names 5 deliverables, you expect 5 checkins.
Show the list to the user with status markers (done, in-progress, not
started) pulled from existing checkins on this branch.
Step 2. Unit loop
For each deliverable (picked per the ordering rule):
-
Negotiate. Draft the unit contract for this deliverable and
write it into a new numbered checkin (Contract section only). Show
the contract to the user for approval before execution. The human is
in the loop — they should see what you agreed to.
-
Execute. Do the work. For creative or exploratory deliverables,
pair with the user — ask when you hit a fork, report when you hit a
dead end, don't charge ahead.
-
Evaluate. Invoke /guild-validate via the Skill tool to run
the antagonist panel against this unit. v1 panels are single-
evaluator lists; Phase 2+ phases will name larger panels in PLAN.md.
agents: evaluator-contract-fit
packet: build a dense packet (see shape below). The substrate
default is dense — verbose packets correlate with budget-exhaustion
failures under evaluator-contract-fit's maxTurns=5. Live
examples in PR #13's checkins 02-06.
Dense packet shape (three sections, in this order):
## How to evaluate efficiently
You have a tight tool-use budget (maxTurns=5). Pre-computed
verification below is authoritative — do not re-run lint/build/
test/grep unless you find specific evidence the artifact summary
contradicts itself. Spot-check at most ONE or TWO criteria with
targeted reads, then emit `VERDICT:`. If you cannot reach a verdict
within budget, emit `VERDICT: flagged` with `parse-failure:
budget-exhausted` so the loop escalates rather than no-ops.
## Contract (paraphrased)
<Goal in 1-3 sentences. Acceptance criteria as a numbered list,
condensed (full text in <checkin path>). Disqualifiers as a
single-line summary. Inputs as a bulleted list of paths.>
## Artifact
**Files** (created/modified/deleted): <bulleted paths>
**Pre-computed verification (authoritative — do not re-run)**:
- `npm run lint` → <result>
- `npm run build` → <result>
- `npm run test` → <result>
- <other verification: grep results, smoke test outputs, etc.>
**Direct mappings to acceptance criteria** (for spot-check
efficiency): <AC N → file:line ranges or section pointers>
**Iteration story** (if applicable): <prior panel runs and what
was addressed; helps the evaluator avoid re-flagging fixed issues>
## Original ask
<verbatim from PLAN.md or the triggering message>
## Suggested spot-check (one tool use)
<the most efficient single read for confirming the most-suspicious
criterion; optional but reduces investigation thrashing>
Pass the contract as a paraphrased summary plus the checkin file
path link, not verbatim — the checkin file is in the repo and
renders one click away. The packet's job is orientation; the depth
is one click away.
The skill returns a structured verdict (approved | flagged |
flagged-conflict) with blocking_findings, advisory_findings,
cli_runs, and conflicts lists. See
.claude/agents/evaluator-base.md for the per-evaluator verdict
shape that /guild-validate parses and aggregates.
-
Iterate or commit.
- Flagged: address the specific reasons, re-invoke
/guild-validate.
Up to 2 retries (3 panel runs total).
- Approved: finalize the checkin.
-
Autosave. Bash("node .claude/scripts/trout/autosave.ts <slug> --event=checkin-created --detail='<NN> on <branch>' --phase-update=<N>:in-progress:branch=<branch>:checkin=<NN>").
-
Checkpoint. Free mode: after every deliverable. Sequential mode:
after every deliverable or when the human explicitly asks.
Invoke /trout-pull-request <slug> <branch>.
Step 3. Phase close
- All deliverables accounted for.
- Full verification passes.
/trout-pull-request is fresh.
Bash("node .claude/scripts/trout/autosave.ts <slug> --event=phase-completed --detail=<N> --phase-update=<N>:completed").
Output format
After each checkpoint and at phase close, report:
Phase <N> — <title>
Deliverables: <done>/<total> (list with status)
Last checkin: <path>
PR: <url or "not yet opened">
Next: <deliverable name, or "phase complete">
Message-driven redirects
Trigger: if the caller's message (from the router) contains a pattern
like address feedback on #<pr> while this loop is active on that PR's
branch, branch into the flow below instead of continuing the normal
unit loop.
For "address feedback on #N":
/trout-pr-respond <slug> <pr> → plan.
- Each Blocker becomes a new unit.
- Run the unit loop. Re-checkpoint when done.
Rules
- The human co-pilots. Don't write long stretches without pausing.
If a unit spans more than ~3 files or ~200 lines of new/changed code
without a natural pause, split it.
- Contract before execution. Always. Even if the deliverable feels
small.
- Evaluator always runs. Same as the confidence loop — never
self-approve. Evaluator budget is 3 runs per unit (initial + 2
retries); on the third flag escalate to the user.
- Scope discipline. One deliverable at a time in a given checkin.
- Record corrections in the checkin. If the user redirects a unit
mid-flight, overrides a decision, or the evaluator flags something
the generator defaulted to incorrectly, note it verbatim in the
checkin's "Notes for the PR" section with a
correction: prefix.
/trout-save-session captures every such line to
learnings/session-notes/ via
Bash("node .claude/scripts/griot/capture.ts --from-checkin=...")
at end of session; /griot-compact decides which get promoted.
The loop itself never writes to learnings/.
- No emojis.
Failure modes
- User goes quiet mid-deliverable → stop, checkpoint whatever is safe,
save a session handoff.
- Evaluator flags 3× → escalate to user.
- Working tree dirty → stop.