Run any Skill in Manus with one click

$pwd:

ev-loop-confidence

Name: Ev Loop Confidence
Author: krambuhl

// Execution loop for tiered-transform work. Runs a phase as a sequence of tiers, each tier processing a batch of files under a tier contract, gated by evaluator verdicts and pre-flight checks. Writes tactical retros between tiers. Composes /trout-* primitives; composes no other loop. Use when a phase is a bulk transform, audit, or find-replace-style operation across many files.

Run Skill in Manus

$ git log --oneline --stat

stars:1

forks:0

updated:May 6, 2026 at 09:12

SKILL.md

readonly

package.json

"author": "krambuhl"

"repository": "krambuhl/aart.camp"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	ev-loop-confidence
description	Execution loop for tiered-transform work. Runs a phase as a sequence of tiers, each tier processing a batch of files under a tier contract, gated by evaluator verdicts and pre-flight checks. Writes tactical retros between tiers. Composes /trout-* primitives; composes no other loop. Use when a phase is a bulk transform, audit, or find-replace-style operation across many files.
argument-hint	<project-slug-or-path> <phase-number>
allowed-tools	Read, Write, Edit, Bash, Agent, Skill, mcp__github__get_file_contents

/ev-loop-confidence

Execute one phase of a project as a confidence loop: tiered transforms, ratcheting from small/safe to large/risky, with an evaluator verdict per unit and a tactical retro per tier.

Composes: .claude/scripts/trout/autosave.ts, .claude/scripts/trout/autoload.ts (both via Bash), /trout-pull-request, /guild-validate. Does not compose: other loops. Peer loops are invoked by the router, not by each other.

Format reference: projects/CONVENTIONS.md (repo-relative).

Skill invocations like /trout-pull-request and /guild-validate below mean Skill(skill: <name>, args: "…") — the Skill tool is how loops compose substrate skills. Script invocations like .claude/scripts/trout/autosave.ts mean Bash("node .claude/scripts/trout/autosave.ts <args>"). Antagonist evaluation runs through /guild-validate, which spawns evaluator agents in parallel via /guild-spawn; the loop itself never calls the Agent tool directly.

Arguments

<project-slug-or-path> — resolved like .claude/scripts/trout/autosave.ts (exact slug → suffix match → full path).
<phase-number> — which phase of the project to run. Must exist in MANIFEST.md and not be in completed state.

If <phase-number> is missing, already completed, or the slug does not resolve, stop and ask the user rather than guessing.

Scope directory

This loop creates and uses a scope directory at ./projects/<slug>/<phase-slug>/ where <phase-slug> is a kebab-case form of the phase name. Inside:

MANIFEST.md — phase-scoped manifest (progress within the phase)
inventory.md — the full list of files in scope, generated in step 1
retros/tier-<N>.md — tactical retro per tier

Create this directory if it doesn't exist.

Phase-level process

Step 0. Pre-flight

Before any work:

Bash("node .claude/scripts/trout/autoload.ts <slug>") to refresh state.
Confirm working tree is clean (git status --porcelain). If not, stop and ask the user to commit or stash.
Confirm current branch matches the phase's branch in MANIFEST.md. If not, stop and ask whether to switch.
Run the verification commands from config.md as a baseline. Record exit status. A red baseline before any work means the loop stops — you are not making a red build redder.

Step 1. Coverage before transforms

Build inventory.md listing every file or item in scope for this phase. The phase description in PLAN.md or config.md tells you the pattern (e.g. "all .ts/.tsx files using ESLint disable comments"). Use git ls-files, grep, find, or equivalent to enumerate. Include counts.

Format:

# Scope inventory — Phase <N> <name>

**Generated**: YYYY-MM-DD HH:MM
**Total items**: <count>
**Pattern**: <description of what makes an item in-scope>

## Items
- [ ] path/to/file-1.ts  (tier: <tier>)
- [ ] path/to/file-2.tsx  (tier: <tier>)

Tier assignment is a judgment call (see Tier assignment below). Assign tiers as you build the inventory, or leave them unassigned and prompt the user.

Do not begin transforms until the inventory is complete. Partial inventory = unknown blast radius.

Step 2. Tier assignment

Divide inventory items into tiers of increasing risk/complexity:

Tier 1 — mechanical, obvious, identical across items
Tier 2 — same shape with small variations, low risk
Tier 3 — requires judgment, possible side effects
Tier 4+ — bespoke, high risk, may need human-paired work

If tier 4+ items appear, consider routing them to /ev-loop-interactive rather than handling here. Surface this to the user and ask.

Step 3. Execute tiers in order

For each tier, run a tier loop (see Tier-level process below). Between tiers, write a tactical retro and re-run pre-flight.

Step 4. Phase close

When all tiers in this phase are complete:

Verify every inventory item is checked off.
Run full verification.
Ensure the latest checkin exists.
Invoke /trout-pull-request <slug> <branch> so the PR reflects the final state.
Run Bash("node .claude/scripts/trout/autosave.ts <slug> --event=phase-completed --detail=<N> --phase-update=<N>:completed").
Return control to the router.

Tier-level process

Each tier runs as a sequence of units. A unit is one batch of inventory items transformed together — sized so that one checkin covers it cleanly.

Batch sizing

Tier 1: batch of 10–30 items (mechanical, cheap to redo)
Tier 2: batch of 5–15 items
Tier 3: batch of 1–5 items
Tier 4+: batch of 1 item

Size down if verification grows slow or evaluator flags pile up. Size up if you're burning checkins on trivial changes.

Tier contract (a specialization of unit contract)

Before the first unit of a tier, write a tier contract as the first checkin of that tier. The Contract section includes tier-wide rules that every unit in the tier must satisfy:

## Contract
- **Goal**: apply <transform> to all Tier <N> items
- **Acceptance criteria**:
  - Every item in this tier is updated
  - `<verification command>` passes after each batch
  - No unrelated files modified
- **Rules applied**:
  - <style/lint rules>
  - Verification: `<command>`
- **Disqualifiers**:
  - Any regression in <area>
  - Any file in scope left untouched
- **Inputs**: inventory.md Tier <N> items

Subsequent units inside the tier can reference the tier contract instead of restating it, as long as the unit's checkin contract says Rules applied: tier-<N> contract (see checkin <NN>) and lists only unit-specific deltas.

Unit loop

For each unit inside a tier:

Negotiate. Write a new checkin file with the Contract section populated. Pick the items for this batch from inventory.md (mark them with a tier tag if not already). Save the checkin with just the Contract section — Execution/Scope/Changes/Evaluator verdict come later.
Execute. Do the transform on the batch. Keep to scope.

Evaluate. Invoke /guild-validate via the Skill tool to run the antagonist panel against this unit. v1 panels are single- evaluator lists; Phase 2+ phases will name larger panels in PLAN.md.

agents: evaluator-contract-fit
packet: build a dense packet (see shape below). The substrate default is dense — verbose packets correlate with budget-exhaustion failures under evaluator-contract-fit's maxTurns=5. Live examples in PR #13's checkins 02-06.

Dense packet shape (three sections, in this order):

## How to evaluate efficiently

You have a tight tool-use budget (maxTurns=5). Pre-computed
verification below is authoritative — do not re-run lint/build/
test/grep unless you find specific evidence the artifact summary
contradicts itself. Spot-check at most ONE or TWO criteria with
targeted reads, then emit `VERDICT:`. If you cannot reach a verdict
within budget, emit `VERDICT: flagged` with `parse-failure:
budget-exhausted` so the loop escalates rather than no-ops.

## Contract (paraphrased)

<Goal in 1-3 sentences. Acceptance criteria as a numbered list,
condensed (full text in <checkin path>). Disqualifiers as a
single-line summary. Inputs as a bulleted list of paths.>

## Artifact

**Files** (in scope for this batch): <bulleted paths>

**Pre-computed verification (authoritative — do not re-run)**:
- `npm run lint` → <result>
- `npm run build` → <result>
- `npm run test` → <result>
- <other verification: tier-specific checks, codemod diff samples, etc.>

**Direct mappings to acceptance criteria** (for spot-check
efficiency): <AC N → file:line ranges or section pointers>

**Iteration story** (if applicable): <prior panel runs and what
was addressed; helps the evaluator avoid re-flagging fixed issues>

## Original ask

<verbatim from PLAN.md phase description or the triggering message>

## Suggested spot-check (one tool use)

<the most efficient single read for confirming the most-suspicious
criterion; optional but reduces investigation thrashing>

Pass the contract as a paraphrased summary plus the checkin file path link, not verbatim — the checkin file is in the repo and renders one click away. The packet's job is orientation; the depth is one click away.

The skill returns a structured verdict (approved | flagged | flagged-conflict) with blocking_findings, advisory_findings, cli_runs, and conflicts lists. See .claude/agents/evaluator-base.md for the per-evaluator verdict shape that /guild-validate parses and aggregates.

Iterate or commit.
- If flagged: update the checkin Execution section with the remedy taken, re-invoke /guild-validate. Maximum 2 re-iterations per unit — on the third flag, stop and escalate to the user.
- If approved: finalize the checkin (Execution, Scope, Changes, Evaluator verdict = approved, Notes for the PR). Check off the inventory items.
Autosave. Run Bash("node .claude/scripts/trout/autosave.ts <slug> --event=checkin-created --detail='<NN> on <branch>' --phase-update=<N>:in-progress:branch=<branch>:checkin=<NN>").
Checkpoint? Evaluate the should-checkpoint policy (below). If any condition holds, invoke /trout-pull-request <slug> <branch> so the PR tracks the latest state. Otherwise continue to the next unit.

Should-checkpoint policy

Checkpoint (invoke /trout-pull-request) when any of the following hold. All are read off state — there is no callable function.

A full tier has just finished.
The number of units since last PR update ≥ 5.
Verification is currently green and we're about to start a riskier tier.
The user has explicitly asked for a checkpoint.

Do not checkpoint mid-tier unless verification is green.

Tactical retro between tiers

Immediately after the last unit of a tier is approved, before moving to the next tier:

Re-run pre-flight (working tree, verification).
Write <scope-dir>/retros/tier-<N>.md:

# Tier <N> retro

**Items processed**: <count>
**Units**: <NN .. NN>
**Verification at tier close**: <green|red>

## What went smoothly
- …

## What bit us
- …

## Adjustment for next tier
- <one concrete change to batch size, tier assignment, or process>

Run Bash("node .claude/scripts/trout/autosave.ts <slug> --event=retro-written --detail='tier-<N>'").

Tactical retros are short and specific. Strategic retrospection happens at /trout-archive, not here.

Gate-and-ratchet

Before starting tier N+1, the gate closes if:

Any tier N unit is still flagged.
Verification is red.
The tier retro identified a blocker for tier N+1.

A closed gate stops the loop and reports to the user. The user decides whether to resume, re-tier, or bail.

Message-driven redirects

If the router passes a message like "address feedback on #14", this loop:

Invokes /trout-pr-respond <slug> <pr> to get the response plan.
Treats each Blocker item as a new unit in the current tier (or a new tier if the feedback rewrites scope).
Iterates the unit loop. When done, re-invokes /trout-pull-request to update the PR.

Rules

Coverage before transforms. Do not start tier 1 without a complete inventory.
Pre-flight before any tier. Every tier starts from a known-good state.
One contract per tier, restated per unit only for deltas.
Evaluator always runs. No exceptions. Never self-approve.
Scope discipline. Fixes outside this phase's pattern get noted in "Notes for the PR" and are deferred — not absorbed silently.
Record corrections in the checkin. If the user redirects a unit mid-flight, overrides a decision, or the evaluator flags something the generator defaulted to incorrectly, note it verbatim in the checkin's "Notes for the PR" section with a correction: prefix. /trout-save-session captures every such line to learnings/session-notes/ via Bash("node .claude/scripts/griot/capture.ts --from-checkin=...") at end of session; /griot-compact decides which get promoted. The loop itself never writes to learnings/.
No emojis.

Output to router

On any termination — phase close, closed gate, or escalation — return:

Status: completed | gated | escalated | aborted
Phase: <N> <name>
Tiers run: list with counts (e.g., 1: 3 files, 2: 7 files)
Last checkin: <NN>
Last PR update: <url> or none
Reason (if not completed): one-line cause
Next action: what the router or user should do next

Failure modes

Pre-flight fails → stop, report, do not proceed.
Evaluator flags 3× → stop, escalate to user, do not auto-merge or force-approve.
Inventory shrinks mid-phase (items disappear) → stop; something moved under you. Regenerate inventory and reconcile.
Working tree dirty at boundary → stop; do not stash silently.

name	ev-loop-confidence
description	Execution loop for tiered-transform work. Runs a phase as a sequence of tiers, each tier processing a batch of files under a tier contract, gated by evaluator verdicts and pre-flight checks. Writes tactical retros between tiers. Composes /trout-* primitives; composes no other loop. Use when a phase is a bulk transform, audit, or find-replace-style operation across many files.
argument-hint	<project-slug-or-path> <phase-number>
allowed-tools	Read, Write, Edit, Bash, Agent, Skill, mcp__github__get_file_contents