Run any Skill in Manus with one click

run-plan

Execute the next phase of a plan document: parse phases and status, dispatch implementation in a worktree, verify with a separate agent, update progress tracking, write reports/plan-{slug}.md, and optionally auto-land to main. Can self-schedule recurring runs via cron. Use `next` to check schedule, `stop` to cancel.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/zeveck/imagegen --skill run-plan

Copy and paste this command into Claude Code to install the skill

Source

zeveck/imagegen

Stars0

Forks0

UpdatedApril 13, 2026 at 21:39

SKILL.md

readonly

/run-plan <plan-file> [phase|finish] [auto] [every SCHEDULE] [now] | stop | next — Plan Phase Executor

Orchestrates plan-driven development. Reads a plan document, identifies the next incomplete phase, dispatches implementation in a worktree, verifies with a separate agent, updates progress tracking, writes a persistent report, and optionally auto-lands to main. Can self-schedule for recurring runs to work through multi-phase plans autonomously.

Ultrathink throughout. Use careful, thorough reasoning at every step.

Arguments

/run-plan <plan-file> [phase] [auto] [every SCHEDULE] [now]
/run-plan stop | next

plan-file (required) — path to plan, e.g. plans/FEATURE_PLAN.md
phase (optional) — specific phase, e.g. 4a. If omitted, auto-detect next incomplete phase
finish (optional) — run ALL remaining phases sequentially until the plan is complete. finish is approval to START — do not ask for confirmation before the first phase (the user already said "finish"). Without auto: pauses BETWEEN phases to show results and ask "continue to next phase?" With auto: runs all phases without pausing (overnight). Each phase still gets full verification, testing, and all safety rails. If any phase fails verification or hits a conflict, stops there. finish and every are mutually exclusive. finish runs all phases in one session. every schedules one phase per cron fire. Combining them is meaningless — finish either completes (cron self-terminates) or fails (Failure Protocol kills the cron). Use one or the other.
auto (optional) — bypass approval gates, auto-land to main via cherry-pick
every SCHEDULE (optional) — self-schedule recurring runs via cron:
- Accepts intervals: 4h, 2h, 30m, 12h
- Accepts time-of-day: day at 9am, day at 14:00, weekday at 9am
- Without now: schedules only, does NOT run immediately
- With now: schedules AND runs immediately
- Implies auto — scheduling only makes sense for autonomous runs
- Cron prompt omits phase number so each invocation auto-detects the next incomplete phase
- Each run re-registers the cron (self-perpetuating)
- Cron is session-scoped — dies when the session dies
now (optional) — run immediately. When combined with every, runs immediately AND schedules. Without every, now is the default behavior.
status — show plan progress: all phases, their status, what's next, and what's blocked. Read-only — no agents dispatched, no approval gate.
stop — cancel any existing /run-plan cron and exit. Takes precedence over all other arguments.
next — check when the next scheduled run will fire. Takes precedence over all other arguments except stop.

Detection: scan $ARGUMENTS for:

stop (case-insensitive) — cancel cron and exit (highest precedence)
next (case-insensitive) — check schedule and exit
status (case-insensitive) — show plan progress and exit
finish (case-insensitive) — run all remaining phases sequentially
now (case-insensitive) — run immediately
auto (case-insensitive) — autonomous mode
every followed by a schedule expression — scheduling mode

Examples:

/run-plan plans/FEATURE_PLAN.md — interactive, next phase
/run-plan plans/FEATURE_PLAN.md 4b — interactive, specific phase
/run-plan plans/FEATURE_PLAN.md finish — interactive, all remaining phases (pauses between each)
/run-plan plans/FEATURE_PLAN.md finish auto — autonomous, all remaining phases (no pausing)
/run-plan plans/FEATURE_PLAN.md auto every 4h — schedule every 4h
/run-plan plans/FEATURE_PLAN.md auto every 4h now — schedule + run now
/run-plan plans/FEATURE_PLAN.md status — show plan progress
/run-plan now — trigger the active cron early
/run-plan stop — cancel scheduled runs
/run-plan next — check when the next phase will run

Status (if `status` is present)

If $ARGUMENTS contains status (case-insensitive):

Read the plan file specified in the arguments
Also read any companion progress document if referenced
Parse all phases and their status (same parsing logic as Phase 1 steps 2-3: "Extract phases and status" and "Determine target phase." Do NOT run preflight checks — status is read-only)

Present a progress table:

Plan: plans/FEATURE_PLAN.md

| Phase | Status |
|-------|--------|
| 4a — Electrical | Done (abc1234) |
| 4b — Mechanical | Done (def5678) |
| 4c — Smooth Nonlinear | Next ← |
| 4d — Solver Fixes | Blocked (needs 4c) |
| 4e — UI Polish | Blocked (needs 4d) |

Next phase: 4c — Smooth Nonlinear Components
Dependencies: 4a ✓, 4b ✓

If a cron is active, also show the schedule:

Scheduled: every 4h (~8:15 PM ET next, cron XXXX)
Exit. Read-only — no agents dispatched, no work done.

Now (standalone — no plan-file provided)

If $ARGUMENTS is just now (no plan-file, no phase, no every):

Use CronList to list all cron jobs
Find any whose prompt starts with Run /run-plan
If found: extract the cron's prompt to get the plan-file, auto, and schedule. Run the phase immediately — proceed to Phase 1. Do NOT ask for confirmation — now IS the confirmation. The cron stays active.
If none found: report No active /run-plan cron to trigger. Use /run-plan <plan-file> to run manually. and exit.

Next (if `next` is present)

If $ARGUMENTS contains next (case-insensitive):

Use CronList to list all cron jobs
Find any whose prompt starts with Run /run-plan
Report:
- If found: parse the cron expression and compute the next fire time. Use date +%Z for the timezone. Show both relative and absolute:
  
  Next run-plan phase in ~2h 15m (~8:30 PM ET, cron XXXX). Prompt: Run /run-plan plans/FEATURE_PLAN.md auto every 4h
- If none found: No active /run-plan cron in this session.
Exit. Do not proceed to any phase.

Stop (if `stop` is present)

If $ARGUMENTS contains stop (case-insensitive):

Use CronList to list all cron jobs
Delete ALL whose prompt starts with Run /run-plan using CronDelete
Report what was cancelled:
- If one cron found: Run-plan cron stopped (was job ID XXXX, every INTERVAL).
- If multiple found: Stopped N run-plan crons (IDs: XXXX, YYYY).
- If none found: No active /run-plan cron found.
Exit. Do not proceed to any phase. The stop command does nothing else.

Phase 0 — Schedule (if `every` is present)

If $ARGUMENTS contains every <schedule>:

Parse the schedule — convert to a cron expression. The LLM interprets natural scheduling expressions.

For interval-based schedules (4h, 2h, 30m): use the CURRENT minute as the offset so the first fire is a full interval from now, not aligned to midnight. Check the current minute with date +%M:
- 4h at minute 9 → 9 */4 * * * (fires at :09 past every 4th hour)
- 2h at minute 15 → 15 */2 * * *
- 30m → */30 * * * * (no offset needed for sub-hour)
- 1h at minute 9 → 9 * * * *
For time-of-day schedules (day at 9am, weekday at 2pm): offset round minutes by a few to avoid API busy marks:
- day at 9am → 3 9 * * *
- day at 14:00 → 3 14 * * *
- weekday at 9am → 3 9 * * 1-5
Deduplicate — use CronList + CronDelete to remove any whose prompt starts with Run /run-plan.
Construct the cron prompt. Strip the phase number (so each invocation auto-detects the next incomplete phase). Always include now in the cron prompt so each cron fire runs immediately AND re-registers itself. Note: this now is for the CRON's invocation, not the current invocation:
```
Run /run-plan <plan-file> auto every <schedule> now
```
Note: the phase number is intentionally omitted so the cron auto-advances.
Create the cron — use CronCreate:
- cron: the cron expression from step 1
- recurring: true
- prompt: the constructed command from step 3
Confirm with wall-clock time. Always show times in America/New_York (ET) — use TZ=America/New_York date for conversion, not the system timezone (which may be UTC):

If now is present:

Run-plan scheduled every 4h. Running now. Next phase run after this one: ~8:15 PM ET (cron ID XXXX).

If now is NOT present:

Run-plan scheduled every 4h. First run: ~4:15 PM ET (cron ID XXXX). Use /run-plan next to check, /run-plan stop to cancel.
If now is present: proceed to Phase 1 (run immediately). If now is NOT present: Exit. The cron fires later.

End-of-phase scheduling note: when a phase finishes and a cron is active, always include the estimated next run time with timezone in the completion message. Example:

Phase complete. Next phase run in ~3h 45m (~11:30 PM ET, cron XXXX).

If every is NOT present, skip this phase and proceed to Phase 1 (bare invocation always runs immediately).

Phase 1 — Parse Plan & Extract Verbatim Phase Text

The key differentiator. Plans have varied formats, so the agent uses LLM comprehension rather than rigid parsing.

Preflight checks

Before parsing, check for stale state from a previous failed run:

In-progress git operation?

ls .git/CHERRY_PICK_HEAD .git/MERGE_HEAD .git/REBASE_HEAD 2>/dev/null
git status --porcelain | grep '^UU\|^AA\|^DD'

If either command produces output, STOP. Invoke the Failure Protocol.

Stash stack?
```
git stash list
```
If there is a stash with message containing "pre-cherry-pick", a previous run's stash was never restored. STOP. Invoke the Failure Protocol — the user needs to git stash pop or git stash drop before a new phase can start safely.
Leftover plan worktrees?
```
git worktree list
```
If worktrees from a previous run exist (paths containing plan-), warn the user. Do not remove them — note their presence and continue.
Unconfigured hook placeholders?
```
grep -q '{{' .claude/hooks/block-unsafe-project.sh 2>/dev/null
```
If found AND test infrastructure exists (package.json with a "test" script, or vitest.config.* / jest.config.* exists), STOP. Hook placeholders have not been configured — run /update-zskills first.

Parse plan

Read the plan file in full. Also read any companion progress document if referenced (e.g., FEATURE_PROGRESS_AND_NEXT_STEPS.md).
Extract phases and status — handle four formats:
- Progress tracker table (FEATURE_PLAN style): rows with ✅ Done, ⬚ (not started), 🟡 (in progress), etc.
- Numbered phase sections (## Phase 4a — Title): look for completion markers in the section body or companion doc
- Checklist (- [x] / - [ ]): checked = done, unchecked = not done
- Narrative: infer status from codebase evidence (files exist, tests pass, etc.)
Determine target phase:
- If phase arg given: use it. If already complete, warn (or skip in auto)
- If no phase arg: first incomplete phase
- If ALL phases complete: report "Plan complete" → stop. If every, delete the cron via CronList + CronDelete
- If multiple phases share the same number (e.g., 4a, 4b, 4c), treat each sub-phase as a separate phase
Check dependencies — if a prerequisite phase isn't Done, STOP. Report which dependency is missing. If every, the cron retries later.
Check for conflicts — if the target phase is "In Progress" (🟡 or equivalent), another agent may be working on it. STOP. Do not compete.
Check for staleness notes — if the plan's Dependencies section contains language like "drafted before," "may need refresh," or "APIs and data structures referenced here are based on [another plan's] design, not actual code," the plan may be stale:
- Without auto: tell the user "this plan was drafted before its dependency was implemented. Want me to refresh it with /draft-plan?"
- With auto: dispatch /draft-plan on the plan file to update it. /draft-plan handles existing files as modernizations. After the refresh, re-read the plan and continue.
- Skip this check if the plan file was modified more recently than the dependency's completion (it may already be up to date).
Save the VERBATIM phase text — copy the entire section from the plan file exactly as written. Every sentence, every bullet, every formula, every constraint. This text will be passed to agents in Phase 2 and Phase 3.

Do NOT summarize, paraphrase, or reinterpret. The plan is the spec.

Lesson from /fix-issues #387: summarized descriptions caused agents to implement the wrong thing. "Reset button" was interpreted as "clear canvas" instead of "reset mappings to defaults" because only the title was read. The same will happen with plan phases if the orchestrator summarizes "implement translational mechanical domain" without the formulas, state equations, and design constraints.

Create tracking fulfillment marker. Determine the tracking ID: use the ID passed by the parent skill if this is a delegated invocation, or derive from the plan file slug if standalone (e.g., FEATURE_PLAN.md → feature-plan). Then create the fulfillment file in the MAIN repo:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
mkdir -p "$MAIN_ROOT/.claude/tracking"
printf 'skill: run-plan\nid: %s\nplan: %s\nphase: %s\nstatus: started\ndate: %s\n' \
  "$TRACKING_ID" "$PLAN_FILE" "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/fulfilled.run-plan.$TRACKING_ID"

Classify UI impact from the plan text. Scan the phase description for UI indicators: mentions of editor, toolbar, canvas, panel, dialog, CSS, button, menu, viewport, renderer, dark mode, layout, or any reference to UI/editor/styles directories in the project. Flag the phase as UI-touching if any are found.

In finish mode, classify ALL phases upfront and report:

Running 5 phases. Phases 3 and 5 touch UI — landing will wait for your sign-off at the end.

This tells the user immediately whether the run will be fully automatic or will need their review before landing. No surprises at Phase 6.
Present the phase plan:
- Without auto: display the phase summary (name, status, dependencies, work items, UI classification) and wait for user approval
- With auto: proceed immediately

`finish` mode: overall verification after all phases

In finish mode, after ALL phases complete their per-phase implement → verify loops, run a final overall verification before writing the report and landing:

Dispatch an overall verification agent. In worktree mode, run /verify-changes worktree on the full worktree diff. In delegate mode (or mixed), run /verify-changes on main against the commits from all phases combined. This catches cross-phase integration issues: regressions from later phases breaking earlier work, conflicting imports, duplicated code.
If ANY phase was classified as UI-touching (step 7), dispatch a dedicated manual testing agent that exercises ALL UI changes together via playwright-cli. This agent:
- Tests the combined UI state (not each change in isolation)
- Takes comprehensive screenshots showing everything working together
- Prepares the sign-off report so the user can review efficiently
- Uses /manual-testing recipes for selectors and setup
The goal: instead of "3 items need sign-off, go check yourself," the user gets "3 items need sign-off, here are screenshots of all of them working together."
Proceed to Phase 5 (write report) with the combined verification results.

Phase 2 — Implement

Execution mode detection

Check the phase text for an execution mode directive:

### Execution: delegate <skill> [args] — delegate mode. The phase runs a skill (e.g., /add-block, /run-plan) that manages its own isolation. The orchestrating agent runs on main, not in a worktree. See "Delegate mode" below.
### Execution: worktree or no directive — default worktree mode. See "Worktree mode" below.

Delegate mode

The orchestrating agent runs on main and calls the specified skill. The skill manages its own worktree, verification, and landing.

Dispatch agent on main (no isolation: "worktree"). Give the agent:
- The verbatim phase text (same rule as worktree mode)
- Instruction to run the specified skill with the given arguments
- Instruction to wait for the skill to finish and report the result
Agent timeout: 2 hours. Same as worktree mode.
After the delegate skill finishes, /run-plan proceeds to Phase 3 (verification) which runs on main — checking that the delegated work actually landed correctly.
In finish mode: each delegate phase runs independently (no shared worktree — the delegate skill creates and destroys its own).

Use cases:

### Execution: delegate /add-block DiscreteFilter — block expansion
### Execution: delegate /run-plan plans/SUB_PLAN.md finish auto — meta-plans
### Execution: delegate /draft-plan plans/FOO.md <description> — plan generation

Worktree mode (default)

One worktree for the entire phase (not per-item like /fix-issues). In finish mode, reuse the SAME worktree across all phases — create it once before the first phase, pass the same path to every phase's agent:

Agent timeout: 2 hours. Note the dispatch time. If the implementation agent hasn't returned after 2 hours, declare it failed:

Mark the phase as "Timed out" in reports/plan-{slug}.md
The phase stays incomplete for the next run
The worktree is a cleanup artifact — do NOT auto-land late results
If the agent eventually returns, ignore it. Timed out = failed, period.
If the plan was drafted with /draft-plan, the phase may be too large — consider splitting it (each phase should be ~3-5 components, ~500 lines).

Dispatch implementation agent with isolation: "worktree". Tell the agent to write a .worktreepurpose file as its first action:
```
echo "<session-name or plan-name>: <phase name>" > .worktreepurpose
```
Example: echo "SKILLZ: briefing Phase 1" > .worktreepurpose This metadata helps /briefing worktrees and /briefing verify show what each worktree is for, instead of just an opaque agent ID.
Agent prompt MUST include the verbatim plan text. The implementing agent receives the EXACT text of the phase from the plan file — not a summary, not bullet points extracted from it, not "implement the mechanical domain." The full section with every requirement, formula, constraint, design note, and acceptance criterion.

The plan is the spec. If the agent doesn't have the verbatim text, it will guess, and it will guess wrong.

For plan sections longer than ~100 lines: write the verbatim text to a temp file (e.g., /tmp/phase-text.md) and tell the agent to Read the file. This avoids the natural LLM tendency to compress long text when inlining it in a prompt. Shorter sections can be inlined directly.
If dispatching sub-agents for parallel work items, each sub-agent gets:
- The full phase context (verbatim) — so they understand the big picture
- Their specific scope clearly delineated — e.g., "you are implementing Mass, Spring, Damper. Another agent is implementing sensors and force source."
- What parallel agents are doing — enough to avoid conflicts (shared files, shared infrastructure) but not so much detail that it confuses their scope. Format: "Another agent is handling: [list of items]. You should not modify [shared files] until that work lands."
- Shared infrastructure dependencies — if a base class or domain definition must exist first, that must be built sequentially before dispatching parallel agents. Never dispatch parallel agents that both need to create the same file.
Within-phase parallelism is the agent's judgment call — if items are independent (e.g., Mass, Spring, Damper components), the agent may dispatch sub-agents. If there's shared infrastructure to build first, it works sequentially then parallelizes. The skill does NOT force parallelism.
Commit discipline:
- One logical unit per commit — clean git history
- npm run test:all before every commit — not just npm test
- Tests alongside implementation, not deferred to later
- Agents commit freely in worktrees — that's the point of isolation
- Rebase onto current main before final commit:
```
git fetch origin main && git rebase origin/main
```
  This ensures the commit contains only the agent's changes, not stale copies of files other agents already fixed on main. If rebase conflicts, abort (git rebase --abort) and proceed — the cherry-pick verification will catch stale files via selective extraction.
Running tests in worktrees — CRITICAL. Agents waste hours getting tests working in worktrees without these instructions. Include this VERBATIM in every implementation and verification agent prompt:
Worktree test recipe:
1. Start a dev server FIRST: npm start &
2. Wait for it: sleep 3
3. Run tests with output captured to a file: npm run test:all > .test-results.txt 2>&1 Never pipe through | tail, | head, | grep — it loses output and forces re-runs. Capture once, read the file.
4. The dev server must stay running for E2E tests. If source files changed (they will have — you're implementing), E2E tests FAIL (not skip) without a dev server.
5. If tests fail, read .test-results.txt to find the failures. Then run ONLY the failing test file to iterate on the fix: node --test tests/the-failing-file.test.js Do NOT re-run npm run test:all to diagnose — that wastes 5 minutes when the single file takes 30 seconds.
6. After fixing, run the single file again to confirm. Then run npm run test:all > .test-results.txt 2>&1 ONE more time as the final gate before committing.
7. Max 2 fix attempts at the same error — do not thrash.
8. If a test fails in code you didn't touch, it may be pre-existing. See /verify-changes Phase 3 for the pre-existing failure protocol.
No steps skipped or deferred. If the plan says "implement 7 components," implement 7 components. If it says "write tests for free vibration," write those exact tests. Do not stop after the easy items and declare the hard ones "future work."

Post-implementation tracking

After the implementation agent finishes (whether worktree or delegate mode), create the implementation step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.implement"

Pre-verification tracking

Before dispatching the verification agent, create a delegation requirement marker so the hook can enforce that verification actually runs:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'skill: verify-changes\nparent: run-plan\nid: %s\ndate: %s\n' \
  "$TRACKING_ID" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/requires.verify-changes.$TRACKING_ID"

Pass the tracking ID to the verification agent in the dispatch prompt so it can create its own fulfillment marker:

Your tracking ID is $TRACKING_ID. On entry, create fulfilled.verify-changes.$TRACKING_ID in the main repo's .claude/tracking/ directory.

Phase 3 — Verify (separate agent)

Critical: the verification agent is NOT the implementing agent. Fresh eyes catch implementer blindspots — deferred hard parts, missing tests, stubs, shortcuts.

Agent timeout: 45 minutes. Verification should take 15-30 minutes — reading diffs, running tests, checking acceptance criteria. If a verification agent hasn't returned after 45 minutes, it is thrashing (likely on test setup or repeated test failures). Declare it failed and invoke the Failure Protocol. Do NOT let verification agents run indefinitely — they are the most common source of time waste.

Delegate mode verification

If this phase used delegate execution, verification runs on main:

Verify commits landed — check git log --oneline -10 for the delegate's commits. If expected commits are missing, the delegate failed to land — invoke Failure Protocol.
Run npm run test:all on main — the delegate already tested, but /run-plan verifies against the plan's acceptance criteria.
Check acceptance criteria from the verbatim plan text — the delegate skill doesn't know the plan's criteria, only /run-plan does.
Dispatch a verification agent if needed (same rules as worktree mode below, but targeting main instead of a worktree path).

Worktree mode verification

Dispatch verification agent targeting the worktree's changes. The verification agent is dispatched without isolation: "worktree" — the Agent tool's isolation parameter creates a NEW worktree, it cannot attach to an existing one. Instead, give the verification agent:
- The worktree path from Phase 2 (so it can read files and run tests there via cd <worktree-path> && npm run test:all)
- The worktree branch name (so it can diff against main: git diff main...<branch>)
- The verbatim phase text from the plan (same text the implementer got)
- Instruction to run /verify-changes worktree — the verification agent runs this, NOT you. Do NOT run verification yourself — you are the orchestrator with implementer bias.
- The work items checklist — verify each item was actually implemented, not stubbed or skipped
Additional plan-specific checks (the verifier checks these against the verbatim plan text — not against a summary):
- Do commits cover ALL work items listed in the plan? Any missing?
- Does implementation follow the plan's stated approach? (e.g., "use internal displacement state for Spring" — did it actually do that?)
- Are constraints respected? (no external solvers, etc.)
- Any deferred hard parts, stubs, TODOs, or placeholder implementations?
- Do acceptance criteria match? (e.g., "test free vibration x(t) = A cos(wt)" — does that exact test exist with that exact formula?)
"Noted as gap" is a verification FAILURE. If any work item, acceptance criterion, or checklist item was skipped and merely noted — that is not a pass. It is a fail. The verifier must not rationalize skipped steps as "not blockers" or "gaps for future work." If the plan says to do it and it wasn't done, verification fails. Period.

Past failure: Block Expansion Plan Phase 1 — the implementer skipped the example model (Step 7 of /add-block) and runtime entry (Step 10). The verifier saw both skips but wrote "gaps noted" instead of invoking the Failure Protocol. The phase was reported as complete with missing work.
If verification fails:
- Without auto: present findings, ask user what to do
- With auto: dispatch a fresh fix agent for the missing items. The fix agent receives: the worktree path, the verbatim plan text, the specific items that failed verification, and instructions to complete them — not summarize them, not note them, COMPLETE them. If the missing item is an example model, the fix agent calls /add-example. If it's a runtime entry, the fix agent adds it. The fix agent is NOT the implementer — it's a fresh agent with no bias toward "this is good enough."
  
  After the fix agent finishes, re-verify (max 2 rounds). If still failing after 2 fix+verify cycles, STOP — needs human judgment. Invoke the Failure Protocol.

Post-verification tracking

After verification passes, create the verification step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\nresult: pass\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.verify"

Phase 4 — Update Progress Tracking

After verification passes. These updates happen on MAIN, not in the worktree. The plan file tracks progress across all phases — it's an orchestrator concern, not an implementation artifact. Updating it on main ensures the next cron invocation sees the correct phase status and advances to the next incomplete phase (preventing infinite loops).

Update the plan file's progress tracker on main — change the phase status to Done with the commit hash (from worktree branch or delegate's landed commits) and notes. Examples by format:
- Table: | **4b: Mechanical** | ✅ Done | \abc1234` | 7 components, 45 tests |`
- Checklist: - [x] Phase 4b — Mechanical Domain (abc1234, 7 components)
- Section: add **Status:** ✅ Done (abc1234) to the section header
Update companion progress doc on main if one exists — add implementation details, architecture notes, lessons learned
If no tracker exists:
- Interactive mode: suggest adding one to the plan file, ask user
- Auto mode: note in the report that no tracker was updated
Mark the phase as "In Progress" and commit:
```
git add <plan-file> [companion-doc]
git commit -m "chore: mark phase <name> in progress"
```
Mark as 🟡 In Progress (not ✅ Done yet). Committing immediately ensures the next cron invocation sees the phase is being worked on (preventing re-runs). Phase 6 updates the tracker to ✅ Done AFTER landing succeeds. If landing fails, the tracker correctly says "In Progress" — the phase was attempted but not landed.

Phase 5 — Write Report

PREPEND new phase sections after the H1 in reports/plan-{slug}.md ({slug} from plan filename, e.g., FEATURE_PLAN.md → plan-physics module). Newest phase at the top — the reader's question is "what needs my attention?" and that's always the newest phase.

If the file doesn't exist, create it with a # Plan Report — {plan name} heading. Never overwrite the file — each phase adds a section.

After writing, regenerate PLAN_REPORT.md in the repo root as an index of all plan reports:

Scan reports/plan-*.md files
For each: extract plan name, phase count, overall status, unchecked [ ]
Write index with Needs Sign-off section (linked items) + Plans table
Staleness rule: items >7 days flagged STALE

Report format — each phase gets one ## Phase section:

## Phase — 4b Translational Mechanical Domain [UNFINALIZED]

**Plan:** plans/FEATURE_PLAN.md
**Status:** Completed (verified)
**Worktree:** ../plan-physics module-4b
**Commits:** abc1234, def5678

### Work Items
| # | Item | Status | Commit |
|---|------|--------|--------|
| 1 | Mass component | Done | abc1234 |
| 2 | Spring component | Done | def5678 |

### Verification
- Test suite: PASSED (4342 tests)
- Acceptance criteria: all met

### User Sign-off
{Only if UI files changed. Omit entirely for non-UI phases.}

- [ ] **P4b-1** — Variable viewer panel
  1. Open the app, load a physics module model (e.g., voltage-divider example)
  2. Run the simulation
  3. Click the lightning icon in the toolstrip
  4. Verify the Physical Variables panel opens with columns for V, I, P
  5. Check that values update after simulation completes
  ![viewer panel](.playwright/output/phase4b-variable-viewer.png)

- [ ] **P4b-2** — Toolstrip button
  1. Verify the lightning icon appears in the toolstrip
  2. Click it — panel should toggle open/closed

Report format rules:

One checkbox per item. Do NOT use a summary table with [ ] AND a detail section with [ ] — the viewer counts both as separate checkboxes. Use only the checklist format above.
Phase-prefixed IDs — P4b-1, P2-3, not #1, #2 (which reset per phase and collide).
Include verification instructions under each checkbox — numbered steps, screenshots. The reviewer needs to know what to do, not just what to check off.
One item per verifiable thing — "3 check blocks in Block Explorer" is wrong. Each block gets its own checkbox.
Avoid literal [ ] in description text — the viewer renders it as a phantom checkbox. Describe instead: "bracket pair" or use backtick escaping.

Post-report tracking

After writing the report and regenerating the index, create the report step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.report"

Phase 5b — Plan Completion

Triggers when ALL phases are done: either the last phase just finished (single-phase run where it was the only remaining phase), or in finish mode after all phases complete. Run this BEFORE Phase 6 (Land).

1. Audit phase compliance

Before declaring the plan complete, verify every phase has a clean status:

Check completion indicators — every phase must have one of: Done, a commit hash, ✅, [x]. If any phase lacks a completion indicator, WARN (do not hard-block):

Phase 3 has no completion indicator — review before closing.
Scan for unresolved gaps — check each phase's status line AND its corresponding section in reports/plan-{slug}.md for any of these phrases (case-insensitive): "noted as gap", "deferred", "skipped", "future work". If found, WARN (do not hard-block):

Phase 3 has unresolved gaps — review before closing.

List all flagged phases together so the user can review in one pass.
If running with auto and warnings were emitted, log them in the report but continue — these are advisory, not blocking.

2. Close linked issue (if any)

If the plan file has YAML frontmatter with an issue: field (e.g., issue: 42 or issue: "#42"):

Check issue state:

gh issue view <N> --json state --jq '.state'

If open: close it with a summary comment listing key commits and what was accomplished:

gh issue close <N> --comment "Resolved via plan execution.

Plan: <plan-file>
Key commits: <comma-separated list of commit hashes from all phases>
Phases completed: <count>

All phases passed verification. See reports/plan-{slug}.md for details."

If already closed: no action needed — log "Issue #N already closed."
If no issue: field: skip this step entirely.

3. Update plan frontmatter

Change status: active (or status: in-progress) to status: complete in the plan file's YAML frontmatter. If the plan has no status: field, add one: status: complete. Commit the change:

git add <plan-file>
git commit -m "chore: mark plan complete — <plan-name>"

4. Update SPRINT_REPORT.md

Check if SPRINT_REPORT.md exists in the repo root. If it does:

Search for the closed issue number (from step 2) or the plan filename in a "Skipped" section (look for headers or list items containing "Skipped", "Too Complex", "Deferred", or "Punted").
If found, append a note to that entry:

Resolved via /run-plan (plan: )
If SPRINT_REPORT.md does not exist, or the issue/plan is not mentioned in a skipped section, skip this step.

Phase 6 — Land

Delegate mode landing

If this phase used delegate execution mode, the delegated skill already landed its own work to main. Phase 6 in delegate mode:

Verify commits on main — check git log --oneline -10 for the delegate's commits. If missing, the delegate failed to land — invoke Failure Protocol.
Verify the report exists (Phase 5 ran)
Update the progress tracker (mark phase done)
Skip cherry-picking — work is already on main
Done. Proceed to next phase or exit.

Worktree mode landing

Pre-landing checklist (worktree mode only)

Before ANY cherry-pick to main, verify ALL of these. If any fails, STOP.

ls reports/plan-{slug}.md — report file exists (Phase 5 ran)
Report has a ## Phase section for every completed phase
In finish mode: cross-phase /verify-changes worktree returned clean
If UI-touching phases: playwright-cli agent ran and produced screenshots
If UI-touching phases: report has ### User Verification with [ ] items

Without auto: Phase complete. Output:

Phase complete. Report written to reports/plan-{slug}.md. Review the worktree and cherry-pick when ready, or use /commit land.

All interactive landing and cleanup is the user's decision. /run-plan is DONE after writing the report.
With auto but User Verify items exist: Check the report you just wrote in Phase 5 — if it has a ### User Verification section with unchecked [ ] items, UI changes need human sign-off before landing. Output:

Phase complete. Report written to reports/plan-{slug}.md. User verification needed before landing — review the report, sign off on UI changes, then run /commit land from the worktree.

Items needing sign-off: [list from the User Verification section]

Do NOT auto-land. The worktree is ready; the user reviews and lands when satisfied. This is the landing gate for UI changes — auto automates everything EXCEPT human judgment.

In finish mode: all phases share one worktree. Do NOT land individual phases as they complete — wait until ALL phases are done, then land everything together. Even non-UI phases should wait, because if a later phase has UI that the user rejects, the earlier phases may need to be revised too. The worktree accumulates all commits; landing is one atomic cherry-pick sequence at the end after all sign-offs.
With auto and NO User Verify items: Auto-land verified phase commits to main. Exception for finish mode: do NOT auto-land per-phase — a later phase may have UI that needs sign-off. In finish mode, wait until all phases complete, then land everything together (same as the User Verify gate above). Only auto-land per-phase when running a single phase (no finish flag).

Auto-land steps (single phase, or after all finish-mode phases complete):
1. Try cherry-picking WITHOUT stashing first. Git allows cherry-picks on a dirty working tree as long as the cherry-picked files don't overlap with uncommitted changes. Other sessions may have uncommitted work in the tree — stashing captures THEIR changes too, and the pop can silently merge or lose them.
  
  If git refuses with "your local changes would be overwritten," the cherry-pick touches files with uncommitted changes. Handle with LLM-assisted merge:
  1. Note which files overlap
  2. Capture the pre-stash state of each overlapping file: git diff <file> — save/remember this output. It's your evidence of what the uncommitted changes were (possibly from another session).
  3. git stash -u -m "pre-cherry-pick stash"
  4. git cherry-pick <commit-hash>
  5. git stash apply (NOT pop — keep the stash as a recovery path)
  6. If stash apply produces conflict markers (<<<<<<<), resolve them — read both sides and combine. This is expected for overlaps.
  7. For every overlapping file, READ the result and compare against the pre-stash diff from step 2. Verify every changed line from the uncommitted changes is still present AND the cherry-pick's changes landed correctly. If the merge dropped changes, restore them.
  8. After verification, drop the stash: git stash drop
  9. If you genuinely can't reconcile, STOP and report to the user.
2. Verify main is clean before cherry-picking:
```
npm run test:all
```
  If main's tests are already failing, STOP. Invoke the Failure Protocol — do not cherry-pick on top of broken code.
3. Cherry-pick sequentially — one commit at a time. Try without stash first (step 1). Only stash if git refuses due to file overlap.
4. If a cherry-pick conflicts: unlike /fix-issues (which can skip individual issues), a plan phase is one logical unit — partial landing is not useful. Abort all cherry-picks and invoke the Failure Protocol.
```
git cherry-pick --abort
```
  Re-run the phase after the conflicting code is resolved on main.
5. Extract logs and mark worktree as landed: a. Copy unique session logs from worktree to main's .claude/logs/ b. Write .landed marker (atomic: .tmp → mv):
```
cat > "<worktree-path>/.landed.tmp" <<LANDED
status: full
date: $(TZ=America/New_York date -Iseconds)
source: run-plan
phase: <phase name>
commits: <list of cherry-picked hashes>
LANDED
mv "<worktree-path>/.landed.tmp" "<worktree-path>/.landed"
```
6. Commit extracted logs:
```
git add .claude/logs/
git commit -m "chore: session logs from run-plan phase"
```
7. Restore stash if one was created:
```
git stash pop
```
8. Run tests after all cherry-picks land:
```
npm run test:all
```
  If tests fail, invoke the Failure Protocol.
9. Update tracker to Done — now that landing succeeded, update the plan file's progress tracker from 🟡 In Progress to ✅ Done:
```
git add <plan-file>
git commit -m "chore: mark phase <name> done (landed)"
```
10. Update the plan report (reports/plan-{slug}.md) — mark the phase section as landed. Regenerate PLAN_REPORT.md index.
11. Auto-remove worktree after successful landing:
```
# Extract any remaining logs
if [ -d "<worktree>/.claude/logs" ]; then
  for log in <worktree>/.claude/logs/*.md; do
    [ -f ".claude/logs/$(basename "$log")" ] || cp "$log" .claude/logs/
  done
fi

# Check for real uncommitted work (not artifacts)
DIRTY=$(git -C "<worktree>" diff --name-only HEAD)
UNTRACKED=$(git -C "<worktree>" status --porcelain | \
  grep -v '\.landed\|\.worktreepurpose\|\.test-results\|\.playwright\|node_modules')

if [ -z "$DIRTY" ] && [ -z "$UNTRACKED" ]; then
  rm -f "<worktree>/.landed" "<worktree>/.worktreepurpose" \
        "<worktree>/.test-results.txt"
  git worktree remove "<worktree>"
  git branch -d "<branch>" 2>/dev/null
else
  echo "Worktree not auto-removed: uncommitted work found"
fi
```
  If removal fails for any reason, leave the worktree — it has .landed and /briefing worktrees will classify it as safe.

Post-landing tracking

After successful landing (cherry-pick + tests pass), create the land step marker and update the fulfillment file:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.land"

printf 'skill: run-plan\nid: %s\nplan: %s\nphase: %s\nstatus: complete\ndate: %s\n' \
  "$TRACKING_ID" "$PLAN_FILE" "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/fulfilled.run-plan.$TRACKING_ID"

In finish mode, per-phase markers use the phasestep prefix (the hook ignores these — they are informational only):

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/phasestep.run-plan.$TRACKING_ID.$PHASE.implement"

After the cross-phase verification in finish mode completes, aggregate with step.* markers:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
for stage in implement verify report land; do
  printf 'phases: all\ncompleted: %s\n' "$(TZ=America/New_York date -Iseconds)" \
    > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.$stage"
done

Failure Protocol

If anything goes wrong during an auto or every run — cherry-pick conflict, test failures after landing, verification fails after 2 fix cycles, all agents fail — execute these steps in this exact order:

1. Kill the cron FIRST

This is the most critical step. A broken run leaves state that a subsequent cron run will stomp on.

CronList → find the /run-plan job ID → CronDelete

Do this BEFORE any cleanup or reporting. Even if you're about to fix the problem, kill the cron. The user can restart it after reviewing.

2. Restore the working tree

If a cherry-pick is in a conflicted state:

git cherry-pick --abort

If a stash was created during Phase 6 auto-land:

git stash pop

If git stash pop conflicts, do NOT attempt to clean up. Do NOT git stash drop (destroys untracked files). Do NOT git checkout -- . (also destroys untracked files extracted during the failed pop). The conflicted state preserves all data — leave it as-is and report to the user:

Stash pop conflicts. Stash preserved (contains untracked files). Run git stash show to inspect, then git stash pop manually.

Never destroy work to clean up. It's always better to STOP with a messy state than to lose files trying to clean up.

3. Write the failure to the plan report

Add a ## Run Failed section at the top of the report:

## Run Failed — YYYY-MM-DD HH:MM

**Plan:** plans/FEATURE_PLAN.md
**Phase:** 4b — Translational Mechanical Domain
**Failed at:** Phase N — [description]
**Error:** [what went wrong]
**State:**
- Cherry-picks landed before failure: [list or "none"]
- Stash restored: yes/no
- Worktree with changes: [path]
- Cron killed: yes (was job ID XXXX)

**To resume:** Review the state above, then either:
- Fix the issue in the worktree and re-run `/run-plan <plan-file> <phase>`
- Run `/run-plan <plan-file> auto every <interval>` to restart the cron

4. Alert the user

Output a clear, prominent message:

⚠ RUN-PLAN FAILED — cron stopped

Phase [N] failed: [one-line reason]
[specific error details]

What happened:
  - Implementation was in worktree [path]
  - [M] commits were cherry-picked to main before failure (or "none")
  - Stash was [restored / not needed]
  - Cron job [ID] has been CANCELLED

Working tree is clean. See reports/plan-{slug}.md for full details.
To restart: /run-plan <plan-file> auto every INTERVAL
To cancel: /run-plan stop

When to trigger

Invoke this protocol for ANY of these:

Cherry-pick conflict during Phase 6
npm run test:all fails after cherry-picks are landed
Verification fails after 2 fix+verify cycles (auto mode)
Preflight checks detect stale state (conflict markers, orphaned stash)
Any unrecoverable error that stops the run from completing normally

Do NOT invoke for:

Individual test failures in the worktree during implementation (the implementer fixes those as part of their workflow)
Warnings or non-blocking issues

Key Rules

"Noted as gap" is a FAILURE, not a pass. If the implementer skips a work item and the verifier writes "gaps noted" or "not a blocker" — that is a verification failure. Dispatch a fix agent for the missing items. Do not advance to Phase 4. Do not write "gaps noted" in reports. Past failure: Block Expansion Phase 1 skipped example model + runtime entry; verifier accepted both skips instead of invoking Failure Protocol.
Never weaken tests — fix the code, not the test. Do not loosen tolerances, skip assertions, or remove test cases.
Honest status reporting — if the user asks "are you stuck?", answer with DATA: (1) current phase and when it started, (2) agent duration and tool call count, (3) errors or retries. Do not say "everything is fine" if an agent has been running >30 minutes or retried 2+ times.

Edge Cases

No progress tracker: LLM reads plan sections + checks codebase for evidence of completion (files exist, tests pass, git log mentions the phase)
Phase fails verification: auto mode tries one fix cycle (dispatch fix agent + re-verify), then stops after 2 total cycles
All phases complete: report "Plan complete", delete cron if scheduled
Dependency not met: stop cleanly, report which dependency. If every, the cron retries on next invocation (the dependency may be completed by then)
Phase "In Progress": another agent may be working — stop, don't compete. Report the conflict.
Existing worktree for phase: previous incomplete run — ask user (interactive) or try to resume from the existing worktree (auto)
Implementation produces no commits: the agent worked but committed nothing. Report in reports/plan-{slug}.md as "No commits produced — investigate worktree." Do not attempt to cherry-pick nothing. In auto mode, invoke the Failure Protocol (this is an unrecoverable state for cron)
Plan file not found: stop immediately, report the error
Phase arg doesn't match any phase: stop, list available phases

name	run-plan
disable-model-invocation	false
argument-hint	<plan-file> [phase\|finish\|status] [auto] [every SCHEDULE] [now] \| stop \| next
description	Execute the next phase of a plan document: parse phases and status, dispatch implementation in a worktree, verify with a separate agent, update progress tracking, write reports/plan-{slug}.md, and optionally auto-land to main. Can self-schedule recurring runs via cron. Use `next` to check schedule, `stop` to cancel.

/run-plan <plan-file> [phase|finish] [auto] [every SCHEDULE] [now] | stop | next — Plan Phase Executor

Ultrathink throughout. Use careful, thorough reasoning at every step.

Arguments

/run-plan <plan-file> [phase] [auto] [every SCHEDULE] [now]
/run-plan stop | next

plan-file (required) — path to plan, e.g. plans/FEATURE_PLAN.md
phase (optional) — specific phase, e.g. 4a. If omitted, auto-detect next incomplete phase
finish (optional) — run ALL remaining phases sequentially until the plan is complete. finish is approval to START — do not ask for confirmation before the first phase (the user already said "finish"). Without auto: pauses BETWEEN phases to show results and ask "continue to next phase?" With auto: runs all phases without pausing (overnight). Each phase still gets full verification, testing, and all safety rails. If any phase fails verification or hits a conflict, stops there. finish and every are mutually exclusive. finish runs all phases in one session. every schedules one phase per cron fire. Combining them is meaningless — finish either completes (cron self-terminates) or fails (Failure Protocol kills the cron). Use one or the other.
auto (optional) — bypass approval gates, auto-land to main via cherry-pick
every SCHEDULE (optional) — self-schedule recurring runs via cron:
- Accepts intervals: 4h, 2h, 30m, 12h
- Accepts time-of-day: day at 9am, day at 14:00, weekday at 9am
- Without now: schedules only, does NOT run immediately
- With now: schedules AND runs immediately
- Implies auto — scheduling only makes sense for autonomous runs
- Cron prompt omits phase number so each invocation auto-detects the next incomplete phase
- Each run re-registers the cron (self-perpetuating)
- Cron is session-scoped — dies when the session dies
now (optional) — run immediately. When combined with every, runs immediately AND schedules. Without every, now is the default behavior.
status — show plan progress: all phases, their status, what's next, and what's blocked. Read-only — no agents dispatched, no approval gate.
stop — cancel any existing /run-plan cron and exit. Takes precedence over all other arguments.
next — check when the next scheduled run will fire. Takes precedence over all other arguments except stop.

Detection: scan $ARGUMENTS for:

stop (case-insensitive) — cancel cron and exit (highest precedence)
next (case-insensitive) — check schedule and exit
status (case-insensitive) — show plan progress and exit
finish (case-insensitive) — run all remaining phases sequentially
now (case-insensitive) — run immediately
auto (case-insensitive) — autonomous mode
every followed by a schedule expression — scheduling mode

Examples:

/run-plan plans/FEATURE_PLAN.md — interactive, next phase
/run-plan plans/FEATURE_PLAN.md 4b — interactive, specific phase
/run-plan plans/FEATURE_PLAN.md finish — interactive, all remaining phases (pauses between each)
/run-plan plans/FEATURE_PLAN.md finish auto — autonomous, all remaining phases (no pausing)
/run-plan plans/FEATURE_PLAN.md auto every 4h — schedule every 4h
/run-plan plans/FEATURE_PLAN.md auto every 4h now — schedule + run now
/run-plan plans/FEATURE_PLAN.md status — show plan progress
/run-plan now — trigger the active cron early
/run-plan stop — cancel scheduled runs
/run-plan next — check when the next phase will run

Status (if `status` is present)

If $ARGUMENTS contains status (case-insensitive):

Read the plan file specified in the arguments
Also read any companion progress document if referenced
Parse all phases and their status (same parsing logic as Phase 1 steps 2-3: "Extract phases and status" and "Determine target phase." Do NOT run preflight checks — status is read-only)

Present a progress table:

Plan: plans/FEATURE_PLAN.md

| Phase | Status |
|-------|--------|
| 4a — Electrical | Done (abc1234) |
| 4b — Mechanical | Done (def5678) |
| 4c — Smooth Nonlinear | Next ← |
| 4d — Solver Fixes | Blocked (needs 4c) |
| 4e — UI Polish | Blocked (needs 4d) |

Next phase: 4c — Smooth Nonlinear Components
Dependencies: 4a ✓, 4b ✓

If a cron is active, also show the schedule:

Scheduled: every 4h (~8:15 PM ET next, cron XXXX)
Exit. Read-only — no agents dispatched, no work done.

Now (standalone — no plan-file provided)

If $ARGUMENTS is just now (no plan-file, no phase, no every):

Use CronList to list all cron jobs
Find any whose prompt starts with Run /run-plan
If found: extract the cron's prompt to get the plan-file, auto, and schedule. Run the phase immediately — proceed to Phase 1. Do NOT ask for confirmation — now IS the confirmation. The cron stays active.
If none found: report No active /run-plan cron to trigger. Use /run-plan <plan-file> to run manually. and exit.

Next (if `next` is present)

If $ARGUMENTS contains next (case-insensitive):

Use CronList to list all cron jobs
Find any whose prompt starts with Run /run-plan
Report:
- If found: parse the cron expression and compute the next fire time. Use date +%Z for the timezone. Show both relative and absolute:
  
  Next run-plan phase in ~2h 15m (~8:30 PM ET, cron XXXX). Prompt: Run /run-plan plans/FEATURE_PLAN.md auto every 4h
- If none found: No active /run-plan cron in this session.
Exit. Do not proceed to any phase.

Stop (if `stop` is present)

If $ARGUMENTS contains stop (case-insensitive):

Use CronList to list all cron jobs
Delete ALL whose prompt starts with Run /run-plan using CronDelete
Report what was cancelled:
- If one cron found: Run-plan cron stopped (was job ID XXXX, every INTERVAL).
- If multiple found: Stopped N run-plan crons (IDs: XXXX, YYYY).
- If none found: No active /run-plan cron found.
Exit. Do not proceed to any phase. The stop command does nothing else.

Phase 0 — Schedule (if `every` is present)

If $ARGUMENTS contains every <schedule>:

Parse the schedule — convert to a cron expression. The LLM interprets natural scheduling expressions.

For interval-based schedules (4h, 2h, 30m): use the CURRENT minute as the offset so the first fire is a full interval from now, not aligned to midnight. Check the current minute with date +%M:
- 4h at minute 9 → 9 */4 * * * (fires at :09 past every 4th hour)
- 2h at minute 15 → 15 */2 * * *
- 30m → */30 * * * * (no offset needed for sub-hour)
- 1h at minute 9 → 9 * * * *
For time-of-day schedules (day at 9am, weekday at 2pm): offset round minutes by a few to avoid API busy marks:
- day at 9am → 3 9 * * *
- day at 14:00 → 3 14 * * *
- weekday at 9am → 3 9 * * 1-5
Deduplicate — use CronList + CronDelete to remove any whose prompt starts with Run /run-plan.
Construct the cron prompt. Strip the phase number (so each invocation auto-detects the next incomplete phase). Always include now in the cron prompt so each cron fire runs immediately AND re-registers itself. Note: this now is for the CRON's invocation, not the current invocation:
```
Run /run-plan <plan-file> auto every <schedule> now
```
Note: the phase number is intentionally omitted so the cron auto-advances.
Create the cron — use CronCreate:
- cron: the cron expression from step 1
- recurring: true
- prompt: the constructed command from step 3
Confirm with wall-clock time. Always show times in America/New_York (ET) — use TZ=America/New_York date for conversion, not the system timezone (which may be UTC):

If now is present:

Run-plan scheduled every 4h. Running now. Next phase run after this one: ~8:15 PM ET (cron ID XXXX).

If now is NOT present:

Run-plan scheduled every 4h. First run: ~4:15 PM ET (cron ID XXXX). Use /run-plan next to check, /run-plan stop to cancel.
If now is present: proceed to Phase 1 (run immediately). If now is NOT present: Exit. The cron fires later.

End-of-phase scheduling note: when a phase finishes and a cron is active, always include the estimated next run time with timezone in the completion message. Example:

Phase complete. Next phase run in ~3h 45m (~11:30 PM ET, cron XXXX).

If every is NOT present, skip this phase and proceed to Phase 1 (bare invocation always runs immediately).

Phase 1 — Parse Plan & Extract Verbatim Phase Text

The key differentiator. Plans have varied formats, so the agent uses LLM comprehension rather than rigid parsing.

Preflight checks

Before parsing, check for stale state from a previous failed run:

In-progress git operation?

ls .git/CHERRY_PICK_HEAD .git/MERGE_HEAD .git/REBASE_HEAD 2>/dev/null
git status --porcelain | grep '^UU\|^AA\|^DD'

If either command produces output, STOP. Invoke the Failure Protocol.

Stash stack?
```
git stash list
```
If there is a stash with message containing "pre-cherry-pick", a previous run's stash was never restored. STOP. Invoke the Failure Protocol — the user needs to git stash pop or git stash drop before a new phase can start safely.
Leftover plan worktrees?
```
git worktree list
```
If worktrees from a previous run exist (paths containing plan-), warn the user. Do not remove them — note their presence and continue.
Unconfigured hook placeholders?
```
grep -q '{{' .claude/hooks/block-unsafe-project.sh 2>/dev/null
```
If found AND test infrastructure exists (package.json with a "test" script, or vitest.config.* / jest.config.* exists), STOP. Hook placeholders have not been configured — run /update-zskills first.

Parse plan

Read the plan file in full. Also read any companion progress document if referenced (e.g., FEATURE_PROGRESS_AND_NEXT_STEPS.md).
Extract phases and status — handle four formats:
- Progress tracker table (FEATURE_PLAN style): rows with ✅ Done, ⬚ (not started), 🟡 (in progress), etc.
- Numbered phase sections (## Phase 4a — Title): look for completion markers in the section body or companion doc
- Checklist (- [x] / - [ ]): checked = done, unchecked = not done
- Narrative: infer status from codebase evidence (files exist, tests pass, etc.)
Determine target phase:
- If phase arg given: use it. If already complete, warn (or skip in auto)
- If no phase arg: first incomplete phase
- If ALL phases complete: report "Plan complete" → stop. If every, delete the cron via CronList + CronDelete
- If multiple phases share the same number (e.g., 4a, 4b, 4c), treat each sub-phase as a separate phase
Check dependencies — if a prerequisite phase isn't Done, STOP. Report which dependency is missing. If every, the cron retries later.
Check for conflicts — if the target phase is "In Progress" (🟡 or equivalent), another agent may be working on it. STOP. Do not compete.
Check for staleness notes — if the plan's Dependencies section contains language like "drafted before," "may need refresh," or "APIs and data structures referenced here are based on [another plan's] design, not actual code," the plan may be stale:
- Without auto: tell the user "this plan was drafted before its dependency was implemented. Want me to refresh it with /draft-plan?"
- With auto: dispatch /draft-plan on the plan file to update it. /draft-plan handles existing files as modernizations. After the refresh, re-read the plan and continue.
- Skip this check if the plan file was modified more recently than the dependency's completion (it may already be up to date).
Save the VERBATIM phase text — copy the entire section from the plan file exactly as written. Every sentence, every bullet, every formula, every constraint. This text will be passed to agents in Phase 2 and Phase 3.

Do NOT summarize, paraphrase, or reinterpret. The plan is the spec.

Lesson from /fix-issues #387: summarized descriptions caused agents to implement the wrong thing. "Reset button" was interpreted as "clear canvas" instead of "reset mappings to defaults" because only the title was read. The same will happen with plan phases if the orchestrator summarizes "implement translational mechanical domain" without the formulas, state equations, and design constraints.

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
mkdir -p "$MAIN_ROOT/.claude/tracking"
printf 'skill: run-plan\nid: %s\nplan: %s\nphase: %s\nstatus: started\ndate: %s\n' \
  "$TRACKING_ID" "$PLAN_FILE" "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/fulfilled.run-plan.$TRACKING_ID"

Classify UI impact from the plan text. Scan the phase description for UI indicators: mentions of editor, toolbar, canvas, panel, dialog, CSS, button, menu, viewport, renderer, dark mode, layout, or any reference to UI/editor/styles directories in the project. Flag the phase as UI-touching if any are found.

In finish mode, classify ALL phases upfront and report:

Running 5 phases. Phases 3 and 5 touch UI — landing will wait for your sign-off at the end.

This tells the user immediately whether the run will be fully automatic or will need their review before landing. No surprises at Phase 6.
Present the phase plan:
- Without auto: display the phase summary (name, status, dependencies, work items, UI classification) and wait for user approval
- With auto: proceed immediately

`finish` mode: overall verification after all phases

In finish mode, after ALL phases complete their per-phase implement → verify loops, run a final overall verification before writing the report and landing:

Dispatch an overall verification agent. In worktree mode, run /verify-changes worktree on the full worktree diff. In delegate mode (or mixed), run /verify-changes on main against the commits from all phases combined. This catches cross-phase integration issues: regressions from later phases breaking earlier work, conflicting imports, duplicated code.
If ANY phase was classified as UI-touching (step 7), dispatch a dedicated manual testing agent that exercises ALL UI changes together via playwright-cli. This agent:
- Tests the combined UI state (not each change in isolation)
- Takes comprehensive screenshots showing everything working together
- Prepares the sign-off report so the user can review efficiently
- Uses /manual-testing recipes for selectors and setup
The goal: instead of "3 items need sign-off, go check yourself," the user gets "3 items need sign-off, here are screenshots of all of them working together."
Proceed to Phase 5 (write report) with the combined verification results.

Phase 2 — Implement

Execution mode detection

Check the phase text for an execution mode directive:

### Execution: delegate <skill> [args] — delegate mode. The phase runs a skill (e.g., /add-block, /run-plan) that manages its own isolation. The orchestrating agent runs on main, not in a worktree. See "Delegate mode" below.
### Execution: worktree or no directive — default worktree mode. See "Worktree mode" below.

Delegate mode

The orchestrating agent runs on main and calls the specified skill. The skill manages its own worktree, verification, and landing.

Dispatch agent on main (no isolation: "worktree"). Give the agent:
- The verbatim phase text (same rule as worktree mode)
- Instruction to run the specified skill with the given arguments
- Instruction to wait for the skill to finish and report the result
Agent timeout: 2 hours. Same as worktree mode.
After the delegate skill finishes, /run-plan proceeds to Phase 3 (verification) which runs on main — checking that the delegated work actually landed correctly.
In finish mode: each delegate phase runs independently (no shared worktree — the delegate skill creates and destroys its own).

Use cases:

### Execution: delegate /add-block DiscreteFilter — block expansion
### Execution: delegate /run-plan plans/SUB_PLAN.md finish auto — meta-plans
### Execution: delegate /draft-plan plans/FOO.md <description> — plan generation

Worktree mode (default)

Agent timeout: 2 hours. Note the dispatch time. If the implementation agent hasn't returned after 2 hours, declare it failed:

Mark the phase as "Timed out" in reports/plan-{slug}.md
The phase stays incomplete for the next run
The worktree is a cleanup artifact — do NOT auto-land late results
If the agent eventually returns, ignore it. Timed out = failed, period.
If the plan was drafted with /draft-plan, the phase may be too large — consider splitting it (each phase should be ~3-5 components, ~500 lines).

Dispatch implementation agent with isolation: "worktree". Tell the agent to write a .worktreepurpose file as its first action:
```
echo "<session-name or plan-name>: <phase name>" > .worktreepurpose
```
Example: echo "SKILLZ: briefing Phase 1" > .worktreepurpose This metadata helps /briefing worktrees and /briefing verify show what each worktree is for, instead of just an opaque agent ID.
Agent prompt MUST include the verbatim plan text. The implementing agent receives the EXACT text of the phase from the plan file — not a summary, not bullet points extracted from it, not "implement the mechanical domain." The full section with every requirement, formula, constraint, design note, and acceptance criterion.

The plan is the spec. If the agent doesn't have the verbatim text, it will guess, and it will guess wrong.

For plan sections longer than ~100 lines: write the verbatim text to a temp file (e.g., /tmp/phase-text.md) and tell the agent to Read the file. This avoids the natural LLM tendency to compress long text when inlining it in a prompt. Shorter sections can be inlined directly.
If dispatching sub-agents for parallel work items, each sub-agent gets:
- The full phase context (verbatim) — so they understand the big picture
- Their specific scope clearly delineated — e.g., "you are implementing Mass, Spring, Damper. Another agent is implementing sensors and force source."
- What parallel agents are doing — enough to avoid conflicts (shared files, shared infrastructure) but not so much detail that it confuses their scope. Format: "Another agent is handling: [list of items]. You should not modify [shared files] until that work lands."
- Shared infrastructure dependencies — if a base class or domain definition must exist first, that must be built sequentially before dispatching parallel agents. Never dispatch parallel agents that both need to create the same file.
Within-phase parallelism is the agent's judgment call — if items are independent (e.g., Mass, Spring, Damper components), the agent may dispatch sub-agents. If there's shared infrastructure to build first, it works sequentially then parallelizes. The skill does NOT force parallelism.
Commit discipline:
- One logical unit per commit — clean git history
- npm run test:all before every commit — not just npm test
- Tests alongside implementation, not deferred to later
- Agents commit freely in worktrees — that's the point of isolation
- Rebase onto current main before final commit:
```
git fetch origin main && git rebase origin/main
```
  This ensures the commit contains only the agent's changes, not stale copies of files other agents already fixed on main. If rebase conflicts, abort (git rebase --abort) and proceed — the cherry-pick verification will catch stale files via selective extraction.
Running tests in worktrees — CRITICAL. Agents waste hours getting tests working in worktrees without these instructions. Include this VERBATIM in every implementation and verification agent prompt:
Worktree test recipe:
1. Start a dev server FIRST: npm start &
2. Wait for it: sleep 3
3. Run tests with output captured to a file: npm run test:all > .test-results.txt 2>&1 Never pipe through | tail, | head, | grep — it loses output and forces re-runs. Capture once, read the file.
4. The dev server must stay running for E2E tests. If source files changed (they will have — you're implementing), E2E tests FAIL (not skip) without a dev server.
5. If tests fail, read .test-results.txt to find the failures. Then run ONLY the failing test file to iterate on the fix: node --test tests/the-failing-file.test.js Do NOT re-run npm run test:all to diagnose — that wastes 5 minutes when the single file takes 30 seconds.
6. After fixing, run the single file again to confirm. Then run npm run test:all > .test-results.txt 2>&1 ONE more time as the final gate before committing.
7. Max 2 fix attempts at the same error — do not thrash.
8. If a test fails in code you didn't touch, it may be pre-existing. See /verify-changes Phase 3 for the pre-existing failure protocol.
No steps skipped or deferred. If the plan says "implement 7 components," implement 7 components. If it says "write tests for free vibration," write those exact tests. Do not stop after the easy items and declare the hard ones "future work."

Post-implementation tracking

After the implementation agent finishes (whether worktree or delegate mode), create the implementation step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.implement"

Pre-verification tracking

Before dispatching the verification agent, create a delegation requirement marker so the hook can enforce that verification actually runs:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'skill: verify-changes\nparent: run-plan\nid: %s\ndate: %s\n' \
  "$TRACKING_ID" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/requires.verify-changes.$TRACKING_ID"

Pass the tracking ID to the verification agent in the dispatch prompt so it can create its own fulfillment marker:

Your tracking ID is $TRACKING_ID. On entry, create fulfilled.verify-changes.$TRACKING_ID in the main repo's .claude/tracking/ directory.

Phase 3 — Verify (separate agent)

Critical: the verification agent is NOT the implementing agent. Fresh eyes catch implementer blindspots — deferred hard parts, missing tests, stubs, shortcuts.

Delegate mode verification

If this phase used delegate execution, verification runs on main:

Verify commits landed — check git log --oneline -10 for the delegate's commits. If expected commits are missing, the delegate failed to land — invoke Failure Protocol.
Run npm run test:all on main — the delegate already tested, but /run-plan verifies against the plan's acceptance criteria.
Check acceptance criteria from the verbatim plan text — the delegate skill doesn't know the plan's criteria, only /run-plan does.
Dispatch a verification agent if needed (same rules as worktree mode below, but targeting main instead of a worktree path).

Worktree mode verification

Dispatch verification agent targeting the worktree's changes. The verification agent is dispatched without isolation: "worktree" — the Agent tool's isolation parameter creates a NEW worktree, it cannot attach to an existing one. Instead, give the verification agent:
- The worktree path from Phase 2 (so it can read files and run tests there via cd <worktree-path> && npm run test:all)
- The worktree branch name (so it can diff against main: git diff main...<branch>)
- The verbatim phase text from the plan (same text the implementer got)
- Instruction to run /verify-changes worktree — the verification agent runs this, NOT you. Do NOT run verification yourself — you are the orchestrator with implementer bias.
- The work items checklist — verify each item was actually implemented, not stubbed or skipped
Additional plan-specific checks (the verifier checks these against the verbatim plan text — not against a summary):
- Do commits cover ALL work items listed in the plan? Any missing?
- Does implementation follow the plan's stated approach? (e.g., "use internal displacement state for Spring" — did it actually do that?)
- Are constraints respected? (no external solvers, etc.)
- Any deferred hard parts, stubs, TODOs, or placeholder implementations?
- Do acceptance criteria match? (e.g., "test free vibration x(t) = A cos(wt)" — does that exact test exist with that exact formula?)
"Noted as gap" is a verification FAILURE. If any work item, acceptance criterion, or checklist item was skipped and merely noted — that is not a pass. It is a fail. The verifier must not rationalize skipped steps as "not blockers" or "gaps for future work." If the plan says to do it and it wasn't done, verification fails. Period.

Past failure: Block Expansion Plan Phase 1 — the implementer skipped the example model (Step 7 of /add-block) and runtime entry (Step 10). The verifier saw both skips but wrote "gaps noted" instead of invoking the Failure Protocol. The phase was reported as complete with missing work.
If verification fails:
- Without auto: present findings, ask user what to do
- With auto: dispatch a fresh fix agent for the missing items. The fix agent receives: the worktree path, the verbatim plan text, the specific items that failed verification, and instructions to complete them — not summarize them, not note them, COMPLETE them. If the missing item is an example model, the fix agent calls /add-example. If it's a runtime entry, the fix agent adds it. The fix agent is NOT the implementer — it's a fresh agent with no bias toward "this is good enough."
  
  After the fix agent finishes, re-verify (max 2 rounds). If still failing after 2 fix+verify cycles, STOP — needs human judgment. Invoke the Failure Protocol.

Post-verification tracking

After verification passes, create the verification step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\nresult: pass\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.verify"

Phase 4 — Update Progress Tracking

Update the plan file's progress tracker on main — change the phase status to Done with the commit hash (from worktree branch or delegate's landed commits) and notes. Examples by format:
- Table: | **4b: Mechanical** | ✅ Done | \abc1234` | 7 components, 45 tests |`
- Checklist: - [x] Phase 4b — Mechanical Domain (abc1234, 7 components)
- Section: add **Status:** ✅ Done (abc1234) to the section header
Update companion progress doc on main if one exists — add implementation details, architecture notes, lessons learned
If no tracker exists:
- Interactive mode: suggest adding one to the plan file, ask user
- Auto mode: note in the report that no tracker was updated
Mark the phase as "In Progress" and commit:
```
git add <plan-file> [companion-doc]
git commit -m "chore: mark phase <name> in progress"
```
Mark as 🟡 In Progress (not ✅ Done yet). Committing immediately ensures the next cron invocation sees the phase is being worked on (preventing re-runs). Phase 6 updates the tracker to ✅ Done AFTER landing succeeds. If landing fails, the tracker correctly says "In Progress" — the phase was attempted but not landed.

Phase 5 — Write Report

If the file doesn't exist, create it with a # Plan Report — {plan name} heading. Never overwrite the file — each phase adds a section.

After writing, regenerate PLAN_REPORT.md in the repo root as an index of all plan reports:

Scan reports/plan-*.md files
For each: extract plan name, phase count, overall status, unchecked [ ]
Write index with Needs Sign-off section (linked items) + Plans table
Staleness rule: items >7 days flagged STALE

Report format — each phase gets one ## Phase section:

## Phase — 4b Translational Mechanical Domain [UNFINALIZED]

**Plan:** plans/FEATURE_PLAN.md
**Status:** Completed (verified)
**Worktree:** ../plan-physics module-4b
**Commits:** abc1234, def5678

### Work Items
| # | Item | Status | Commit |
|---|------|--------|--------|
| 1 | Mass component | Done | abc1234 |
| 2 | Spring component | Done | def5678 |

### Verification
- Test suite: PASSED (4342 tests)
- Acceptance criteria: all met

### User Sign-off
{Only if UI files changed. Omit entirely for non-UI phases.}

- [ ] **P4b-1** — Variable viewer panel
  1. Open the app, load a physics module model (e.g., voltage-divider example)
  2. Run the simulation
  3. Click the lightning icon in the toolstrip
  4. Verify the Physical Variables panel opens with columns for V, I, P
  5. Check that values update after simulation completes
  ![viewer panel](.playwright/output/phase4b-variable-viewer.png)

- [ ] **P4b-2** — Toolstrip button
  1. Verify the lightning icon appears in the toolstrip
  2. Click it — panel should toggle open/closed

Report format rules:

One checkbox per item. Do NOT use a summary table with [ ] AND a detail section with [ ] — the viewer counts both as separate checkboxes. Use only the checklist format above.
Phase-prefixed IDs — P4b-1, P2-3, not #1, #2 (which reset per phase and collide).
Include verification instructions under each checkbox — numbered steps, screenshots. The reviewer needs to know what to do, not just what to check off.
One item per verifiable thing — "3 check blocks in Block Explorer" is wrong. Each block gets its own checkbox.
Avoid literal [ ] in description text — the viewer renders it as a phantom checkbox. Describe instead: "bracket pair" or use backtick escaping.

Post-report tracking

After writing the report and regenerating the index, create the report step marker:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.report"

Phase 5b — Plan Completion

1. Audit phase compliance

Before declaring the plan complete, verify every phase has a clean status:

Check completion indicators — every phase must have one of: Done, a commit hash, ✅, [x]. If any phase lacks a completion indicator, WARN (do not hard-block):

Phase 3 has no completion indicator — review before closing.
Scan for unresolved gaps — check each phase's status line AND its corresponding section in reports/plan-{slug}.md for any of these phrases (case-insensitive): "noted as gap", "deferred", "skipped", "future work". If found, WARN (do not hard-block):

Phase 3 has unresolved gaps — review before closing.

List all flagged phases together so the user can review in one pass.
If running with auto and warnings were emitted, log them in the report but continue — these are advisory, not blocking.

2. Close linked issue (if any)

If the plan file has YAML frontmatter with an issue: field (e.g., issue: 42 or issue: "#42"):

Check issue state:

gh issue view <N> --json state --jq '.state'

If open: close it with a summary comment listing key commits and what was accomplished:

gh issue close <N> --comment "Resolved via plan execution.

Plan: <plan-file>
Key commits: <comma-separated list of commit hashes from all phases>
Phases completed: <count>

All phases passed verification. See reports/plan-{slug}.md for details."

If already closed: no action needed — log "Issue #N already closed."
If no issue: field: skip this step entirely.

3. Update plan frontmatter

Change status: active (or status: in-progress) to status: complete in the plan file's YAML frontmatter. If the plan has no status: field, add one: status: complete. Commit the change:

git add <plan-file>
git commit -m "chore: mark plan complete — <plan-name>"

4. Update SPRINT_REPORT.md

Check if SPRINT_REPORT.md exists in the repo root. If it does:

Search for the closed issue number (from step 2) or the plan filename in a "Skipped" section (look for headers or list items containing "Skipped", "Too Complex", "Deferred", or "Punted").
If found, append a note to that entry:

Resolved via /run-plan (plan: )
If SPRINT_REPORT.md does not exist, or the issue/plan is not mentioned in a skipped section, skip this step.

Phase 6 — Land

Delegate mode landing

If this phase used delegate execution mode, the delegated skill already landed its own work to main. Phase 6 in delegate mode:

Verify commits on main — check git log --oneline -10 for the delegate's commits. If missing, the delegate failed to land — invoke Failure Protocol.
Verify the report exists (Phase 5 ran)
Update the progress tracker (mark phase done)
Skip cherry-picking — work is already on main
Done. Proceed to next phase or exit.

Worktree mode landing

Pre-landing checklist (worktree mode only)

Before ANY cherry-pick to main, verify ALL of these. If any fails, STOP.

ls reports/plan-{slug}.md — report file exists (Phase 5 ran)
Report has a ## Phase section for every completed phase
In finish mode: cross-phase /verify-changes worktree returned clean
If UI-touching phases: playwright-cli agent ran and produced screenshots
If UI-touching phases: report has ### User Verification with [ ] items

Without auto: Phase complete. Output:

Phase complete. Report written to reports/plan-{slug}.md. Review the worktree and cherry-pick when ready, or use /commit land.

All interactive landing and cleanup is the user's decision. /run-plan is DONE after writing the report.
With auto but User Verify items exist: Check the report you just wrote in Phase 5 — if it has a ### User Verification section with unchecked [ ] items, UI changes need human sign-off before landing. Output:

Phase complete. Report written to reports/plan-{slug}.md. User verification needed before landing — review the report, sign off on UI changes, then run /commit land from the worktree.

Items needing sign-off: [list from the User Verification section]

Do NOT auto-land. The worktree is ready; the user reviews and lands when satisfied. This is the landing gate for UI changes — auto automates everything EXCEPT human judgment.

In finish mode: all phases share one worktree. Do NOT land individual phases as they complete — wait until ALL phases are done, then land everything together. Even non-UI phases should wait, because if a later phase has UI that the user rejects, the earlier phases may need to be revised too. The worktree accumulates all commits; landing is one atomic cherry-pick sequence at the end after all sign-offs.
With auto and NO User Verify items: Auto-land verified phase commits to main. Exception for finish mode: do NOT auto-land per-phase — a later phase may have UI that needs sign-off. In finish mode, wait until all phases complete, then land everything together (same as the User Verify gate above). Only auto-land per-phase when running a single phase (no finish flag).

Auto-land steps (single phase, or after all finish-mode phases complete):
1. Try cherry-picking WITHOUT stashing first. Git allows cherry-picks on a dirty working tree as long as the cherry-picked files don't overlap with uncommitted changes. Other sessions may have uncommitted work in the tree — stashing captures THEIR changes too, and the pop can silently merge or lose them.
  
  If git refuses with "your local changes would be overwritten," the cherry-pick touches files with uncommitted changes. Handle with LLM-assisted merge:
  1. Note which files overlap
  2. Capture the pre-stash state of each overlapping file: git diff <file> — save/remember this output. It's your evidence of what the uncommitted changes were (possibly from another session).
  3. git stash -u -m "pre-cherry-pick stash"
  4. git cherry-pick <commit-hash>
  5. git stash apply (NOT pop — keep the stash as a recovery path)
  6. If stash apply produces conflict markers (<<<<<<<), resolve them — read both sides and combine. This is expected for overlaps.
  7. For every overlapping file, READ the result and compare against the pre-stash diff from step 2. Verify every changed line from the uncommitted changes is still present AND the cherry-pick's changes landed correctly. If the merge dropped changes, restore them.
  8. After verification, drop the stash: git stash drop
  9. If you genuinely can't reconcile, STOP and report to the user.
2. Verify main is clean before cherry-picking:
```
npm run test:all
```
  If main's tests are already failing, STOP. Invoke the Failure Protocol — do not cherry-pick on top of broken code.
3. Cherry-pick sequentially — one commit at a time. Try without stash first (step 1). Only stash if git refuses due to file overlap.
4. If a cherry-pick conflicts: unlike /fix-issues (which can skip individual issues), a plan phase is one logical unit — partial landing is not useful. Abort all cherry-picks and invoke the Failure Protocol.
```
git cherry-pick --abort
```
  Re-run the phase after the conflicting code is resolved on main.
5. Extract logs and mark worktree as landed: a. Copy unique session logs from worktree to main's .claude/logs/ b. Write .landed marker (atomic: .tmp → mv):
```
cat > "<worktree-path>/.landed.tmp" <<LANDED
status: full
date: $(TZ=America/New_York date -Iseconds)
source: run-plan
phase: <phase name>
commits: <list of cherry-picked hashes>
LANDED
mv "<worktree-path>/.landed.tmp" "<worktree-path>/.landed"
```
6. Commit extracted logs:
```
git add .claude/logs/
git commit -m "chore: session logs from run-plan phase"
```
7. Restore stash if one was created:
```
git stash pop
```
8. Run tests after all cherry-picks land:
```
npm run test:all
```
  If tests fail, invoke the Failure Protocol.
9. Update tracker to Done — now that landing succeeded, update the plan file's progress tracker from 🟡 In Progress to ✅ Done:
```
git add <plan-file>
git commit -m "chore: mark phase <name> done (landed)"
```
10. Update the plan report (reports/plan-{slug}.md) — mark the phase section as landed. Regenerate PLAN_REPORT.md index.
11. Auto-remove worktree after successful landing:
```
# Extract any remaining logs
if [ -d "<worktree>/.claude/logs" ]; then
  for log in <worktree>/.claude/logs/*.md; do
    [ -f ".claude/logs/$(basename "$log")" ] || cp "$log" .claude/logs/
  done
fi

# Check for real uncommitted work (not artifacts)
DIRTY=$(git -C "<worktree>" diff --name-only HEAD)
UNTRACKED=$(git -C "<worktree>" status --porcelain | \
  grep -v '\.landed\|\.worktreepurpose\|\.test-results\|\.playwright\|node_modules')

if [ -z "$DIRTY" ] && [ -z "$UNTRACKED" ]; then
  rm -f "<worktree>/.landed" "<worktree>/.worktreepurpose" \
        "<worktree>/.test-results.txt"
  git worktree remove "<worktree>"
  git branch -d "<branch>" 2>/dev/null
else
  echo "Worktree not auto-removed: uncommitted work found"
fi
```
  If removal fails for any reason, leave the worktree — it has .landed and /briefing worktrees will classify it as safe.

Post-landing tracking

After successful landing (cherry-pick + tests pass), create the land step marker and update the fulfillment file:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.land"

printf 'skill: run-plan\nid: %s\nplan: %s\nphase: %s\nstatus: complete\ndate: %s\n' \
  "$TRACKING_ID" "$PLAN_FILE" "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/fulfilled.run-plan.$TRACKING_ID"

In finish mode, per-phase markers use the phasestep prefix (the hook ignores these — they are informational only):

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
printf 'phase: %s\ncompleted: %s\n' "$PHASE" "$(TZ=America/New_York date -Iseconds)" \
  > "$MAIN_ROOT/.claude/tracking/phasestep.run-plan.$TRACKING_ID.$PHASE.implement"

After the cross-phase verification in finish mode completes, aggregate with step.* markers:

MAIN_ROOT=$(cd "$(git rev-parse --git-common-dir)/.." && pwd)
for stage in implement verify report land; do
  printf 'phases: all\ncompleted: %s\n' "$(TZ=America/New_York date -Iseconds)" \
    > "$MAIN_ROOT/.claude/tracking/step.run-plan.$TRACKING_ID.$stage"
done

Failure Protocol

1. Kill the cron FIRST

This is the most critical step. A broken run leaves state that a subsequent cron run will stomp on.

CronList → find the /run-plan job ID → CronDelete

Do this BEFORE any cleanup or reporting. Even if you're about to fix the problem, kill the cron. The user can restart it after reviewing.

2. Restore the working tree

If a cherry-pick is in a conflicted state:

git cherry-pick --abort

If a stash was created during Phase 6 auto-land:

git stash pop

Stash pop conflicts. Stash preserved (contains untracked files). Run git stash show to inspect, then git stash pop manually.

Never destroy work to clean up. It's always better to STOP with a messy state than to lose files trying to clean up.

3. Write the failure to the plan report

Add a ## Run Failed section at the top of the report:

## Run Failed — YYYY-MM-DD HH:MM

**Plan:** plans/FEATURE_PLAN.md
**Phase:** 4b — Translational Mechanical Domain
**Failed at:** Phase N — [description]
**Error:** [what went wrong]
**State:**
- Cherry-picks landed before failure: [list or "none"]
- Stash restored: yes/no
- Worktree with changes: [path]
- Cron killed: yes (was job ID XXXX)

**To resume:** Review the state above, then either:
- Fix the issue in the worktree and re-run `/run-plan <plan-file> <phase>`
- Run `/run-plan <plan-file> auto every <interval>` to restart the cron

4. Alert the user

Output a clear, prominent message:

⚠ RUN-PLAN FAILED — cron stopped

Phase [N] failed: [one-line reason]
[specific error details]

What happened:
  - Implementation was in worktree [path]
  - [M] commits were cherry-picked to main before failure (or "none")
  - Stash was [restored / not needed]
  - Cron job [ID] has been CANCELLED

Working tree is clean. See reports/plan-{slug}.md for full details.
To restart: /run-plan <plan-file> auto every INTERVAL
To cancel: /run-plan stop

When to trigger

Invoke this protocol for ANY of these:

Cherry-pick conflict during Phase 6
npm run test:all fails after cherry-picks are landed
Verification fails after 2 fix+verify cycles (auto mode)
Preflight checks detect stale state (conflict markers, orphaned stash)
Any unrecoverable error that stops the run from completing normally

Do NOT invoke for:

Individual test failures in the worktree during implementation (the implementer fixes those as part of their workflow)
Warnings or non-blocking issues

Key Rules

"Noted as gap" is a FAILURE, not a pass. If the implementer skips a work item and the verifier writes "gaps noted" or "not a blocker" — that is a verification failure. Dispatch a fix agent for the missing items. Do not advance to Phase 4. Do not write "gaps noted" in reports. Past failure: Block Expansion Phase 1 skipped example model + runtime entry; verifier accepted both skips instead of invoking Failure Protocol.
Never weaken tests — fix the code, not the test. Do not loosen tolerances, skip assertions, or remove test cases.
Honest status reporting — if the user asks "are you stuck?", answer with DATA: (1) current phase and when it started, (2) agent duration and tool call count, (3) errors or retries. Do not say "everything is fine" if an agent has been running >30 minutes or retried 2+ times.

Edge Cases

No progress tracker: LLM reads plan sections + checks codebase for evidence of completion (files exist, tests pass, git log mentions the phase)
Phase fails verification: auto mode tries one fix cycle (dispatch fix agent + re-verify), then stops after 2 total cycles
All phases complete: report "Plan complete", delete cron if scheduled
Dependency not met: stop cleanly, report which dependency. If every, the cron retries on next invocation (the dependency may be completed by then)
Phase "In Progress": another agent may be working — stop, don't compete. Report the conflict.
Existing worktree for phase: previous incomplete run — ask user (interactive) or try to resume from the existing worktree (auto)
Implementation produces no commits: the agent worked but committed nothing. Report in reports/plan-{slug}.md as "No commits produced — investigate worktree." Do not attempt to cherry-pick nothing. In auto mode, invoke the Failure Protocol (this is an unrecoverable state for cron)
Plan file not found: stop immediately, report the error
Phase arg doesn't match any phase: stop, list available phases

run-plan

More from this repository

More from this repository

/run-plan <plan-file> [phase|finish] [auto] [every SCHEDULE] [now] | stop | next — Plan Phase Executor

Arguments

Status (if status is present)

Now (standalone — no plan-file provided)

Next (if next is present)

Stop (if stop is present)

Phase 0 — Schedule (if every is present)

Phase 1 — Parse Plan & Extract Verbatim Phase Text

Preflight checks

Parse plan

finish mode: overall verification after all phases

Phase 2 — Implement

Execution mode detection

Delegate mode

Worktree mode (default)

Post-implementation tracking

Pre-verification tracking

Phase 3 — Verify (separate agent)

Delegate mode verification

Worktree mode verification

Post-verification tracking

Phase 4 — Update Progress Tracking

Phase 5 — Write Report

Post-report tracking

Phase 5b — Plan Completion

1. Audit phase compliance

2. Close linked issue (if any)

3. Update plan frontmatter

4. Update SPRINT_REPORT.md

Phase 6 — Land

Delegate mode landing

Worktree mode landing

Pre-landing checklist (worktree mode only)

Post-landing tracking

Failure Protocol

1. Kill the cron FIRST

2. Restore the working tree

3. Write the failure to the plan report

4. Alert the user

When to trigger

Key Rules

Edge Cases

/run-plan <plan-file> [phase|finish] [auto] [every SCHEDULE] [now] | stop | next — Plan Phase Executor

Arguments

Status (if status is present)

Now (standalone — no plan-file provided)

Next (if next is present)

Stop (if stop is present)

Phase 0 — Schedule (if every is present)

Phase 1 — Parse Plan & Extract Verbatim Phase Text

Preflight checks

Parse plan

finish mode: overall verification after all phases

Phase 2 — Implement

Execution mode detection

Delegate mode

Worktree mode (default)

Post-implementation tracking

Pre-verification tracking

Phase 3 — Verify (separate agent)

Delegate mode verification

Worktree mode verification

Post-verification tracking

Phase 4 — Update Progress Tracking

Phase 5 — Write Report

Post-report tracking

Phase 5b — Plan Completion

1. Audit phase compliance

2. Close linked issue (if any)

3. Update plan frontmatter

4. Update SPRINT_REPORT.md

Phase 6 — Land

Delegate mode landing

Worktree mode landing

Pre-landing checklist (worktree mode only)

Post-landing tracking

Failure Protocol

Status (if `status` is present)

Next (if `next` is present)

Stop (if `stop` is present)

Phase 0 — Schedule (if `every` is present)

`finish` mode: overall verification after all phases

Status (if `status` is present)

Next (if `next` is present)

Stop (if `stop` is present)

Phase 0 — Schedule (if `every` is present)

`finish` mode: overall verification after all phases