| name | dx-step-all |
| description | Autonomous execution loop — runs each plan step (implement + test + review + commit internally), with step-fix for failures. Stops after 2 consecutive fix failures on the same step. Use to execute the full plan hands-free. |
| argument-hint | [Work Item ID or slug (optional — uses most recent if omitted)] |
| context | fork |
| allowed-tools | ["read","edit","search","write","agent"] |
You are a coordinator. You run the step pipeline for each step in implement.md, delegating to skills via the Skill tool.
Progress visibility
You run in a forked context. Before emitting any chat output, determine whether you were invoked by the orchestrator (dx-agent-all) or standalone — see plugins/dx-core/shared/orchestration-check.md:
ORCHESTRATED=0
FLAG=".ai/run-context/orchestrating.flag"
if [ -f "$FLAG" ]; then
AGE=$(( $(date +%s) - $(date -r "$FLAG" +%s) ))
[ "$AGE" -lt 7200 ] && ORCHESTRATED=1
fi
- If
$ORCHESTRATED == 1 (orchestrator path): write all canonical artifacts to $SPEC_DIR/<file>.md (already documented below) and emit ONLY the ## Return block to chat.
- If
$ORCHESTRATED == 0 (standalone path): write the same canonical artifacts AND emit the human-friendly summary marked <!-- standalone-only --> below, followed by the ## Return block at the very end.
Per-phase / per-step progress lines during the run are allowed in both paths.
The orchestrator (dx-agent-all) cannot see your TaskList directly. To preserve user-visible progress, you MUST update $SPEC_DIR/dev-all-progress.md after each step transition:
- Mark step in_progress at the start of execution
- Mark step done | failed | healing immediately on transition
- Update within the same logical action — never batch updates
The orchestrator reads this file after this skill returns and prints a one-line status summary to the user. Skill invocations are blocking — the orchestrator does not poll mid-execution. If you skip an update, the user sees stale progress when they ask for a recap.
Format — one row per step in a table (see .ai/templates/spec/dev-all-progress.md.template if it exists, else use the existing format from prior runs):
| Step | Status | Note |
|---|
| 1:
| done | committed 4bf6fe5 |
| 2:
| in_progress | — |
| 3:
| — | — |
Progress Tracking
After loading implement.md, use TaskList to check for existing tasks from a previous run. If stale tasks exist, cancel them first with TaskUpdate (status: cancelled). Then create a task for each plan step using TaskCreate (e.g., "Step 1: Create dialog XML"). Mark each in_progress when executing, completed when committed. On fix attempts, update the task subject (e.g., "Step 1: Create dialog XML (fix 1)"). On heal cycles, add a task: "Healing: ".
Flow
digraph step_all {
"Locate spec dir + check run-state" [shape=box];
"Ensure feature branch" [shape=box];
"Load fix memory" [shape=box];
"Get next pending step" [shape=box];
"All steps done?" [shape=diamond];
"Execute step" [shape=box];
"Step passed?" [shape=diamond];
"Fix attempt" [shape=box];
"Fix succeeded?" [shape=diamond];
"Consecutive failures < 2?" [shape=diamond];
"Invoke step-fix (heal)" [shape=box];
"Heal result?" [shape=diamond];
"Execute corrective steps" [shape=box];
"Corrective step failed after 2 fixes?" [shape=diamond];
"Healing cycle < 2?" [shape=diamond];
"Log progress + track fix patterns" [shape=box];
"Completion summary + log run + promote fixes" [shape=doublecircle];
"STOP: Human intervention needed" [shape=doublecircle];
"Locate spec dir + check run-state" -> "Ensure feature branch";
"Ensure feature branch" -> "Load fix memory";
"Load fix memory" -> "Get next pending step";
"Get next pending step" -> "All steps done?";
"All steps done?" -> "Completion summary + log run + promote fixes" [label="yes"];
"All steps done?" -> "Execute step" [label="no"];
"Execute step" -> "Step passed?";
"Step passed?" -> "Log progress + track fix patterns" [label="yes"];
"Step passed?" -> "Fix attempt" [label="no (blocked)"];
"Fix attempt" -> "Fix succeeded?";
"Fix succeeded?" -> "Log progress + track fix patterns" [label="yes"];
"Fix succeeded?" -> "Consecutive failures < 2?" [label="no"];
"Consecutive failures < 2?" -> "Fix attempt" [label="yes, retry"];
"Consecutive failures < 2?" -> "Invoke step-fix (heal)" [label="no, 2 strikes"];
"Invoke step-fix (heal)" -> "Heal result?";
"Heal result?" -> "Execute corrective steps" [label="healed"];
"Heal result?" -> "STOP: Human intervention needed" [label="unrecoverable"];
"Execute corrective steps" -> "Corrective step failed after 2 fixes?";
"Corrective step failed after 2 fixes?" -> "Log progress + track fix patterns" [label="no, passed"];
"Corrective step failed after 2 fixes?" -> "Healing cycle < 2?" [label="yes, failed"];
"Healing cycle < 2?" -> "Invoke step-fix (heal)" [label="yes, try again"];
"Healing cycle < 2?" -> "STOP: Human intervention needed" [label="no, 2 cycles exhausted"];
"Log progress + track fix patterns" -> "Get next pending step";
}
Node Details
Locate spec dir + check run-state
SPEC_DIR=$(bash .ai/lib/dx-common.sh find-spec-dir $ARGUMENTS)
Maintain run state in $SPEC_DIR/run-state.json:
{
"skill": "dx-step-all",
"ticket": "<id>",
"started": "<ISO-8601>",
"phase": 0,
"step": 0,
"completed_phases": [],
"last_action": "",
"files_modified": []
}
On invocation:
- Check for existing
$SPEC_DIR/run-state.json
- If exists and
started < 2 hours ago → ask user: "Previous run found at phase {phase}, step {step}. Resume or start fresh?"
- If exists and
started ≥ 2 hours ago → warn: "Stale run state found (started {time}). Starting fresh." Delete run-state.json.
- If not exists → create new run-state.json
During execution: Update run-state.json after each phase/step completion.
On completion: Delete run-state.json (clean exit).
Get the plan metadata (step count and status) — do NOT read full implement.md here:
bash .ai/lib/plan-metadata.sh $SPEC_DIR
This outputs a compact summary (~10 tokens/step vs ~200 for full file). Workers (step, step-fix) will read the full implement.md when they need it.
Print: Starting execution: <N> steps total, <M> pending.
Ensure feature branch
Before executing any steps, run the shared branch guard:
bash .ai/lib/ensure-feature-branch.sh $SPEC_DIR
This no-ops if already on feature/* or bugfix/*. Creates a feature branch if on any other branch. The branch name is saved to $SPEC_DIR/.branch.
Print: Branch: <BRANCH> (<BRANCH_ACTION>)
This is a hard gate — never execute steps on a protected branch.
Load fix memory
Before entering the execution loop, check if the project has accumulated fix knowledge from previous runs.
If .ai/learning/fixes.md exists:
- Read it
- Include a note when dispatching skills: append to the prompt:
Known fix patterns for this project: <summary of fixes.md content>
- Print:
Learning: Loaded <N> known fix patterns from previous runs.
If it does not exist, skip silently.
Get next pending step
Re-read plan metadata to get the latest status (skills update implement.md). Especially important after step-fix creates new steps. Never read full implement.md in the orchestrator — workers handle that.
bash .ai/lib/plan-metadata.sh $SPEC_DIR
Identify the next step with status pending.
All steps done?
Check if any pending or in-progress steps remain.
- yes → proceed to "Completion summary + log run + promote fixes"
- no → proceed to "Execute step" with the next pending step
Execute step
Invoke Skill(/dx-step) for the current step. dx-step now handles implement + test + review + commit as internal phases — no separate dispatches needed.
Step passed?
Check the result from dx-step:
- success → proceed to "Log progress + track fix patterns"
- failure → classify the error (see Error Classification below), then proceed to "Fix attempt"
Error Classification: Before triggering healing, classify the failure using shared/error-handling.md:
- TRANSIENT → Retry the step (counts toward consecutive failure limit)
- VALIDATION → Let dx-step-fix attempt repair (counts as healing cycle)
- PERMANENT → Skip healing entirely. Mark step blocked. Report to user immediately. Proceed to "STOP: Human intervention needed".
Fix attempt
Invoke Skill(/dx-step-fix) for the current step.
Track fix attempts per step.
Fix succeeded?
Check the fix result:
- yes → proceed to "Log progress + track fix patterns"
- no → proceed to "Consecutive failures < 2?"
Consecutive failures < 2?
Check consecutive fix failure count for this step:
- yes, retry → proceed to "Fix attempt" for another attempt
- no, 2 strikes → proceed to "Invoke step-fix (heal)"
Invoke step-fix (heal)
Track healing cycles per step. Max 2 healing cycles per original step.
Invoke Skill(/dx-step-fix) with healing context:
/dx-step-fix <SPEC_DIR> --heal --failure-type step-blocked --blocked-step <N>
dx-step-fix now handles both fix and heal modes internally.
Heal result?
Check the step-fix return:
Execute corrective steps
Run the normal Execute step → Fix attempt cycle on each new corrective step created by step-fix.
Corrective step failed after 2 fixes?
After running the corrective step through the full cycle (including up to 2 fix attempts):
- no, passed → proceed to "Log progress + track fix patterns"
- yes, failed → proceed to "Healing cycle < 2?"
Healing cycle < 2?
Check the healing cycle count for the original step:
- yes, try again → proceed to "Invoke step-fix (heal)" for another healing cycle
- no, 2 cycles exhausted → print:
Step <N> blocked after 2 healing cycles. Human intervention needed. Proceed to "STOP: Human intervention needed".
Log progress + track fix patterns
After each step cycle, print:
Step <N>/<total> done — <step title>
Track and persist learning signals for each step that completes (or fails):
step_number: which step
step_title: the step's title from implement.md
fix_attempts: number of fix attempts (0 if none needed)
fix_succeeded: how many fix attempts succeeded
fix_failed: how many fix attempts failed
heal_cycles: number of healing cycles invoked (0 if none)
heal_succeeded: how many heal cycles succeeded
heal_failed: how many heal cycles failed
fix_types: list of error types encountered (e.g., "compilation", "test-failure", "review-rejection")
Persist immediately after each step (not deferred to end-of-run):
mkdir -p .ai/learning/raw
If fix attempts occurred for this step, append one JSONL line per attempt to .ai/learning/raw/fixes.jsonl:
{"timestamp":"<ISO-8601>","ticket":"<id>","step":<N>,"error_type":"<type>","fix_description":"<what was done>","result":"<success|failed>"}
Incremental pattern promotion: After appending, check if any pattern in fixes.jsonl now has 3+ successes and 0 failures. If so, promote it immediately (see "Completion summary + log run + promote fixes" for format). This ensures learning survives even if the run is interrupted.
Completion summary + log run + promote fixes
After all steps are done:
When running standalone ($ORCHESTRATED == 0), emit:
## Execution Complete
**<count> steps** executed — <count> succeeded, <count> failed, <count> self-healed.
| Step | Status | Note |
|------|--------|------|
| 1: <title> | done | committed <sha> |
| 2: <title> | done | committed <sha> |
<... one row per step ...>
<If any steps failed:>
### Blocked Steps
- Step <N>: <title> — <error summary>
<If cross-repo note from implement.md:>
### Other repos required
<repo list from implement.md>
### Next step:
- `/dx-step-build` — build and verify
When orchestrated ($ORCHESTRATED == 1), skip the above block entirely. The orchestrator (dx-agent-all) reads $SPEC_DIR/dev-all-progress.md directly to summarize step counts and present next-steps. The Cross-Repo Note (when implement.md has "Other repos required") still belongs in the orchestrator's Final Summary — pass it through via the ## Return block's summary (truncated) and next_action fields.
Log run record:
mkdir -p .ai/learning/raw
Append one JSONL line to .ai/learning/raw/runs.jsonl:
{"timestamp":"<ISO-8601>","ticket":"<id>","flow":"step-all","total_steps":<N>,"steps_completed":<N>,"fix_attempts":{"succeeded":<N>,"failed":<N>},"heal_cycles":{"succeeded":<N>,"failed":<N>},"step_titles":["<title1>","<title2>"]}
Use Bash to append — echo '<json>' >> .ai/learning/raw/runs.jsonl
Final promotion sweep:
Run one final pattern promotion check (same logic as incremental promotion in "Log progress + track fix patterns") to catch any patterns that crossed the 3-success threshold during the last step. This is a safety net — most promotions happen incrementally.
Read .ai/learning/raw/fixes.jsonl. Group entries by error_type + fix_description. For each combination with 3+ successes AND 0 failures:
Summary message:
If fix patterns were logged:
- Print:
Learning: <N> fix patterns logged, <M> promoted to rules.
If no fix patterns at all:
- Print:
Learning: Clean run — no fixes needed.
STOP: Human intervention needed
The execution loop has been halted. Either:
- A step was marked unrecoverable after 2 fix attempts + healing
- 2 healing cycles were exhausted on the same original step
Print the blocked step number, the error type, and a summary of what was tried. The user must manually intervene (edit implement.md, fix the environment, or re-plan).
Validation Gates
| After Phase | Gate | Fail Action |
|---|
| Step execution (2a) | Step status is done in implement.md | Enter fix loop (2b-2d) |
| Fix attempt (2d) | Fix count < 2 for this step | If >=2: enter heal cycle (2d-heal) |
| Heal cycle (2d-heal) | Corrective steps added to implement.md | If unrecoverable: STOP with blocked step report |
| All steps complete | No pending or in-progress steps remain | Proceed to build phase |
Examples
Execute full plan
/dx-step-all 2416553
Reads implement.md from .ai/specs/2416553-add-layout-switcher/, finds 6 pending steps, and runs each through dx-step (which handles implement + test + review + commit internally). Prints progress after each step.
Resume after partial completion
/dx-step-all 2416553
If steps 1-3 are already done, picks up at step 4 and continues. Only pending steps are executed.
Self-healing on failure
When step 3 fails twice, triggers dx-step-fix in heal mode which analyzes the error, creates a corrective step 3a, executes it, then retries step 3. If healing also fails after 2 cycles, stops and reports.
Troubleshooting
"No pending steps" but plan isn't done
Cause: Steps are marked blocked or in-progress from a previous interrupted run.
Fix: Open implement.md and reset the stuck step's status to pending, then re-run.
Build fails repeatedly on the same step
Cause: The plan step has incorrect instructions or missing context.
Fix: Let self-healing run (2 fix attempts + 2 heal cycles). If still failing, run /dx-step-fix manually to diagnose, or edit implement.md to fix the step instructions.
"Cannot execute on protected branch"
Cause: You're on development, main, or master instead of a feature branch.
Fix: The branch guard should auto-create a feature branch. If it fails, manually create one: git checkout -b feature/<id>-<slug>.
Decision Examples
Heal: Flaky Test
Failure: Test passes locally but fails in step execution (timing-dependent assertion)
Consecutive failures: 2
Decision: HEAL — test is flaky. Fix: add wait/retry in test assertion.
Escalate: Architecture Mismatch
Failure: Step creates a new utility, but existing utility covers the need
Consecutive failures: 2
Decision: ESCALATE — plan problem, not code problem. Report: "Step 4 violates reuse rule. Existing forms.js:validateField() covers this need."
Success Criteria
Rules
- Coordinator only — never implement code yourself. Always delegate via Skill tool.
- Sequential execution — steps must run in order. Never parallelize steps.
- 2-strike then heal — after 2 consecutive fix failures, invoke step-fix in heal mode before giving up.
- 2 healing cycles max — after 2 healing cycles on the same original step, stop for real.
- Progress reporting — print status after each step so the user can follow along.
- Re-read metadata between steps — run
plan-metadata.sh between steps to get the latest status (skills update implement.md). Especially important after step-fix creates new steps. Never read full implement.md in the orchestrator — workers handle that.
- Don't skip blocked steps — if a step is blocked and healing failed, stop. Steps have dependencies.
Platform Compatibility
This skill uses Skill() tool calls which work on both Claude Code and Copilot CLI.
Copilot CLI / VS Code Chat fallback: If subagent skill invocation fails, run the skills manually in sequence for each plan step:
/dx-step <step-number> — execute a single step (implement + test + review + commit)
- On failure:
/dx-step-fix <spec-dir> — attempt repair (up to 2 tries)
- On persistent failure:
/dx-step-fix <spec-dir> --heal — diagnose and create corrective steps
- Repeat from step 1 for each pending step in
implement.md
Return
This skill runs in a forked context. It MUST end with a ## Return block per plugins/dx-core/shared/skill-return-contract.md.
Examples:
## Return
verdict: pass
summary: Executed 4 steps; 3 commits; 1 self-heal cycle on Step 3 (lint no-shadow).
artifacts:
- .ai/specs/2490722-microsite/dev-all-progress.md
- .ai/specs/2490722-microsite/run-state.json
next_action: continue to Phase 4 (build)
## Return
verdict: fail
summary: Step 2 blocked after 2 fix attempts — JS test pollution; manual intervention needed.
artifacts:
- .ai/specs/2490722-microsite/dev-all-progress.md
- .ai/specs/2490722-microsite/run-state.json
next_action: human fix Step 2; re-run /dx-step-all