| name | design |
| description | Use when designing any non-trivial feature, refactor, or architectural change — design, architecture, scope, approach validation. Sketch agents (4 regular, 2 quick) propose approaches; 4-reviewer panel validates via 3-voter dialectic. |
| argument-hint | [--auto] [--quick] [--subagent] [--session-env <path>] <feature description> |
| allowed-tools | AskUserQuestion, Bash, Read, Edit, Write, Grep, Glob, Agent, Task, WebFetch, WebSearch |
Design Skill
Design an implementation plan for a feature and review it with a 4-reviewer panel (2 Cursor: Arch, Edge + 2 Codex: Innovation, Pragmatic — a diagonal split matching the sketch phase), adjudicated by a 3-voter panel (Claude + Codex + Cursor). The sketch phase (Step 2a) runs 4 agents in regular mode (Cursor-Arch + Cursor-Edge + Codex-Innovation + Codex-Pragmatic — one personality per vendor in a diagonal split) or 2 agents in quick mode (1 Cursor-Generic + 1 Codex-Generic).
Flags: Parse flags from the start of $ARGUMENTS before treating the remainder as the feature description. Flags may appear in any order; stop at the first non-flag token. All boolean flags default to false. Only set a flag to true when its --flag token is explicitly present in the arguments. Flags are independent — the presence of one flag must not influence the default value of any other flag.
| Flag | Default | Purpose | Load-bearing detail |
|---|
--auto | false | Skip interactive question checkpoints (1c, 1d, 3.5) | No-op when /implement --quick skips /design entirely; dirty-tree recovery prompts are not suppressed |
--quick | false | Quick sketch mode: 2 agents instead of 4 | Independent of --auto; see flags.md for /implement --quick vs /design --quick distinction |
--subagent | false | Run Step 2a heavy phase in an isolated Agent-tool subagent (heavy-worker.md); writes artifacts only to $DESIGN_TMPDIR/ and returns terse status; standalone (--session-env empty) parents replay artifacts before cleanup | No-op when --quick is set; orthogonal to --session-env |
--session-env <path> | empty | Forward discovered session values to session-setup.sh | Empty = standalone invocation, full discovery |
--step-prefix <prefix> | empty | Nested-numbering prefix from /implement | :: delimiter splits numeric prefix from breadcrumb path; "1." (bare numeric) is backward-compat |
--branch-info <values> | — | Skip redundant branch-state check when called from /implement | 4 keys required: IS_MAIN/IS_USER_BRANCH/USER_PREFIX/CURRENT_BRANCH; fallback on validation failure to create-branch.sh --check; power-user / nested-call flag with no standalone value validation |
MANDATORY — READ ENTIRE FILE before parsing argument flags: Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/flags.md completely. This reference is the single normative source for flag semantics — validation rules, fallback behaviors, :: delimiter encoding spec, 4-key --branch-info requirement, and backward-compat notes. The table above is a non-normative index.
The feature to design is described by the remainder of $ARGUMENTS after flags are stripped.
Anti-halt continuation reminder. After every Bash tool call that completes a numbered step or sub-step, and after every visible output (plans, diagrams, voting tallies, skip breadcrumbs), IMMEDIATELY continue with this skill's NEXT numbered step — do NOT end the turn on a Bash result, a status message, or a deliverable-looking output, and do NOT write a summary, handoff, status recap, or "returning to parent" message — those are halts in disguise. This applies to ALL step boundaries from Step 0 through Step 5, and to ALL sub-step transitions (1c→1d→2a→2a.5→2b→3→3.5→3b→4→5). Critical: the implementation plan (Step 2b) and architecture diagram (Step 3b) are intermediate deliverables, NOT the end of the design — plan review (Step 3) and cleanup (Step 5) must still execute. The rule is strictly subordinate to any explicit non-sequential control-flow directive in THIS file (e.g., skip to Step N, bail to cleanup, jump back, proceed to Step N). A normal sequential proceed to Step N+1 instruction is the default continuation this rule reinforces, NOT an exception.
Progress Reporting
Every step MUST print clearly visible breadcrumb status lines so the user can instantly see where execution is and which parent steps they are inside. Follow the formatting rules in ${CLAUDE_PLUGIN_ROOT}/skills/shared/progress-reporting.md.
- Print a start line when entering a step: e.g.,
> **🔶 1: branch** (standalone) or > **🔶 1.1: design plan | branch** (nested from /implement)
- Print a completion line only when it carries informational payload. Only the final step (Step 5) prints an unconditional completion announcement.
- When
STEP_NUM_PREFIX is non-empty, prepend it to step numbers: {STEP_NUM_PREFIX}{local_step}. When STEP_PATH_PREFIX is non-empty, prepend it to breadcrumb paths: {STEP_PATH_PREFIX} | {step_short_name}. This rule overrides the literal step numbers and names in Print: directives and examples throughout this file. Examples shown below assume standalone mode; when nested, prepend the parent context.
Step Name Registry:
| Step | Short Name |
|---|
| 0 | setup |
| 1 | branch |
| 1c | questions |
| 1d | discussion r1 |
| 2a | sketches |
| 2a.5 | dialectic |
| 2b | full plan |
| 3 | plan review |
| 3.5 | discussion r2 |
| 3b | arch diagram |
| 4 | rejected findings |
| 5 | cleanup |
Verbosity Control
- Use empty string for the
description parameter on all Bash tool calls.
- Use terse 3-5 word descriptions for Agent tool calls.
- Do not produce explanatory prose between tool call outputs — only print: step breadcrumb lines (start
🔶, completion ✅, skip ⏩), final completion line (Step 5), all warning/error lines (**⚠ ...), structured summaries (voting tallies, scoreboards, round summaries, findings lists, approach synthesis, dialectic resolutions, implementation plans, architecture diagrams), and the compact reviewer status table (see below).
Suppressed output: explanatory prose, script paths, rationale for decisions between tool calls, per-reviewer individual completion messages.
When SESSION_ENV_PATH is non-empty (nested under /implement), suppress bulky inline artifact bodies for the implementation plan, voting tally, architecture diagram, rejected findings, and discussion syntheses; print one-line saved breadcrumbs instead and rely on the files under $DESIGN_TMPDIR/ plus the Step 5 design manifest. When SESSION_ENV_PATH is empty (standalone /design), preserve the existing verbose inline output and skip manifest export entirely.
Compact reviewer status table: After launching sketch agents (Step 2a) or plan reviewers (Step 3), maintain a mental tracker of each agent's status. Print a compact table after EACH status change:
📊 Sketches (regular): | Cursor-Arch: ⏳ | Cursor-Edge: ✅ 3m5s | Codex-Innovation: ❌ 8m3s | Codex-Pragmatic: ✅ 4m2s |
📊 Sketches (quick): | Cursor-Generic: ⏳ | Codex-Generic: ✅ 3m5s |
or for Step 3 plan review (4-reviewer panel):
📊 Reviewers: | Cursor-Arch: ✅ 4m12s | Cursor-Edge: ⏳ | Codex-Innovation: ⏳ | Codex-Pragmatic: ✅ 2m31s |
Icons: ✅ done (with elapsed time since launch), ⏳ pending/in-progress, ❌ failed/timeout (with elapsed time since launch), ⊘ skipped (unavailable). This replaces individual per-agent completion messages. See ${CLAUDE_PLUGIN_ROOT}/skills/shared/progress-reporting.md for elapsed time and step start formatting rules.
Limitation: Verbosity suppression is prompt-enforced and best-effort.
Design Mindset
Before invoking /design, the orchestrator should internalize these questions. They bias every subsequent choice — sketch synthesis, plan drafting, review-finding acceptance — and are the thinking pattern this skill transfers along with its mechanical procedures.
- What is the smallest change that achieves the goal? Resist adding abstractions, flags, or layers the feature description did not ask for. Every additional moving part is a new failure mode.
- Where is anchoring risk highest? The first plausible approach locks architectural direction unless the sketch phase forces alternatives. Do NOT skip Step 2a (anti-pattern rule #1).
- What hidden constraints must this preserve? Canonical sources, CI invariants, downstream parsers, contract tokens, byte-preserved reference files. Identify them before edits, not during plan review.
- Which tradeoffs should surface to the user versus be quietly chosen? Scope and hard-constraint decisions surface via Round 1 discussion; architectural preferences belong to the sketch phase — not to the user.
- Which anti-patterns in the NEVER list below apply to this specific feature? Re-read the Anti-patterns section for every non-trivial feature; muscle memory for the six rules is the expert delta this skill aims to transfer.
Anti-patterns
Consolidated NEVER rules collected from the procedural steps below. Each rule states the WHY so edits can respect the original constraint. Inline step-local mentions remain where they carry load-bearing context.
-
NEVER skip Step 2a (the sketch phase). Why: anchoring bias locks architectural direction before alternatives are considered. How to apply: always run all 4 sketch slots in regular mode or all 2 in quick mode, even when the feature seems trivial; Claude fallbacks preserve the configured lane count when externals are unavailable.
-
NEVER substitute a Claude subagent into a dialectic debate bucket. Why: the debate path is externals-only (Cursor/Codex) because model-specific writing style could encode tool identity into adversarial arguments; the judge path uses the repo-wide replacement-first pattern because judges merely adjudicate pre-authored defenses. See GitHub issue #98. How to apply: Step 2a.5 skips debate buckets whose assigned tool is unavailable — do NOT reassign to Claude. Judge-panel slots (after debate) DO use Claude replacements per dialectic-protocol.md.
-
NEVER mutate orchestrator-wide codex_available / cursor_available inside Step 2a.5. Why: Step 3 plan-review panel integrity depends on the Option B snapshot pattern — a debate-phase timeout must not lock a tool out of later plan review. How to apply: use the dialectic_*_available shadow flags inside Step 2a.5 and the judge_*_available shadow flags inside the judge re-probe; never touch the top-level flags.
-
NEVER pass --caller-env or --write-health to session-setup.sh when SESSION_ENV_PATH is empty. Why: standalone /design invocations have no parent /implement to consume the session-env or health artifacts. How to apply: branch on SESSION_ENV_PATH non-empty in Step 0; omit both flags when standalone.
-
NEVER call collect-agent-results.sh with zero positional arguments. Why: it exits 1 with "at least one output file is required". This is the zero-externals failure mode when every external slot has fallen back to a Claude subagent. How to apply: guard each collector call with an explicit check that at least one external slot was launched; the dialectic zero-externals guardrail (Step 2a.5 step 5) and the Step 3 collector both require this.
-
NEVER conflate the two timeout families. Why: sketch-phase timeouts (sketches are shorter) differ from plan-review + dialectic timeouts (longer, deeper reasoning). How to apply: use timeout: 1260000 (Bash tool) / --timeout 1260 (collector) / --timeout 1200 (reviewer script) for sketch-phase launches and sketch collection; use timeout: 1860000 / --timeout 1860 / --timeout 1800 for plan-review launches, dialectic debaters, and dialectic judges.
Step 0 — Session Setup
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 0 — session setup" || true
Define branch_info_supplied=true only when the caller passed valid --branch-info containing all 4 keys: IS_MAIN, IS_USER_BRANCH, USER_PREFIX, and CURRENT_BRANCH. SESSION_ENV_PATH being non-empty is not a nesting signal by itself; --session-env is an exposed argument and can be passed manually.
If branch_info_supplied=true (trusted caller-supplied branch state, normally from /implement), use the parsed --branch-info values for CURRENT_BRANCH, IS_MAIN, IS_USER_BRANCH, and USER_PREFIX. The four key values are accepted as-is and not cross-checked against the working tree (see the --branch-info "Sharp edge" note in ${CLAUDE_PLUGIN_ROOT}/skills/design/references/flags.md). /implement is presumed to have already run the entry gate, so session-entry-gate.sh below will emit SKIP_BRANCH_CHECK=true.
If branch_info_supplied=false (standalone, regardless of SESSION_ENV_PATH), check the current branch before setup:
${CLAUDE_PLUGIN_ROOT}/scripts/create-branch.sh --check
Parse CURRENT_BRANCH, IS_MAIN, IS_USER_BRANCH, and USER_PREFIX from stdout.
Run the shared entry gate helper using the parsed branch facts. Its contract lives at ${CLAUDE_PLUGIN_ROOT}/scripts/session-entry-gate.md.
${CLAUDE_PLUGIN_ROOT}/scripts/session-entry-gate.sh \
--mode design \
--current-branch "$CURRENT_BRANCH" \
--is-main "$IS_MAIN" \
--is-user-branch "$IS_USER_BRANCH" \
--user-prefix "$USER_PREFIX" \
--branch-info-supplied "$branch_info_supplied"
Parse ENTRY_GATE and SKIP_BRANCH_CHECK from this script's stdout in isolation. Do not concatenate it with create-branch.sh --check output for a single eval. On non-zero exit, print the raw GATE_ERROR=... line first, then print the normalized internal-contract message and abort:
⚠ /design: internal Step 0 contract violation in session-entry-gate.sh. Aborting.
Do NOT print the clean-main banner for GATE_ERROR; that banner is reserved for session-setup.sh PREFLIGHT_ERROR.
If SKIP_BRANCH_CHECK=true, run setup with --skip-branch-check:
${CLAUDE_PLUGIN_ROOT}/scripts/session-setup.sh --prefix claude-design --skip-branch-check --skip-repo-check --check-reviewers [--caller-env "$SESSION_ENV_PATH"] [--skip-codex-probe] [--skip-cursor-probe] [--write-health "${SESSION_ENV_PATH}.health"]
If SKIP_BRANCH_CHECK=false, run setup without --skip-branch-check; preflight.sh runs in default mode and enforces clean main plus fetch/rebase before design work begins:
${CLAUDE_PLUGIN_ROOT}/scripts/session-setup.sh --prefix claude-design --skip-repo-check --check-reviewers [--caller-env "$SESSION_ENV_PATH"] [--skip-codex-probe] [--skip-cursor-probe] [--write-health "${SESSION_ENV_PATH}.health"]
Only include --caller-env "$SESSION_ENV_PATH" and --write-health "${SESSION_ENV_PATH}.health" if SESSION_ENV_PATH is non-empty. This Anti-pattern #4 predicate is orthogonal to branch_info_supplied: session-env controls parent health I/O; branch-info controls whether /design trusts /implement's already-gated branch state. If SESSION_ENV_PATH provides CODEX_HEALTHY=false or CURSOR_HEALTHY=false, the script auto-sets the corresponding --skip-codex-probe / --skip-cursor-probe flag — you do not need to pass these explicitly when using --caller-env.
If the script exits non-zero, always print the raw PREFLIGHT_ERROR=... line first. Then print the normalized skill-level message and abort:
⚠ /design requires clean main to start. To continue, choose one of: (a) git checkout main && git status clean → re-run; (b) check out or create a <USER_PREFIX>/* feature branch and re-run (the branch naming convention is the explicit opt-in to continue from current state); (c) commit or stash uncommitted changes on main first.
Parse the output for SESSION_TMPDIR, CODEX_AVAILABLE, CURSOR_AVAILABLE, CODEX_HEALTHY, CURSOR_HEALTHY. Set DESIGN_TMPDIR = SESSION_TMPDIR. Substitute the actual path in every command below.
Set mental flags codex_available and cursor_available based on the output:
- If
CODEX_AVAILABLE=false: codex_available=false. Print: **⚠ Codex not available (binary not found). Proceeding without Codex reviewer.**
- Else if
CODEX_HEALTHY=false: codex_available=false. Print: **⚠ Codex installed but not responding (health check failed). Using Claude replacement.**
- Else:
codex_available=true
- Same logic for Cursor.
The --write-health flag writes the health status file for cross-skill propagation. It will be updated by collect-agent-results.sh --write-health during runtime if any reviewer times out.
Step 1 — Create Branch
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 1 — branch" || true
1a — Check current branch state
If branch_info_supplied=true (via --branch-info): Use the values parsed from the flag (CURRENT_BRANCH, IS_MAIN, IS_USER_BRANCH, USER_PREFIX). Skip the create-branch.sh --check call.
Otherwise (standalone invocation or validation failed): Use the values parsed from Step 0's standalone create-branch.sh --check call. If Step 0 did not capture those values for any reason, run the create-branch.sh script in check mode before proceeding:
${CLAUDE_PLUGIN_ROOT}/scripts/create-branch.sh --check
Parse the output for CURRENT_BRANCH, IS_MAIN, IS_USER_BRANCH, and USER_PREFIX.
1b — Decide action
Decision logic (using the script output):
-
If IS_MAIN=true: Derive a short kebab-case branch name from the feature description (e.g., "add user auth" → <USER_PREFIX>/add-user-auth). Keep it under 50 characters. Then create it:
${CLAUDE_PLUGIN_ROOT}/scripts/create-branch.sh --branch <USER_PREFIX>/<branch-name>
-
If IS_USER_BRANCH=true: Verify the branch name (CURRENT_BRANCH) aligns with the requested feature. If it appears unrelated (different feature name, unrelated commits), print a warning: **⚠ Current branch '<branch-name>' may not match the requested feature. Creating a new branch from main.** and create a new branch as above. Otherwise, skip branch creation. Print: > **🔶 1: branch — using existing: <branch-name>**
-
Otherwise (non-main, non-user branch): Print a warning: **⚠ Currently on branch '<branch-name>' which doesn't match the expected '<USER_PREFIX>/*' pattern. Creating a new branch from main.** Then derive a name and create as above.
Step 1c — Clarifying Questions
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 1c — questions" || true
Print: > **🔶 1c: questions**
If auto_mode=true: Print ⏩ 1c: questions — skipped (auto mode) (<elapsed>) and proceed to Step 1d.
If auto_mode=false: MANDATORY — READ ENTIRE FILE: Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/discussion-rounds.md completely. Execute the Step 1c body in that file. Do NOT load discussion-rounds.md when auto_mode=true — the short-circuit above exits first.
Step 1d — Design Discussion (Round 1)
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 1d — discussion r1" || true
Print: > **🔶 1d: discussion r1**
If auto_mode=true: Print ⏩ 1d: discussion r1 — skipped (auto mode) (<elapsed>) and proceed to Step 2a.
If auto_mode=false: Execute the Step 1d body in ${CLAUDE_PLUGIN_ROOT}/skills/design/references/discussion-rounds.md. If already loaded at Step 1c, no need to re-load; otherwise MANDATORY — READ ENTIRE FILE: Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/discussion-rounds.md completely.
Step 2a — Collaborative Approach Sketches
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 2a — sketches" || true
IMPORTANT: The collaborative sketch phase MUST ALWAYS run with all configured sketch agents — 4 in regular mode, 2 in quick mode (using Claude replacements when external tools are unavailable). Never skip or abbreviate this phase regardless of how simple, obvious, or documentation-only the feature appears. The sketch synthesis is required architectural input for the implementation plan — skipping it causes anchoring bias where a single perspective locks in the direction before alternatives are considered.
A diverge-then-converge phase where multiple agents independently produce short architectural sketches before writing the full plan. This surfaces different perspectives early — when they can still influence architectural direction — rather than waiting for review when the plan is already anchored.
Regular mode (quick_mode=false) — 4 sketch agents
The 4 sketch agents are 2 Cursor + 2 Codex, with per-slot Claude fallback when an external tool is unavailable:
- Cursor — Architecture/Standards — or Claude (Architecture/Standards) fallback.
- Cursor — Edge-cases/Failure-modes — or Claude (Edge-cases/Failure-modes) fallback.
- Codex — Innovation/Exploration — or Claude (Innovation/Exploration) fallback.
- Codex — Pragmatism/Safety — or Claude (Pragmatism/Safety) fallback.
When the assigned external is unavailable, the slot's Claude fallback uses the same personality prompt; the configured 4-agent shape is preserved.
Quick mode (quick_mode=true) — 2 sketch agents
- Cursor — Generic — or Claude (Generic) fallback: a broad-scope sketch without personality specialization.
- Codex — Generic — or Claude (Generic) fallback: same generic prompt as Cursor-Generic.
Heavy phase dispatch (regular and quick mode)
Print > **🔶 2a: sketches**.
Subagent heavy phase: If subagent_mode=true (i.e., --subagent was passed) AND quick_mode=false, invoke a single Agent-tool subagent (subagent_type: general-purpose) for the heavy non-interactive phase before entering 2a.2. The subagent MUST read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/heavy-worker.md, receive DESIGN_TMPDIR, IMPLEMENT_TMPDIR, SESSION_ENV_PATH, FEATURE_DESCRIPTION, quick_mode, auto_mode, branch info, and reviewer health flags as explicit data, and write raw artifacts to $DESIGN_TMPDIR/. The subagent returns only DESIGN_HEAVY=complete or DESIGN_HEAVY=failed REASON=<short-token>; it does not write the manifest and does not return plan/reviewer/tally prose.
Immediately after the Agent tool returns, parse the heavy-worker status line. Before following the success path, fail closed if the worker omitted a valid status line or returned success without the required artifacts. The gate has two tiers, drawn from two distinct normative sources: Tier 1 (non-empty) for substantive artifacts the heavy-worker.md "Artifact Contract" mandates as non-empty regardless of manifest export; Tier 2 (must-exist) for may-be-empty artifacts that skills/design/scripts/write-design-manifest.sh requires on disk for manifest export (copy_required_may_be_empty calls in that script). On the nested+auto_mode=true path, parent Step 4 (which creates missing empty files) is skipped, so Tier 2 is the load-bearing existence check before manifest export at Step 5; Tier 1 is the heavy-worker contract check independent of manifest export.
if [[ "${DESIGN_HEAVY:-}" != "complete" && "${DESIGN_HEAVY:-}" != "failed" ]]; then
DESIGN_HEAVY=failed
REASON=worker-yielded-without-artifacts
elif [[ "${DESIGN_HEAVY:-}" == "complete" ]] && {
[[ ! -s "$DESIGN_TMPDIR/plan.txt" ]] ||
[[ ! -s "$DESIGN_TMPDIR/approach-synthesis.txt" ]] ||
[[ ! -s "$DESIGN_TMPDIR/voting-tally.md" ]] ||
[[ ! -f "$DESIGN_TMPDIR/contested-decisions.md" ]] ||
[[ ! -f "$DESIGN_TMPDIR/oos.md" ]] ||
[[ ! -f "$DESIGN_TMPDIR/rejected-findings.md" ]] ||
[[ ! -f "$DESIGN_TMPDIR/accepted-plan-findings.md" ]]
}; then
DESIGN_HEAVY=failed
REASON=worker-yielded-without-artifacts
fi
Tier 1 (non-empty -s checks) pins the substantive artifacts mandated as non-empty by heavy-worker.md "Artifact Contract"; this tier is independent of manifest export and includes approach-synthesis.txt, which write-design-manifest.sh does not stage. Tier 2 (existence -f checks) pins may-be-empty manifest-required artifacts (contested-decisions.md, oos.md, rejected-findings.md, accepted-plan-findings.md) that write-design-manifest.sh stages via copy_required_may_be_empty. Two artifacts are intentionally NOT in the gate: dialectic-resolutions.md (heavy-worker.md "Artifact Contract" requires it as an empty file when dialectic does not run, but dialectic-protocol.md allows absence on the NO_CONTESTED_DECISIONS short-circuit and the zero-externals guardrail — adding -f would false-positive on those legitimate paths until the two normative sources are reconciled, which is out of scope for this gate) and architecture-diagram.md (optional; auto_mode=true only). A failure from this gate routes through the normal DESIGN_HEAVY=failed branch below.
On DESIGN_HEAVY=complete:
- If
SESSION_ENV_PATH is non-empty (nested under /implement): if auto_mode=false proceed directly to Step 3.5; if auto_mode=true proceed directly to Step 5 because the worker ran Step 3b and Step 4. (Parent /implement reads the manifest written at Step 5.)
- If
SESSION_ENV_PATH is empty (standalone /design --subagent — NEW capability): read and print $DESIGN_TMPDIR/plan.txt under ## Implementation Plan, $DESIGN_TMPDIR/voting-tally.md under ## Voting Tally and Reviewer Competition Scoreboard, $DESIGN_TMPDIR/accepted-plan-findings.md under ## Plan Review Findings (Voted In) (skip header if file is empty or missing), $DESIGN_TMPDIR/oos.md under ## Out-of-Scope Observations (skip header if file is empty or missing), and — when auto_mode=true AND $DESIGN_TMPDIR/architecture-diagram.md exists and is non-empty — $DESIGN_TMPDIR/architecture-diagram.md under ## Architecture Diagram with the mermaid fence (when auto_mode=true AND the file is missing/empty, print **⚠ Architecture diagram unavailable (rejected by sanitizer).** if the session's Warnings section contains a mermaid sanitizer rejected entry; otherwise print **⚠ Architecture diagram unavailable (Step 3b generation failed in subagent).**). When auto_mode=true, also read $DESIGN_TMPDIR/rejected-findings.md: if non-empty, print it under ## Unimplemented Plan Review Suggestions; if empty or missing, print ## Plan Review — All Suggestions Implemented (matches Step 4's standalone output). Then if auto_mode=false proceed to Step 3.5 (Discussion Round 2 still runs interactively against the displayed artifacts); if auto_mode=true proceed to Step 5 (cleanup). This replay matches the inline standalone output that today's empty-SESSION_ENV_PATH path produces, so the user sees the deliverables that cleanup-tmpdir.sh would otherwise delete.
On DESIGN_HEAVY=failed:
- If
SESSION_ENV_PATH is non-empty (nested): print **⚠ 2a: sketches — heavy worker subagent failed: $REASON. No design manifest will be exported. Parent /implement will see MANIFEST_FAILED at Step 1 and re-invoke /design.**, proceed to Step 5 for cleanup/export checks (Step 5 sets MANIFEST_EXPORT_OK=false, skips cleanup-tmpdir.sh, preserves $DESIGN_TMPDIR), and do not run the inline heavy steps. Recovery: the parent /implement Step 1 reads the manifest after /design returns; on missing/failed manifest it sets STALL_TRACKING=true and bails to Step 18 cleanup. To retry transient subagent failures (network blip, model timeout), the operator re-runs the same /implement invocation — Step 0.5 sentinel idempotency reuses the already-created tracking issue, and /design runs fresh.
- If
SESSION_ENV_PATH is empty (standalone): print **⚠ 2a: sketches — heavy worker subagent failed: $REASON. Preserving $DESIGN_TMPDIR for inspection.**, set the mental flag STANDALONE_HEAVY_FAILED=true, skip the inline heavy steps, and proceed to Step 5. Step 5 sees STANDALONE_HEAVY_FAILED=true and skips cleanup-tmpdir.sh, preserving $DESIGN_TMPDIR so the operator can inspect partial artifacts. (No parent /implement consumer; no manifest needed for standalone.)
If subagent_mode=false (or quick_mode=true), proceed to 2a.2 and run the inline flow below. (SESSION_ENV_PATH continues to govern nested I/O semantics — verbosity suppression, manifest export, OOS routing — orthogonally to dispatch mode.)
2a.2 — Launch Sketches in Parallel
Regular mode: 4 sketch agents run in parallel: 2 Cursor slots (Architecture/Standards, Edge-cases/Failure-modes) + 2 Codex slots (Innovation/Exploration, Pragmatism/Safety), with per-slot Claude Agent-tool fallback when an external tool is unavailable so the 4-agent count is preserved.
Quick mode: 2 sketch agents run in parallel: 1 Cursor-Generic + 1 Codex-Generic, with per-slot Claude Agent-tool fallback so the 2-agent count is preserved.
MANDATORY — READ ENTIRE FILE (load FIRST): Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/sketch-prompts.md completely. It defines ARCH_PROMPT, EDGE_PROMPT, INNOVATION_PROMPT, PRAGMATIC_PROMPT, and GENERIC_PROMPT — the four personality-prompt bodies and the quick-mode generic prompt, substituted into the launch shell blocks via the corresponding <…> token names.
MANDATORY — READ ENTIRE FILE (load SECOND, after sketch-prompts.md): Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/sketch-launch.md completely. It contains the byte-preserved launch shell blocks for the 4 regular-mode external slots (2 Cursor + 2 Codex) and the 2 quick-mode slots (1 Cursor-Generic + 1 Codex-Generic), the spawn-order rule, the per-slot run_in_background: true / timeout: 1260000 requirements, and the per-slot Claude fallback notes.
Execute the launches per sketch-launch.md — all external and fallback launches issued in a single message, Cursor slots first, then Codex slots, then any Claude fallbacks.
2a.3 — Wait and Validate Sketches
Collect and validate external sketch outputs using the shared collection script. Pass the output paths for whichever external slots were actually launched (omit any slot where the tool was unavailable and a Claude subagent fallback is returning via Agent tool instead).
Regular mode (4 external output files when both tools available):
${CLAUDE_PLUGIN_ROOT}/scripts/collect-agent-results.sh --timeout 1260 \
"$DESIGN_TMPDIR/cursor-sketch-arch-output.txt" \
"$DESIGN_TMPDIR/cursor-sketch-edge-output.txt" \
"$DESIGN_TMPDIR/codex-sketch-innovation-output.txt" \
"$DESIGN_TMPDIR/codex-sketch-pragmatic-output.txt"
Quick mode (2 external output files when both tools available):
${CLAUDE_PLUGIN_ROOT}/scripts/collect-agent-results.sh --timeout 1260 \
"$DESIGN_TMPDIR/cursor-sketch-generic-output.txt" \
"$DESIGN_TMPDIR/codex-sketch-generic-output.txt"
Use timeout: 1260000 on the Bash tool call. Do NOT set run_in_background: true — this call must block. Only include output paths for slots that were actually launched as external reviewers — omit any slot whose tool was unavailable (its fallback comes back via the Agent tool).
Note: This is a separate collect-agent-results.sh call from the one in Step 3. Both are permitted because they operate on completely distinct output file sets (*-sketch-*-output.txt vs *-plan-output.txt).
Parse the structured output for each reviewer's STATUS, REVIEWER_FILE, and HEALTHY. For sketches, a valid output is non-empty and contains substantive architectural content (at least a paragraph). If a reviewer's STATUS is not OK, follow the Runtime Timeout Fallback procedure in ${CLAUDE_PLUGIN_ROOT}/skills/shared/external-reviewers.md (set *_available=false for all subsequent steps).
After this collection boundary, consult any ${OUTPUT}.dirty-tree launcher sidecars for launched Cursor/Codex outputs, then run ${CLAUDE_PLUGIN_ROOT}/scripts/check-mid-run-dirty-tree.sh --mode checkpoint. If a sidecar or checkpoint reports STATUS=dirty or STATUS=unknown, write $DESIGN_TMPDIR/dirty-tree-detected.env with STATUS, STAGE=sketch-collection, and RECOVERY_REQUIRED=true, then fire the dirty-tree recovery AskUserQuestion regardless of auto_mode. Use a $DESIGN_TMPDIR/.dirty-tree-prompted-sketch-collection flag so one logical boundary prompts once.
2a.4 — Synthesis
Read all sketches (or their Claude fallbacks if an external tool was unavailable). Produce a synthesis that:
- Identifies where the approaches agree (likely the majority)
- Identifies where they diverge and makes a reasoned call on each contested point with justification
- Notes which ideas from each sketch are being incorporated into the full plan
Regular mode only (personality-specific highlights — skip these in quick mode):
- Highlights any Architecture/Standards concerns that should be addressed in the plan
- Highlights any Pragmatism/Safety warnings about regression risk or unnecessary complexity
- Surfaces any Edge-case/Failure-mode risks that should be addressed in the plan's Failure modes section
- Notes any Innovation/Exploration alternatives worth preserving as options even when not chosen
Quick mode: attribute sketches by tool (Cursor-Generic vs Codex-Generic). Skip personality-specific highlight bullets 4-7 above. Use generic agreement/divergence analysis only.
-
Lists contested decisions as a structured markdown list in $DESIGN_TMPDIR/contested-decisions.md. Use this schema:
### DECISION_1: <short title>
- **Chosen**: <the synthesis choice>
- **Alternative**: <the strongest alternative>
- **Tension**: <why this is contested — which sketches diverged and why>
- **Impact**: High/Medium/Low
- **Affected files**: <comma-separated list of files/modules impacted by this decision>
List decisions in priority order: High impact first, then by degree of sketch disagreement (more agents on different sides = higher priority), then by order of appearance in the synthesis. If no sketches diverged (all agents agreed on all points), write exactly NO_CONTESTED_DECISIONS as the entire file content.
Write the synthesis to $DESIGN_TMPDIR/approach-synthesis.txt so it can be referenced by Step 2b. If SESSION_ENV_PATH is empty, also print it under an ## Approach Synthesis header. If SESSION_ENV_PATH is non-empty, print only ✅ 2a: sketches — synthesis saved (<elapsed>).
2a.5 — Dialectic Resolution of Contested Decisions
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 2a.5 — dialectic" || true
Print: > **🔶 2a.5: dialectic**
Read $DESIGN_TMPDIR/contested-decisions.md. If the file contains only NO_CONTESTED_DECISIONS (ignoring leading/trailing whitespace and newlines), print ⏩ 2a.5: dialectic — no contested decisions (<elapsed>) and IMMEDIATELY proceed to Step 2b — do NOT halt after the skip breadcrumb.
Intentional divergence from the repo-wide replacement-first fallback architecture (debate phase only). The debate phase (steps 1-9 below) deliberately diverges from the "Voter Composition" rule in ${CLAUDE_PLUGIN_ROOT}/skills/shared/voting-protocol.md and from the Cursor/Codex fallback rules in the "Step 3 — Plan Review" section below: when an assigned debater tool is unavailable, the bucket is skipped entirely — Claude subagents are NEVER substituted into the dialectic debate path. Likewise, the "Runtime Timeout Fallback" procedure in ${CLAUDE_PLUGIN_ROOT}/skills/shared/external-reviewers.md flips orchestrator-wide *_available for all subsequent session steps; in this phase, runtime failures affect ONLY this phase's bookkeeping and never mutate the orchestrator-wide flags. Do NOT "fix" this carve-out back to global-flip + Claude-replacement behavior for debaters — see GitHub issue #98 for the rationale.
This divergence applies only to debate execution, not to judge adjudication. The post-debate judge panel (see ${CLAUDE_PLUGIN_ROOT}/skills/shared/dialectic-protocol.md) uses the repo-wide replacement-first pattern: when Cursor or Codex is unavailable for judging, a Claude Code Reviewer subagent replaces that slot so the panel always remains at 3 judges. Judges merely adjudicate between pre-authored defenses; the "no Claude substitution" rule is specific to adversarial debate where model-specific writing style could encode tool identity.
Otherwise, read $DESIGN_TMPDIR/approach-synthesis.txt — this provides {SYNTHESIS_TEXT} for the prompt templates below. Then apply the following protocol:
-
Cap = min(5, |contested-decisions|) — select that many decisions from the file (they are already in priority order from Step 2a.4).
-
Initialize dialectic-scoped shadow flags at the top of this step:
dialectic_codex_available = codex_available (snapshot at entry)
dialectic_cursor_available = cursor_available (snapshot at entry)
The orchestrator-wide codex_available / cursor_available flags are NEVER mutated during this step. This preserves Step 3's plan-review panel integrity by construction (Option B).
-
Deterministic per-decision bucket assignment (1-based indexing):
- Decision 1, 3, 5 → Cursor bucket (uses
dialectic_cursor_available).
- Decision 2, 4 → Codex bucket (uses
dialectic_codex_available).
- Both thesis and antithesis for a single decision use the same tool (bucket homogeneity).
-
Per-bucket pre-launch availability check. For each selected decision, check the assigned tool's dialectic_*_available flag:
- If
false: print **⚠ <Tool> unavailable — dialectic skipped for bucket <N> decisions (indices: <comma-list>). Step 2a.4 synthesis decisions stand.**, skip that decision, and continue. Do NOT fall back to a Claude Agent-tool subagent. Do NOT reassign the decision to the surviving tool. Do NOT abort this step.
- If
true: queue both the thesis and antithesis launch for that decision.
-
Zero-externals guardrail. If after iterating all selected decisions, zero buckets are queued, print no further launches, do NOT call collect-agent-results.sh at all, skip the judge phase entirely. The dialectic-resolutions.md file IS still written — it contains only Disposition: bucket-skipped entries (one per selected decision) plus any Disposition: over-cap entries for decisions ranked outside the top-5 cap — so Step 2b and Step 3.5 parse a uniform schema regardless of dialectic outcome. On this path, follow the second Do NOT load variant below.
MANDATORY — READ ENTIRE FILE before rendering debate prompts (step 6): Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/dialectic-execution.md completely. It contains the byte-preserved execution choreography: per-decision prompt rendering, parallel debater launch, collection, the eligibility gate (Dispositions), the debate quorum gate, the dialectic-local judge-panel re-probe, ballot construction, judge launch, tally, and the Write dialectic-resolutions.md sub-step. The first directive inside that file is a nested MANDATORY pointing to references/dialectic-debate.md — the template-body file that holds the Thesis/Antithesis prompt substitution placeholders ({FEATURE_DESCRIPTION}, {SYNTHESIS_TEXT}, {DECISION_BLOCK}, {CHOSEN}, {ALTERNATIVE}, {TENSION}, {AFFECTED_FILES} plus the <debater_synthesis> / <debater_decision> reference-block wrappers).
Do NOT load dialectic-execution.md when contested-decisions.md contains only NO_CONTESTED_DECISIONS — the short-circuit print at the top of Step 2a.5 exits before reaching this point, so the reference file is naturally never loaded on the no-contest path.
Do NOT load dialectic-execution.md when the zero-externals guardrail fired (zero buckets queued in step 5 above) — instead, jump directly to the final sub-step of dialectic-execution.md conceptually (emit only bucket-skipped / over-cap entries into dialectic-resolutions.md) without loading the full execution procedure. The dialectic-resolutions schema for these entries is documented in the Write $DESIGN_TMPDIR/dialectic-resolutions.md section of dialectic-execution.md; if the orchestrator already has the schema in context from a prior run, skip the load entirely. Otherwise, a one-time load of dialectic-execution.md is acceptable but the debate-execution mechanics inside it MUST NOT fire (no debaters, no judges, no ballot).
Execute steps 6 through the final ✅ 2a.5: dialectic — … print directive as documented in ${CLAUDE_PLUGIN_ROOT}/skills/design/references/dialectic-execution.md (loaded via the MANDATORY directive above). That file is the single normative source for dialectic-execution mechanics. The final Write $DESIGN_TMPDIR/dialectic-resolutions.md sub-step (including the per-disposition field rules) lives inside that reference; print the ## Dialectic Resolutions header at the end and the ✅ 2a.5: dialectic — <V> voted, <F> fallback, <S> bucket-skipped, <O> over-cap (<elapsed>) print directive (omit a count if zero).
After each dialectic collection boundary (debate results and judge results), follow the dirty-tree probe contract in references/heavy-worker.digest.md: consult launcher sidecars, run check-mid-run-dirty-tree.sh --mode checkpoint, and ask for recovery on dirty/unknown regardless of auto_mode, deduped by $DESIGN_TMPDIR/.dirty-tree-prompted-<boundary>.
Step 2b — Design the Implementation Plan
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 2b — plan" || true
Before writing any code, create a concrete implementation plan. Research the codebase (read relevant files, grep for patterns, understand existing architecture). See CLAUDE.md for project-specific development references and conventions.
Read $DESIGN_TMPDIR/approach-synthesis.txt from Step 2a and incorporate the synthesis into the plan. The synthesis should inform architectural decisions, file selection, and tradeoff resolutions.
Also read $DESIGN_TMPDIR/discussion-round1.md if it exists and is non-empty. Incorporate the scope boundaries and hard constraints established during the design discussion into the plan — these define what is in-scope, what must not break, and what the user explicitly does not want.
Also read $DESIGN_TMPDIR/dialectic-resolutions.md if it exists and is non-empty. Parse the structured fields defined in ${CLAUDE_PLUGIN_ROOT}/skills/shared/dialectic-protocol.md (Resolution, Disposition, Vote tally, Thesis summary, Antithesis summary, Why field). Branch on Disposition:
Disposition: voted: the plan must follow the Resolution direction and explicitly note how the antithesis concern (from Antithesis summary) was addressed, referencing the Why thesis prevails / Why antithesis prevails justification. These resolutions are binding for Step 2b — do not override them.
Disposition: fallback-to-synthesis: the synthesis decision stands (Resolution is the synthesis choice = CHOSEN). Note the Why fallback reason briefly (judge panel tie, quorum failure, etc.) but do NOT fabricate antithesis-engagement prose — no antithesis was heard with sufficient rigor to engage.
Disposition: bucket-skipped: the synthesis decision stands. Note that debate was skipped (Why skipped reason) but do NOT fabricate antithesis-engagement prose — no debate occurred.
Disposition: over-cap: the synthesis decision stands. Note that this decision was outside the dialectic cap (Why over-cap reason) but do NOT fabricate antithesis-engagement prose.
(Note: Step 3 plan review may subsequently revise the plan based on accepted review findings, which supersede dialectic resolutions.)
Produce a plan that includes:
- Files to modify/create: List each file with a brief description of what changes.
- Approach: Describe the implementation strategy, key decisions, and any trade-offs.
- Edge cases: Note important input/boundary conditions and how they'll be handled.
- Failure modes (for non-trivial changes): The 3 most likely architectural/systemic failure paths, earliest warning signals, and simplest mitigations. May be omitted for purely cosmetic or documentation-only changes.
- Testing strategy: What tests will be added or modified.
Write the plan to $DESIGN_TMPDIR/plan.txt with basename exactly plan.txt. If SESSION_ENV_PATH is empty, print the plan to the user under a ## Implementation Plan header so reviewers can see it. If SESSION_ENV_PATH is non-empty, print only ✅ 2b: full plan — saved (<elapsed>); /implement reads the exported plan file. The plan is an intermediate deliverable — IMMEDIATELY continue to Step 3 (Plan Review) after saving/printing. Do NOT halt, summarize, or treat the plan as the end of the design.
Continue to Step 3 IMMEDIATELY. The implementation plan is an intermediate design artifact — plan review, optional discussion, diagram generation, rejected-findings reporting, and cleanup still must run. See ${CLAUDE_PLUGIN_ROOT}/skills/shared/subskill-invocation.md section Step-boundary anti-halt.
Step 3 — Plan Review
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 3 — plan review" || true
LARCH_TOKEN_SESSION_ID=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_TOKEN_SESSION_ID --default "")
LARCH_CLAUDE_SOURCE_FILE=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_CLAUDE_SOURCE_FILE --default "")
export LARCH_TOKEN_SESSION_ID LARCH_CLAUDE_SOURCE_FILE
"${CLAUDE_PLUGIN_ROOT}/scripts/token-ledger.sh" mark "Step 1 — design Step 3 plan review" || true
IMPORTANT: Plan review MUST ALWAYS run with all 4 reviewers (2 Cursor: Arch, Edge + 2 Codex: Innovation, Pragmatic). Never skip or abbreviate this step regardless of how straightforward the plan appears — even when all sketch agents agreed, the plan is short, or the change seems trivial. Reviewers validate against the actual codebase state, catching issues that sketch-phase reasoning alone cannot detect. When Cursor is unavailable, each Cursor archetype slot falls back to Codex; when Codex is unavailable, each Codex archetype slot falls back to Cursor; when both are unavailable, each falls back to a Claude subagent.
MANDATORY — READ ENTIRE FILE before launching reviewers: Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/plan-review.md completely. The reference is the normative source for the reviewer-prompt content and post-launch procedures: the byte-preserved Competition notice blockquote (appended to EACH reviewer prompt), the voter-1 / voter-2 / voter-3 detailed quoted prompts, the ballot file handling paragraph, the Collecting External Reviewer Results procedure (4 reviewers: 2 Cursor archetypes (Arch, Edge) + 2 Codex archetypes (Innovation, Pragmatic), all external), the Voting Panel launch-order + threshold + Competition scoring rules, the Finalize Plan Review 4-step procedure plus OOS artifact write rule, the Track Rejected Plan Review Findings rule, and the accepted FINDING_N template, accepted oos-accepted-design.md format, and rejected-findings template. Step 3 control flow that remains inline in SKILL.md below (not in plan-review.md): the 4-reviewer "MUST ALWAYS run" IMPORTANT banner, the overall parallel-launch + spawn-order rule, ### External Reviewer Setup (writing $DESIGN_TMPDIR/plan.txt + the focus-area enum summary line), and the external reviewer launch Bash blocks (2 Cursor archetypes + 2 Codex archetypes) which must stay inline because CI greps SKILL.md for the focus-area enum they carry. The Competition notice must be in context before any reviewer launch below — reading this file now guarantees that.
Launch all 4 reviewers in parallel (in a single message). When Cursor is unavailable, each Cursor archetype slot falls back to Codex; when Codex is unavailable, each Codex archetype slot falls back to Cursor; when both are unavailable, each archetype slot falls back to a Claude subagent. Spawn order matters for parallelism — launch the slowest reviewers first: 2 Cursor archetypes (Arch, Edge), then 2 Codex archetypes (Innovation, Pragmatic). Each reviewer receives the plan text and the feature description. Each must only report findings — never edit files.
External Reviewer Setup (if codex_available or cursor_available)
Before launching external reviewers, verify the implementation plan exists at $DESIGN_TMPDIR/plan.txt so Codex and Cursor can read it. Step 2b owns writing this file.
Each reviewer walks five focus areas: code-quality / risk-integration / correctness / architecture / security.
Cursor Archetype Reviewers (2 slots)
Launch 2 Cursor archetype plan reviewers first in the parallel message (Arch and Edge — they take the longest). Each archetype reviews the plan from its specialized perspective. Each Cursor reviewer has full repo access. Fallback chain per slot: Cursor → Codex → Claude subagent (subagent_type: larch:code-reviewer, model: "sonnet" with the archetype personality prepended).
Cursor — Architecture/Standards (if cursor_available):
${CLAUDE_PLUGIN_ROOT}/scripts/launch-cursor-review.sh --output "$DESIGN_TMPDIR/cursor-plan-arch-output.txt" --timeout 1800 --timing-task-kind cursor-plan-arch --prompt "You are an Architecture/Standards reviewer. Review the implementation plan in $DESIGN_TMPDIR/plan.txt for this project. Read the plan file, then explore the codebase to validate the plan. Your role is to emphasize maintainability, engineering standards, separation of concerns, and reuse of existing patterns. Walk five focus areas: (1) Code Quality: logical flaws, code reuse, test coverage, backward compat, style consistency. (2) Risk/Integration: breaking changes, side effects, thread safety, deployment risks, regressions, CI. (3) Correctness: logic errors, off-by-one, nil handling, type mismatches, races, error paths. (4) Architecture: separation of concerns, contract boundaries, invariants, semantic boundaries. (5) Security: injection, authn/authz, secret handling, crypto, deserialization, SSRF, path traversal, dependency CVEs. Tag each finding with its focus area (one of code-quality / risk-integration / correctness / architecture / security). Return numbered findings with focus-area tag, concern, and suggested revision. If a finding is out of scope for this PR but worth tracking, prefix it with [OUT_OF_SCOPE]. When emitting [OUT_OF_SCOPE] findings, include affected repo-relative file paths and line ranges (e.g., skills/foo/bar.sh:120-150) in the finding's concern text when applicable, so /implement Step 9a.1's file-conflict pre-pass can emit serialization edges. If NO issues, output exactly NO_ISSUES_FOUND. Do NOT modify files. Work at your maximum reasoning effort level."
Use run_in_background: true and timeout: 1860000 on the Bash tool call.
Cursor — Edge-cases/Failure-modes (if cursor_available):
${CLAUDE_PLUGIN_ROOT}/scripts/launch-cursor-review.sh --output "$DESIGN_TMPDIR/cursor-plan-edge-output.txt" --timeout 1800 --timing-task-kind cursor-plan-edge --prompt "You are an Edge-case/Failure-mode reviewer. Review the implementation plan in $DESIGN_TMPDIR/plan.txt for this project. Read the plan file, then explore the codebase to validate the plan. Your role is to focus on what can go wrong: boundary conditions, error handling, failure recovery, race conditions, and silent data corruption. Walk five focus areas: (1) Code Quality: logical flaws, code reuse, test coverage, backward compat, style consistency. (2) Risk/Integration: breaking changes, side effects, thread safety, deployment risks, regressions, CI. (3) Correctness: logic errors, off-by-one, nil handling, type mismatches, races, error paths. (4) Architecture: separation of concerns, contract boundaries, invariants, semantic boundaries. (5) Security: injection, authn/authz, secret handling, crypto, deserialization, SSRF, path traversal, dependency CVEs. Tag each finding with its focus area (one of code-quality / risk-integration / correctness / architecture / security). Return numbered findings with focus-area tag, concern, and suggested revision. If a finding is out of scope for this PR but worth tracking, prefix it with [OUT_OF_SCOPE]. When emitting [OUT_OF_SCOPE] findings, include affected repo-relative file paths and line ranges (e.g., skills/foo/bar.sh:120-150) in the finding's concern text when applicable, so /implement Step 9a.1's file-conflict pre-pass can emit serialization edges. If NO issues, output exactly NO_ISSUES_FOUND. Do NOT modify files. Work at your maximum reasoning effort level."
Use run_in_background: true and timeout: 1860000 on the Bash tool call.
Cursor archetype fallback (per slot, if cursor_available is false): For each Cursor archetype slot where Cursor is unavailable, try Codex first (if codex_available). Use the same archetype prompt but launch via the Codex pattern (no --capture-stdout; uses --output-last-message) with distinct per-archetype output paths: $DESIGN_TMPDIR/codex-fallback-cursor-plan-arch-output.txt, $DESIGN_TMPDIR/codex-fallback-cursor-plan-edge-output.txt. If both Cursor and Codex are unavailable for a slot, launch a Claude subagent fallback (subagent_type: larch:code-reviewer, model: "sonnet") with the archetype personality prepended to the plan-review context.
Codex Archetype Reviewers (2 slots)
Launch 2 Codex archetype plan reviewers second in the parallel message (Innovation and Pragmatic, after Cursor). Each archetype reviews the plan from its specialized perspective. Each Codex reviewer has full repo access. Fallback chain per slot: Codex → Cursor → Claude subagent (subagent_type: larch:code-reviewer, model: "sonnet" with the archetype personality prepended).
Codex — Innovation/Exploration (if codex_available):
${CLAUDE_PLUGIN_ROOT}/scripts/launch-codex-review.sh --output "$DESIGN_TMPDIR/codex-primary-plan-innovation-output.txt" --timeout 1800 --timing-task-kind codex-plan-innovation --prompt "You are an Innovation/Exploration reviewer. Review the implementation plan in $DESIGN_TMPDIR/plan.txt for this project. Read the plan file, then explore the codebase to validate the plan. Your role is to question assumptions, suggest creative alternatives, and flag when the plan takes the obvious path without considering unconventional but potentially superior solutions. Walk five focus areas: (1) Code Quality: logical flaws, code reuse, test coverage, backward compat, style consistency. (2) Risk/Integration: breaking changes, side effects, thread safety, deployment risks, regressions, CI. (3) Correctness: logic errors, off-by-one, nil handling, type mismatches, races, error paths. (4) Architecture: separation of concerns, contract boundaries, invariants, semantic boundaries. (5) Security: injection, authn/authz, secret handling, crypto, deserialization, SSRF, path traversal, dependency CVEs. Tag each finding with its focus area (one of code-quality / risk-integration / correctness / architecture / security). Return numbered findings with focus-area tag, concern, and suggested revision. If a finding is out of scope for this PR but worth tracking, prefix it with [OUT_OF_SCOPE]. When emitting [OUT_OF_SCOPE] findings, include affected repo-relative file paths and line ranges (e.g., skills/foo/bar.sh:120-150) in the finding's concern text when applicable, so /implement Step 9a.1's file-conflict pre-pass can emit serialization edges. If NO issues, output exactly NO_ISSUES_FOUND. Do NOT modify files. Work at your maximum reasoning effort level."
Use run_in_background: true and timeout: 1860000 on the Bash tool call.
Codex — Pragmatism/Safety (if codex_available):
${CLAUDE_PLUGIN_ROOT}/scripts/launch-codex-review.sh --output "$DESIGN_TMPDIR/codex-primary-plan-pragmatic-output.txt" --timeout 1800 --timing-task-kind codex-plan-pragmatic --prompt "You are a Pragmatism/Safety reviewer. Review the implementation plan in $DESIGN_TMPDIR/plan.txt for this project. Read the plan file, then explore the codebase to validate the plan. Your role is to minimize the scope of changes, avoid unnecessary complexity, and ensure existing features are not broken. Walk five focus areas: (1) Code Quality: logical flaws, code reuse, test coverage, backward compat, style consistency. (2) Risk/Integration: breaking changes, side effects, thread safety, deployment risks, regressions, CI. (3) Correctness: logic errors, off-by-one, nil handling, type mismatches, races, error paths. (4) Architecture: separation of concerns, contract boundaries, invariants, semantic boundaries. (5) Security: injection, authn/authz, secret handling, crypto, deserialization, SSRF, path traversal, dependency CVEs. Tag each finding with its focus area (one of code-quality / risk-integration / correctness / architecture / security). Return numbered findings with focus-area tag, concern, and suggested revision. If a finding is out of scope for this PR but worth tracking, prefix it with [OUT_OF_SCOPE]. When emitting [OUT_OF_SCOPE] findings, include affected repo-relative file paths and line ranges (e.g., skills/foo/bar.sh:120-150) in the finding's concern text when applicable, so /implement Step 9a.1's file-conflict pre-pass can emit serialization edges. If NO issues, output exactly NO_ISSUES_FOUND. Do NOT modify files. Work at your maximum reasoning effort level."
Use run_in_background: true and timeout: 1860000 on the Bash tool call.
Codex archetype fallback (per slot, if codex_available is false): For each Codex archetype slot where Codex is unavailable, try Cursor first (if cursor_available). Use the same archetype prompt but launch via the Cursor pattern (with --capture-stdout) with distinct per-archetype output paths: $DESIGN_TMPDIR/cursor-fallback-codex-plan-innovation-output.txt, $DESIGN_TMPDIR/cursor-fallback-codex-plan-pragmatic-output.txt. If both Codex and Cursor are unavailable for a slot, launch a Claude subagent fallback (subagent_type: larch:code-reviewer, model: "sonnet") with the archetype personality prepended to the plan-review context.
Collecting, Voting, Finalize, Track Rejected
Follow plan-review.md (loaded via the MANDATORY at the top of Step 3) for: Collecting External Reviewer Results (collect-agent-results.sh for all 4 external reviewers, dedup in-scope and OOS separately), Voting Panel launch-order + threshold + Competition scoring, writing $DESIGN_TMPDIR/voting-tally.md, Finalize Plan Review (accepted findings revise plan, write $DESIGN_TMPDIR/accepted-plan-findings.md, write accepted OOS to $(dirname "$SESSION_ENV_PATH")/oos-accepted-design.md when SESSION_ENV_PATH is non-empty, write all OOS visibility content to $DESIGN_TMPDIR/oos.md, print non-accepted OOS under ## Out-of-Scope Observations only when SESSION_ENV_PATH is empty), and Track Rejected Plan Review Findings (append to $DESIGN_TMPDIR/rejected-findings.md, in-scope only). Accepted OOS Descriptions should include affected repo-relative file paths and line ranges when applicable; /implement Step 9a.1 serializes same-file OOS issues unless the exposed ranges are parseable and non-overlapping.
After the plan-review collection boundary, consult launcher ${OUTPUT}.dirty-tree sidecars, run check-mid-run-dirty-tree.sh --mode checkpoint, and ask for recovery on dirty/unknown regardless of auto_mode, deduped by $DESIGN_TMPDIR/.dirty-tree-prompted-plan-review.
If all reviewers report no in-scope issues and no out-of-scope observations, skip voting and proceed to Step 3.5 if auto_mode=false, or Step 3b if auto_mode=true.
Continue to Step 3.5 or Step 3b IMMEDIATELY. The plan-review result is not terminal — follow the auto_mode branch into discussion or diagram generation.
Step 3.5 — Design Discussion (Round 2)
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 3.5 — discussion r2" || true
LARCH_TOKEN_SESSION_ID=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_TOKEN_SESSION_ID --default "")
LARCH_CLAUDE_SOURCE_FILE=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_CLAUDE_SOURCE_FILE --default "")
export LARCH_TOKEN_SESSION_ID LARCH_CLAUDE_SOURCE_FILE
"${CLAUDE_PLUGIN_ROOT}/scripts/token-ledger.sh" mark "Step 1 — design Step 3.5 discussion r2" || true
Print: > **🔶 3.5: discussion r2**
If auto_mode=true: Print ⏩ 3.5: discussion r2 — skipped (auto mode) (<elapsed>) and proceed to Step 3b. Do NOT load discussion-rounds.md when auto_mode=true.
If auto_mode=false: Execute the Step 3.5 body in ${CLAUDE_PLUGIN_ROOT}/skills/design/references/discussion-rounds.md. If already loaded at Step 1c, no need to re-load; otherwise MANDATORY — READ ENTIRE FILE: Read ${CLAUDE_PLUGIN_ROOT}/skills/design/references/discussion-rounds.md completely. The body defines Inputs, Behavior (still-contested criteria including close 2-1 voted, fallback-to-synthesis, bucket-skipped, over-cap), Short-circuit, Output schema, Cap, and Terse-answer rules.
Step 3b — Architecture Diagram
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 3b — arch diagram" || true
LARCH_TOKEN_SESSION_ID=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_TOKEN_SESSION_ID --default "")
LARCH_CLAUDE_SOURCE_FILE=$("${CLAUDE_PLUGIN_ROOT}/scripts/read-session-env-key.sh" --file "$SESSION_ENV_PATH" --key LARCH_CLAUDE_SOURCE_FILE --default "")
export LARCH_TOKEN_SESSION_ID LARCH_CLAUDE_SOURCE_FILE
"${CLAUDE_PLUGIN_ROOT}/scripts/token-ledger.sh" mark "Step 1 — design Step 3b arch diagram" || true
Print: > **🔶 3b: arch diagram**
This step runs on ALL paths through Step 3 — whether voting produced revisions, rejected all findings, or was skipped entirely because all reviewers reported no issues. It always executes before Step 4.
Generate a mermaid Architecture Diagram that represents the high-level system/component structure of the feature based on the finalized implementation plan (revised or original). The diagram should focus on modules, boundaries, and their relationships — not runtime behavior or code flow.
Choose the most appropriate mermaid diagram type for the feature (e.g., graph TD, flowchart, C4Context, classDiagram, etc.). The diagram type is flexible — pick whatever best communicates the architecture.
Diagram contents must obey ${CLAUDE_PLUGIN_ROOT}/skills/shared/mermaid-safe-content.md to avoid sanitizer rejection.
Write the diagram to $DESIGN_TMPDIR/architecture-diagram.candidate.md first. The candidate file includes the ## Architecture Diagram heading and mermaid fence. Validate it before promotion:
"${CLAUDE_PLUGIN_ROOT}/scripts/sanitize-mermaid-fragment.sh" \
--input "$DESIGN_TMPDIR/architecture-diagram.candidate.md" \
--from-md \
--warnings-step "3b"
On STATUS=ok, rename the candidate to $DESIGN_TMPDIR/architecture-diagram.md. If SESSION_ENV_PATH is empty, also print the promoted diagram under a ## Architecture Diagram header with a mermaid code fence:
## Architecture Diagram
```mermaid
<diagram content>
**If diagram generation and sanitizer validation succeed**, print: `✅ 3b: arch diagram — saved (<elapsed>)` when `SESSION_ENV_PATH` is non-empty, or `✅ 3b: arch diagram — generated (<elapsed>)` when standalone — then IMMEDIATELY continue to Step 4.
**If the sanitizer returns `STATUS=rejected` or exits 2**, do NOT promote the candidate. Delete `$DESIGN_TMPDIR/architecture-diagram.candidate.md`, print `**⚠ 3b: architecture diagram — rejected by mermaid sanitizer (REASON_TOKEN=<token>); proceeding without diagram.**`, and continue to Step 4. When `SESSION_ENV_PATH` is non-empty, append `- **Step 3b — architecture diagram rejected:** <REASON_TOKEN>` under `### Warnings` in `$(dirname "$SESSION_ENV_PATH")/execution-issues.md` via `${CLAUDE_PLUGIN_ROOT}/scripts/append-execution-issue.sh`. Use only `REASON_TOKEN` values, not raw diagram content.
**If diagram generation fails** (e.g., the feature is too abstract to diagram meaningfully), print: `**⚠ 3b: arch diagram — generation failed, proceeding without diagram (<elapsed>)**` — then IMMEDIATELY continue to Step 4.
> **Continue to Step 4 IMMEDIATELY.** The architecture diagram branch is not terminal — rejected-findings reporting and cleanup still must run.
## Step 4 — Rejected Plan Review Findings Report
```bash
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 4 — rejected findings" || true
Print any rejected plan review findings:
- Ensure
$DESIGN_TMPDIR/rejected-findings.md, $DESIGN_TMPDIR/accepted-plan-findings.md, and $DESIGN_TMPDIR/oos.md exist; create empty files for any missing may-be-empty artifact so Step 5 can export a complete manifest.
- Check if
$DESIGN_TMPDIR/rejected-findings.md exists and is non-empty.
- If it has content and
SESSION_ENV_PATH is empty, print it under a ## Unimplemented Plan Review Suggestions header, formatted clearly with the reviewer name, the suggestion, and the reason for each.
- If it has content and
SESSION_ENV_PATH is non-empty, print only ✅ 4: rejected findings — saved (<elapsed>).
- If
$DESIGN_TMPDIR/rejected-findings.md is empty (it always exists after item 1), print: ✅ 4: rejected findings — all suggestions implemented (<elapsed>)
After printing rejected findings (or the "all implemented" message), IMMEDIATELY continue to Step 5 — do NOT halt or treat this as the end of the design.
Continue to Step 5 IMMEDIATELY. Rejected-findings output is not terminal — cleanup and manifest export still must run.
Step 5 — Cleanup and Final Warnings
SESSION_ENV_PATH="$SESSION_ENV_PATH" LARCH_TIMING_SKILL=design "${CLAUDE_PLUGIN_ROOT}/scripts/timing-ledger.sh" mark "design Step 5 — cleanup" || true
5a — Update Health Status File
Health status file updates are now handled automatically by collect-agent-results.sh --write-health during reviewer collection (Steps 2a.3 and 3). No additional cleanup-time write is needed unless a reviewer was marked unhealthy outside of a collect-agent-results.sh call (e.g., via a manual timeout detection). If SESSION_ENV_PATH is non-empty and any reviewer was marked unhealthy during this session that was NOT already written by collect-agent-results.sh, re-write the health status file at ${SESSION_ENV_PATH}.health with the final health state before cleanup.
5b — Remove Temp Directory
If SESSION_ENV_PATH is non-empty, export design artifacts before cleanup:
${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/write-design-manifest.sh --design-tmpdir "$DESIGN_TMPDIR" --implement-tmpdir "$(dirname "$SESSION_ENV_PATH")"
Parse MANIFEST_WRITTEN=<path> from stdout and set the mental flag MANIFEST_EXPORT_OK=true if the command exited 0 AND the manifest file exists AND is non-empty; otherwise set MANIFEST_EXPORT_OK=false, print **⚠ 5: cleanup — design manifest export failed. Parent /implement will rerun /design. Preserving $DESIGN_TMPDIR for inspection.**, and SKIP the cleanup-tmpdir.sh step below entirely so the parent /implement (or operator) can inspect the partial artifacts. If SESSION_ENV_PATH is empty, skip this manifest write and treat MANIFEST_EXPORT_OK as true for cleanup-gating purposes (standalone /design preserves visible inline output, has no parent consumer, and always cleans up on the normal path).
Manifest helper contracts (per ${CLAUDE_PLUGIN_ROOT}/.claude/rules/script-md-siblings.md):
${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/write-design-manifest.sh — atomic writer invoked above. Sibling contract: ${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/write-design-manifest.md.
${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/read-design-manifest.sh — consumer-side reader/verifier invoked from skills/implement/SKILL.md Step 1 after /design returns. Producer/reader colocation under skills/design/scripts/ is intentional (plan-review FINDING_12 vote: keep colocated, do not relocate to skills/implement/scripts/). Sibling contract: ${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/read-design-manifest.md.
${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/test-design-manifest.sh — regression harness for both writer and reader (atomicity, missing-required-artifact rejection, KV grammar, source/eval injection rejection, path-traversal rejection, symlink rejection, control-character rejection, malformed-key rejection). Wired into make lint via the test-design-manifest Makefile target. Sibling contract: ${CLAUDE_PLUGIN_ROOT}/skills/design/scripts/test-design-manifest.md.
Remove the session temp directory and all files within it. Run cleanup-tmpdir.sh only when MANIFEST_EXPORT_OK=true AND STANDALONE_HEAVY_FAILED is unset or false; otherwise skip cleanup so $DESIGN_TMPDIR is preserved for inspection. STANDALONE_HEAVY_FAILED=true is set by the Step 2a Subagent heavy phase failure branch when SESSION_ENV_PATH is empty (standalone /design --subagent failed); MANIFEST_EXPORT_OK=false is set by Step 5b's writer-invocation failure (nested /implement path):
${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-tmpdir.sh --dir "$DESIGN_TMPDIR"
Repeat any external reviewer warnings from earlier steps (Step 0 reviewer-availability checks via session-setup.sh, Step 2a sketch-phase failures/timeouts, Step 3 runtime failures, or Step 3b diagram generation failure) so they are visible at the end of the workflow. For example:
**⚠ Codex not available: <reason>**
**⚠ Cursor review failed: <reason>**
**⚠ Cursor sketch timed out / produced empty output**
**⚠ Codex sketch timed out / produced empty output**
**⚠ 3b: arch diagram — generation failed, proceeding without diagram (<elapsed>)**
If STEP_NUM_PREFIX is empty (standalone mode): Print: ✅ 5: cleanup — design complete! (<elapsed>)
If STEP_NUM_PREFIX is non-empty (orchestrated mode): skip this final print — the parent orchestrator handles overall progress.