with one click
start
// Start a new task. Give it a description and it handles discovery, spec, plan, execution, and audit.
// Start a new task. Give it a description and it handles discovery, spec, plan, execution, and audit.
| name | start |
| description | Start a new task. Give it a description and it handles discovery, spec, plan, execution, and audit. |
You are the entry point for dynos-work. You own all human-in-the-loop gates before execution. When done, the task is ready for /dynos-work:execute.
There is one pipeline for all tasks. There are no shortcuts. Historical memory may inform discovery and design review, but it is advisory only. Human approval and deterministic artifact checks decide readiness.
After EVERY Agent tool call in this skill (planner, spec-completion auditor, testing-executor), you MUST write a receipt that records token usage. Read total_tokens from the Agent tool result's usage summary and run:
Before EVERY planner receipt write, you MUST first write a per-phase injected-prompt sidecar by piping the planner prompt body into hooks/router.py planner-inject-prompt. Capture the printed sha256 digest and pass it back through as the injected_prompt_sha256=<digest> kwarg on the receipt call. The receipt will raise ValueError if the sidecar is missing or its contents do not match — that is the proof-of-injection gate.
# Discovery planner — write sidecar, capture digest:
DISCOVERY_DIGEST=$(printf '%s' "$DISCOVERY_PROMPT" | python3 "${PLUGIN_HOOKS}/router.py" planner-inject-prompt --task-id {id} --phase discovery)
# Spec planner — write sidecar, capture digest:
SPEC_DIGEST=$(printf '%s' "$SPEC_PROMPT" | python3 "${PLUGIN_HOOKS}/router.py" planner-inject-prompt --task-id {id} --phase spec)
# Plan planner — write sidecar, capture digest:
PLAN_DIGEST=$(printf '%s' "$PLAN_PROMPT" | python3 "${PLUGIN_HOOKS}/router.py" planner-inject-prompt --task-id {id} --phase plan)
Then, after each planner subagent returns, write the matching deterministic receipt:
# After planner spawn (discovery/design/classification):
python3 hooks/ctl.py planner-receipt .dynos/task-{id} discovery \
--tokens-used {TOTAL_TOKENS} \
--model {MODEL_USED} \
--agent-name planning \
--injected-prompt-sha256 "${DISCOVERY_DIGEST}"
# After planner spawn (spec normalization):
python3 hooks/ctl.py planner-receipt .dynos/task-{id} spec \
--tokens-used {TOTAL_TOKENS} \
--model {MODEL_USED} \
--agent-name planning \
--injected-prompt-sha256 "${SPEC_DIGEST}"
# After planner spawn (plan generation OR combined Spec + Plan):
python3 hooks/ctl.py planner-receipt .dynos/task-{id} plan \
--tokens-used {TOTAL_TOKENS} \
--model {MODEL_USED} \
--agent-name planning \
--injected-prompt-sha256 "${PLAN_DIGEST}"
# After validate_task_artifacts passes — REQUIRED before execute skill can run.
python3 hooks/ctl.py plan-validated-receipt .dynos/task-{id}
# After spec-completion auditor on high/critical-risk tasks only:
python3 hooks/ctl.py plan-audit-receipt .dynos/task-{id} \
--tokens-used {TOTAL_TOKENS} \
--model {MODEL_USED}
Each receipt auto-records tokens to token-usage.json. If you skip this, the retrospective will show 0 tokens and the effectiveness scores will be wrong. This is the same enforcement pattern as the execute skill's receipts.
.dynos/ exists: mkdir -p .dynos. Then auto-register this project with the global registry (silent, idempotent): run python3 "${PLUGIN_HOOKS}/registry.py" register "$(pwd)" 2>/dev/null || true. This creates ~/.dynos/projects/{slug}/ and adds the project to ~/.dynos/registry.json if not already registered. No user action needed. Then ensure the local maintenance daemon is running (silent, idempotent): run PYTHONPATH="${PLUGIN_HOOKS}:${PYTHONPATH:-}" python3 "${PLUGIN_HOOKS}/daemon.py" start --root "$(pwd)" 2>/dev/null || true. If already running, it is a no-op.task-YYYYMMDD-NNN..dynos/task-{id}/.raw-input.md with the full task description exactly as given.manifest.json with at least:{
"task_id": "task-20260403-001",
"created_at": "ISO timestamp",
"title": "First 80 characters of task description",
"raw_input": "Full task description as provided by user",
"input_type": "text | prd | wireframe | mixed",
"stage": "FOUNDRY_INITIALIZED",
"classification": null,
"retry_counts": {},
"blocked_reason": null,
"completed_at": null
}
.dynos/task-{id}/execution-log.md.raw-input.md, inspect the user input for:
.prd.md, .pdf, or .txt and treat it as a primary spec source.manifest.json parses as valid JSON and that task_id, created_at, raw_input, and stage are present before continuing. Run:python3 hooks/ctl.py validate-task .dynos/task-{id}
dynos-work: Foundry Task Initialized: task-YYYYMMDD-NNNAfter every subagent spawn AND every deterministic validation, record the event:
For LLM subagent spawns (planner, testing-executor, spec-completion-auditor):
PYTHONPATH="${PLUGIN_HOOKS}:${PYTHONPATH:-}" python3 "${PLUGIN_HOOKS}/lib_tokens.py" record \
--task-dir .dynos/task-{id} \
--agent "{agent_name}" \
--model "{model_name}" \
--input-tokens {input_tokens} \
--output-tokens {output_tokens} \
--phase planning \
--stage "{current_manifest_stage}" \
--type spawn \
--detail "{what the agent did}"
For deterministic Python validations (validate_task_artifacts, ctl validate-task, spec heading check, etc.):
PYTHONPATH="${PLUGIN_HOOKS}:${PYTHONPATH:-}" python3 "${PLUGIN_HOOKS}/lib_tokens.py" record \
--task-dir .dynos/task-{id} \
--agent "{validation_tool_name}" \
--model "none" \
--input-tokens 0 \
--output-tokens 0 \
--phase planning \
--stage "{current_manifest_stage}" \
--type deterministic \
--detail "{validation result summary}"
Where:
input_tokens/output_tokens come from the Agent tool result's usage summary (pass 0 if unavailable)model_name is the model used (e.g. "opus", "sonnet", "haiku")current_manifest_stage is the current stage from manifest.json (e.g. DISCOVERY, SPEC_NORMALIZATION, PLANNING)detail is a short description of what happened (e.g. "Discovery + Design + Classification", "Validated spec.md headings — 0 errors")This writes to .dynos/task-{id}/token-usage.json with a chronological event log. The same hook is used by execute and audit skills — events accumulate across all phases.
Specific events to record in this skill:
Every Agent tool spawn in this skill (planner, spec-completion-auditor, testing-executor) requires a stamped active-segment-role file. The subagent's pre_tool_use.py reads this file to resolve its write role; without it the subagent runs as execute-inline and write_policy.decide_write denies the writes the subagent is trying to make:
planning is the only role that may write discovery-notes.md, design-decisions.md, spec.md, and plan.md. Without the stamp, the planner falls to execute-inline and every spec/plan write is denied.audit-* roles are the only roles that may write audit-reports/. Without the stamp, the spec-completion-auditor falls to execute-inline and its audit report write is denied.testing-executor is the only executor role permitted to write evidence/tdd-tests.md plus the test files in this stage; without it the spawn falls to execute-inline (works for repo files but not for the executor-scoped invariants downstream).Direct writes to active-segment-role are denied by write_policy.py — the file is wrapper-required. Always go through ctl:
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "{role}"
The wrapper enforces _STAMP_ROLE_ALLOWLIST (hooks/ctl.py), which gates the allowed values. Forgery defense for audit-* claims is enforced downstream by receipt_audit_done, which cross-checks against spawn-log.jsonl — stamping a role that no real Agent spawn matches produces an unforgeable audit-trail mismatch at receipt time.
The stamp file is overwritten by each new stamp, so successive phases do not need explicit cleanup between spawns. Step 9 cleans it up before handing off to /dynos-work:execute.
raw-input.md.dynos/trajectories.json if available (skip when learning_enabled=false in project policy.json)AskUserQuestion.discovery-notes.md with the Q&A.There is no direct-classify shortcut here. Always use the planner for Discovery + Design + Classification. Do NOT skip the planner spawn and do NOT infer classification directly from the task input in prompt logic.
Learned Planning Skill Injection (skip when learning_enabled=false): Before spawning the planner, check if a learned planning skill exists for this task type:
PYTHONPATH="${PLUGIN_HOOKS}:${PYTHONPATH:-}" python3 -c "from pathlib import Path; from router import resolve_route; r = resolve_route(Path('.'), 'plan-skill', '{task_type}'); print(r['agent_path'] or '')"
If a non-empty path is returned AND the file exists, read it, strip frontmatter, and append its contents to the planner's instruction below under a ## Learned Planning Rules heading. This injects project-specific planning patterns (e.g., tighter acceptance criteria, better segment sizing) derived from past task retrospectives. Log: {timestamp} [ROUTE] plan-skill route={mode} agent={agent_name}.
Stamp role BEFORE the spawn (MANDATORY):
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "planning"
Without this stamp the planner subagent resolves to execute-inline and its writes to discovery-notes.md and design-decisions.md are denied by write_policy.
Spawn the Planner subagent (dynos-work:planning) with instruction:
Phase: Discovery + Design + Classification (combined).
Read raw-input.md and discovery-notes.md if present. Also read any attached trajectory context as advisory prior history only.
Be ruthless. Surface ambiguity, hidden requirements, failure modes, and soft assumptions. Do not produce generic questions or generic design options.
Perform three things in one pass:
1. Discovery: generate only the highest-value unresolved questions.
2. Design Options: break the task into subtasks. For any subtask rated hard complexity or critical value, generate 2-3 design options with pros and cons. For easy or medium subtasks, decide directly.
3. Classification: produce type, domains, risk_level, and notes.
Return three sections:
- Questions
- Design Options
- Classification (JSON)
Reject lazy output mentally before accepting it. If the planner returns generic questions, generic design options, or a mushy classification, send it back.
Present any remaining discovery questions to the user and append answers to discovery-notes.md.
If hard or critical design options were returned, present each option to the user with pros/cons and record the chosen design in design-decisions.md.
If no high-risk design options were returned, write design-decisions.md with the autonomous design choices and rationale.
Write the returned classification object to /tmp/classification-{id}.json, then run python3 hooks/ctl.py write-classification .dynos/task-{id} --from /tmp/classification-{id}.json.
Deterministic validation before proceeding:
classification.type.classification.risk_level.classification.domains.write-classification fails, stop and correct the payload before moving on.Transition the stage by running: Finalize classification through the deterministic control-plane entrypoint:
python3 hooks/ctl.py run-start-classification .dynos/task-{id}
run-start-classification validates the classification payload, applies fast-track + tdd_required, and advances the manifest to SPEC_NORMALIZATION when the task is ready to continue. If it exits non-zero, the JSON payload names the exact classification defects.
If the output contains "tdd_required": true: Step 8 (TDD-First Gate) is mandatory for this task. Do not override this with your own risk assessment — tdd_required is set deterministically by the system for high and critical risk tasks. The state machine will block PLAN_AUDIT → PRE_EXECUTION_SNAPSHOT if Step 8 is skipped. Note this now and plan accordingly.
Fast-track is determined by run-start-classification. Do not recompute it in prompt logic or by hand.
When fast-tracked (fast_track: true), apply these simplifications throughout the remaining steps:
Implicit Requirements Surfaced and Risk Notes sections can contain a single line each if no significant risks exist.fast_track: true in the manifest, spawn only spec-completion-auditor and security-auditor. Skip all other auditors regardless of streak or domain.If any condition is not met, proceed normally (no fast-track). Do not ask the user — this is a deterministic gate.
Before inventing a solution from scratch, run the deterministic gate:
python3 hooks/ctl.py run-external-solution-gate .dynos/task-{id}
This command writes .dynos/task-{id}/external-solution-gate.json and prints the same decision payload to stdout. The gate owns the decision artifact. Do NOT hand-write or rewrite this JSON in prompt logic.
The artifact shape is:
{
"search_recommended": true,
"search_used": false,
"query_reason": "One sentence explaining the recommendation",
"candidates": [],
"recommended_choice": null,
"decision_basis": {
"task_type": "feature|bugfix|refactor|migration|ml|full-stack",
"risk_level": "low|medium|high|critical",
"domains": ["backend"],
"trigger_matches": ["stripe"],
"local_bug_matches": [],
"file_scoped": false
}
}
Rules:
If search_recommended is false, proceed with local repo evidence only.
If search_recommended is true, you MUST conduct external research before proceeding. Use query_reason and decision_basis to form the search query, then call:
python3 hooks/ctl.py write-search-receipt .dynos/task-{id} \
--query "<your search query>" \
--urls-consulted "<url1>,<url2>" \
--findings-summary "<one-sentence summary of what the research found>"
Both --urls-consulted and --findings-summary are required when search_recommended is true; they record the research evidence in the receipt so the audit chain can verify that real external research was performed.
run-spec-ready (Step 3 exit) checks for this receipt and exits non-zero if it is missing. There is no rationalization that bypasses this — if search_recommended is true and the receipt is absent, the spec cannot advance to SPEC_REVIEW.
The planner still owns the final design choice. Research findings inform the plan; they do not automatically authorize adopting any external library or pattern.
Do not mutate the gate artifact by hand to claim search happened or to inject candidates.
Logging: append exactly one line to the execution log:
{timestamp} [GATE] external-solution — recommended: {true|false}
Proceed to Step 3.
Fast-track combined spawn: If manifest.json has "fast_track": true, skip the spawn and the spec validation. Spec is produced in Step 5 by the combined Spec + Plan planner spawn. Do NOT advance the manifest stage here — leave it at SPEC_NORMALIZATION. Walking the stage forward before spec.md exists breaks the artifact invariant in hooks/lib_validate.py (_SPEC_REQUIRED_AFTER requires spec.md once stage is SPEC_REVIEW or beyond), and any /dynos-work:status or /dynos-work:resume invocation in the window between Step 3 and Step 5 completing would observe stage=PLANNING with no spec on disk. The stage walk happens in Step 5 after spec.md is written. Log: {timestamp} [SKIP] spec-normalization-spawn — fast_track combined planner (stage walk deferred to Step 5). Skip the rest of this step and proceed to Step 4.
Normal path: Stamp role BEFORE the spawn (MANDATORY):
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "planning"
Without this stamp the planner falls to execute-inline and the spec.md write is denied by write_policy.decide_write (the spec.md guard at hooks/write_policy.py:290-296 denies executor roles outright). This is the failure mode reported in the SPEC_NORMALIZATION block incident.
Spawn the Planner subagent with instruction:
Phase: Spec Normalization.
Read raw-input.md, discovery-notes.md, and design-decisions.md.
Also read the actual implementation files referenced in the task (e.g., the files that will be modified). Verify runtime semantics directly from the code — do not assume template engines, escaping conventions, or generation mechanisms without reading the relevant functions. Include specific function signatures, data flow paths, and module boundaries in the spec.
Write a spec that leaves executors zero room to hand-wave. Name the exact behavior, exact boundaries, exact failure modes, and exact evidence needed to prove completion.
Write spec.md.
See docs/spec-writing-rules.md for known spec-writing anti-patterns.
If the spec still contains vague adjectives, missing states, or unstated boundary behavior after normalization, send it back again.
After spec.md is written, run deterministic spec validation:
Task Summary, User Context, Acceptance Criteria, Implicit Requirements Surfaced, Out of Scope, Assumptions, and Risk Notes.1 and incrementing by 1 with no gaps.needs confirmation or safe assumption.If any rule fails, send the Planner back to fix spec.md before presenting it.
Finalize spec readiness through the deterministic control-plane entrypoint:
python3 hooks/ctl.py run-spec-ready .dynos/task-{id}
run-spec-ready validates spec.md, writes the spec-validated receipt, and advances SPEC_NORMALIZATION -> SPEC_REVIEW when the artifact is sound. If it exits non-zero, the JSON payload tells you exactly why the spec must be regenerated.
Fast-track skip: If manifest.json has "fast_track": true, skip this step. Spec is reviewed together with the plan in Step 6 (combined approval gate). Log: {timestamp} [SKIP] spec-review — fast_track combined gate.
Auto-approve path (precedes the human path): If manifest.json has "auto_approve_gates": true, do NOT present spec.md to the user. Run the auto-approved variant of the approve-stage ctl command instead. This is the only sanctioned bypass — it still hashes the live spec.md, writes a human-approval-SPEC_REVIEW receipt with approver_type="residual-auto", and advances SPEC_REVIEW → PLANNING through the same atomic gate that the human path uses. The receipt is forensically distinguishable from a human approval (the approver_type field), so the audit chain remains intact.
python3 hooks/ctl.py approve-stage .dynos/task-{id} SPEC_REVIEW --auto-approved
auto_approve_gates is not true (the manifest flag was flipped off mid-flight) or a hash mismatch. Log the stderr message and fall through to the human-approval path below. Do NOT retry with --auto-approved if the manifest flag is no longer true; the gate is correct to refuse.In the auto path, the "if changes requested" and "if rejected" branches of the human path below are unreachable — the residual queue's classification has already filtered out tasks where human review is required. If you find yourself in those branches with auto_approve_gates=true, that is a bug in the pick-time ceilings, not a runtime decision to make here.
Normal path: Present spec.md to the user and ask for approval.
approve-stage ctl command below. It hashes the current spec.md, writes the human-approval-SPEC_REVIEW receipt with that hash, the scheduler then observes the receipt write and advances the task to PLANNING asynchronously. Do NOT write a manual [HUMAN] log line — approve-stage is the only path that satisfies the receipt-gate in transition_task (which compares the receipt's artifact_sha256 against the live spec.md at transition time and refuses with human-approval-SPEC_REVIEW / hash mismatch substrings on drift).receipt_spec_validated, and present the updated spec again. Do NOT call approve-stage until the user re-approves the regenerated spec.python3 hooks/ctl.py transition .dynos/task-{id} FAILED, append [FAILED] Spec rejected by user, and stop. Do not edit manifest.json directly.When approved:
python3 hooks/ctl.py approve-stage .dynos/task-{id} SPEC_REVIEW
Exit code 0 means the receipt was written and the scheduler queued the advance to PLANNING (verify via manifest.stage after the in-process event dispatch completes). Exit code 1 means the gate refused — the stderr text identifies the cause (missing artifact, hash drift, illegal transition). Do not retry without addressing the reported cause; in particular, do not call python3 hooks/ctl.py transition ... --force to bypass — that would advance the stage without a receipt and break the audit chain.
(transition_task auto-appends the [STAGE] → PLANNING log line; do not write it manually.)
Choose planning mode through ctl:
python3 "${PLUGIN_HOOKS}/ctl.py" run-planning-mode .dynos/task-{id}
Use the JSON output as authoritative:
planning_mode == "fast_track_combined": use the fast-track combined flowplanning_mode == "hierarchical": use hierarchical planningplanning_mode == "standard": use standard planningDo NOT re-derive fast-track, risk-based escalation, or acceptance-criteria thresholds in prompt logic.
Stamp role BEFORE every planner spawn in this step (MANDATORY — applies to all three flows below):
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "planning"
For hierarchical flow, stamp once before the Master Planner spawn AND again before each Worker Planner spawn (each spawn reads the file fresh at its first tool call — successive stamps with the same role are idempotent and overwrite cleanly). Without these stamps the planner falls to execute-inline and plan.md / execution-graph.json writes are denied by write_policy.
Hierarchical flow:
plan.md and an execution-graph payload, then persist the final graph ONLY via python3 hooks/ctl.py write-execution-graph .dynos/task-{id} --from /tmp/execution-graph-{id}.json.Fast-track combined flow (when fast_track: true):
SPEC_NORMALIZATION (Step 3 deferred the walk). Do NOT advance yet.Spec + Plan to produce spec.md, plan.md, and an execution-graph payload in /tmp/execution-graph-{id}.json, then persist the final graph ONLY via python3 hooks/ctl.py write-execution-graph .dynos/task-{id} --from /tmp/execution-graph-{id}.json. This replaces both Step 3 (Spec Normalization) and Step 5's normal planner spawn.validate_task_artifacts passes (see below), walk the stage forward through SPEC_NORMALIZATION → SPEC_REVIEW → PLANNING (each transition is legal per ALLOWED_STAGE_TRANSITIONS in hooks/lib_core.py). Only advance once the artifacts that justify each stage exist on disk. Log each transition. Then continue with the post-validation flow below (which advances to PLAN_REVIEW).Standard flow:
plan.md and an execution-graph payload, then persist the final graph ONLY via python3 hooks/ctl.py write-execution-graph .dynos/task-{id} --from /tmp/execution-graph-{id}.json.After generation, run deterministic artifact validation before any human review. If available in this repo, run:
python3 hooks/validate_task_artifacts.py .dynos/task-{id} --no-gap
The command is the source of truth for artifact validation. Use the rules below to explain and repair failures:
For plan.md:
Technical Approach, Reference Code, Components / Modules, Data Flow, Error Handling Strategy, Test Strategy, Dependency Graph, and Open Questions.API Contracts section is required. When domains include db: Data Model section is required.Reference Code paths must exist in the repo unless explicitly marked as to-be-created.API Contracts or Data Model sections, their claims are verified against the codebase. Endpoints listed in the API Contracts table must correspond to actual route definitions. Tables listed in the Data Model table must correspond to actual model/schema/migration definitions. Claimed-but-not-found entries are validation errors — the planner must either fix the table or mark new entries as to-be-created.For execution-graph.json:
id.files_expected.depends_on reference must point to an existing segment.criteria_id must map to a real acceptance criterion in spec.md.spec.md must be covered by at least one segment.If any validation fails, respawn planning and fix the artifacts before continuing.
Receipt: plan-validated (MANDATORY). Once validate_task_artifacts passes, write the plan-validated receipt. Without this receipt the eventual transition to EXECUTION (in the execute skill) will be blocked by the state machine:
python3 hooks/ctl.py plan-validated-receipt .dynos/task-{id}
Append to the execution log (transition_task auto-appends the [STAGE] → PLAN_REVIEW line — only the [DONE] line is the skill's responsibility):
{timestamp} [DONE] planning — final plan.md and execution-graph.json written (mode: {hierarchical|standard})
Transition the stage by running:
python3 hooks/ctl.py transition .dynos/task-{id} PLAN_REVIEW
This gate always runs. For fast-track tasks it acts as the combined Spec + Plan approval (since Step 4 was skipped) — present BOTH spec.md AND plan.md together.
Auto-approve path (precedes the human path): If manifest.json has "auto_approve_gates": true, do NOT present plan.md (or the combined spec.md + plan.md) to the user. Use the --auto-approved variant of approve-stage instead. Two cases:
Normal path (Step 4 already wrote the SPEC_REVIEW receipt — auto or human):
python3 hooks/ctl.py approve-stage .dynos/task-{id} PLAN_REVIEW --auto-approved
Fast-track combined gate (Step 4 was skipped; the manifest is at SPEC_REVIEW after Step 5's stage walk and needs both receipts in order):
python3 hooks/ctl.py approve-stage .dynos/task-{id} SPEC_REVIEW --auto-approved
python3 hooks/ctl.py approve-stage .dynos/task-{id} PLAN_REVIEW --auto-approved
The state machine requires the SPEC_REVIEW receipt before the PLAN_REVIEW receipt — this ordering is unchanged by the auto-approval feature. Both calls must return exit 0; if either returns exit 1, log the stderr message and fall through to the human-approval path for that specific gate.
Either auto path:
auto_approve_gates is not true, hash mismatch, or illegal transition). Log the stderr message and fall through to the human path below for the refused gate. Do not bypass with transition --force.In the auto path, "if changes requested" and "if rejected outright" branches below are unreachable — see the same note in Step 4.
Present the artifact(s) to the user and ask for approval.
If approved (normal path): run python3 hooks/ctl.py approve-stage .dynos/task-{id} PLAN_REVIEW. This hashes the current plan.md, writes the human-approval-PLAN_REVIEW receipt with that hash, and atomically advances PLAN_REVIEW → PLAN_AUDIT. Exit code 0 means success; exit code 1 means the gate refused (stderr identifies the cause). Do not bypass with transition --force.
If approved (fast-track combined gate): the manifest is currently at SPEC_REVIEW (Step 5 walked it through SPEC_NORMALIZATION → SPEC_REVIEW → PLANNING → PLAN_REVIEW). Run approve-stage twice in order — first for the spec, then for the plan:
python3 hooks/ctl.py approve-stage .dynos/task-{id} SPEC_REVIEW
python3 hooks/ctl.py approve-stage .dynos/task-{id} PLAN_REVIEW
Each call hashes the live artifact, writes the matching receipt, and advances one stage. Both must succeed; if either returns exit 1, address the reported cause before retrying.
If changes are requested: append the feedback, respawn planning (combined Spec + Plan phase for fast-track, otherwise standard planning), re-run deterministic artifact validation, and present the updated artifact(s) again. Do NOT call approve-stage until the user re-approves the regenerated artifact(s) — the gate compares the receipt hash to the live file at transition time, so an approval against an out-of-date hash will be refused with hash mismatch.
If rejected outright: run python3 hooks/ctl.py transition .dynos/task-{id} FAILED, append [FAILED] Plan rejected by user, and stop. Do not edit manifest.json directly.
The deterministic gap analysis ALWAYS runs. The LLM auditor only runs for high/critical-risk tasks (the deterministic check covers low/medium because validate_task_artifacts already enforces criteria coverage). This avoids 1.5–3M tokens per task on auditor work that duplicates the deterministic checks.
Deterministic gap analysis (mandatory, always runs):
python3 hooks/plan_gap_analysis.py --root . --task-dir .dynos/task-{id}
This verifies that claims in ## API Contracts and ## Data Model sections correspond to real code. If the plan claims an endpoint or table exists that the codebase doesn't have, the planner must either fix the table or explicitly mark the entry as to-be-created. Gap analysis failures block — repair before continuing.
LLM plan auditor (conditional): Only spawn spec-completion-auditor when risk_level is high or critical. For low/medium risk, the deterministic checks (validate_task_artifacts for criteria coverage + gap analysis for code/plan alignment) are authoritative — skip the LLM spawn. Log: {timestamp} [SKIP] plan-audit-llm — risk_level={risk}.
Stamp role BEFORE the spawn (MANDATORY — only when the conditional fires):
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "audit-spec-completion"
Without this stamp the auditor falls to execute-inline and its audit-reports/spec-completion.json write is denied by write_policy.decide_write (which restricts audit-reports/ to audit-* roles). Forgery defense: receipt_audit_done cross-checks the orchestrator-claimed spawn against spawn-log.jsonl, so stamping without a real Agent spawn produces an unforgeable mismatch at receipt time.
Additionally, surface any segment where len(segment.files_expected) >= 10 (computed budget ≥ 35, the TOOL_BUDGET_ADVISORY threshold from hooks/lib_tool_budget.py) as a non-blocking advisory finding near-budget-ceiling. The advisory is informational and does NOT block plan approval; it warns the operator that the segment is approaching the 11-file overflow ceiling and may want decomposition.
If gap analysis finds gaps, or (when invoked) the auditor finds gaps, route back to planning, repair, and rerun deterministic artifact validation.
Create a git branch safety net: dynos/task-{id}-snapshot.
This gate is mandatory when manifest.classification.tdd_required is true (auto-derived by the system for high and critical risk tasks; also set for explicit opt-in). Do not skip this step based on your own risk judgment. The run-start-classification output surfaces tdd_required; if it is true, this step is required and the state machine will block PLAN_AUDIT → PRE_EXECUTION_SNAPSHOT without it.
When tdd_required is false: tests are written by testing-executor after production code, in the execute skill (Step 4 of execute), where the implementation context is already known. This avoids ~1.5–2M tokens of pre-code context loading per task.
When tdd_required is true:
Stamp role BEFORE the spawn (MANDATORY):
python3 "${PLUGIN_HOOKS}/ctl.py" stamp-role .dynos/task-{id} --role "testing-executor"
Without this stamp the testing-executor falls to execute-inline. Repo-file writes still work because write_policy permits both execute-inline and *-executor for repo artifacts, but the role file is what every other dynos-work skill in this codebase stamps before an executor spawn — keeping the convention here ensures the spawn-log entry's claimed role matches the runtime role and that downstream receipts identify the spawn correctly.
Spawn testing-executor with instruction:
TDD-First Mode.
Read spec.md and plan.md.
Write a complete test suite covering every acceptance criterion.
Do not implement production code.
Write only test files and evidence to .dynos/task-{id}/evidence/tdd-tests.md.
Deterministically validate the generated tests before user review:
spec.md must be mapped to at least one test case in the evidence summary.Present the test file paths and summary to the user.
If changes are requested, rerun the testing executor and the same deterministic validation.
When validation passes (and before asking for human approval), write the TDD receipt:
from pathlib import Path
from lib_receipts import receipt_tdd_tests, hash_file
evidence_path = Path(".dynos/task-{id}/evidence/tdd-tests.md")
receipt_tdd_tests(
task_dir=Path(".dynos/task-{id}"),
test_file_paths=[...], # list of test file paths from this run
tests_evidence_sha256=hash_file(evidence_path),
tokens_used=TOTAL_TOKENS, # from testing-executor spawn usage
model_used="opus", # or whichever model was used
)
Auto-approve path (precedes the human path): If manifest.json has "auto_approve_gates": true AND tdd_required: true (which is the only condition under which Step 8 fires at all), do NOT present the test suite to the user. After the testing-executor spawn and the deterministic validation in step 2 above have both passed, run the auto-approved variant of approve-stage:
python3 hooks/ctl.py approve-stage .dynos/task-{id} TDD_REVIEW --auto-approved
transition --force.This conditional does NOT change the tdd_required == false behavior. When tdd_required is false, Step 8 does not fire at all (existing condition unchanged); the auto-approval flag has no effect on a step that is not executed.
When the user approves the test suite (or after step 6's auto-approval has run), transition out of TDD_REVIEW via the approve-stage ctl command. This hashes evidence/tdd-tests.md, writes the human-approval-TDD_REVIEW receipt with that hash, and advances TDD_REVIEW → PRE_EXECUTION_SNAPSHOT in one atomic step. Do NOT append a manual [HUMAN] log line — the receipt + approve-stage path is the only one the state machine accepts (the gate refuses with human-approval-TDD_REVIEW / hash mismatch substrings on drift):
python3 hooks/ctl.py approve-stage .dynos/task-{id} TDD_REVIEW
Exit code 0 means the receipt was written and the stage advanced. Exit code 1 means the gate refused — the stderr text identifies the cause. Do not bypass with transition --force. (If step 6's auto-approval already advanced the stage, this human-path call is unreachable.)
Commit the approved tests to the snapshot branch before any production code is written. The commit message MUST start with tdd: (e.g. tdd: PRO-XYZ test suite (RED)). This is a load-bearing convention, not just a style hint: ctl record-snapshot in the execute skill detects HEAD's commit message and rewinds the recorded snapshot SHA to HEAD^ when the message starts with tdd:. Without that rewind, the TDD-committed test files end up AT the snapshot SHA, never appear in git diff <snapshot>, and break run-execution-segment-done's coverage check for the test segment. Other commit-message prefixes (feat:, fix:, refactor:, etc.) suppress the rewind.
Role file cleanup (MANDATORY — BEFORE the handoff transition): Delete .dynos/task-{id}/active-segment-role so /dynos-work:execute starts each segment with a clean slate. Deletion is permitted by write_policy (only the write path is wrapper-required); leaving the file in place is benign because every executor stamp overwrites it, but cleaning up keeps the spawn-log → role file relationship one-to-one for the next phase.
rm -f .dynos/task-{id}/active-segment-role
Transition the stage by running:
python3 hooks/ctl.py transition .dynos/task-{id} PRE_EXECUTION_SNAPSHOT
Append to the execution log:
{timestamp} [ADVANCE] PLAN_AUDIT → PRE_EXECUTION_SNAPSHOT
Print:
Foundry Ready to Execute.
Task: {task_id}
Spec: {N} acceptance criteria (human approved)
Plan: approved, validated, and audited (mode: {standard|hierarchical})
Memory: advisory only
Next: /dynos-work:execute
spec.md, plan.md, execution-graph.json, or the generated test suite for approval until deterministic validation passes.[HINT] Download the complete skill directory including SKILL.md and all related files