| name | dev-pipeline |
| description | Mercury's preset Main → Dev → Acceptance chain for executing a single, well-scoped coding task end-to-end with blind acceptance review. **Use this skill proactively** whenever the user has a ready-to-implement task (instead of coding inline) — even if they don't explicitly ask for 'dev pipeline'. Say 'dev pipeline', 'dispatch task', '派发任务', 'blind review', or '完整开发链'. When the task is well-scoped, the skill spawns dev subagent to implement, acceptance subagent to blind-review, then loops or completes based on verdict. Independent of Mercury's other modules — works in any repo with .claude/agents/dev.md + .claude/agents/acceptance.md.
|
| user-invocable | true |
| allowed-tools | Read, Write, Edit, Glob, Grep, Bash, Agent, WebSearch, WebFetch |
Dev Pipeline — Main → Dev → Acceptance Preset Chain
A linear, single-task pipeline. The Main agent (you, the orchestrator running this skill) coordinates two sub-agent invocations and decides loopback vs completion.
Why a preset, not dynamic orchestration? See PHILOSOPHY.md in this directory.
Prerequisites
Before invoking this skill, the following must be true:
- Agents discoverable: .claude/agents/dev.md and .claude/agents/acceptance.md exist with valid YAML frontmatter (name dev / name acceptance). Verify by running the
claude agents command or by inspecting the files directly. If they do not exist, this skill cannot run — fall back to inline implementation.
- Task is well-scoped: clear definition of done, bounded write scope, listed acceptance criteria. If the task is ambiguous, run a research/design pass first instead of dispatching dev.
- Branch is correct: you are on a feature branch (not develop or master). Dev agent will commit and push to whatever branch is current.
- Issue exists: every task must have a GitHub Issue (Mercury rule). PR will reference it via Closes #N.
Iron Rules
| Rule | Why |
|---|
| One task per pipeline run | Mercury's preset is linear single-task, not parallel multi-task. For parallelism, the user opens multiple sessions. |
| TaskBundle is inline JSON, not Obsidian | This skill is dependency-free. The Memory Layer / Obsidian KB integration ships in Phase 3 — until then, the bundle is constructed inline by Main and passed to the dev subagent in the prompt. |
| Acceptance is blind | The acceptance subagent MUST NOT receive the dev subagent's reasoning, narrative, self-assessment, or risk evaluation. It receives only the AcceptanceBundle (criteria) plus a blindReceipt containing changed-file paths, test results, and a structured dodChecklist (per-criterion citations — structured pointers, not narrative reasoning). |
| Max 3 dev iterations | If acceptance returns fail 3 times, escalate to user — do not loop forever. |
| Main does not write code | Main coordinates, reviews receipts, decides next action. Implementation belongs to dev. Verification belongs to acceptance. |
| Sub-agents cannot spawn sub-agents | Per Claude Code documented constraint. Only the Main thread (running this skill) can dispatch dev or acceptance. Dev cannot call acceptance directly. Note: built-in Claude Code search tools (Read, Grep, Glob) and the built-in Explore subagent are exempt — they are part of the Claude Code platform, not custom Mercury subagents. Main may dispatch them per normal Claude Code semantics. The "no sub-agent spawning" rule applies specifically to Mercury's custom dev/acceptance/critic subagents defined under .claude/agents/. (Explore is dispatched via the Agent tool with subagent_type: "Explore".) |
Receipt return-size discipline
Sub-agent return values are the main session's primary context inflation source (typically 5–15K tokens per round). To keep the pipeline sustainable across many iterations, the dev receipt MUST use the structured slim format defined in Phase 2 — no free-form evidence or risks narrative. The dodChecklist array provides structured per-criterion citations that give acceptance everything it needs without prose overhead. See Issue #362 for background and real-world soak targets (<2K tokens avg per receipt).
Phase 1: Build TaskBundle
MANDATORY before any dispatch. Main constructs the TaskBundle inline based on the user's request, the GitHub Issue body, and any referenced design docs.
{
"taskId": "<short-slug>",
"issue": "<owner/repo#N>",
"title": "<one-line summary>",
"context": "<2-5 sentences why this task exists>",
"definitionOfDone": [
"<verifiable criterion 1>",
"<verifiable criterion 2>"
],
"allowedWriteScope": [
"<file or glob 1>",
"<file or glob 2>"
],
"mustNotTouch": [
"<file or glob>"
],
"readScope": [
"<file paths the dev should read first>"
],
"acceptanceCriteria": [
"<what acceptance will check 1>",
"<what acceptance will check 2>"
],
"verifyCommands": [
"<exact bash command to validate, e.g. pnpm test packages/foo>"
],
"worktreePath": "<absolute path — injected by Main in Phase 2; leave blank here>"
}
Gate: every field non-empty (except worktreePath, which is filled by Main in Phase 2). If definitionOfDone contains a subjective phrase (clean, elegant, good), rewrite it as a measurable criterion or escalate to the user.
Phase 2: Dispatch Dev
Before dispatching, capture the task-start SHA and create the isolated worktree for this task:
TASK_START_SHA=$(git rev-parse HEAD)
BRANCH_KEY=$(git rev-parse --abbrev-ref HEAD | tr '/' '_' | tr -cd '[:alnum:]_-')
SHA_FILE="${TMPDIR:-/tmp}/dev-pipeline-task-start-sha-${BRANCH_KEY}"
echo "$TASK_START_SHA" > "$SHA_FILE"
TASK_ID="<taskId from TaskBundle>"
TASK_BRANCH="feat/${TASK_ID}"
REPO_ROOT=$(git rev-parse --show-toplevel)
WORKTREE_PATH="${REPO_ROOT}/.worktrees/${TASK_ID}"
git worktree add "${WORKTREE_PATH}" -b "${TASK_BRANCH}"
The SHA file is keyed by the current branch name (slash-sanitized). This is stable across Bash invocations within the same pipeline run, and concurrent pipelines collide only if they are on the same branch — which would be a pre-existing git conflict anyway. Phase 6 hand-off must remove both it and the worktree.
Use the Agent tool with subagent_type set to dev. The prompt template:
You are operating under the dev agent role (.claude/agents/dev.md). Implement the following task and return a JSON receipt as your final message.
**Working directory: `<worktreePath>` (isolated git worktree). Use `cd <worktreePath>` before any file operation.**
## TaskBundle
[paste TaskBundle JSON built in Phase 1, with worktreePath field filled in]
## Execution Protocol
1. cd <worktreePath> — all file reads/writes and git commands run from this directory.
2. Read every file listed in readScope.
3. Implement within allowedWriteScope only. Touching anything in mustNotTouch is forbidden.
4. Run every command in verifyCommands. ALL must pass before you commit.
5. Self-fix once if a verifyCommand fails. If it still fails, STOP and report — do NOT commit broken code.
6. Commit with format type(scope): summary (Mercury convention).
7. Push to current branch.
8. Output the JSON receipt below as your FINAL message.
## Receipt template
{
"taskId": "[copied out of the bundle]",
"status": "completed|blocked|escalated",
"branch": "<current branch name>",
"commitSha": "sha",
"changedFiles": ["path", "..."],
"verifyResults": [
{"command": "cmd", "exitCode": 0, "summary": "one line"}
],
"dodChecklist": [
{"criterion": "<text from definitionOfDone>", "met": true, "citation": "<file:line or test output>"}
],
"escalationReason": "only if status is not completed"
}
## Forbidden
- git switch, git checkout, git branch, git reset, git rebase, git merge, git push --force
- git add -A or git add .
- Modifying CLAUDE.md or any file under .claude/agents/
- Creating or modifying git worktrees (Main's responsibility)
- Picking up additional work after the receipt is filed
Gate: dev must return a JSON receipt with status completed. If blocked or escalated, jump to Phase 5 (escalate to user).
Phase 3: Receipt Review (Main)
Main checks receipt completeness — NOT correctness (that is acceptance's job).
Checklist:
Gate: if any check fails, send a correction prompt to dev (still iteration 1) with the specific deficiency. Do not advance to acceptance with an incomplete receipt.
Phase 4: Dispatch Acceptance (BLIND)
Build the blindReceipt by forwarding the structured fields from the dev receipt. Preserve original JSON types — changedFiles, verifyResults, and dodChecklist are arrays in the dev receipt and MUST remain arrays here, not stringified placeholders:
{
"taskId": "task-slug",
"branch": "<branch name>",
"changedFiles": ["path/to/file1.ts", "path/to/file2.ts"],
"commitSha": "abc123def",
"verifyResults": [
{"command": "pnpm test packages/foo", "exitCode": 0, "summary": "12 passed"},
{"command": "pnpm lint", "exitCode": 0, "summary": "0 issues"}
],
"dodChecklist": [
{"criterion": "criterion text from DoD", "met": true, "citation": "file:line or test output"}
]
}
Note what was REMOVED relative to the dev receipt: escalationReason (only present when status != completed — not relevant to a completed task's acceptance review). The dodChecklist is forwarded because it contains structured per-criterion citations (not dev's narrative reasoning) — acceptance uses it to cross-check each DoD item against actual code. The acceptance agent still forms its own independent conclusions from code and tests; dodChecklist citations are starting pointers, not authoritative verdicts.
Build the AcceptanceBundle (also preserve original types — definitionOfDone, acceptanceCriteria, verifyCommands are arrays, not strings):
{
"taskId": "task-slug",
"title": "one-line summary",
"definitionOfDone": ["criterion 1", "criterion 2"],
"acceptanceCriteria": ["check 1", "check 2"],
"verifyCommands": ["pnpm test packages/foo", "pnpm lint"]
}
Use the Agent tool with subagent_type set to acceptance. Prompt template:
You are operating under the acceptance agent role (.claude/agents/acceptance.md). BLIND REVIEW: you are FORBIDDEN from inferring or asking about the dev agent's reasoning, narrative, or self-assessment.
## AcceptanceBundle
[paste AcceptanceBundle JSON]
## Blind Receipt (changed files, test results, structured dodChecklist — NO dev narrative)
[paste blindReceipt JSON]
## Instructions
1. Read every file listed in changedFiles at the latest commit.
2. Run every command in verifyCommands. Capture exit codes and output.
3. Evaluate each acceptanceCriteria and definitionOfDone item against the actual code and runtime output. Cite file:line evidence.
4. Output your verdict as JSON.
## Verdict template
{
"verdict": "pass|partial|fail|blocked",
"criteriaResults": [
{"criterion": "text", "verdict": "pass|fail|partial", "evidence": "file:line or test output"}
],
"findings": ["problem 1", "problem 2"],
"recommendations": ["actionable fix 1"]
}
Gate: capture the verdict.
Phase 5: Decide Next Action
Based on the acceptance verdict:
| Verdict | Action |
|---|
| pass | Pipeline complete. Summarize result for user (Chinese for milestones). Hand off to /pr-flow if a PR is the next step. Run cleanup (see below). |
| partial | Re-dispatch dev with the original full TaskBundle plus a priorFindings array containing acceptance's findings. Constraints (definitionOfDone, allowedWriteScope, mustNotTouch, readScope) MUST be carried over verbatim from iteration 1 — never widened, never dropped. Increment iteration. Do NOT clean up $SHA_FILE between iterations — Phase 3 needs it on every retry. |
| fail | Same as partial: dispatch with full original TaskBundle + priorFindings + priorRecommendations. Constraints carried verbatim. Increment iteration. Do NOT clean up between iterations. |
| blocked | Escalate to user. Acceptance hit an environmental block; user must resolve. Run cleanup. |
Constraint preservation: every retry dispatch must include the EXACT original definitionOfDone, allowedWriteScope, mustNotTouch, and readScope from iteration 1. Adding a new constraint is OK; widening or dropping an existing one is forbidden — that defeats the purpose of the bundle as a contract.
Iteration cap: if iteration is at least 3 and verdict is still not pass, escalate to user with the full history and run cleanup. Do not silently keep looping.
Cleanup (mandatory on every terminal exit path)
rm -f "$SHA_FILE"
REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || echo "")
if [ -z "$REPO_ROOT" ]; then
echo "WARN: cannot determine REPO_ROOT — skipping worktree/branch cleanup" >&2
else
bash "$REPO_ROOT/scripts/cleanup-worktree-branch.sh" \
"$TASK_BRANCH" "$(git rev-parse --abbrev-ref HEAD)" \
--force --worktree-path "$WORKTREE_PATH" \
|| echo "WARN: cleanup-worktree-branch.sh exited non-zero — see stderr" >&2
fi
This runs on pass, blocked, escalation after partial/fail, and on iteration-cap escalation. The ONLY paths that skip cleanup are intra-iteration dev re-dispatches (because Phase 3 still needs the SHA and the worktree is still active). If the loop terminates without reaching one of these branches (e.g. host crash), the SHA file at ${TMPDIR:-/tmp}/dev-pipeline-task-start-sha-${BRANCH_KEY} will be cleaned up on the next pipeline run against the same branch (the new invocation overwrites it) or by OS tmp eviction; orphaned worktrees under .worktrees/ can be reclaimed by scripts/worktree-reaper.sh --prune.
Phase 6: Hand-off
Phase 6 is reached only on pass. Cleanup for non-pass terminal exits is handled inside Phase 5 — do not duplicate it here.
On pass:
-
Confirm commit is pushed (git status)
-
If user requested PR: invoke /pr-flow
-
Mark related GitHub Project item Done (via /gh-project-flow if Mercury self-dev) or via Closes #N in PR (general case)
-
Summarize in Chinese for the user
-
Notify (Mercury-only, fail-safe): emit a user-actionable Telegram notification announcing pipeline completion so the user can decide next step (review PR, run cleanup, hand off) without watching the terminal:
bash scripts/notify-event.sh info "Dev pipeline complete: <taskId>" "verdict=pass | files=<N> | branch=<branch>"
The wrapper writes a single JSON line to stdout and never blocks the pipeline. Return shapes: happy path → {ok:true}; MERCURY_NOTIFY_DISABLED=1 → {ok:true,skipped:true} (intentional opt-out); router unreachable → {ok:false,error:"transport"}; router replied non-2xx → {ok:false,error:"router_<status>"} (e.g. router_500); token file missing → {ok:false,error:"no_token"}; broken adapter / missing node → {ok:false,error:"adapter_missing"|"node_missing"|"invoke:..."}. Exit code is always 0 except for usage errors. The pipeline continues regardless. Anti-patterns (loop-detector stalls, hook failures, autocompact, heartbeat) MUST NOT call this — see adapters/mercury-channel-router/README.md "Acceptable Callers" + Issue #316. Skip this step in portable forks (no router → no-op anyway, but the helper file is Mercury-specific — see Detachability below).
-
After PR merge is confirmed, run the Phase 5 Cleanup block as the final action (see Phase 5 above — the retry + rm -rf fallback logic is the SoT and is not duplicated here).
Single source of truth: the Phase 5 Cleanup block is the only authoritative description of when $SHA_FILE is removed. Phase 6 only reaches it via the pass branch above. If you find yourself debating "should I clean up here", re-read Phase 5.
Explore guardrail (Main-side)
When Main dispatches Explore for readScope discovery, scope verification, or general codebase exploration, the Explore prompt MUST cap return at the following constraints:
- Token cap: cap return at ~5K tokens (caller-stated soft cap).
- Path-only preference: when matches exceed 20 files, return file paths only (one per line) — no file contents, no snippets, no surrounding context beyond the path.
- No raw file contents: never paste raw file contents into the return. Use
file:line citations with at most a 1-line context excerpt per citation.
- Overflow behavior (mandatory fallback): if any of the above thresholds are exceeded, the return MUST switch to path-only mode AND emit a single explicit fallback line at the top:
[guardrail-fallback: <reason>; matches=<N>; tokens≈<T>; raw output suppressed — caller may re-dispatch with narrower scope]. Do not silently truncate or arbitrarily summarize — the caller must know fallback was triggered so they can re-dispatch with tighter scope.
These constraints preserve the main session's context budget when using Explore for discovery. Violation risks are the same as raw-search injection: context pressure and session stops (Issue #215, #101 Gap 4).
Note: this section governs Main's use of the built-in Explore tool. Dev subagents operate in isolated worktree contexts and use Read/Grep/Glob directly per their allowed-tools list — they do not dispatch Explore as a subagent.
Detachability
This skill is designed to be portable to any repository that uses GitHub + Claude Code, provided:
- .claude/agents/dev.md and .claude/agents/acceptance.md exist with valid frontmatter
- The target repo uses GitHub Issues + GitHub PRs (the protocol references
Closes #N, gh pr create, and Mercury's /pr-flow skill — all GitHub-specific). Non-GitHub repos would need protocol adaptation.
- The repo has a sane verifyCommands story (tests, lint, build commands that exit non-zero on failure)
- The user is on a feature branch (not main, develop, or master)
The /gh-project-flow reference in Phase 6 is Mercury-specific and should be removed or replaced when porting elsewhere — it is mentioned only because Mercury self-development uses Project #3 for task tracking.
The Phase 6 step 5 notify call (bash scripts/notify-event.sh ...) is also Mercury-specific — it depends on scripts/notify-event.sh + adapters/mercury-notify/notify.cjs + adapters/mercury-channel-router/. Portable forks should remove the step or leave it as a no-op (the underlying notify.cjs itself fail-safes when the router is absent, so calling the script in a fork without the router will simply log {ok:false} and continue — no breakage, just a confused log line).
To use it elsewhere, copy this skill directory plus the two agent files, then strip the /gh-project-flow line from Phase 6 and remove the notify step (or strip just the notify line and accept the harmless log noise). No other Mercury dependency.
Known Limitations
- No parallel tasks. By design — Mercury's preset chain is linear single-task. For parallelism, open another session.
- No persistent memory between invocations. Each pipeline run is fresh. Phase 3 Memory Layer will lift this constraint.
- Subagent context is independent. The dev and acceptance subagents do NOT see the main session's history — only the prompt you send them. Be explicit; do not assume shared context.
- Critic agent not included by default. If you want a third independent verification pass (different model), add a Phase 4.5 dispatch to subagent_type critic. Out of scope for the baseline pipeline.
Failure Modes
| Symptom | Likely Cause | Fix |
|---|
| Agent tool returns unknown subagent type dev | Frontmatter missing or invalid in .claude/agents/dev.md | Check that name dev is the first non-divider line; restart session |
| Dev commits files outside allowedWriteScope | Bundle scope was too vague, or dev hallucinated needed files | Tighten scope; if hallucination, fix and re-dispatch with explicit prohibition |
| Acceptance returns pass but obvious bug exists | Acceptance criteria did not cover the bug class | Update bundle criteria; this is a design failure of the bundle, not the agent |
| Pipeline loops 3+ times on the same finding | Dev keeps fixing the wrong thing | Escalate immediately; usually means the finding text is ambiguous |