with one click
code-production-process
// Six-stage quality-gate pipeline for any code implementation task
// Six-stage quality-gate pipeline for any code implementation task
| name | code-production-process |
| description | Six-stage quality-gate pipeline for any code implementation task |
| user-invocable | false |
| disable-model-invocation | true |
| license | Apache-2.0 |
| compatibility | claude-code |
| progressive_disclosure | {"entry_point":{"summary":"6-stage gate pipeline (Research→Architect→Implement→Tests→Critic→Security). Hotfix path skips 1-2, mandatory 3-6. WARN proceeds with findings logged; BLOCK halts and returns to user — PM must NOT auto-retry without user direction.","when_to_use":"When PM is about to dispatch an engineer for non-trivial implementation (>50 lines OR >1 source file touched). Also fires when engineer-dispatch produces .py/source file changes detected via git diff. EXCLUDE: docs edits, commit messages, config-only changes.","quick_start":"1. MUST: dispatch Research before any engineer task. 2. MUST: Architect produces interface spec — NO implementation code. 3. Engineer implements + writes pytest. 3.5 MUST: pytest passes green — failing tests skip critic and return to engineer. 4. MUST: dispatch code-critic (isolated context — no implementer framing). 5. Verdict: APPROVE proceeds; WARN proceeds with findings logged to docs handoff; BLOCK halts and surfaces findings to user. 6. Security pass."},"references":["stage-research.md","stage-architect.md","stage-implement.md","stage-tests.md","stage-critic.md","stage-security.md","skip-rules.md","critic-isolation.md"]} |
This skill implements a six-stage quality-gate pipeline that every non-trivial code implementation task must pass through before being declared complete. The pipeline exists because the repository owner cannot visually assess code quality — all quality assurance must therefore be enforced by automated gates inside the agent pipeline itself, not by human review after the fact. Each stage produces a concrete artifact and must clear a defined gate before the next stage may begin. No stage may be skipped on the standard path. The critic agent operates with full context isolation to prevent anchoring bias from the implementer's own framing. A deliberately adversarial review catches issues that a cooperative pass-through would miss.
This skill loads when any of the following conditions are true:
Explicit triggers (pre-dispatch):
Post-dispatch detection:
git diff --stat shows source file changes (.py, .ts, .js, .go, .rs, .java, .rb, .sh with logic) regardless of stated task scopesrc/, lib/, app/, services/, api/ directoriesExplicit exclusions — do NOT trigger this pipeline for:
.md, .rst, .txt, .html content-only)*.yaml, *.toml, *.json, *.env with no code logic)When in doubt: err toward triggering the pipeline. A false positive costs one critic dispatch. A false negative ships broken or insecure code.
Each stage has a defined agent, required output artifact, and a gate condition. The gate condition must be satisfied before proceeding. Failing a gate returns to the current stage, not to stage 1, unless the failure indicates a fundamental misunderstanding of requirements.
Agent: research
Tools: mcp__vector-indexer-mcp__search_hybrid, mcp__knowledge__kb_search,
mcp__knowledge__search_local
What the agent does:
Required output artifact: A written spec document (markdown) covering:
Gate: Spec document exists and names at least one existing codebase reference. Research agent MUST NOT produce implementation code. If no relevant existing code exists, the output explicitly states "no prior implementation found" — this is a valid research finding, not a failure.
See: references/stage-research.md for prompt templates and search strategy.
Agent: python-engineer (in design mode — no implementation code produced)
Skills loaded: software-patterns
What the agent does:
Required output artifact: An interface specification document containing:
... or pass)Gate: Interface document exists. Zero implementation code (no function bodies with logic, no algorithm steps, no I/O calls). The architect produces the contract; the engineer fills it.
Important: The code-critic agent may optionally be dispatched here to review the
interface design for API coherence before implementation begins. This is the Phase 2 design
critic pass. If dispatched, critic input is the interface document only — no code exists yet.
See: references/stage-architect.md for interface specification templates.
Agent: python-engineer
Skills loaded: software-patterns, asyncio (if async code required), pytest
What the agent does:
pytest covering:
mypy --strict on the new code and resolves all type errors before declaring donepytest locally and confirms all tests pass before returningRequired output artifact:
mypy --strict output showing zero errorspytest output showing all tests passedGate: Engineer must provide mypy and pytest output with zero errors/failures before returning to PM. If engineer cannot provide this output, the engineer did not finish — PM must re-dispatch, not proceed to Stage 3.5.
See: references/stage-implement.md for implementation standards and self-check
checklist.
Agent: PM (not an agent dispatch — PM evaluates the Stage 3 output directly)
What PM does:
This gate is non-negotiable. Dispatching code-critic against code with failing tests wastes tokens on findings that may be artifacts of the broken state. The critic reviews working code, not code under active repair.
Failure return message to engineer (template):
Tests failed. Return to Stage 3. Fix the following failures before requesting critic review:
[paste pytest output here]
When tests pass green, provide updated implementation + passing pytest output.
See: references/stage-tests.md for test quality requirements and coverage standards.
Agent: code-critic
Skills loaded: code-review-standards, code-production-process
Context isolation is mandatory. See "Critic Isolation Rule" section below for full requirements. Failure to isolate the critic context is a process violation.
What the critic agent does:
code-review-standards severity-tagged checklist (CRITICAL/HIGH/MEDIUM/LOW)Required output artifact:
Gate: Critic returns a verdict. PM acts on verdict per the Verdict Protocol below.
See: references/stage-critic.md for dispatch template and isolation checklist.
Agent: security
Scope: OWASP Top 10, secrets/credentials exposure, injection vulnerabilities (SQL, shell, template), authentication and authorization bypass, arbitrary code execution paths, insecure deserialization, cryptographic weaknesses.
What the agent does:
Gate: Zero security findings at CRITICAL or HIGH severity. MEDIUM findings are documented and logged. LOW findings are noted. If CRITICAL or HIGH security findings are present, PM halts and surfaces to user — same protocol as a critic BLOCK.
See: references/stage-security.md for security review scope and OWASP mapping.
The critic agent's verdict is only as independent as the context it receives. Anchoring bias — where the reviewer unconsciously accepts the author's framing of what the code does — is the primary failure mode of code review in multi-agent systems.
The PM MUST construct the critic dispatch prompt to contain ONLY:
The PM MUST NOT include in the critic dispatch prompt:
If the engineer's implementation response contains inline explanatory text mixed with code, PM must extract only the code and test output for the critic dispatch — strip the implementer's narration.
This rule exists because an engineer who writes "I chose this approach because the alternative would have performance issues" subtly pre-argues against the critic raising performance concerns. The critic must encounter the code cold.
See: references/critic-isolation.md for prompt construction templates.
PM behavior is fully determined by the critic's top-level verdict. PM MUST NOT use judgment to override or soften the verdict protocol.
APPROVE (zero CRITICAL findings, zero HIGH findings):
WARN (zero CRITICAL findings, one or more HIGH findings):
BLOCK (any CRITICAL finding):
The following rules define when stages may be skipped. Stages 3 through 5 are NEVER skippable regardless of path.
Standard path: All 6 stages. Required for all new features, refactors >100 lines, any change to authentication/authorization code, any change to data models.
Hotfix path: Stages 1 and 2 may be skipped when ALL of the following are true:
Even on the hotfix path: Stage 3 (Implement + Tests), Stage 3.5 (Tests-Must-Pass Gate), Stage 4 (Critic), and Stage 5 (Security) are mandatory and cannot be skipped.
Documentation-only path: If ALL changes are confirmed to be documentation, configuration
values, or dependency version bumps with no logic changes — all 6 stages are skipped.
PM should verify via git diff --stat that no source files were modified.
See: references/skip-rules.md for skip rule decision tree and examples.
When a stage gate fails, the finding returns to the responsible agent in a structured format. PM does not attempt to interpret or patch the finding — PM passes it verbatim.
Format for returning to engineer after BLOCK:
PIPELINE GATE FAILURE — Return to Stage [N]
Critic verdict: BLOCK
Findings (CRITICAL):
| Severity | File | Line | Issue | Required Fix |
|----------|------|------|-------|--------------|
[paste finding table verbatim]
Instructions:
- Fix ALL CRITICAL items above
- Re-run pytest (must pass green)
- Re-run mypy --strict (must pass)
- Return updated implementation with fresh test output
Do NOT return until all CRITICAL findings are resolved and tests pass.
Format for returning to engineer after WARN (if user requests fixes):
PIPELINE GATE — WARN findings for review
Critic verdict: WARN
Pipeline proceeded, but the following HIGH findings were logged:
| Severity | File | Line | Issue | Recommended Fix |
|----------|------|------|-------|-----------------|
[paste finding table]
These were noted in the documentation handoff. If you choose to address them in this session,
re-run tests and confirm they still pass. No re-critic required for WARN resolutions.
Stages 1 and 2 are lightweight by design. Research and Architect agents operate primarily on text documents (specs, KB entries, interface files) rather than large codebases. The majority of token spend occurs in Stages 3-5, where the implementation, tests, critic review, and security review all operate on the full codebase context. Dispatching Research and Architect first ensures that Stage 3 (Implement) receives a clear spec and does not require iterative clarification, reducing the number of engineer re-dispatches — which is the primary source of token waste in unstructured implementation workflows.
PM-invocable protocol for Cargo publish and release operations in the trusty-tools Rust monorepo: semver rules, 10-step release sequence, macOS codesign safety, and cross-crate dependency ordering
PM-invocable protocol for running and interpreting Rust quality gates in the trusty-tools monorepo: fmt, clippy, and test in strict sequence before any merge
Reliably drive the GitHub gh CLI for issue, PR, and label operations in automation and subagent environments, with pre-flight verification so you never fabricate a success or a fake issue URL.
Severity-tagged code review checklist (CRITICAL/HIGH/MEDIUM/LOW) used by code-critic agent
MCP (Model Context Protocol) - Build AI-native servers with tools, resources, and prompts. TypeScript/Python SDKs for Claude Desktop integration.
MCP (Model Context Protocol) - Build AI-native servers with tools, resources, and prompts. TypeScript/Python SDKs for Claude Desktop integration.