with one click
writing-tdd-plans
// Use when you have a design document and need to create an implementation plan with enforced TDD and adversarial review gates per task
// Use when you have a design document and need to create an implementation plan with enforced TDD and adversarial review gates per task
| name | writing-tdd-plans |
| description | Use when you have a design document and need to create an implementation plan with enforced TDD and adversarial review gates per task |
Transform a design document into an implementation plan where every feature gets three tasks: write failing tests, implement to pass them, adversarial review. Plans are structured for subagent execution — each task is self-contained with full context.
Core principle: TDD and review are enforced at the plan level, not left to executor discipline. The plan structure makes skipping tests or reviews impossible because they are separate, tracked tasks.
Announce at start: "I'm using the writing-tdd-plans skill to create the implementation plan from the design document."
digraph when_to_use {
"Have a design document?" [shape=diamond];
"Want enforced TDD + review gates?" [shape=diamond];
"Use writing-tdd-plans" [shape=box, style=filled, fillcolor=lightgreen];
"Write the design first" [shape=box];
"Have a design document?" -> "Want enforced TDD + review gates?" [label="yes"];
"Have a design document?" -> "Write the design first" [label="no"];
"Want enforced TDD + review gates?" -> "Use writing-tdd-plans" [label="yes"];
}
digraph process {
rankdir=TB;
"Read design document" [shape=box];
"Identify goal, architecture, tech stack" [shape=box];
"Decompose into features/components" [shape=box];
"Order features by dependency" [shape=box];
subgraph cluster_per_feature {
label="For Each Feature";
"Create Task N.1: Write failing tests (RED)" [shape=box];
"Create Task N.2: Implement to pass tests (GREEN)" [shape=box];
"Create Task N.3: Adversarial review" [shape=box];
}
"Build wiring ledger" [shape=box];
"Save as subfolder in docs/plans/" [shape=box];
"Read design document" -> "Identify goal, architecture, tech stack";
"Identify goal, architecture, tech stack" -> "Decompose into features/components";
"Decompose into features/components" -> "Order features by dependency";
"Order features by dependency" -> "Create Task N.1: Write failing tests (RED)";
"Create Task N.1: Write failing tests (RED)" -> "Create Task N.2: Implement to pass tests (GREEN)";
"Create Task N.2: Implement to pass tests (GREEN)" -> "Create Task N.3: Adversarial review";
"Create Task N.3: Adversarial review" -> "Build wiring ledger";
"Build wiring ledger" -> "Flow trace check";
"Flow trace check" [shape=diamond];
"Fix plan gaps" [shape=box];
"Flow trace check" -> "Save as subfolder in docs/plans/" [label="all steps covered"];
"Flow trace check" -> "Fix plan gaps" [label="gaps found"];
"Fix plan gaps" -> "Flow trace check";
}
REQUIRED: Read ./plan-format.md before writing any plan. It defines the exact task structure, required fields, commit patterns, and detail level.
Key points (see plan-format.md for full templates):
The plan MUST cover the ENTIRE design document. Every feature, component, endpoint, UI element, and infrastructure piece described in the design MUST appear as a triplet in the plan. The plan's job is to decompose and order ALL work — not to decide what fits in a PR. Scope decisions (what ships in which PR) are made AFTER the plan exists, not during planning. A plan that omits sections of the design is incomplete.
Coverage check (MANDATORY): Before writing any triplets, list every top-level section/component/layer from the design document. After writing the plan, verify each item on that list maps to at least one triplet. If any item is missing, the plan is incomplete — add the missing triplets before saving.
Read the design document carefully and identify:
Independent features — Can be implemented in any order. Their triplets can potentially run in parallel (different subagents working on non-overlapping files).
Dependent features — Feature B needs Feature A. Order triplets: A.1 → A.2 → A.3 → B.1 → B.2 → B.3.
Shared infrastructure — If multiple features need the same base (database setup, config, types), create a Task 0 for scaffolding, then triplets for each feature.
Mock boundaries — When a feature will be tested with mocks (e.g., mocking IProvider to test a service that depends on it), that mock boundary represents a real connection that feature tests will NOT verify. List every mock boundary — these become mandatory integration test targets in the integration triplet. See plan-format.md "Integration Triplet" for the Mock Boundary Table format.
Integration test prerequisites (MANDATORY) — For each mock boundary where the integration triplet will use real services instead of mocks, check: does the project already have the infrastructure to run those real services in tests? If not, Task 0 must install packages (testcontainers, docker-compose), create container/fixture definitions, and provide seed data. An integration test that references testcontainers without a task to install the package is a plan bug.
UI test infrastructure (MANDATORY) — If the design includes ANY UI component, check whether the project has component rendering test packages (bUnit for Blazor, React Testing Library for React, Vue Test Utils for Vue). Search project files for package references. If not installed, Task 0 MUST install the package and create a minimal test scaffold. Without this, RED tasks cannot write rendering tests, agents fall back to store/state-only tests, and GREEN never creates the component.
6b. Visual design foundation (MANDATORY for new apps) — If the design document includes a Visual Design section (color palette, typography, spacing, component style), Task 0 MUST bootstrap a design system: CSS variables/design tokens, base styles, and any chosen CSS framework configuration. If the design document does NOT include visual design decisions, flag this as a gap — the brainstorming phase should have captured visual direction. For existing apps, identify and document the existing design patterns that UI tasks should follow (reference specific styled components by name). Without a visual design foundation, each feature subagent invents its own visual style, producing an incoherent UI.
TimeProvider, IHttpClientFactory, or any framework type not auto-registered — a task must register it. If a service is registered in one host (e.g., Agent) but needed in another (e.g., MCP server, Blazor WASM) — the plan must register it in each host separately. Cross-host DI is the #1 source of "works in unit tests, crashes at startup" errors.Independent triplets execute as parallel subagents sharing the same workspace. Two agents editing the same file simultaneously cause merge conflicts, build failures, and spurious test failures.
Before marking features as independent, analyze file scope overlap:
Resolution options when overlap is found:
Prefer Task 0 extraction when the shared changes are small and mechanical (type additions, re-exports). Prefer serialization when the shared file changes are substantial or depend on each feature's implementation.
Granularity: Each triplet should be 5-15 minutes of work. If a feature is too large, split it into sub-features, each with its own triplet.
Dependency graph: Include a visual dependency graph at the end of the plan showing which triplets can run in parallel and which are sequential. The graph must reflect both logical dependencies AND file-scope overlap — two features are parallel only if they have zero file overlap.
Build wiring ledger (MANDATORY): After writing ALL triplets, build the wiring ledger (see plan-format.md "Wiring Ledger"). For every DI registration, route, layout change, and pipeline hookup in the plan, add a ledger entry with the task that creates it. Cross-reference: every ledger entry should have a corresponding wiring test in a RED task. Save as wiring-ledger.md in the plan directory.
Flow trace check (MANDATORY): After building the wiring ledger, trace each user-facing flow from the design end-to-end through the plan. For each step in the flow (user sees X → clicks Y → gets Z), identify which task creates or modifies the code for that step. A step with no corresponding task is a plan gap — add the missing task before saving. Save the trace as flow-trace.md in the plan directory (see plan-format.md "Flow Trace Artifact") — the executor reads this to verify wiring between layers. Common gaps this catches:
[FromBody] on a GET endpoint?| Excuse | Reality |
|---|---|
| "I'll combine tests and implementation for speed" | Separate tasks enforce TDD. Combined tasks let you write tests-after. |
| "Tests pass, no need for review" | Tests only cover what you thought of. Adversarial review finds what you didn't. |
| "No bugs found, looks good" | Bug-free ≠ correct. Does it actually do what the requirements ask? Review requirements compliance, not just code quality. |
| "Review is overkill for this simple feature" | Simple features have subtle edge cases. Review takes 5 minutes. |
| "I'll write the tests in the implementation task" | That's tests-after with extra steps. The test task must exist separately. |
| "The design is clear enough, I don't need to quote requirements" | Reviewers need verbatim requirements to catch misinterpretations. |
| "Subagents are slow, I'll execute tasks myself" | Fresh subagent context prevents cross-task contamination and shortcuts. |
| "These features are logically independent, so they can run in parallel" | Logical independence ≠ file independence. Check for shared files before marking as parallel. |
| "Store/action tests cover the UI feature" | Store tests verify state logic, not that the component exists or renders. The GREEN step (YAGNI) won't create a component no test requires. Add a rendering test that imports and renders the component. If no rendering test infrastructure exists, establishing it (bUnit, React Testing Library) is a Task 0 item — not a reason to exclude the component. |
| "No bUnit/RTL installed, so we can't write rendering tests" | Missing test infrastructure is a Task 0 prerequisite, not a reason to drop rendering tests. Task 0 MUST install the package and create a test scaffold. Check project files during decomposition (point 6) — don't discover this at RED task time. |
| "CSS/styling is beyond what tests require (YAGNI)" | Tests verify behavior, not appearance — but unstyled HTML is not a deliverable. The GREEN step must style UI components following the visual design direction (from the design document) or matching the codebase's existing visual patterns. YAGNI applies to features, not to basic visual quality. A component with CSS class names that have no CSS rules is a broken deliverable. |
| "Visual design is subjective, we can't test for it" | Correct — tests can't verify visual quality. That's why the REVIEW task must assess it. Visual acceptance criteria in the plan give the reviewer concrete expectations to evaluate against. A review that only checks "CSS rules exist" misses whether the UI is actually usable and appealing. |
| "The design doc doesn't specify colors/fonts, so we'll use defaults" | Missing visual direction is a gap in the design document, not a license to skip visual quality. Flag it as a brainstorming gap and establish a visual foundation in Task 0. Browser defaults produce an unusable UI. |
| "The executor can figure out the types/signatures" | The executor has only the task spec, not the design doc or debate log. If the plan says "create UserService with CRUD operations", 10 executors produce 10 different APIs. Specify signatures, types, error conditions — lock down design decisions. |
| "The integration triplet will catch wiring issues" | Only if the integration task SPECIFICALLY tests wiring. A vague "test features together" integration task won't catch missing DI registrations, unapplied decorators, or stub implementations. Build the Mock Boundary Table (see plan-format.md). |
| "Integration tests will use real services" (but no task installs the infrastructure) | Integration tests need runnable infrastructure. If the plan says "test against real PostgreSQL via testcontainers" but no Task 0 installs testcontainers or creates container definitions, the test is unrunnable at execution time. Prerequisites must be explicit plan tasks. |
| "This design is too large for one PR — I'll split into PR 01 and PR 02" | The plan covers the ENTIRE design. Scope decisions (what ships in which PR) happen AFTER the plan exists, not during planning. A plan that omits sections of the design is incomplete — even if you intend to "plan the rest later." Plan everything, then let the executor or user decide PR boundaries. |
| "I'll defer the UI/auth/infrastructure to a separate plan" | Deferring = omitting. If the design describes it, the plan must include it. The plan is a decomposition of the design, not a subset. |
| "The coverage check passed, so the plan is complete" | Coverage maps design sections to triplets. It doesn't verify that every step in a user flow has a task. A component can have a triplet but never be wired into the layout. An interface can exist in Task 0 but have no implementation class. The flow trace catches what the coverage check misses. |
| "Wiring will be implemented because it's in the Files section" | GREEN subagents prioritize making tests pass. Without wiring tests in RED, there's no test-driven reason to implement wiring changes — they get skipped. Every "Modify" file needs a corresponding wiring test. |
| "The flow trace is in my head, I verified it" | A mental trace is lost after planning. Save it as flow-trace.md — the executor needs it to verify wiring between layers. If it's not written down, it doesn't exist. |
| "The wiring ledger is redundant with the DI registration table" | The DI table is per-task. The ledger is cross-task — it shows WHO registers WHAT across the entire plan. The executor uses it to verify wiring after each layer, not just within a single task. |
| "Unit tests pass, so the DI wiring is correct" | Unit tests mock DI containers. A class can pass all unit tests while its real DI container can't resolve it — because a transitive dependency (like TimeProvider) was never registered. Every GREEN task must include a DI registration table. |
| "The framework infers parameter binding correctly" | Minimal API, MVC, and other frameworks have inference rules that surprise. A GET endpoint with an unattributed complex parameter is inferred as [FromBody] and throws InvalidOperationException. GREEN tasks for web endpoints must specify binding sources explicitly. |
| "The executor can figure out DI registrations" | The executor has only the task spec. If the plan says "create TokenInjector(ITokenStore, TimeProvider)" but doesn't say "register TimeProvider.System in DI", the executor creates the class, unit tests pass (mocked), and the app crashes at startup. DI registrations are design decisions — lock them down. |
Never:
[FromQuery], [FromServices], [FromBody], etc.IFooHubService when the plan says to inject IChatConnectionService. New abstractions = new DI registrations the plan didn't account for = runtime crashes.The triplet is atomic: If you can't write all three tasks for a feature, the feature needs to be decomposed further.
[HINT] Download the complete skill directory including SKILL.md and all related files