| name | tdd |
| description | Enforces strict Test-Driven Development with RED-GREEN-REFACTOR cycles. Writes one failing test at a time, implements minimal code to pass, then refactors. Uses subagent isolation to prevent tests-by-construction. Pairs with the `code-quality` skill: invokes `Skill('code-quality')` during the REFACTOR phase to apply the full code-quality rule set against the GREEN output, and cites refactor recipes (R1–R20) by ID when reporting changes. Triggers on: "tdd", "write tests", "test this", "add test coverage", "test driven", "red green refactor", "/tdd".
|
| license | MIT |
| metadata | {"author":"mthines","version":"1.2.0","workflow_type":"applied","modes":["tdd-full","test-first","test-after"],"tags":["tdd","testing","red-green-refactor","test-first","test-after","refactoring","code-quality"]} |
Test-Driven Development
Enforce strict RED-GREEN-REFACTOR discipline. One test at a time. Tests describe WHAT the system does, never HOW.
Input
Check $ARGUMENTS for the feature or behavior description.
- If
$ARGUMENTS contains a feature description, use it directly.
- If
$ARGUMENTS contains "after" (e.g., /tdd after), run in Test-After Mode — write tests for existing code. See rules/test-after.md.
- If
$ARGUMENTS is empty, ask the user what behavior they want to implement or test.
Step 0: Discover Project Test Setup
Before writing any tests:
- Find existing tests — glob for
**/*.test.*, **/*.spec.*, **/*_test.*, **/test_*.*, **/tests/** to identify the test framework, naming conventions, and directory structure.
- Find the test runner — check
package.json scripts, Makefile, pyproject.toml, Cargo.toml, go.mod, or similar for the test command.
- Adopt existing patterns — match the project's test style exactly: same imports, same assertion library, same file naming, same directory placement. Never introduce a new test framework or pattern.
- Identify the run command — store it mentally as
TEST_CMD for use throughout the cycle. If you can run a single test file or test case, prefer that over the full suite.
If no tests exist yet, ask the user which framework to use before proceeding.
Step 1: Prioritize by Business Criticality
Before diving into implementation, identify what matters most:
- Core user flows first — what are the primary actions users perform? Test those before edge cases.
- Ask if unclear — if the feature has multiple behaviors, ask the user to rank them or list the critical paths.
- Build a test list — write a numbered list of behaviors to test, ordered by importance. Each item should be a single, specific behavior (not "test the login flow" but "reject login with expired password").
Present the test list to the user for confirmation before starting the cycle.
Step 2: RED-GREEN-REFACTOR Cycle
For each item in the test list, execute one full cycle. Follow the rules strictly:
RED Phase
See rules/red.md
Write exactly ONE failing test. Run it. Confirm it fails with the expected error. Do NOT write implementation code.
GREEN Phase
See rules/green.md
Write the MINIMUM code to make the failing test pass. No more. Run the test. Confirm it passes. Run the full relevant test suite to check for regressions.
Even in this phase, apply the basic readability primitives while you write: meaningful names, guard clauses for the cases the test forces, no nesting beyond 2 levels. These cost almost nothing during authoring but are expensive to bolt on later. Don't optimize, abstract, or add unrequested features — REFACTOR will handle deeper improvements via Skill('code-quality').
REFACTOR Phase
See rules/refactor.md
Evaluate whether refactoring is needed. If yes, refactor while keeping all tests green. If no, move to the next cycle.
Invoke Skill('code-quality') for this phase. The code-quality skill is the source of truth for what "clean" means — guard clauses, low cognitive complexity, single-responsibility functions, intent-revealing names, single source of truth for union-type metadata, schema-first validation with type inference, total functions, type-driven design, functional core / imperative shell, and the named refactor recipes catalog (R1–R20). Routing through the skill — rather than reading individual rule files — picks up its Review Mode procedure, the structured output contract, and any rules added since this skill was written. Cite recipes by ID in commit messages and PR descriptions (e.g., "R1 Consolidate Parallel Maps", "R10 Total-ise the Function") so reviews are reproducible.
Step 3: Cycle Completion Check
After each RED-GREEN-REFACTOR cycle:
- Run the full relevant test suite (not just the new test).
- If all tests pass, move to the next item on the test list.
- If a test fails, stop and fix it before proceeding. Never accumulate broken tests.
- After every 3 cycles, briefly report progress to the user.
Step 4: Final Verification
After all items in the test list are complete:
- Run the full test suite one final time.
- Check test coverage if the project has coverage tooling — report any critical paths that are uncovered.
- Provide a summary of what was tested and what was implemented.
Critical Rules (apply to ALL phases)
Test Quality
- Test behavior, not implementation — tests must exercise public interfaces only. A test must survive a complete internal refactor unchanged.
- One behavior per test — each test should verify exactly one thing. The test name should describe that behavior.
- No testing framework internals — never test that
setTimeout works, that React renders, or that Go's http.ListenAndServe starts. Test YOUR code.
- Maximum 10-15 tests per file — if you need more, split by behavior group.
- Factory functions for test data — use
buildUser(overrides?) patterns instead of inline object literals scattered across tests.
Mocking Strategy
- DO mock: external HTTP APIs, third-party services, file system (when testing logic, not I/O), time/dates, randomness.
- DO NOT mock: your own code, framework features, database (prefer test DB or in-memory), internal modules (unless crossing a major boundary).
- Never mock what you don't own — if you don't control the interface, write an adapter and mock that.
- If the test needs more than 3 mocks, the design is wrong — refactor the code under test first.
Naming Convention
Follow the project's existing convention. If none exists, use:
- describe block: the unit under test (function, class, component)
- it/test block:
should [expected behavior] when [condition]
- Example:
describe('createOrder') → it('should reject order when inventory is zero')
Code Quality (during GREEN and REFACTOR)
The code-quality skill is the source of truth for what "well-written code" means in this workflow. Apply during GREEN as inline primitives; invoke as a full pass during REFACTOR via Skill('code-quality').
GREEN-phase primitives (apply inline while writing the minimal implementation):
- Guard clauses + early returns instead of nested
ifs — flat code is easier to read and easier to test.
- Cognitive complexity ≤ 15 per function (SonarSource scoring). If the function feels hard to test (lots of mocks, complex setup), that is the metric warning you the function does too much.
- Names that describe intent, not types or position.
pendingOrders beats arr2.
- One responsibility per function — if you cannot name it without "and", split it.
- Validate at boundaries, trust internally. No defensive null checks for impossible states.
For REFACTOR, invoke Skill('code-quality') so the skill's Review Mode procedure runs against the GREEN output. The skill returns findings under the High / Medium / Low / Maintainability / Correctness / Testability headings; address each as a separate refactoring move (one change at a time per the "Procedure" below). Cite recipe IDs (R1–R20) in commit messages.
Anti-Patterns to Avoid
| Anti-Pattern | Why It's Bad | What to Do Instead |
|---|
| Writing tests in bulk | Tests imagined behavior, not observed | One test per cycle |
| Testing and implementing together | Unconsciously designs tests around implementation | Strict phase separation |
| "Make sure tests pass" prompt | Encourages implementation-first thinking | "Write a FAILING test" |
| Changing test expectations to pass | Masks real bugs | Fix the source code |
| Testing private methods | Couples tests to implementation | Test through public API |
| Copy-pasting mock setup | Brittle, hard to maintain | Extract shared fixtures |
When Things Go Wrong
- Test won't fail (RED phase): The behavior already exists or the test is wrong. Investigate before proceeding.
- Can't make test pass without large changes (GREEN phase): The test step is too big. Break it into smaller behaviors.
- Refactoring breaks tests: The tests were testing implementation details. Rewrite the test to test behavior, then refactor.
- After 2 failed attempts to fix: Clear context and start the cycle fresh with a better-scoped test.