Run any Skill in Manus with one click

tdd

Stars1

Forks0

UpdatedFebruary 23, 2026 at 13:24

Strict Test-Driven Development workflow enforcing the RED-GREEN-REFACTOR cycle. Ensures every line of production code is justified by a failing test. Use when building new features, fixing bugs, or adding behavior.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

hypeJunction

hypeJunction/ai-assistant-starter

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations·SOC 15-1253

File Explorer

5 files

SKILL.md

readonly

TDD

Purpose: Enforce strict Test-Driven Development with verified RED-GREEN-REFACTOR cycles Phases: RED -> Verify RED -> GREEN -> Verify GREEN -> REFACTOR Usage: /tdd <feature or behavior description>

Iron Laws

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST -- Every line of production code must be justified by a test that fails without it. No exceptions. No "just this once."
WRITE THE MINIMUM CODE TO PASS -- The GREEN phase produces only what the test demands. No extra features, no future-proofing, no refactoring. Just make the red test green.
DELETE AND RESTART IF VIOLATED -- If production code was written before its test, delete it completely. Then write the test first. There is no shortcut that preserves TDD's guarantees.

Note: Command examples use npm as default. Adapt to the project's package manager per ai-assistant-protocol — Project Commands.

Why TDD for AI Agents

Everything that makes TDD tedious for humans makes it ideal for AI: clear measurable goals per cycle, mechanical verification at each step, and tight feedback loops. Tests serve as natural-language specs that guide the agent toward exactly the behavior you expect.

When to Use

Building new features or modules from scratch
Fixing bugs (write test that reproduces bug first)
Adding behavior to existing code
Implementing interfaces or contracts
Any work where correctness matters more than speed

When NOT to Use

Exploratory prototyping (explore first, then delete and TDD) -> /explore
Code already written that needs tests after the fact -> /test-coverage
Refactoring working code with passing tests -> /refactor
Investigating a bug you don't understand yet -> /debug
Configuration or infrastructure files with no testable logic

Never Do

Never keep production code written before its test -- Delete it. Write the test. Then rewrite the code. The test must fail first or you have no proof it catches anything.
Never skip the verify step -- Running the test and confirming it fails (RED) or passes (GREEN) is not optional. The verify step is what separates TDD from "writing tests."
Never mock what you don't own -- Don't mock third-party libraries or framework internals. Wrap them in your own interface and mock that. See testing-anti-patterns.md.
Never test implementation details -- Test behavior, not how the code achieves it. If refactoring breaks your tests but not your behavior, your tests are wrong.
Never write more than one failing test at a time -- One RED test. Make it GREEN. Refactor. Then write the next test. Multiple failing tests create confusion and split focus.
Never refactor during GREEN -- GREEN means "make it pass." REFACTOR is a separate phase. Mixing them means you can't tell if breakage came from the fix or the cleanup.
Never commit with failing tests -- Every commit must be GREEN. If you can't make it green, revert to the last green state.

Gate Enforcement

CRITICAL: Each phase transition requires verification.

RED -> Verify RED: Test MUST fail. Failure MUST be for the right reason.
Verify RED -> GREEN: Test MUST pass after code change.
GREEN -> Verify GREEN: All tests MUST pass, not just the new one.
Verify GREEN -> REFACTOR: Only begin cleanup with a fully green suite.

Phase 0: Setup -- Decompose and Plan

Mode: Analysis and test file creation -- no production code.

Step 0.1: Decompose into Testable Behaviors

Decompose features into testable behaviors, not methods. A behavior is a single observable outcome: "When [condition], it should [result]." Methods may have multiple behaviors -- happy paths, edge cases, and error cases are separate behaviors. Decompose complex behaviors into simpler ones until each can be tested with a single assertion.

## Feature Decomposition

**Feature:** [Feature name]
**Behaviors identified:** [count]

| # | Behavior | Expected |
|---|----------|----------|
| 1 | When [condition], it should [result] | [expected output] |
| 2 | When [condition], it should [result] | [expected output] |
| ... | ... | ... |

Step 0.2: Create Test File with Roadmap

Before the first RED phase, create the test file with the describe block and a list of planned test descriptions as skipped/todo tests. This provides a roadmap for the session and makes progress visible:

describe('ModuleName', () => {
  it.todo('calculates subtotal from item prices and quantities');
  it.todo('returns zero for an empty item list');
  it.todo('applies percentage discount code');
  it.todo('rejects invalid discount codes with an error');
  it.todo('calculates tax based on region-specific rate');
});

Run the test file to confirm it loads without errors (all tests show as "todo/skipped"). This file is the session plan -- each RED phase replaces the next it.todo with a real test.

Phase 1: RED -- Write a Failing Test

Mode: Test files only -- no production code.

Step 1.1: Identify the Next Behavior

Pick the next behavior from the decomposition (Step 0.1). Replace its it.todo entry with a real test. One assertion, one concept.

## TDD Cycle -- Behavior N/M

**Behavior:** [What the code should do]
**Input:** [What goes in]
**Expected output:** [What comes out]

Step 1.2: Write ONE Minimal Failing Test

describe('ModuleName', () => {
  it('should [expected behavior] when [condition]', () => {
    // Arrange: set up inputs and dependencies
    const input = createInput();

    // Act: call the function/method under test
    const result = moduleName.doSomething(input);

    // Assert: verify the expected outcome
    expect(result).toEqual(expectedOutput);
  });
});

Rules:

Test name describes behavior, not implementation ("should calculate total with tax" not "should call multiply")
Use real objects over mocks wherever possible
One logical assertion per test (multiple expect calls are fine if they assert one concept)
The test must reference production code that does not yet exist or does not yet handle this case

Step 1.3: Confirm Test is Written

## RED Phase Complete

**Test file:** `path/to/module.spec.ts`
**Test name:** `should [behavior] when [condition]`
**Asserts:** [what it checks]

Ready to verify this test fails.

Phase 2: Verify RED -- Confirm the Test Fails

Mode: Read-only verification -- run tests, do not change anything.

Step 2.1: Run the Test

npm run test -- path/to/module.spec.ts

Step 2.2: Confirm Failure Reason

CRITICAL: The test must fail for the right reason.

Failure Type	Verdict	Action
Function not found / module not exported	Valid RED	Proceed to GREEN
Assertion fails (wrong return value)	Valid RED	Proceed to GREEN
Syntax error in test	Invalid RED	Fix the test, re-run
Wrong import path	Invalid RED	Fix the test, re-run
Unrelated test fails	Invalid RED	Fix the unrelated failure first
Test passes unexpectedly	Invalid RED	The behavior already exists -- write a different test or verify your test actually tests what you think

## RED Verified

**Test result:** FAIL
**Failure reason:** [why it failed]
**Valid failure:** Yes / No
**Action:** Proceed to GREEN / Fix test and re-verify

GATE: Test must fail for a valid reason before proceeding.

Phase 3: GREEN -- Write Minimum Code to Pass

Mode: Production code -- minimal changes only.

Step 3.1: Write the Simplest Code That Passes

If the test expects a return value, hardcode it if that makes the test pass
Do not add error handling the test doesn't require
Do not add features beyond what the test checks
Do not refactor, rename, or reorganize

Step 3.2: Keep It Minimal

Temptation	Response
"I should also handle the edge case"	Write a test for it first
"This needs error handling"	Write a test for the error first
"I should extract a helper"	Do that in REFACTOR
"The variable name is bad"	Rename in REFACTOR
"I know what the next test will need"	Write that test first

## GREEN Phase Complete

**File changed:** `path/to/module.ts`
**Change:** [what was added/modified]
**Lines added:** [count]

Ready to verify all tests pass.

Phase 4: Verify GREEN -- Confirm All Tests Pass

Mode: Read-only verification -- run tests, do not change anything.

Step 4.1: Run the Test

npm run test -- path/to/module.spec.ts

Step 4.2: Run Related Tests

npm run test -- "path/to/directory/"

Step 4.3: Confirm Clean Pass

## GREEN Verified

**New test:** PASS
**Related tests:** PASS ({N} tests)
**Clean output:** Yes / No (warnings?)

Ready to refactor.

GATE: ALL tests must pass before proceeding to REFACTOR.

Phase 5: REFACTOR -- Improve Without Changing Behavior

Mode: Production and test code -- behavior must remain identical.

Step 5.1: Identify Improvements

Extract duplicated logic into helpers
Rename variables and functions for clarity
Simplify conditional logic
Remove dead code
Improve test readability
For utilities with well-defined input/output contracts, consider adding property-based tests (e.g., fast-check) to catch edge cases example-based tests miss

Step 5.2: Refactor in Small Steps

After each change:

npm run test -- path/to/module.spec.ts

If any test fails, undo the last change immediately. Refactoring must never break tests.

Step 5.3: Verify Final State

npm run test -- "path/to/directory/"
npm run typecheck
npm run lint

## REFACTOR Complete

**Changes made:**
- [Refactoring 1]
- [Refactoring 2]

**All tests:** PASS
**Type check:** PASS
**Lint:** PASS

Cycle complete. Ready for next behavior or commit.

Step 5.4: Track Progress

Track progress across cycles: "Behavior N/M: [description] -- RED/GREEN/REFACTOR". This prevents losing track in long TDD sessions. Report progress after each cycle completion.

## Progress

- [x] Behavior 1/M: [description] -- COMPLETE
- [x] Behavior 2/M: [description] -- COMPLETE
- [ ] Behavior 3/M: [description]
- [ ] ...

Commit Frequency

Commit at natural TDD boundaries:

After completing each behavior cycle for complex features (each behavior involves significant logic or multiple files)
After every 2-3 cycles for simpler features (behaviors are straightforward, single-file changes)
Always commit before starting a risky refactor -- if the refactor might break things, ensure the current green state is committed first

Commit messages should reference the behavior:

test: add PriceCalculator subtotal calculation
test: add discount validation and tax calculation
refactor: extract shared pricing helpers

Never commit with failing tests. Every commit must be GREEN.

Common Rationalizations

When tempted to skip TDD, consult this table (see references/tdd-rationalizations.md for deeper analysis with real-world scenarios):

Rationalization	Rebuttal
"Too simple to test"	Simple code breaks. A one-line function with a typo is still a bug in production.
"Need exploration first"	Explore freely. Then delete the exploration and rebuild with TDD. Exploration code is throwaway.
"Tests after achieve same result"	Tests-after verify what the code IS. Tests-first verify what the code SHOULD BE. Only one of these catches design mistakes.
"TDD is dogmatic"	TDD is pragmatic: it finds bugs at write time instead of debug time. Dogma would be following it without evidence. The evidence is overwhelming.
"I'll write the test right after"	You won't. And if you do, you can't verify it catches the bug -- because the code already passes.
"This is just a config file"	If it can break production, it can have a test. If it can't break production, why are you changing it?
"The function is obvious"	Obvious functions get called with non-obvious inputs. The test documents what "obvious" means.
"I need to see the implementation shape first"	The test IS the shape. Write what you want to call, then make it work. The test is the first client of your API.
"Mocking is too hard for this"	Hard-to-mock means hard-to-test means bad design. Fix the design. The difficulty is the signal.
"We're in a hurry"	TDD is faster. Debugging untested code is what wastes time. You're not saving time, you're borrowing it at high interest.
"Existing code has no tests"	Start now. Every tested line is one less future debugging session. Don't perpetuate the problem because someone else created it.

Red Flags -- Stop and Restart

If any of these occur, stop the current cycle and reassess:

You wrote production code before a test -- Delete the code. Write the test.
The test passes on the first run -- Your test doesn't test anything new. Rewrite it.
You're not sure why the test fails -- You don't understand the system well enough. Read the code first.
You added "just one more thing" in GREEN -- Revert to the last green state. Write a test for the extra thing.
You're mocking more than two dependencies -- The unit under test has too many collaborators. Refactor the design.
The test name describes implementation -- "should call database" is wrong. "should return user by email" is right.
You're testing private methods -- Test the public interface. Private methods are implementation details.
Multiple tests are failing at once -- You jumped ahead. Revert to last green. One failing test at a time.
You refactored during GREEN -- Revert. Make it pass first, then clean up.
The test requires complex setup (>15 lines of arrange) -- The code under test needs a simpler interface. Refactor first.
You're writing tests to match existing code -- That's /test-coverage, not /tdd. TDD means the test comes first.
You feel confident enough to skip verification -- That's exactly when bugs slip through. Run the test.

Example: TDD Cycle for a Bug Fix

RED

A user reports that calculateDiscount returns negative prices for 100% discounts.

// discount.spec.ts
describe('calculateDiscount', () => {
  it('should return zero when discount is 100%', () => {
    const result = calculateDiscount(50.00, 100);
    expect(result).toBe(0);
  });
});

Verify RED

npm run test -- discount.spec.ts
# FAIL: Expected 0, received -50
# Valid failure: the function subtracts beyond zero

GREEN

// discount.ts
export function calculateDiscount(price: number, discountPercent: number): number {
  const discounted = price - (price * discountPercent / 100);
  return Math.max(0, discounted);
}

Verify GREEN

npm run test -- discount.spec.ts
# PASS: 1 test passed

REFACTOR

No refactoring needed for this small change. Run full suite to confirm:

npm run test -- "src/pricing/"
# PASS: 12 tests passed

Commit the fix with its regression test.

Verification Checklist

Before committing, confirm every item:

Every production code change has a corresponding test that was written FIRST
Every test was verified to FAIL before writing production code
Every test was verified to PASS after writing production code
No production code exists beyond what the tests require
All tests pass (not just the new ones)
Type check passes
Lint passes
Test names describe behavior, not implementation

When Stuck

See references/tdd-troubleshooting.md for detailed decision tables, the "wish" technique, and debugging integration guidance.

Problem	Solution
Don't know what test to write	Describe the behavior in plain English first. The test is that sentence turned into code.
Test is too complex	Break the behavior into smaller pieces. Test each piece separately.
Can't make the test pass simply	The design is wrong. Step back. What interface would make this test trivial? Build that interface.
Too many mocks needed	The code has too many dependencies. Extract an interface, inject dependencies, or split the unit.
Existing code has no tests	Don't retrofit TDD. Use `/test-coverage` to add tests to existing code. Use `/tdd` for new behavior.
Feature is unclear	Use `/explore` or `/plan` first. Come back to `/tdd` when you know what to build.
Tests pass but behavior is wrong	Your tests don't cover the actual requirements. Write a new test for the failing scenario.
Refactoring breaks tests	Undo the refactoring. Refactoring should not change behavior. If it does, you're changing functionality -- write a test first.

References

Testing Anti-Patterns -- Common testing mistakes and how to avoid them

Acceptance Tests

ID	Type	Prompt / Condition	Expected
TDD-T1	Positive	"Write the test first, then implement"	Skill triggers
TDD-T2	Positive	"Use red-green-refactor for this feature"	Skill triggers
TDD-T3	Positive	"Test-driven development for the parser"	Skill triggers
TDD-T4	Negative	"Add tests for existing code"	Does NOT trigger (→ /test-coverage)
TDD-T5	Negative	"Build the login page"	Does NOT trigger (→ /implement)
TDD-T6	Negative	"Run the test suite"	Does NOT trigger (→ /validate)
TDD-T7	Boundary	"Implement with tests"	Does NOT trigger (→ /implement, has TDD-lite)

name	tdd
description	Strict Test-Driven Development workflow enforcing the RED-GREEN-REFACTOR cycle. Ensures every line of production code is justified by a failing test. Use when building new features, fixing bugs, or adding behavior.
category	process
triggers	["test first","test-driven","red green refactor","write test before code"]

tdd

More from this repository

TDD

Iron Laws

Why TDD for AI Agents

When to Use

When NOT to Use

Never Do

Gate Enforcement

Phase 0: Setup -- Decompose and Plan

Step 0.1: Decompose into Testable Behaviors

Step 0.2: Create Test File with Roadmap

Phase 1: RED -- Write a Failing Test

Step 1.1: Identify the Next Behavior

Step 1.2: Write ONE Minimal Failing Test

Step 1.3: Confirm Test is Written

Phase 2: Verify RED -- Confirm the Test Fails

Step 2.1: Run the Test

Step 2.2: Confirm Failure Reason

Phase 3: GREEN -- Write Minimum Code to Pass

Step 3.1: Write the Simplest Code That Passes

Step 3.2: Keep It Minimal

Phase 4: Verify GREEN -- Confirm All Tests Pass

Step 4.1: Run the Test

Step 4.2: Run Related Tests

Step 4.3: Confirm Clean Pass

Phase 5: REFACTOR -- Improve Without Changing Behavior

Step 5.1: Identify Improvements

Step 5.2: Refactor in Small Steps

Step 5.3: Verify Final State

Step 5.4: Track Progress

Commit Frequency

Common Rationalizations

Red Flags -- Stop and Restart

Example: TDD Cycle for a Bug Fix

RED

Verify RED

GREEN

Verify GREEN

REFACTOR

Verification Checklist

When Stuck

References

Acceptance Tests

TDD

Iron Laws

Why TDD for AI Agents

When to Use

When NOT to Use

Never Do

Gate Enforcement

Phase 0: Setup -- Decompose and Plan

Step 0.1: Decompose into Testable Behaviors

Step 0.2: Create Test File with Roadmap

Phase 1: RED -- Write a Failing Test

Step 1.1: Identify the Next Behavior

Step 1.2: Write ONE Minimal Failing Test

Step 1.3: Confirm Test is Written

Phase 2: Verify RED -- Confirm the Test Fails

Step 2.1: Run the Test

Step 2.2: Confirm Failure Reason

Phase 3: GREEN -- Write Minimum Code to Pass

Step 3.1: Write the Simplest Code That Passes

Step 3.2: Keep It Minimal

Phase 4: Verify GREEN -- Confirm All Tests Pass

Step 4.1: Run the Test

Step 4.2: Run Related Tests

Step 4.3: Confirm Clean Pass

Phase 5: REFACTOR -- Improve Without Changing Behavior

Step 5.1: Identify Improvements

Step 5.2: Refactor in Small Steps

Step 5.3: Verify Final State

Step 5.4: Track Progress

Commit Frequency

Common Rationalizations

Red Flags -- Stop and Restart

Example: TDD Cycle for a Bug Fix

RED

Verify RED

GREEN