| name | tdd-workflow |
| description | Spec-driven TDD full-cycle skill. Triggered when implementing features: Interactive requirements gathering -> UseCase documentation -> Test plan -> TDD implementation (unit -> integration -> E2E) -> Regression verification -> Issue tracking -> Delivery. Trigger words: implement, new feature, develop, add, build, I want to, help me build
|
| user-invocable | true |
| allowed-tools | Read, Write, Edit, Bash, Glob, Grep, Agent |
| metadata | {"version":"2.4.7","compatible":"claude-code, cursor, cline, windsurf, codebuddy, github-copilot","hooks":"Installed to .claude/hooks/tdd/ via tdd-workflow init. See .claude/settings.json for registration."} |
TDD Workflow — Spec-Driven Full-Cycle Development
Command Overview
| Command | Purpose |
|---|
/tdd:new <name> | Start new feature, interactive requirements gathering (collects UC framework) |
/tdd:ff <name> | UseCase-first: generate usecases.md as primary output, then derive requirements → design → tasks from it |
/tdd:change | Mid-course requirement change: analyze impact (UseCase dimension first), sync all 4 docs |
/tdd:spec | Generate/update spec documents individually |
| RED / GREEN / REFACTOR phases | Phase markers used inside /tdd:loop; not separate slash commands. See "Loop-internal phases" below for the rules each phase enforces. |
/tdd:loop | Auto-cycle red -> green -> refactor until Phase 2 complete |
/tdd:e2e | Derive E2E tests from usecases.md paths (each UC path → one E2E) |
/tdd:verify-setup | Interactive project-level verify config (tdd-specs/.verify/project.md) |
/tdd:verify-local | Interactive personal verify params (tdd-specs/.verify/project.local.md, gitignored) |
/tdd:cleanup [env] | Manual cleanup — run pre_verify_cleanup without running verification itself |
/tdd:done | 4-stage verification: code checks → local E2E → staging → delivery (includes UC sync to paths.usecases.dir, default docs/usecases/) |
/tdd:notes | Generate TDD practice notes — record decisions, pitfalls, lessons learned |
/tdd:bug | Bug fix workflow: report -> analyze -> test -> fix -> verify |
/tdd:continue <name> | Resume in-progress feature |
/tdd:archive | Archive completed specs (warns if usecases.md not synced to docs/) |
Project Context Detection
Before starting any feature, detect project structure to determine test framework and directory conventions:
ls package.json pyproject.toml Cargo.toml go.mod pom.xml build.gradle 2>/dev/null | head -5
grep -E '"test"|"jest"|"vitest"|"pytest"|"mocha"' package.json 2>/dev/null || true
ls -d src/ app/ lib/ tests/ test/ spec/ __tests__/ 2>/dev/null || true
Adapt all subsequent commands based on detection results:
- Test commands:
npm test / npx jest / npx vitest / pytest / go test ./... / mvn test / cargo test etc.
- Test directories:
test/ / tests/ / __tests__/ / spec/ etc.
- Source directories:
src/ / app/ / lib/ etc.
/tdd:new <name>
Start a new feature.
- If no
<name>, ask user what they want to build; derive kebab-case name from description
- Create
tdd-specs/<name>/ directory, write to tdd-specs/.current
- Enter requirements gathering (cannot be skipped)
Requirements gathering — all dimensions must be covered:
| Dimension | Question |
|---|
| Target users | Who will use this? (based on actual project roles) |
| Core scenarios | Top 1-3 most important use cases? |
| Input/Output | What does the user input? What does the system return? |
| Error handling | What situations cause failure? Expected error behavior? |
| Scope boundaries | What is explicitly out of scope? |
| Acceptance criteria | How do we know it's done? |
After each round of Q&A, reflect understanding back to user for confirmation. Scope must be confirmed before proceeding.
Output after collection:
- Feature name and path:
tdd-specs/<name>/
- Confirmation summary (user stories + acceptance criteria)
- Prompt: Run
/tdd:ff to generate all spec docs at once, or /tdd:spec for step-by-step
/tdd:ff <name>
Fast-forward: generate requirements -> design -> tasks in one shot.
-
If tdd-specs/<name>/ doesn't exist, first run /tdd:new requirements gathering
-
Step 1: Review known Issues (if project uses issues directory; path from paths.issues.dir in tdd-specs/.verify/project.md, defaults to docs/issues)
ls ${ISSUES_DIR}/*.md 2>/dev/null | grep -v README || echo "No issues directory, skipping"
grep -rl "<feature-keywords>" ${ISSUES_DIR}/ 2>/dev/null || true
-
Step 2: Update UseCase docs (target dir from paths.usecases.dir, default docs/usecases/; external-tool mode prompts manual sync)
-
Step 3: Generate tdd-specs/<name>/requirements.md
-
Step 4: Generate tdd-specs/<name>/design.md (incorporating actual project tech stack)
-
Step 5: Generate tdd-specs/<name>/tasks.md (using actual project test commands and paths)
CRITICAL — Vertical Slice Rule (mandatory):
Phase 2 tasks MUST be organized by UC (vertical slice), and each UC MUST include tasks for ALL technical layers it touches:
| Layer | Include when... | Example tasks |
|---|
| Database migration | UC introduces new table/field | CREATE TABLE, execute migration to local DB |
| Backend service | UC has business logic | service unit test + implementation |
| Backend controller | UC has API endpoint | controller/route test |
| Frontend page | UC actor is end-user (browser/miniprogram) | page JS/HTML/CSS implementation |
| Client app | UC involves client-side processing | Python/native test + implementation |
FORBIDDEN: Separating frontend into a standalone "Phase 4". All layers of a UC belong together in Phase 2.
Exception: Pure infrastructure setup (Phase 1: creating directories, Entity skeletons, module registration) is allowed as a separate phase since it's shared across all UCs.
Database migration execution rule: Phase 1 must include actually running the migration on local dev DB (not just writing the SQL file). After migration, run schema:dump if the project has that script.
-
Step 6: Test coverage check (mandatory, cannot skip)
After generating tasks.md, immediately verify all 3 test layers have tasks. If any layer has 0, proactively add before continuing:
| Layer | Check | Gap-fill direction |
|---|
| Unit tests | tasks.md has tasks with "unit test:" prefix | Add pure function unit tests for core business logic |
| Integration tests | tasks.md has "integration test:" prefix tasks covering key HTTP endpoint chains + DB write verification | Add: POST /api/xxx full chain (request -> response -> DB state); concurrency safety; permission boundaries (4xx for unauthorized roles) |
| E2E | Phase 3 has E2E tasks | Add key user flow end-to-end verification |
-
Show summary, wait for confirmation
Output format:
OK requirements.md — N requirements, N acceptance criteria
OK design.md — N modules, N interfaces
OK tasks.md — Phase 1: N items / Phase 2: N items (by UC, all layers) / Phase 3: N E2E items
Issues reviewed: <related IDs or "none">
Ready! Run /tdd:loop to start TDD implementation.
/tdd:spec
Generate or update spec documents individually. Same as /tdd:ff Steps 1-6, but checks existing files and asks whether to overwrite.
Loop-internal phases
The three phases below are NOT standalone slash commands. They are phase markers enforced inside /tdd:loop (and inside /tdd:bug when fixing a bug). Users do not invoke /tdd:red directly — the loop transitions through these phases automatically for each Phase 2 task.
The rules in each phase are the contract between the loop and the Coder/Reviewer it spawns.
RED phase
Write a failing test (TDD Red phase).
- Pick next
[ ] Phase 2 task from tasks.md
- Write test file (rules: one test at a time, test behavior not mocks, name describes behavior)
- Run immediately to verify it actually fails (using project's actual test command)
- Confirm failure reason is "feature not implemented" not syntax error
- Mark task as
[~]
Failure verification triple-check:
- The test fails (not errors out)
- Failure message matches expectation
- Failure is due to missing functionality, not typo
GREEN phase
Write minimum code to pass current test (TDD Green phase).
Step 0 (mandatory): Check Issues first (if project has issues tracking; path from paths.issues.dir)
grep -rl "<error-keywords>" ${ISSUES_DIR}/ 2>/dev/null || echo "No existing records"
- Write only the minimum code to pass the test — no premature abstraction
- Run tests (using project's actual command)
- All green (including existing tests, no regressions allowed) to mark complete
- Mark task as
[x]
Three-Strike Protocol — triggered when same test fails 3 times:
WARNING: Three-Strike Protocol
Test: <test-name>
Attempt history:
1. <approach> -> <error>
2. <approach> -> <error>
3. <approach> -> <error>
Issues search result: <found/none>
Please choose:
A. Try a different approach (describe your idea)
B. Split into smaller test granularity
C. Mark [!] skip, move to next
D. Need more context
REFACTOR phase
Refactor (only when all tests are green).
- Eliminate duplication, improve naming, extract shared logic
- Run tests after each small change
- Follow project's existing conventions (reference lint/style config and issues prevention notes)
/tdd:loop
Auto-cycle until all implementation tasks (Phase 1 + Phase 2) are fully complete.
WHILE tasks.md has ANY [ ] or [~] task (regardless of Phase):
IF current task is Phase 1 (infrastructure):
Execute directly (no RED/GREEN cycle needed for migrations, scaffolding)
VERIFY then mark [x]
ELSE (implementation task — any Phase):
IF task is a "unit test" task:
RED phase -> Write failing test
IF task is an "implement" task:
GREEN phase -> Implement to pass (with issues lookup)
IF task is a frontend page task:
Write page files directly (js/wxml/wxss/json for miniprogram, tsx for React)
VERIFY then mark [x]
REFACTOR phase -> Refactor (if applicable)
IF same test fails 3 times:
STOP -> Three-Strike Protocol -> Await decision
IF ALL tasks across ALL Phases are [x]:
Run full test suite (project's actual command)
Output: completion report (N tests, Xs elapsed)
Prompt: Run /tdd:e2e for E2E acceptance
Marking [x] Verification Protocol (MANDATORY — cannot bypass):
Before marking ANY task [x], you MUST verify with evidence:
| Task type | Required evidence before [x] |
|---|
| Unit test | Test file exists + jest/pytest run shows it passes |
| Implementation | Source file exists + related tests pass |
| Frontend page | All page files exist (js/wxml/wxss/json or tsx) + registered in app config |
| Database migration | SHOW TABLES confirms new tables exist |
| Any task | FORBIDDEN: marking [x] with "待后续", "TODO", "skip" in the same line. Use [!] for blocked tasks. |
If you cannot complete a task, you MUST either:
- Mark
[!] with a documented blocker reason
- Keep as
[ ] and ask user for guidance
- NEVER mark
[x] for unfinished work
Key behavior: The loop processes ALL layers within each UC (backend test → backend impl → frontend page) before moving to the next UC. This ensures each UC is fully deliverable when its tasks complete.
Task completeness scan (runs once at loop start, cannot skip):
| Scenario Type | Check | Prompt |
|---|
| DB migration executed | Phase 1 has migration task AND local DB has the new tables? | Migration SQL exists but was never executed. Run it now and verify with SHOW TABLES. |
| Real DB integration test | At least 1 test in tasks.md connects to real DB (not all mocked)? | All tests use mock repositories. Add at least 1 integration test that writes to real DB and reads back to verify schema correctness. |
| Error response parsing | Tests for "response structure doesn't match expected"? | Missing error response parsing test, suggest adding |
| Crash/restart recovery | Tests for "state recovery after process restart"? | Missing crash recovery test, suggest adding |
| External URL/Host changes | Tests for "external resource URL host mismatch"? | Missing URL host rewrite test, suggest adding |
| Network timeout | Tests for "return False/empty instead of throwing on timeout"? | Missing timeout handling test, suggest adding |
| Integration tests (HTTP chain) | tasks.md has "integration test:" tasks covering key endpoint HTTP request -> response -> DB write chain? | Missing integration test tasks, suggest adding: key endpoint e2e chain (with DB state verification), concurrency safety, permission boundaries |
DB Migration Verification Protocol (mandatory when Phase 1 has migration tasks):
When processing a Phase 1 migration task, the loop MUST:
- Execute the SQL file against local dev DB
- Verify tables/columns exist:
SHOW TABLES LIKE '<pattern>' or equivalent
- Run
schema:dump if project has it
- Only mark task
[x] after verification passes
If DB is not running or migration fails → STOP and ask user to fix DB before continuing. Do NOT proceed with mock-only tests and claim "verification passed".
/tdd:e2e
Phase 3: E2E acceptance tests.
Pre-check (based on project's actual service address):
curl -s http://localhost:<PORT>/health 2>/dev/null || echo "WARNING: Service not running, please start dev server first"
- Detect project's E2E framework (Playwright, Cypress, Selenium, etc.)
- Add acceptance test cases in project's E2E test directory (following existing test structure)
- Run E2E tests (using project's actual command)
- Fix failures (Three-Strike Protocol applies)
- Update tasks.md Phase 3 status
E2E Hard Rules:
Rule 1: Must cover real network layer
page.evaluate(() => { window.__store__.state.status = 'done' })
await page.click('[data-testid="submit-btn"]')
await expect(page.locator('[data-testid="success-msg"]')).toBeVisible()
Rule 2: Skipped tests must have documented reasons
test.skip('env not supported')
test.skip('Step N: [specific reason], restore after resolution')
If accumulated skips exceed 3, must establish mock/stub environment to resolve — no more skip stacking.
Rule 3: Assert results after every key action
Rule 4: Assert specific values for critical business fields
/tdd:done
Phase 4: Delivery. Every check must pass before continuing.
- Compilation verification (mandatory for compiled languages: TypeScript, Java, Go, Rust, etc.)
- Full unit tests with coverage (project's actual command) — coverage >= 80% (or project target)
- Full regression (if project has regression scripts)
- E2E (if applicable)
- Issue tracking judgment — must create Issue document if ANY:
- Bug fix took > 5 minutes
- Same type of error occurred more than once
- Fix spans 2+ files
- Delivery checklist:
[ ] Compilation clean (if applicable)
[ ] All tests passing, coverage >= 80% (or project target)
[ ] Full regression passing (if project has regression scripts)
[ ] E2E tests passing (if applicable)
[ ] Feature docs updated (if project has usecases/docs directory)
[ ] Issues logged (if qualifying bugs found)
[ ] Environment variable examples synced (if new env vars added)
[ ] tdd-specs/<name>/tasks.md all [x]
- Output delivery report
- Prompt to run
/tdd:notes to capture practice notes, then /tdd:archive
/tdd:notes
Generate TDD practice notes — record the full development story.
When to use: after /tdd:done, or at any point you want to capture the development journey.
- Read
requirements.md, design.md, tasks.md from current spec
- Scan git history for feature-related commits, reverts, fix iterations
- Generate
tdd-specs/<name>/tdd-practice-notes.md covering:
- 需求背景: what the user wanted
- TDD 流程: Phase 1-4 record, each behavior chain's RED/GREEN/接入
- 踩坑记录: real problems encountered (from git history, not hypothetical)
- 文件清单: new/modified files + test coverage numbers
- 核心经验: 3-5 actionable lessons learned
- Cross-check: pitfalls ≥ 1, lessons ≥ 3, file list matches git diff
Guardrails:
- Must read spec docs first — don't fabricate from memory
- Pitfalls must reference actual problems (git reverts, fix commits, user reports)
- Lessons must be actionable ("do X" / "avoid Y"), not vague ("testing is important")
/tdd:bug
Bug fix full workflow: traceable from problem description to Issue archive.
- Collect bug info (symptoms, module, reproduction steps, severity)
- Create Issue document (status: "investigating")
- Root cause analysis — no guessing, find root cause first; check existing issues to avoid duplicate work
- Write reproduction test (RED) — choose test layer based on bug type:
- Business logic error -> Unit test
- API error -> Integration test
- UI/interaction issue -> E2E
- Fix code (GREEN) — minimum code to pass, then full regression
- Complete Issue documentation (root cause, fix, verification steps, prevention measures)
- Run E2E verification (if UI/end-to-end flow involved)
Three-Strike Protocol applies if reproduction test fails 3 times.
/tdd:change
Mid-course requirement change flow.
-
Confirm current spec
-
Collect change description (interactive if not provided)
-
Analyze impact across all 3 spec docs
Output impact assessment:
## Change Impact Assessment
### Affected spec entries
| Document | Entry | Impact Type | Description |
### Affected tasks
| Task | Current Status | Action Needed |
### Risk notes
- Completed tasks affected: N
- Estimated additional work: small / medium / large
-
Wait for user confirmation before modifying anything
-
Execute updates (requirements.md, design.md, tasks.md, UseCases if applicable)
- Completed but affected tasks: revert to
[ ] with note <- needs redo due to requirement change
-
Output change summary
Guardrail: If change causes 10+ task reverts, suggest considering a fresh /tdd:ff
/tdd:continue <name>
Resume in-progress feature.
- Read
tdd-specs/<name>/tasks.md
- Find first
[ ], [~], or [!] task
- Write to
tdd-specs/.current, resume from corresponding phase
- Output recovery summary (completed N/M tasks, current phase, next step)
/tdd:archive
Archive completed specs.
- Verify all tasks are
[x] — if not, stop and prompt to complete /tdd:done first
- Check for
tdd-practice-notes.md — if missing, prompt to run /tdd:notes first (recommend, don't block)
- Move to
tdd-specs/archive/<YYYY-MM>/
- Clear
tdd-specs/.current
File Structure Convention
tdd-specs/
+-- .current <- Currently active spec name
+-- <feature-name>/
| +-- requirements.md <- Requirements (EARS format)
| +-- design.md <- Technical design
| +-- tasks.md <- Implementation checklist (live-updated)
| +-- tdd-practice-notes.md <- Practice record (generated by /tdd:notes)
+-- archive/
+-- YYYY-MM/
+-- <completed-feature>/
Task Status Markers
| Marker | Meaning |
|---|
- [ ] | Not started |
- [~] | In progress (RED written, GREEN incomplete) |
- [x] | Completed |
- [!] | Blocked (Three-Strike Protocol, awaiting decision) |
Mandatory Issues Lookup Timing
| Timing | Method |
|---|
Before /tdd:ff or /tdd:spec | Browse project issues directory (if exists) |
Before each GREEN phase (inside /tdd:loop) | grep -rl "<error-keywords>" <issues-dir>/ |
| After Three-Strike Protocol triggers | Full-text search + module filter |
Not Applicable For
- Single-line typo fixes
- Pure documentation/configuration changes
- Simple style adjustments
Use direct git commit for these instead.
交付后继续开发规范(Post-Delivery Development)
/tdd:done 后 harness 进入 deliver 状态。此后任何 src/ 改动必须遵守以下规则,否则测试债务会迅速积累。
场景 A:联调发现 bug
禁止直接改代码,必须走 /tdd:bug:
发现 bug
→ /tdd:bug(写复现测试 RED → 改代码 GREEN → 记录 Issue)
→ 全量回归通过后提交
即使 bug 看起来很小(一行代码),也必须先有复现测试。直接改代码无法确认修复范围,无法防止回归。
场景 B:Spec 交付后追加功能
支付对接、状态流转等交付后才补充的逻辑:
sed -i '' 's/phase=deliver/phase=green/' tdd-specs/<spec>/.harness
不允许在 deliver 阶段直接修改实现代码而不追加测试。
场景 C:纯样式 / UX / 配置调整
可以直接改,但:
- commit message 注明
[style] / [ux] / [config]
- 改后全量跑测试确认无回归
/tdd:done 检查 8:交付后改动核查
在检查 7(环境变量)之后执行:
git log --oneline --name-only -- 'backend/src/**' 'miniprogram/pages/**' \
| grep -v "^[a-f0-9]" | sort -u | head -30
对照 tasks.md 确认:
- 每个新增 service 方法 → 有单元测试覆盖
- 每个新增/修改 API 接口 → 有 e2e 测试覆盖
- 联调期间的 bug 修复 → 有对应 Issue 记录
发现未覆盖逻辑 → 停止交付,补测试后重跑 /tdd:done。
提交前自查清单
□ 每个新增 service 方法有单元测试吗?
□ 每个新增/修改 API 接口有 e2e 测试吗?
□ 集成了外部服务(微信/腾讯云/COS)?→ 有 mock 测试吗?
□ 全量 jest 跑过了吗?(不是只跑单模块)
□ 有 bug 修复吗?→ 有 Issue 记录和复现测试吗?