Run any Skill in Manus with one click

$pwd:

tdd-workflow

Name: Tdd Workflow
Author: yanjunz

// Spec-driven TDD full-cycle skill. Triggered when implementing features: Interactive requirements gathering -> UseCase documentation -> Test plan -> TDD implementation (unit -> integration -> E2E) -> Regression verification -> Issue tracking -> Delivery. Trigger words: implement, new feature, develop, add, build, I want to, help me build

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 6, 2026 at 07:07

File Explorer

27 files

SKILL.md

readonly

package.json

"author": "yanjunz"

"repository": "yanjunz/tdd-workflow"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	tdd-workflow
description	Spec-driven TDD full-cycle skill. Triggered when implementing features: Interactive requirements gathering -> UseCase documentation -> Test plan -> TDD implementation (unit -> integration -> E2E) -> Regression verification -> Issue tracking -> Delivery. Trigger words: implement, new feature, develop, add, build, I want to, help me build
user-invocable	true
allowed-tools	Read, Write, Edit, Bash, Glob, Grep, Agent
metadata	{"version":"2.4.7","compatible":"claude-code, cursor, cline, windsurf, codebuddy, github-copilot","hooks":"Installed to .claude/hooks/tdd/ via tdd-workflow init. See .claude/settings.json for registration."}

TDD Workflow — Spec-Driven Full-Cycle Development

Command Overview

Command	Purpose
`/tdd:new <name>`	Start new feature, interactive requirements gathering (collects UC framework)
`/tdd:ff <name>`	UseCase-first: generate usecases.md as primary output, then derive requirements → design → tasks from it
`/tdd:change`	Mid-course requirement change: analyze impact (UseCase dimension first), sync all 4 docs
`/tdd:spec`	Generate/update spec documents individually
RED / GREEN / REFACTOR phases	Phase markers used inside `/tdd:loop`; not separate slash commands. See "Loop-internal phases" below for the rules each phase enforces.
`/tdd:loop`	Auto-cycle red -> green -> refactor until Phase 2 complete
`/tdd:e2e`	Derive E2E tests from usecases.md paths (each UC path → one E2E)
`/tdd:verify-setup`	Interactive project-level verify config (tdd-specs/.verify/project.md)
`/tdd:verify-local`	Interactive personal verify params (tdd-specs/.verify/project.local.md, gitignored)
`/tdd:cleanup [env]`	Manual cleanup — run pre_verify_cleanup without running verification itself
`/tdd:done`	4-stage verification: code checks → local E2E → staging → delivery (includes UC sync to `paths.usecases.dir`, default `docs/usecases/`)
`/tdd:notes`	Generate TDD practice notes — record decisions, pitfalls, lessons learned
`/tdd:bug`	Bug fix workflow: report -> analyze -> test -> fix -> verify
`/tdd:continue <name>`	Resume in-progress feature
`/tdd:archive`	Archive completed specs (warns if usecases.md not synced to docs/)

Project Context Detection

Before starting any feature, detect project structure to determine test framework and directory conventions:

# Detect package management and test framework
ls package.json pyproject.toml Cargo.toml go.mod pom.xml build.gradle 2>/dev/null | head -5
grep -E '"test"|"jest"|"vitest"|"pytest"|"mocha"' package.json 2>/dev/null || true

# Detect source and test directories
ls -d src/ app/ lib/ tests/ test/ spec/ __tests__/ 2>/dev/null || true

Adapt all subsequent commands based on detection results:

Test commands: npm test / npx jest / npx vitest / pytest / go test ./... / mvn test / cargo test etc.
Test directories: test/ / tests/ / __tests__/ / spec/ etc.
Source directories: src/ / app/ / lib/ etc.

`/tdd:new <name>`

Start a new feature.

If no <name>, ask user what they want to build; derive kebab-case name from description
Create tdd-specs/<name>/ directory, write to tdd-specs/.current
Enter requirements gathering (cannot be skipped)

Requirements gathering — all dimensions must be covered:

Dimension	Question
Target users	Who will use this? (based on actual project roles)
Core scenarios	Top 1-3 most important use cases?
Input/Output	What does the user input? What does the system return?
Error handling	What situations cause failure? Expected error behavior?
Scope boundaries	What is explicitly out of scope?
Acceptance criteria	How do we know it's done?

After each round of Q&A, reflect understanding back to user for confirmation. Scope must be confirmed before proceeding.

Output after collection:

Feature name and path: tdd-specs/<name>/
Confirmation summary (user stories + acceptance criteria)
Prompt: Run /tdd:ff to generate all spec docs at once, or /tdd:spec for step-by-step

`/tdd:ff <name>`

Fast-forward: generate requirements -> design -> tasks in one shot.

If tdd-specs/<name>/ doesn't exist, first run /tdd:new requirements gathering

Step 1: Review known Issues (if project uses issues directory; path from paths.issues.dir in tdd-specs/.verify/project.md, defaults to docs/issues)

# ISSUES_DIR resolves to paths.issues.dir (default: docs/issues). External-tool mode: skip local scan.
ls ${ISSUES_DIR}/*.md 2>/dev/null | grep -v README || echo "No issues directory, skipping"
grep -rl "<feature-keywords>" ${ISSUES_DIR}/ 2>/dev/null || true

Step 2: Update UseCase docs (target dir from paths.usecases.dir, default docs/usecases/; external-tool mode prompts manual sync)
Step 3: Generate tdd-specs/<name>/requirements.md
Step 4: Generate tdd-specs/<name>/design.md (incorporating actual project tech stack)

Step 5: Generate tdd-specs/<name>/tasks.md (using actual project test commands and paths)

CRITICAL — Vertical Slice Rule (mandatory):

Phase 2 tasks MUST be organized by UC (vertical slice), and each UC MUST include tasks for ALL technical layers it touches:

Layer	Include when...	Example tasks
Database migration	UC introduces new table/field	`CREATE TABLE`, execute migration to local DB
Backend service	UC has business logic	service unit test + implementation
Backend controller	UC has API endpoint	controller/route test
Frontend page	UC actor is end-user (browser/miniprogram)	page JS/HTML/CSS implementation
Client app	UC involves client-side processing	Python/native test + implementation

FORBIDDEN: Separating frontend into a standalone "Phase 4". All layers of a UC belong together in Phase 2.

Exception: Pure infrastructure setup (Phase 1: creating directories, Entity skeletons, module registration) is allowed as a separate phase since it's shared across all UCs.

Database migration execution rule: Phase 1 must include actually running the migration on local dev DB (not just writing the SQL file). After migration, run schema:dump if the project has that script.

Step 6: Test coverage check (mandatory, cannot skip)

After generating tasks.md, immediately verify all 3 test layers have tasks. If any layer has 0, proactively add before continuing:

Layer	Check	Gap-fill direction
Unit tests	tasks.md has tasks with "unit test:" prefix	Add pure function unit tests for core business logic
Integration tests	tasks.md has "integration test:" prefix tasks covering key HTTP endpoint chains + DB write verification	Add: `POST /api/xxx` full chain (request -> response -> DB state); concurrency safety; permission boundaries (4xx for unauthorized roles)
E2E	Phase 3 has E2E tasks	Add key user flow end-to-end verification

Show summary, wait for confirmation

Output format:

OK requirements.md — N requirements, N acceptance criteria
OK design.md       — N modules, N interfaces
OK tasks.md        — Phase 1: N items / Phase 2: N items (by UC, all layers) / Phase 3: N E2E items
Issues reviewed: <related IDs or "none">

Ready! Run /tdd:loop to start TDD implementation.

`/tdd:spec`

Generate or update spec documents individually. Same as /tdd:ff Steps 1-6, but checks existing files and asks whether to overwrite.

Loop-internal phases

The three phases below are NOT standalone slash commands. They are phase markers enforced inside /tdd:loop (and inside /tdd:bug when fixing a bug). Users do not invoke /tdd:red directly — the loop transitions through these phases automatically for each Phase 2 task.

The rules in each phase are the contract between the loop and the Coder/Reviewer it spawns.

RED phase

Write a failing test (TDD Red phase).

Pick next [ ] Phase 2 task from tasks.md
Write test file (rules: one test at a time, test behavior not mocks, name describes behavior)
Run immediately to verify it actually fails (using project's actual test command)
Confirm failure reason is "feature not implemented" not syntax error
Mark task as [~]

Failure verification triple-check:

The test fails (not errors out)
Failure message matches expectation
Failure is due to missing functionality, not typo

GREEN phase

Write minimum code to pass current test (TDD Green phase).

Step 0 (mandatory): Check Issues first (if project has issues tracking; path from paths.issues.dir)

grep -rl "<error-keywords>" ${ISSUES_DIR}/ 2>/dev/null || echo "No existing records"

Write only the minimum code to pass the test — no premature abstraction
Run tests (using project's actual command)
All green (including existing tests, no regressions allowed) to mark complete
Mark task as [x]

Three-Strike Protocol — triggered when same test fails 3 times:

WARNING: Three-Strike Protocol

Test: <test-name>
Attempt history:
  1. <approach> -> <error>
  2. <approach> -> <error>
  3. <approach> -> <error>

Issues search result: <found/none>

Please choose:
  A. Try a different approach (describe your idea)
  B. Split into smaller test granularity
  C. Mark [!] skip, move to next
  D. Need more context

REFACTOR phase

Refactor (only when all tests are green).

Eliminate duplication, improve naming, extract shared logic
Run tests after each small change
Follow project's existing conventions (reference lint/style config and issues prevention notes)

`/tdd:loop`

Auto-cycle until all implementation tasks (Phase 1 + Phase 2) are fully complete.

WHILE tasks.md has ANY [ ] or [~] task (regardless of Phase):
  IF current task is Phase 1 (infrastructure):
    Execute directly (no RED/GREEN cycle needed for migrations, scaffolding)
    VERIFY then mark [x]
  ELSE (implementation task — any Phase):
    IF task is a "unit test" task:
      RED phase      -> Write failing test
    IF task is an "implement" task:
      GREEN phase    -> Implement to pass (with issues lookup)
    IF task is a frontend page task:
      Write page files directly (js/wxml/wxss/json for miniprogram, tsx for React)
      VERIFY then mark [x]
    REFACTOR phase  -> Refactor (if applicable)

  IF same test fails 3 times:
    STOP -> Three-Strike Protocol -> Await decision

IF ALL tasks across ALL Phases are [x]:
  Run full test suite (project's actual command)
  Output: completion report (N tests, Xs elapsed)
  Prompt: Run /tdd:e2e for E2E acceptance

Marking [x] Verification Protocol (MANDATORY — cannot bypass):

Before marking ANY task [x], you MUST verify with evidence:

Task type	Required evidence before [x]
Unit test	Test file exists + `jest`/`pytest` run shows it passes
Implementation	Source file exists + related tests pass
Frontend page	All page files exist (js/wxml/wxss/json or tsx) + registered in app config
Database migration	`SHOW TABLES` confirms new tables exist
Any task	FORBIDDEN: marking [x] with "待后续", "TODO", "skip" in the same line. Use [!] for blocked tasks.

If you cannot complete a task, you MUST either:

Mark [!] with a documented blocker reason
Keep as [ ] and ask user for guidance
NEVER mark [x] for unfinished work

Key behavior: The loop processes ALL layers within each UC (backend test → backend impl → frontend page) before moving to the next UC. This ensures each UC is fully deliverable when its tasks complete.

Task completeness scan (runs once at loop start, cannot skip):

Scenario Type	Check	Prompt
DB migration executed	Phase 1 has migration task AND local DB has the new tables?	Migration SQL exists but was never executed. Run it now and verify with `SHOW TABLES`.
Real DB integration test	At least 1 test in tasks.md connects to real DB (not all mocked)?	All tests use mock repositories. Add at least 1 integration test that writes to real DB and reads back to verify schema correctness.
Error response parsing	Tests for "response structure doesn't match expected"?	Missing error response parsing test, suggest adding
Crash/restart recovery	Tests for "state recovery after process restart"?	Missing crash recovery test, suggest adding
External URL/Host changes	Tests for "external resource URL host mismatch"?	Missing URL host rewrite test, suggest adding
Network timeout	Tests for "return False/empty instead of throwing on timeout"?	Missing timeout handling test, suggest adding
Integration tests (HTTP chain)	tasks.md has "integration test:" tasks covering key endpoint HTTP request -> response -> DB write chain?	Missing integration test tasks, suggest adding: key endpoint e2e chain (with DB state verification), concurrency safety, permission boundaries

DB Migration Verification Protocol (mandatory when Phase 1 has migration tasks):

When processing a Phase 1 migration task, the loop MUST:

Execute the SQL file against local dev DB
Verify tables/columns exist: SHOW TABLES LIKE '<pattern>' or equivalent
Run schema:dump if project has it
Only mark task [x] after verification passes

If DB is not running or migration fails → STOP and ask user to fix DB before continuing. Do NOT proceed with mock-only tests and claim "verification passed".

`/tdd:e2e`

Phase 3: E2E acceptance tests.

Pre-check (based on project's actual service address):

curl -s http://localhost:<PORT>/health 2>/dev/null || echo "WARNING: Service not running, please start dev server first"

Detect project's E2E framework (Playwright, Cypress, Selenium, etc.)
Add acceptance test cases in project's E2E test directory (following existing test structure)
Run E2E tests (using project's actual command)
Fix failures (Three-Strike Protocol applies)
Update tasks.md Phase 3 status

E2E Hard Rules:

Rule 1: Must cover real network layer

// WRONG: Bypasses network layer entirely, API errors invisible
page.evaluate(() => { window.__store__.state.status = 'done' })

// CORRECT: Trigger real user action, let API calls happen
await page.click('[data-testid="submit-btn"]')
await expect(page.locator('[data-testid="success-msg"]')).toBeVisible()

Rule 2: Skipped tests must have documented reasons

// WRONG: Silent skip, no tracking
test.skip('env not supported')

// CORRECT: Document skip reason for follow-up
test.skip('Step N: [specific reason], restore after resolution')

If accumulated skips exceed 3, must establish mock/stub environment to resolve — no more skip stacking.

Rule 3: Assert results after every key action

Rule 4: Assert specific values for critical business fields

`/tdd:done`

Phase 4: Delivery. Every check must pass before continuing.

Compilation verification (mandatory for compiled languages: TypeScript, Java, Go, Rust, etc.)
Full unit tests with coverage (project's actual command) — coverage >= 80% (or project target)
Full regression (if project has regression scripts)
E2E (if applicable)
Issue tracking judgment — must create Issue document if ANY:
- Bug fix took > 5 minutes
- Same type of error occurred more than once
- Fix spans 2+ files

Delivery checklist:

[ ] Compilation clean (if applicable)
[ ] All tests passing, coverage >= 80% (or project target)
[ ] Full regression passing (if project has regression scripts)
[ ] E2E tests passing (if applicable)
[ ] Feature docs updated (if project has usecases/docs directory)
[ ] Issues logged (if qualifying bugs found)
[ ] Environment variable examples synced (if new env vars added)
[ ] tdd-specs/<name>/tasks.md all [x]

Output delivery report
Prompt to run /tdd:notes to capture practice notes, then /tdd:archive

`/tdd:notes`

Generate TDD practice notes — record the full development story.

When to use: after /tdd:done, or at any point you want to capture the development journey.

Read requirements.md, design.md, tasks.md from current spec
Scan git history for feature-related commits, reverts, fix iterations
Generate tdd-specs/<name>/tdd-practice-notes.md covering:
- 需求背景: what the user wanted
- TDD 流程: Phase 1-4 record, each behavior chain's RED/GREEN/接入
- 踩坑记录: real problems encountered (from git history, not hypothetical)
- 文件清单: new/modified files + test coverage numbers
- 核心经验: 3-5 actionable lessons learned
Cross-check: pitfalls ≥ 1, lessons ≥ 3, file list matches git diff

Guardrails:

Must read spec docs first — don't fabricate from memory
Pitfalls must reference actual problems (git reverts, fix commits, user reports)
Lessons must be actionable ("do X" / "avoid Y"), not vague ("testing is important")

`/tdd:bug`

Bug fix full workflow: traceable from problem description to Issue archive.

Collect bug info (symptoms, module, reproduction steps, severity)
Create Issue document (status: "investigating")
Root cause analysis — no guessing, find root cause first; check existing issues to avoid duplicate work
Write reproduction test (RED) — choose test layer based on bug type:
- Business logic error -> Unit test
- API error -> Integration test
- UI/interaction issue -> E2E
Fix code (GREEN) — minimum code to pass, then full regression
Complete Issue documentation (root cause, fix, verification steps, prevention measures)
Run E2E verification (if UI/end-to-end flow involved)

Three-Strike Protocol applies if reproduction test fails 3 times.

`/tdd:change`

Mid-course requirement change flow.

Confirm current spec
Collect change description (interactive if not provided)

Analyze impact across all 3 spec docs

Output impact assessment:

## Change Impact Assessment
### Affected spec entries
| Document | Entry | Impact Type | Description |
### Affected tasks
| Task | Current Status | Action Needed |
### Risk notes
- Completed tasks affected: N
- Estimated additional work: small / medium / large

Wait for user confirmation before modifying anything
Execute updates (requirements.md, design.md, tasks.md, UseCases if applicable)
- Completed but affected tasks: revert to [ ] with note <- needs redo due to requirement change
Output change summary

Guardrail: If change causes 10+ task reverts, suggest considering a fresh /tdd:ff

`/tdd:continue <name>`

Resume in-progress feature.

Read tdd-specs/<name>/tasks.md
Find first [ ], [~], or [!] task
Write to tdd-specs/.current, resume from corresponding phase
Output recovery summary (completed N/M tasks, current phase, next step)

`/tdd:archive`

Archive completed specs.

Verify all tasks are [x] — if not, stop and prompt to complete /tdd:done first
Check for tdd-practice-notes.md — if missing, prompt to run /tdd:notes first (recommend, don't block)
Move to tdd-specs/archive/<YYYY-MM>/
Clear tdd-specs/.current

File Structure Convention

tdd-specs/
+-- .current                    <- Currently active spec name
+-- <feature-name>/
|   +-- requirements.md         <- Requirements (EARS format)
|   +-- design.md               <- Technical design
|   +-- tasks.md                <- Implementation checklist (live-updated)
|   +-- tdd-practice-notes.md   <- Practice record (generated by /tdd:notes)
+-- archive/
    +-- YYYY-MM/
        +-- <completed-feature>/

Task Status Markers

Marker	Meaning
`- [ ]`	Not started
`- [~]`	In progress (RED written, GREEN incomplete)
`- [x]`	Completed
`- [!]`	Blocked (Three-Strike Protocol, awaiting decision)

Mandatory Issues Lookup Timing

Timing	Method
Before `/tdd:ff` or `/tdd:spec`	Browse project issues directory (if exists)
Before each GREEN phase (inside `/tdd:loop`)	`grep -rl "<error-keywords>" <issues-dir>/`
After Three-Strike Protocol triggers	Full-text search + module filter

Not Applicable For

Single-line typo fixes
Pure documentation/configuration changes
Simple style adjustments

Use direct git commit for these instead.

交付后继续开发规范（Post-Delivery Development）

/tdd:done 后 harness 进入 deliver 状态。此后任何 src/ 改动必须遵守以下规则，否则测试债务会迅速积累。

场景 A：联调发现 bug

禁止直接改代码，必须走 /tdd:bug：

发现 bug
  → /tdd:bug（写复现测试 RED → 改代码 GREEN → 记录 Issue）
  → 全量回归通过后提交

即使 bug 看起来很小（一行代码），也必须先有复现测试。直接改代码无法确认修复范围，无法防止回归。

场景 B：Spec 交付后追加功能

支付对接、状态流转等交付后才补充的逻辑：

# 1. 在 tasks.md 末尾追加任务（标注 Post-delivery: <说明>）
# 2. 把 harness 改回 green
sed -i '' 's/phase=deliver/phase=green/' tdd-specs/<spec>/.harness
# 3. 走正常 red→green 循环
# 4. 全量测试通过后重跑 /tdd:done

不允许在 deliver 阶段直接修改实现代码而不追加测试。

场景 C：纯样式 / UX / 配置调整

可以直接改，但：

commit message 注明 [style] / [ux] / [config]
改后全量跑测试确认无回归

/tdd:done 检查 8：交付后改动核查

在检查 7（环境变量）之后执行：

# 查看本 spec 周期内修改的 src 文件
git log --oneline --name-only -- 'backend/src/**' 'miniprogram/pages/**' \
  | grep -v "^[a-f0-9]" | sort -u | head -30

对照 tasks.md 确认：

每个新增 service 方法 → 有单元测试覆盖
每个新增/修改 API 接口 → 有 e2e 测试覆盖
联调期间的 bug 修复 → 有对应 Issue 记录

发现未覆盖逻辑 → 停止交付，补测试后重跑 /tdd:done。

提交前自查清单

□ 每个新增 service 方法有单元测试吗？
□ 每个新增/修改 API 接口有 e2e 测试吗？
□ 集成了外部服务（微信/腾讯云/COS）？→ 有 mock 测试吗？
□ 全量 jest 跑过了吗？（不是只跑单模块）
□ 有 bug 修复吗？→ 有 Issue 记录和复现测试吗？

name	tdd-workflow
description	Spec-driven TDD full-cycle skill. Triggered when implementing features: Interactive requirements gathering -> UseCase documentation -> Test plan -> TDD implementation (unit -> integration -> E2E) -> Regression verification -> Issue tracking -> Delivery. Trigger words: implement, new feature, develop, add, build, I want to, help me build
user-invocable	true
allowed-tools	Read, Write, Edit, Bash, Glob, Grep, Agent
metadata	{"version":"2.4.7","compatible":"claude-code, cursor, cline, windsurf, codebuddy, github-copilot","hooks":"Installed to .claude/hooks/tdd/ via tdd-workflow init. See .claude/settings.json for registration."}