com um clique
test-driven-development
// Test-driven development discipline for any feature or bugfix. Use when implementing features, fixing bugs, or changing behavior — requires writing failing tests before implementation code.
// Test-driven development discipline for any feature or bugfix. Use when implementing features, fixing bugs, or changing behavior — requires writing failing tests before implementation code.
Use tmux correctly from an agent — explicit pane targeting, safe-state verification, send verification, state distinction, and recovery. Use when Claude Code, Codex, GitHub Copilot CLI, or any TUI/REPL must keep state across commands.
Dispatch parallel subagents for independent tasks without shared state. Use when facing 2+ independent failures, debugging unrelated subsystems, or researching separate topics concurrently.
Create isolated git worktrees for feature work with safety verification. Use when starting feature branches that need isolation, running parallel agents in separate workspaces, or before executing implementation plans.
Closed-loop planning discipline for non-trivial changes. Use when creating implementation plans, reviewing plan completeness, or preparing PRs that require structured planning evidence.
Fetch up-to-date library documentation. Use when user asks about libraries, frameworks, or needs current API references/code examples.
Browse websites, interact with web pages, and run Playwright tests. Use for navigating URLs, clicking elements, filling forms, taking screenshots, running browser automation, or executing Playwright test suites.
| name | test-driven-development |
| description | Test-driven development discipline for any feature or bugfix. Use when implementing features, fixing bugs, or changing behavior — requires writing failing tests before implementation code. |
Test-driven development is a discipline, not a suggestion. This skill defines when and how to apply TDD, provides the iron law that prevents shortcuts, and connects to Hill90's TDD agents for execution.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
This is not a guideline. It is a hard constraint. If you find yourself writing implementation code without a failing test, stop immediately and write the test first.
The only exception: trivial configuration changes (adding an env var, updating a version number) that have no behavioral impact.
Every change follows this cycle:
If the test passes immediately, either:
Do not optimize. Do not add features. Do not refactor. Just make it green.
Then return to Red for the next requirement.
# BATS example
@test "deploy.sh rejects unknown service name" {
# Arrange
source scripts/deploy.sh
# Act
run deploy "nonexistent-service" "prod"
# Assert
[[ "$status" -ne 0 ]]
[[ "$output" =~ "unknown service" ]]
}
// TypeScript example
test("retryOperation retries on transient failure then succeeds", async () => {
// Arrange
let attempts = 0;
const operation = async () => {
attempts++;
if (attempts < 3) throw new Error("transient");
return "success";
};
// Act
const result = await retryOperation(operation, { maxRetries: 3 });
// Assert
expect(result).toBe("success");
expect(attempts).toBe(3);
});
| Property | Meaning | Example |
|---|---|---|
| Focused | Tests one behavior | "rejects empty input", not "handles all edge cases" |
| Descriptive | Name explains what and why | deploy rejects unknown service |
| Independent | No reliance on other tests | Each test sets up its own state |
| Deterministic | Same result every run | No reliance on timing, network, or random values |
| Fast | Runs in milliseconds | Mock external dependencies |
One test case is not enough. A hardcoded return value can pass a single test. Add multiple examples to force a real implementation:
@test "slugify converts spaces to hyphens" {
run slugify "hello world"
[[ "$output" == "hello-world" ]]
}
@test "slugify converts multiple spaces to single hyphens" {
run slugify "hello world"
[[ "$output" == "hello-world" ]]
}
@test "slugify lowercases uppercase letters" {
run slugify "Hello World"
[[ "$output" == "hello-world" ]]
}
Writing tests after implementation has fundamental problems:
Writing tests first:
These are excuses. Recognize them and resist.
| Rationalization | Why It's Wrong |
|---|---|
| "I'll write the tests after" | You won't. And if you do, they'll be weaker. |
| "This is too simple to test" | Simple code grows. The test takes 30 seconds to write. |
| "I'm just exploring" | Spike in a branch. When you implement for real, TDD. |
| "The tests would be trivial" | Trivial tests catch regressions. Write them anyway. |
| "I need to see the implementation first" | No. Define the behavior first. Implementation follows. |
| "It's just a refactor" | Refactoring means tests stay green. If they don't exist, write them. |
| "This is infrastructure code" | Infrastructure has behavior. Test it. |
| "I'll slow down the team" | Bugs slow down the team more. TDD prevents bugs. |
| "The framework makes it hard to test" | Isolate the framework. Test your logic separately. |
| "It's a one-line change" | One-line changes cause production incidents. Test it. |
| "I know this works" | You don't. The test proves it. |
Stop and course-correct if you notice:
A user reports that deploy.sh crashes when given an empty service name.
@test "deploy.sh exits with error when service name is empty" {
run bash scripts/deploy.sh "" "prod"
[[ "$status" -ne 0 ]]
[[ "$output" =~ "service name required" ]]
}
Run the test. It fails because deploy.sh doesn't validate the service name. Good — this is the right failure.
Add validation at the top of deploy.sh:
if [[ -z "$1" ]]; then
echo "Error: service name required" >&2
exit 1
fi
Run the test. It passes. Run the full suite. No regressions.
The validation is simple enough — no refactoring needed. Move to the next requirement.
Before considering a TDD cycle complete:
| Situation | Action |
|---|---|
| Don't know what test to write | Describe the behavior in plain English first. The test is that description in code. |
| Test is hard to write | The design needs to change. Hard-to-test code is poorly structured code. |
| Test keeps failing after implementation | Read the failure message carefully. The test and implementation may disagree on the interface. |
| Too many tests to write | Start with the most important behavior. One test at a time. |
| Existing code has no tests | Write tests for the behavior you're about to change. Don't try to backfill everything. |
| Can't isolate dependencies | Use test doubles (stubs, spies). If you can't inject dependencies, refactor to allow it. |
| Not sure if test is meaningful | Ask: "Would this test catch a real bug?" If no, reconsider. |
| Blocked on understanding behavior | Ask the user to clarify expected behavior before writing tests. |
When a bug is found:
Never fix a bug without first writing a test that demonstrates it. The test is proof the bug existed and evidence it's fixed.
Hill90 provides three specialized agents that execute the TDD cycle:
| Agent | Phase | What It Does | Tools |
|---|---|---|---|
tdd-red | Red | Writes failing tests in tests/ | Read, Grep, Glob, Write |
tdd-green | Green | Implements minimum code to pass tests | Read, Grep, Glob, Bash, Edit, Write |
tdd-refactor | Refactor | Improves structure, keeps tests green | Read, Grep, Glob, Bash, Edit, Write |
Each agent hands off to the next automatically. The agents enforce tool restrictions (e.g., tdd-red cannot run tests or edit implementation files).
This skill provides the discipline and philosophy — when to use TDD, why order matters, how to resist rationalizations. The agents provide the execution mechanism with enforced constraints.
Use the agents when you want strict phase separation. Use this skill's principles directly when working in the main conversation.
See references/testing-anti-patterns.md for detailed anti-patterns including: