mit einem Klick
ks-test-first
// Default behavior-locking stage for the current block. Write failing checks first, then implement. Only controlled low-risk exceptions may skip it.
// Default behavior-locking stage for the current block. Write failing checks first, then implement. Only controlled low-risk exceptions may skip it.
Lightweight task compressor. Turn messy input into a one-page brief that can route to roundtable or writing-plan without replacing direction decisions.
Formal quality gate with multi-persona review, confidence thresholds, and QA routing based on fresh verification evidence.
Direction convergence skill. Use brainstorm, align, or challenge mode to turn messy questions or existing proposals into a clear directional decision.
Use this when direction is already clear and the next job is to compress it into an executable block with scope, done criteria, and routing.
Fast failure classifier. Decide whether the problem is a spec gap, an implementation bug, or a rollback candidate before routing to writing-plan, diagnose, or ship.
Use this when the system is broken, tests are failing, or behavior is abnormal and the root cause is still unknown. No investigation, no fix.
| name | ks-test-first |
| description | Default behavior-locking stage for the current block. Write failing checks first, then implement. Only controlled low-risk exceptions may skip it. |
| layer | execution |
| owner | test-first |
| inputs | ["plan_document","current_block","risk_level"] |
| outputs | ["test_contract","failed_tests"] |
| entry_modes | ["from-scratch","aligned-offline","plan-ready"] |
Preamble: see templates/preamble.md
Locks the behavior boundary before implementation by writing failing checks first, then moving into implementation. Enter when a plan exists, the current block is clear, and the next step is to lock behavior with executable failing checks. Does not write implementation code, replace writing-plan, or make the final quality decision.
Do not write implementation code before you have seen a failing check.
"This looks simple" is not a valid reason to skip test-first.
The test contract must exist on disk before state points to it.
Only these cases may skip test-first (and the skip must be approved in writing-plan):
If the block is an exception, that approval must already exist in writing-plan. It cannot be invented here.
plan_document exists and is readableIf the project has no test framework, say so explicitly and define the dependency or manual contract instead of pretending the test phase happened. See E1.
If writing-plan already approved a true low-risk controlled exception and explained why test-first may be skipped, do not enter this skill.
Read the plan document. Extract each done criterion.
For each done criterion:
If a done criterion has no test scenario after 2 attempts, return to writing-plan with note: "done criterion [X] cannot be tested; needs rewriting".
Create a test contract file on disk before proceeding. The contract is the single source of truth for what "done" means for this block.
# Test Contract: [block name]
## Scope
[what this contract covers — must match plan scope]
## Test Scenarios
### Scenario 1: [name]
- **Given**: [precondition]
- **When**: [action]
- **Then**: [expected result]
- **Test function**: `test_scenario_1`
### Scenario 2: [name]
- **Given**: [precondition]
- **When**: [action]
- **Then**: [expected result]
- **Test function**: `test_scenario_2`
## Edge Conditions
[edge cases derived from boundary values, empty inputs, error paths]
## Failure Conditions
[what must fail — negative tests, invalid input, authorization checks]
Write checks that must fail. Execute them to confirm they actually fail.
For framework-specific commands, see templates/test-commands.md.
Quick reference:
| Framework | Run all tests | Run specific scenario | Run with coverage |
|---|---|---|---|
| Python | pytest | pytest -k "scenario_name" | pytest --cov |
| Node/Jest | npx jest | npx jest --testNamePattern="scenario name" | npx jest --coverage |
| Java/Gradle | ./gradlew test | ./gradlew test --tests "ClassName" | ./gradlew jacocoTestReport |
| Go | go test ./... | go test -run TestScenarioName ./... | go test -cover ./... |
| Shell/bats | bats tests/ | bats tests/case.bats | N/A |
| Rust | cargo test | cargo test test_name | cargo test -- --nocapture |
Confirm: each new check fails. A check that passes immediately did not test anything meaningful. See E3.
Red -> write a check that must fail; run it; confirm failure
Green -> write the smallest implementation that passes it; run it; confirm pass
Refactor -> improve structure with tests still green; run all; confirm still pass
Repeat -> move to next scenario
Do not write implementation code before you have seen a failing check.
Risk guidance:
| Risk | Requirement |
|---|---|
high | behavior must be locked before implementation; no shortcuts |
medium | default to test-first; full red-green-refactor |
low | still default to test-first unless writing-plan approved the exception |
After main scenarios pass, run edge-condition checks. Each edge condition should have its own test function.
Common edge condition categories:
Once all scenarios and edge conditions are covered:
| Failure | Action |
|---|---|
| No test framework detected | See E1 |
| Test scenarios exceed plan scope | See E2 |
| Red phase never fails | See E3 |
| Done criterion cannot be tested | Return to writing-plan with note |
When the project has no test framework:
bats, shell scripts, or Make targets)escalate = true and route to orchestrateWhen the test scenarios grow beyond the plan's defined scope:
writing-plan with note: "done criterion [X] requires scope expansion"When a written check passes immediately instead of failing:
| Risk level | Minimum test depth |
|---|---|
high | all done criteria + edge conditions + failure conditions; no shortcuts |
medium | all done criteria + at least one edge condition per scenario |
low | all done criteria; edge conditions are recommended but not required if the plan approved the exception |
When test-first was entered despite the block being a controlled exception (approved by writing-plan):
decision: test-contract-ready confidence: 0.9 rationale: The current block needs a locked behavior boundary, the key done criteria are covered, and the failing checks are ready to drive implementation. fallback: If the checks cannot be written, the done criteria are probably broken and the block should return to writing-plan. escalate: false next_skill: implement next_action: Enter implement and run the red-green-refactor loop.
_TEST_CONTRACT_PATH="${TEST_CONTRACT_PATH:?set TEST_CONTRACT_PATH to the actual test contract path}"
[ -f "$_TEST_CONTRACT_PATH" ] || { echo "test_contract not found: $_TEST_CONTRACT_PATH"; exit 1; }
_KS_CLI="${KEYSTONE_CLI:-./keystone}"
$_KS_CLI state set current_stage "test-first" >/dev/null
$_KS_CLI state set last_decision "test-contract-produced" >/dev/null
$_KS_CLI state set artifacts.test_contract "$_TEST_CONTRACT_PATH" >/dev/null
$_KS_CLI state set exit_code "ok" >/dev/null
$_KS_CLI state set exit_reason "test_contract locked the core done criteria" >/dev/null
$_KS_CLI state set next_skill "implement" >/dev/null
_SCENARIO_COUNT="${SCENARIO_COUNT:-0}"
echo "{\"skill\":\"test-first\",\"ts\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",\"decision\":\"test-contract-produced\",\"confidence\":0.9,\"scenario_count\":$_SCENARIO_COUNT}" >> .keystone/telemetry/events/$(date +%Y-%m).jsonl