| name | test-design-mandates |
| description | Use when designing test coverage matrices, assigning tests to Clean Architecture layers, planning the outside-in implementation order, or applying the Walking Skeleton strategy. Ensures every behaviour is tested at the right level with no redundancy. Load after writing Gherkin scenarios. |
Test Design Mandates
Overview
Four mandatory rules that govern how tests are designed in a Clean Architecture context. Applied after Gherkin scenarios are written, before any implementation starts.
Core principle: Tests enter through a use case boundary and assert at the next visible boundary. Never test internal classes directly.
Mandate 1 ā Layer Boundary Enforcement
Rule: Every AC names its use case boundary ā the Application layer entry point the test enters through. All tests at the Application layer enter through that use case. Assertions are made via application interfaces (repositories, gateways) or the use case return value. Never instantiate an internal domain class directly in a test that is exercising Application behaviour.
Why: Prevents TBU (Tested But Unwired) defects ā production code that works in isolation but is never called through the real composition root.
Application:
- Acceptance test ā enters through
{UseCaseName} use case / command handler
- Domain unit test (when extracted) ā enters through the domain function or policy's public signature
- Infrastructure test ā enters through the application interface (repository contract)
TBU detection checklist (run after GREEN):
Mandate 2 ā Business Language Abstraction
Rule: Maintain 3 strictly separated layers of abstraction in test code. No technical vocabulary leaks upward.
| Layer | Location | Language | Example |
|---|
| Gherkin | .feature files | Pure business | Given a driver with a clean record |
| Step methods | Step definition classes | Bridge ā calls use case | eligibilityUseCase.handle(command) |
| Business services / Use cases | Test helpers or application | Technical | new CheckEligibilityCommand(driverId, ...) |
Violations:
- ā Gherkin step:
Given the EligibilityApplicationService is instantiated
- ā Step method builds DTOs with raw primitives without a factory method
- ā
Step method:
var command = EligibilityCommandBuilder.forDriverWith(cleanRecord);
Implementation:
- Create
{Feature}Builder / {Feature}Mother test factories in step definition layer
- Step methods delegate to builders ā never hardcode IDs, dates, or primitives in Gherkin or steps
Mandate 3 ā User Journey Completeness
Rule: Tests validate complete user journeys, not isolated operations. A scenario must include setup, action, and observable outcome ā all three, always.
Structure:
Setup (Given) ā state of the system before the action
Action (When) ā ONE business trigger
Outcome (Then) ā observable business result (not internal state)
Incomplete journey anti-patterns:
- ā Only testing
Given + When with no assertion ā setup test, not a behaviour test
- ā
Then asserting internal state (e.g., field value on domain object) ā not observable by user
- ā Missing setup ā test depends on implicit shared state (test ordering bug)
Journey completeness check: Ask "could a user observe this outcome in the real system?" If yes, the Then is correctly observable.
Mandate 4 ā Domain Test Extraction (Gated)
Default rule: Acceptance tests do the work. No domain unit test by default, even when a pure function is extracted by refactoring. A regression that breaks 2+ AC together is the correct business signal ā the duplicated red is the proof that the rule traverses multiple user journeys, not a redundancy to eliminate.
Anti-pattern this rule prevents ā DOUBLE-COVERAGE: A pure function whose every behavioral branch is already reached by planned acceptance scenarios MUST NOT receive a dedicated domain test. Two suites asserting the same behavior drift together on every rule change and add zero failure-discrimination value.
A domain unit test is REQUIRED if and only if one of the two gates opens:
- Gate (a) ā Branch unreachable via AC. A behavioral branch of the function exists but no realistic acceptance scenario can trigger it (defensive case, exhaustive-enum fallback, technical guard). The AC physically CANNOT observe it.
- Gate (b) ā Combinatorial economy. Covering the input grid through AC alone would explode the Gherkin scenario count (indicative threshold: > 10ā15 scenarios for a single rule). Keep 3ā5 representative business AC (happy path + key rejections + critical boundaries) and delegate combinatorial sweep to a parameterized domain test (
[Theory], @ParameterizedTest, table-driven).
No gate opens ā REJECT extraction. Record in the test plan: M4 negative ā saturated by AC.
Process for each candidate pure function P:
- Enumerate
B(P) ā distinct behavioral branches of P.
- Enumerate
A(P) ā branches reached by planned AC scenarios.
- If
A(P) == B(P) AND combinatorial size ⤠10ā15 ā forbidden to extract, log M4 negative ā saturated by AC.
- Otherwise, name which gate opens and record the corresponding
Extraction Reason code.
Counter-example (STORY-41 contrefactual): Vehicle.MinimumAge() has B(P) = {21, 16}. Five planned AC reach both outputs, so A(P) == B(P). Combinatorial size = 3 cases, well under threshold. Both gates closed ā no domain test. The 5 AC do all the work; if MinimumAge() breaks, all 5 turn red ā that is the intended business signal.
Coverage Matrix
Build this matrix for every story before implementation starts. Any row with Layer = Domain MUST carry an Extraction Reason code (see Mandate 4). Without a reason code, the row is not authorized ā remove it.
Allowed Extraction Reason codes:
branch_unreachable_via_AC ā Gate (a)
combinatorial_economy ā Gate (b)
| Scenario | Use Case Boundary | Layer | Extraction Reason | Double Type | Walking Skeleton | Priority |
|---|
| Happy path ā driver eligible | CheckEligibilityUseCase | Application | ā | InMemory repository | A | P1 |
| Edge ā driver at age limit | CheckEligibilityUseCase | Application | ā | InMemory repository | A | P1 |
| Rejection ā too many accidents | CheckEligibilityUseCase | Application | ā | InMemory repository | A | P2 |
| Combinatorial sweep ā premium grid | PricingPolicy.computePremium | Domain | combinatorial_economy | None (pure function) | ā | P2 |
| Infrastructure ā real persistence | IEligibilityRepository | Infrastructure | ā | Real DB (Testcontainers) | ā | P3 |
Priority:
- P1 ā happy path, walking skeleton basis
- P2 ā business rule coverage, edge cases
- P3 ā infrastructure, integration, error paths
Walking Skeleton Strategy
A walking skeleton is the thinnest possible slice that exercises the full path from use case to output ā enough to prove the wiring works.
4 Strategies
| Strategy | When to use | Setup |
|---|
| A ā Full InMemory | Feature is purely internal (no external services) | InMemory repository, no real I/O |
| B ā Real local + Fake costly | Feature needs a local DB but avoids expensive external calls | Testcontainers DB + fake/stub for external service |
| C ā Real local | Feature integrates with controllable local infrastructure | Testcontainers for all local dependencies |
| D ā Configurable | Feature must run in both unit and integration mode | Strategy/feature flag selects double type at test startup |
Decision Tree
Does the feature write to or read from persistent storage?
āāā NO ā Strategy A (full InMemory)
āāā YES ā Does it call an expensive external service (payment, SMS, AI)?
āāā YES ā Strategy B (real local DB + fake external)
āāā NO ā Is the storage local and controllable?
āāā YES ā Strategy C (real local with Testcontainers)
āāā NO ā Strategy D (configurable)
Walking Skeleton Sizing
- 2ā5 walking skeletons per feature (one per major flow variant)
- 15ā20 focused scenarios total per feature (detailed behaviour coverage)
- Tag walking skeleton scenarios with
@smoke for fast validation
Layer Assignment Rules
| What to test | Test project | Layer | Double type |
|---|
| Use case / command handler | UnitTest | Application | InMemory application interfaces |
| Domain policy / specification | UnitTest | Domain | None (pure function ā call directly) |
| Repository adapter | IntegrationTest | Infrastructure | Real DB via Testcontainers |
| API controller / endpoint | IntegrationTest | API | In-process app host (WebApplicationFactory or equivalent) |
| Architecture boundaries | IntegrationTest | Architecture | Static analysis (NetArchTest, ArchUnit, etc.) |
Never:
- Test a domain entity by instantiating it directly in an Application acceptance test
- Use a real database in
UnitTest
- Use a mock where an InMemory fake exists (InMemory > mock for repositories)
References