| name | xp-design-contract-review |
| description | Use this skill during XP pair-programming review or pre-commit code review. Walks the 8-class design-contract blocker checklist (silent fallback / non-empty unenforced / commit-state discontinuity / sentinel value confusion / conditional cleanup pre-state / NotFound short-circuit / terminating-vs-absent / operator precedence). Catches contract-level defects that traditional Dev/Test split commonly misses. |
| allowed-tools | Bash(grep *) Bash(rg *) Read |
XP Design Contract Review
Hard rules — enforce during pair review or pre-commit
- Read for contract, not for syntax. Linters catch missing semicolons. Pair review must catch "this code looks correct but the contract is wrong."
- 8 classes, walk all 8. Don't stop at the first issue you find. Each class catches a different failure mode; co-occurrences are common.
- Confirm by re-reading the call site, not just the implementation. Contract is between caller and callee — flaws appear at the boundary, not inside one or the other in isolation.
- State your verdict as: "I checked class N, here's what I saw." Not "looks fine." Vague approval is the failure mode this skill exists to block.
When to invoke
- Before merging any non-trivial PR
- During XP pair coding when the partner says "I'm done with this part"
- Before writing test cases (so test design tests the contract, not the syntax)
- When debugging a recurring race or silent failure — high probability one of the 8 classes was missed at PR time
The 8 design-contract blocker classes (frequency descending)
1. Silent fallback overrides what you wanted to eliminate
Symptom: code says "fallback to default if X is empty" but X is the very thing you were trying to enforce.
Catch: ask "if X is empty, is that an acceptable system state? Or did we just hide the bug?"
Example: if config.role == "" { config.role = "follower" } — but role being empty is the bug; defaulting to follower silences the cluster split.
2. Non-empty contract field is not enforced at write site
Symptom: doc says "Field A must be set"; reader assumes A != "" and dereferences. But the writer never validates non-empty.
Catch: search for every place that writes to A; confirm each path has a non-empty assertion or returns an error.
3. State within a single commit boundary is misused as continuous
Symptom: "I observed X.state at line 5, used it at line 50, but the call between them changed it under me."
Catch: any time state is read more than once on the same path, ask "could it have changed between reads?"
4. Sentinel value (empty string, zero, nil) carries error meaning
Symptom: function returns "" to mean "not found" but caller doesn't know the difference between "not found" and "found but empty".
Catch: prefer explicit (value, ok) or Result<value, error>. If sentinel is unavoidable, document it loud and verify caller checks.
4a. Cross-column sentinel-value namespace collision
Symptom: the same sentinel string (or numeric value) is reused across two different columns / fields / layers that have different value spaces, different decision authorities, or different time horizons. A row carrying token Z in column X and token Z in column Y are pulled together at aggregation / readability / triage time and silently conflated, even though their meanings diverge.
Catch: when walking class 4, do not stop at "is the value space inside this single column / enum well-defined?" — also walk the cross-column axis:
- List every column / field / layer that ever carries sentinel-namespace tokens.
- For each token, list every column it appears in.
- Any token appearing in ≥2 columns is a candidate collision. Confirm by asking three questions:
- Same value space across both columns? (If no → collision.)
- Same decision authority — i.e. does the token in column A trigger the same downstream action as in column B? (If no → collision.)
- Same time horizon? (If no → collision.)
- Any "no" → rename one side. Pure rename, no logic change is the safest fix; do not touch decision logic or discriminator while resolving the collision.
- Add a unit fixture that has the same row carry distinct values in the two columns (cross-layer disagreement case), plus a spot-check assert that the two columns are observably different on that row. Internally row-consistent fixtures cannot catch this class.
Acceptance fixture: SQL Server soak writer dual-metric class-4 sub-case (canonical artifact 20260506-000019-24h-soak-mssql-laneb-dr-idc2-pod row id=3591, fault-009 ch30 6s switchover quorum vacuum). v3.0 used token durability_bad_ack in both:
| Column | Layer | Computed at | Decision authority | Old value space | Collision token |
|---|
bad_ack_class (col 20) | A — row-time forensic | iteration end (~6s) | none (warn-only) | clean / availability_window / durability_bad_ack | durability_bad_ack |
final_classification | B — durability ground truth | SOAK-40 reconcile | exclusive halt | ack_success / durability_bad_ack / commit_unknown / client_failed_absent | durability_bad_ack |
Three failures of the cross-column check: different value space, different decision authority (A: warn-only vs B: halt), different time horizon (A: ~6s row-time vs B: SOAK-40). On row id=3591 Layer A bounded retry never recovered (col 20 = durability_bad_ack) but Layer B saw the row durably present at SOAK-40 (final_classification = ack_success) — same string, opposite decisions, raw TSV unreadable.
v3.1 fix: rename Layer A durability_bad_ack → bounded_retry_exhausted (rename-only at the row-time layer; halt logic, discriminator, and Layer-B sovereignty unchanged). Unit test gained case 9 with cross-layer disagreement (Layer A bounded_retry_exhausted + Layer B ack_success) plus a cross-layer spot-check assert that col 20 and final_classification show distinct values on id=3591.
The earlier v3 review walked bounded_retry_result enum value space inside col 20 and missed this collision because every fixture in the v3 32-row suite was internally row-consistent across the two columns. The cross-column walk above is what catches it.
5. Conditional cleanup depends on pre-state, but state changes mid-flight
Symptom: "if obj.status == X then cleanup" — but obj.status may have transitioned before the cleanup runs. Cleanup either skipped (leak) or applied wrongly (data loss).
Catch: re-read state inside cleanup itself, or use a watch / informer that delivers state-at-trigger-time.
6. NotFound short-circuit drops the side effect that should have been written
Symptom: obj, err := Get(name); if errors.IsNotFound(err) { return nil } — but the original intent was "if not found, create it". Short-circuit hides the missing create.
Catch: every NotFound branch needs an explicit decision: skip / create / error. Default-skip is the failure mode.
7. Release sequence does not distinguish "object still terminating" vs "object absent"
Symptom: "if pod is gone, we can release the PVC." But "gone" could mean "deletion in flight" — the pod's finalizers haven't run yet, so releasing PVC corrupts the volume mount.
Catch: in cleanup paths, treat "still terminating" and "absent" as different states. Only "absent" allows release.
8. Operator precedence trap — compiles but NPE at runtime
Symptom: if foo != nil && foo.Bar != nil || baz() — operator precedence parses as (foo != nil && foo.Bar != nil) || baz(). If foo == nil, short-circuit fires correctly. But if a refactor changes || to &&, the nil check is no longer protective.
Catch: parenthesize complex boolean chains. Never trust precedence for safety-critical short-circuits.
Self-check protocol — before saying "LGTM"
For each PR / patch you review, walk the 8 classes mentally:
- Are there fallback paths that hide the root cause? (Class 1)
- Are non-empty contracts enforced at every write? (Class 2)
- Is any state read more than once without re-validation? (Class 3)
- Are sentinel values explicitly documented + checked? Walk two axes:
- intra-column — single-enum value space well-defined? (Class 4)
- cross-column — any token reused in ≥2 columns / fields / layers with different value space / decision authority / time horizon? (Class 4a)
- Does any conditional cleanup depend on pre-state? (Class 5)
- Is NotFound an explicit decision or a silent return? (Class 6)
- Does cleanup distinguish terminating vs absent? (Class 7)
- Are complex boolean chains parenthesized? (Class 8)
Document your walk-through in the PR review comment. "I checked classes 1-8: class 5 has a concern at line 42 — see comment." Not "LGTM." If you can't list which classes you checked, you didn't review.
Anti-patterns (catch yourself before posting LGTM)
| Anti-pattern | Symptom | Why it's wrong |
|---|
| Vague approve | "LGTM" with no class walkthrough | Reviewer didn't actually check; just signed |
| Type-only review | "Type checks, syntax fine" | Compiler does this. Pair review's job is contract level |
| Skip-because-tested | "tests pass, so LGTM" | Tests test what's written, not what's missing. Class 6 (NotFound short-circuit) has tests that pass against the empty branch |
| One-class-only | "Class 4 looks bad" → reject without walking 1-3, 5-8 | Co-occurrences are common; missing other classes leaves work for next reviewer |
| Single-axis class 4 | "Class 4 walked: enum value space inside col X is fine" without checking cross-column reuse | Cross-column collision (4a) bypasses single-column reviews. Internally-row-consistent fixtures cannot catch this — see SQL Server v3 → v3.1 (id=3591) for the canonical miss-then-catch |
Closeout language
Good: "Reviewed: classes 1-8 walked. Found class 5 concern at line 42 (cleanup re-checks pre-state). Class 7 OK because we use the absent/terminating distinction in pkg/cleanup. Other 6 classes clean. Approve after class 5 fix."
Bad: "Approved." (no walk-through, no evidence)
Bad: "Looks fine." (no class attribution)
Related docs
docs/addon-design-contract-review-during-xp-guide.md — full prose methodology + 8 case appendices
docs/addon-evidence-discipline-guide.md — companion: when reviewing test reports, also size description language to evidence count
docs/addon-test-acceptance-and-first-blocker-guide.md — when a defect found post-merge slips to runtime, classify into 5-layer first-blocker for triage