with one click
integration-test
// [Testing] Use when you need to generate or review integration tests.
// [Testing] Use when you need to generate or review integration tests.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | integration-test |
| description | [Testing] Use when you need to generate or review integration tests. |
Codex compatibility note:
- Invoke repository skills with
$skill-namein Codex; this mirrored copy rewrites legacy Claude/skill-namereferences.- Prefer the
plan-hardskill for planning guidance in this Codex mirror.- Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.
- User-question prompts mean to ask the user directly in Codex.
- Ignore Claude-specific mode-switch instructions when they appear.
- Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.
- Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required
spawn_agentsubagent(s) for that task.- Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.
- For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.
- If a required step/tool cannot run in this environment, stop and ask the user before adapting.
Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.
Always read:
docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)docs/project-reference/lessons.md (always-on guardrails and anti-patterns)Situation-based docs:
backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.mdfrontend-patterns-reference.md, scss-styling-guide.md, design-system/README.mdfeature-docs-reference.mdintegration-test-reference.mde2e-test-reference.mdcode-review-rules.md plus domain docs above based on changed filesDo not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.
[BLOCKING] Execute skill steps in declared order. NEVER skip, reorder, or merge steps without explicit user approval. [BLOCKING] Before each step or sub-skill call, update task tracking: set
in_progresswhen step starts, setcompletedwhen step ends. [BLOCKING] Every completed/skipped step MUST include brief evidence or explicit skip reason. [BLOCKING] If Task tools are unavailable, create and maintain an equivalent step-by-step plan tracker with the same status transitions.
Goal: Generate/review integration test files using real DI (no mocks). 5 modes: (1) from-changes · (2) from-prompt · (3) review · (4) diagnose · (5) verify-traceability.
Workflow: Detect mode → Find targets → Gather context → Execute → Report
Key Rules:
references/integration-test-patterns.md before writingQueries/ or Commands/ folders$integration-test-verify runs without DB resetPrerequisites — MUST ATTENTION READ before executing:
references/integration-test-patterns.md— canonical test templates: collection attributes, base class usage, TC annotation format, async polling helpers, unique name generators, DB assertion patterns. Read before writing ANY test.
docs/specs/— existing TCs by module: read to verify test-to-spec traceability and get TC IDs before generating. (read directly when relevant; do not rely on hook-injected conversation text)
references/integration-test-patterns.md — canonical test templates (MUST READ before writing any test)docs/project-reference/domain-entities-reference.md — domain entity catalog, relationships, cross-service sync (read directly when relevant; do not rely on hook-injected conversation text)docs/specs/ — existing TCs by module (read before generating tests; verify test-to-spec traceability)CRITICAL: Search existing patterns FIRST. Before generating ANY test, grep existing integration test files in same service. Read ≥1 existing test file to match conventions (namespace, usings, collection name, base class, helper usage). NEVER generate tests contradicting established codebase patterns.
CRITICAL: NO Smoke/Fake/Useless Tests. Every test MUST execute actual commands/handlers and verify DB data state. NO DI-resolution-only tests. NO exception-check-only tests. Before writing assertions: READ handler/entity/event source — understand WHAT fields change, WHAT entities created/updated/deleted, WHAT event handlers fire. Assert specific field values.
CRITICAL: Async Polling for ALL Data Assertions. ALWAYS wrap data state assertions in async polling/retry helper. DEFAULT for ALL data verification — not just async handlers. Data persistence may be delayed by event handlers, message bus consumers, background jobs, DB write latency. Rule: If asserting data in DB → use async polling. No exceptions.
For test specifications and test case generation from PBIs, use
$tdd-specskill instead.
External Memory: Complex/lengthy work → write findings to
plans/reports/— prevents context loss.
Evidence Gate: MANDATORY IMPORTANT MUST ATTENTION — every claim requires
file:lineproof or traced evidence with confidence percentage (>80% act, <80% verify first).
Before implementation, search codebase for patterns:
IntegrationTest, TestFixture, TestUserContext, IntegrationTestBaseMANDATORY IMPORTANT MUST ATTENTION plan task to READ
integration-test-reference.mdfor project-specific patterns and code examples. If not found, continue with search-based discovery.
Workflow:
Key Rules:
references/integration-test-patterns.md before writing any testOrders/OrderCommandIntegrationTests.*). NEVER create Queries/ or Commands/ folder.// TC-{FEATURE}-{NNN}: Description comment + test-spec annotation — before method, outside body$tdd-spec firstALWAYS create and execute tasks in this exact order:
FIRST: Verify/upsert test specs in feature docs
docs/business-features/{App}/detailed-features/) for target domaindocs/specs/{App}/README.md) if existsTC-{FEATURE}-{NNN} existsMIDDLE: Implement integration tests
// TC-OM-001: Create valid order — happy path
[Trait("TestSpec", "TC-OM-001")]
[Fact]
public async Task CreateOrder_WhenValidData_ShouldCreateSuccessfully()
FINAL: Verify bidirectional traceability
TC-{FEATURE}-{NNN} in feature doc Section 15 / specs docIntegrationTest field in feature doc TCs with {File}::{MethodName}| Module | Abbreviation | Test Folder |
|---|---|---|
| Order Management | OM | Orders/ |
| Inventory | INV | Inventory/ |
| User Profiles | UP | UserProfiles/ |
| Notification Management | NM | Notifications/ |
| Report Generation | RG | Reports/ |
| Feedback | FB | Feedback/ |
| Background Jobs | BJ | — |
Creating new TC-{FEATURE}-{NNN} codes:
docs/business-features/{App}/detailed-features/ has existing codes. New codes must not collide.Args = command/query name (e.g., "$integration-test CreateOrderCommand")
→ FROM-PROMPT mode: generate tests for the specified command/query
No args (e.g., "$integration-test")
→ FROM-CHANGES mode: detect changed command/query files from git
Args = "review" (e.g., "$integration-test review Orders")
→ REVIEW mode: audit existing test quality, find flaky patterns, check best practices
Args = "diagnose" (e.g., "$integration-test diagnose OrderCommandIntegrationTests")
→ DIAGNOSE mode: analyze why tests fail — determine test bug vs code bug
Args = "verify" (e.g., "$integration-test verify {Service}")
→ VERIFY-TRACEABILITY mode: check test code matches specs and feature docs
Run via Bash tool:
git diff --name-only; git diff --cached --name-only
Filter for command/query files using project naming conventions (e.g., *Command.*, *Query.*). Path patterns from docs/project-config.json → modules or backendServices. Extract service from path:
| Path pattern | Service | Test project |
|---|---|---|
Per docs/project-config.json service path pattern | {Service} | {Service}.IntegrationTests (or project equivalent) |
Search codebase for existing *.IntegrationTests.* projects to find correct mapping.
If no test project exists: inform user "No integration test project for {service}. See CLAUDE.md Integration Testing section to create one."
If test file already exists: ask user overwrite or skip.
User specifies command/query name. Use Grep tool (NOT bash grep):
Grep pattern="class {CommandName}" path="." glob="*.cs"
For each target, read in parallel:
{Service}.IntegrationTests/**/*IntegrationTests.*, read ≥1 for conventions (collection name, trait, namespace, usings, base class)class.*ServiceIntegrationTestBasereferences/integration-test-patterns.md — canonical templates (adapt {Service} placeholders)For each target domain, read:
docs/business-features/{App}/detailed-features/ Section 15 (primary source)docs/specs/{App}/README.md (secondary reference)Build mapping: test case description → TC code (e.g., "create valid order" → TC-OM-001).
$tdd-spec first.File path: {project-test-dir}/{Service}.IntegrationTests/{Domain}/{CommandName}IntegrationTests{ext} (adapt path/extension per docs/project-config.json → integrationTestVerify.testProjectPattern)
Folder = domain feature.
{Domain}= business domain (Orders, Inventory, Notifications, UserProfiles), NOT CQRS type. Command and query tests for same domain live in same folder.
Structure (C#/xUnit — adapt to your framework):
#region
using FluentAssertions;
// ... service-specific usings (copy from existing tests)
#endregion
namespace {Service}.IntegrationTests.{Domain};
[Collection({Service}IntegrationTestCollection.Name)]
[Trait("Category", "Command")] // or "Query"
public class {CommandName}IntegrationTests : {Service}ServiceIntegrationTestBase
{
// Minimum 3 tests: happy path, validation failure, DB state verification
}
Test method naming: {CommandName}_When{Condition}_Should{Expectation}
Required patterns per command type:
| Command type | Required tests |
|---|---|
| Save/Create | Happy path + validation failure + DB state |
| Update | Create-then-update + verify updated fields in DB |
| Delete | Create-then-delete + AssertEntityDeletedAsync |
| Query | Filter returns results + pagination + empty result |
Build test project via project's build tool (see $integration-test-verify for config-driven build).
MUST ATTENTION verify ALL of the following:
// TC-{FEATURE}-{NNN}: Description comment + test-spec annotationSearch codebase for existing integration test files:
find . -name "*IntegrationTests.*" -type f
find . -name "*IntegrationTestBase.*" -type f
find . -name "*IntegrationTestFixture.*" -type f
| Pattern | Shows |
|---|---|
{Service}.IntegrationTests/{Domain}/*CommandIntegrationTests.* | Create + update + validation |
{Service}.IntegrationTests/{Domain}/*QueryIntegrationTests.* | Query with create-then-query |
{Service}.IntegrationTests/{Domain}/Delete*IntegrationTests.* | Delete + cascade |
{Service}.IntegrationTests/{Service}ServiceIntegrationTestBase.* | Service base class pattern |
Case: Generate tests from existing test specs (feature docs Section 15)
$integration-test CreateOrderCommand
→ Reads Section 15 TCs, generates test file with TC annotations
Case: Generate tests from git changes (default)
$integration-test
→ Detects changed command/query files, checks Section 15 for matching TCs, generates tests
Case: Generate tests after $tdd-spec created new TCs
$tdd-spec → $integration-test
→ tdd-spec writes TCs to Section 15, then integration-test generates tests from those TCs
Case: Review existing tests for quality
$integration-test review Orders
→ Audits test quality, finds flaky patterns, checks best practices
Case: Diagnose test failures
$integration-test diagnose OrderCommandIntegrationTests
→ Analyzes failures, determines test bug vs code bug
Case: Verify test-spec traceability
$integration-test verify {Service}
→ Checks test code matches specs and feature docs bidirectionally
Mode = REVIEW: audit existing integration tests for quality, flaky patterns, best practices.
| Input type | Sub-agent | Why |
|---|---|---|
| Test file quality audit | integration-tester | Purpose-built for spec generation, TC traceability, and test patterns — catches integration-specific issues code-reviewer misses |
| Security-sensitive test data (PII, auth fixtures) | security-auditor | Detects PII leakage in test fixtures |
MANDATORY: Integration test REVIEW mode spawns
integration-testersub-agent (agent_type: "integration-tester"), NOTcode-reviewer. Rationale:integration-testerspecializes in test spec generation, TC traceability, CQRS test patterns,WaitUntilAsynccorrectness, and microservices integration context — areascode-reviewerdoes not cover at depth.
Fresh Eyes Protocol: Run Round 1 inline. If findings are LOW confidence or contradictory → spawn fresh integration-tester sub-agent (zero memory of Round 1) for Round 2. Main agent reads report, NEVER filters findings. Max 2 rounds, then escalate.
{Service}.IntegrationTests/{Domain}/**/*IntegrationTests.*Dimension 1: Reliability — Think: What causes intermittent failures?
WaitUntilAsync() or equiv → WILL flakeThread.Sleep(), Task.Delay() instead of condition-based pollingDateTime.Now without time abstractionDimension 2: Assertion Value — Think: Does the test actually verify anything?
exception.Should().BeNull() alone → HIGH severityDimension 3: Conventions — Think: Does test follow project patterns?
[Trait("Category", "Command")] or equivDimension 4: Code Quality — Think: Maintainability and isolation?
{Action}_When{Condition}_Should{Expectation}# Integration Test Quality Report — {Domain}
## Summary
- Tests scanned: {N}
- Issues found: {N} (HIGH: {n}, MEDIUM: {n}, LOW: {n})
- Overall quality: {GOOD|NEEDS_WORK|CRITICAL}
## HIGH Severity Issues (Flaky Risk)
| Test | Issue | Fix |
| ------------ | ------------------------------------------------ | -------------------------------------- |
| {MethodName} | DB assertion without polling after async handler | Wrap in project's async polling helper |
## MEDIUM Severity Issues (Best Practice)
| Test | Issue | Fix |
| ---- | ----- | --- |
## LOW Severity Issues (Style)
| Test | Issue | Fix |
| ---- | ----- | --- |
## Recommendations
1. {Prioritized fix suggestions}
Mode = DIAGNOSE: analyze failing tests to determine test bug vs application code bug.
Test fails
├── Compilation error?
│ ├── Missing type/method → Code changed, test not updated → TEST BUG
│ └── Wrong import/namespace → TEST BUG
├── Timeout/hang?
│ ├── Missing async/await → TEST BUG
│ ├── Deadlock in handler → CODE BUG
│ └── Infrastructure down → INFRA ISSUE
├── Assertion failure?
│ ├── Expected value wrong?
│ │ ├── Test hardcoded old behavior → TEST BUG
│ │ └── Business logic changed → CODE BUG (if unintended) or TEST BUG (if intended change)
│ ├── Null/empty result?
│ │ ├── Entity not found → Check if create step succeeded → TEST BUG (setup) or CODE BUG (handler)
│ │ └── Query returns empty → Check filters/predicates → CODE BUG
│ ├── Intermittent (passes sometimes)?
│ │ ├── Async assertion without polling → TEST BUG (add async polling/retry)
│ │ ├── Non-unique test data collision → TEST BUG (use unique name generator)
│ │ └── Race condition in handler → CODE BUG
│ └── Wrong count/order?
│ ├── Test data leak from other tests → TEST BUG (isolation)
│ └── Logic error in query → CODE BUG
├── Validation error (expected success)?
│ ├── Test sends invalid data → TEST BUG
│ └── Validation rule too strict → CODE BUG
└── Exception thrown?
├── Known exception type in handler → CODE BUG
└── DI/config error → INFRA ISSUE
# Test Failure Diagnosis — {TestClass}
## Failing Tests
| Test Method | Error Type | Root Cause | Classification |
| ----------- | ----------------- | ------------- | --------------------------- |
| {Method} | {AssertionFailed} | {Description} | TEST BUG / CODE BUG / INFRA |
## Detailed Analysis
### {MethodName}
**Error:** {error message}
**Expected:** {what test expected}
**Actual:** {what happened}
**Root Cause:** {explanation with code evidence}
**Classification:** TEST BUG | CODE BUG | INFRA ISSUE
**Evidence:** `{file}:{line}` — {what the code does}
**Recommended Fix:** {specific fix with code location}
## Summary
- Test bugs: {N} — fix in test code
- Code bugs: {N} — fix in application code
- Infra issues: {N} — fix in configuration/environment
Mode = VERIFY: bidirectional traceability check between test code, test specs, feature docs.
| Scenario | Likely Correct Source | Action |
|---|---|---|
| Test passes, spec describes different behavior | Test (reflects current code) | Update spec to match test |
| Test fails, spec describes expected behavior | Spec (test is stale) | Update test to match spec |
| Test exists, no spec | Test (spec was never written) | Create spec from test |
| Spec exists, no test | Spec (test was never written) | Generate test from spec |
| Test and spec agree, but code behaves differently | Spec (code has regression) | Fix code or update spec+test |
MUST ATTENTION verify ALL of the following:
Status: Untested)docs/specs/ dashboard is in sync with feature doc Section 15# Traceability Report — {Service}
## Summary
- TCs in feature docs: {N}
- Test methods with TC annotations: {N}
- Fully traced (both directions): {N}
- Orphaned tests (no matching TC): {N}
- Orphaned TCs (no matching test): {N}
- Mismatched behavior: {N}
## Traceability Matrix
| TC ID | Feature Doc? | Test Code? | Dashboard? | Status |
| --------- | ------------ | ---------- | ---------- | ------------ |
| TC-OM-001 | ✅ | ✅ | ✅ | Traced |
| TC-OM-005 | ✅ | ❌ | ✅ | Missing test |
| TC-OM-010 | ❌ | ✅ | ❌ | Missing spec |
## Orphaned Tests (no matching TC in docs)
| Test File | Method | Annotation | Action |
| --------- | -------- | ---------- | ------------------------ |
| {file} | {method} | TC-OM-010 | Create TC in feature doc |
## Orphaned TCs (no matching test)
| TC ID | Doc Location | Priority | Action |
| --------- | ------------ | -------- | ----------------------------------- |
| TC-OM-005 | Section 15 | P0 | Generate test via $integration-test |
## Behavior Mismatches
| TC ID | Doc Says | Test Does | Correct Source | Action |
| ----- | -------- | --------- | -------------- | ------ |
## Recommendations
1. {Prioritized actions}
| Pattern | When to Use | Example |
|---|---|---|
| Per-test inline | Simple tests, unique data | var order = new CreateOrderCommand { Name = UniqueName() } |
| Factory methods | Repeated entity creation | TestDataFactory.CreateValidOrder() |
| Builder pattern | Complex entities with many fields | new OrderBuilder().WithStatus(Active).WithItems(3).Build() |
| Shared fixture | Reference data needed by all tests | CollectionFixture.SeedReferenceData() |
Rules:
MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS: NOT in workflow? a direct user question — do NOT decide complexity yourself. User decides:
test-to-integrationworkflow (Recommended) — scout → integration-test → integration-test-review → integration-test-verify → test → docs-update → watzup → workflow-end$integration-testdirectly — standalone
IMPORTANT MUST ATTENTION: After generating/modifying integration tests, MUST:
- Run tests:
$integration-test-verify(readsquickRunCommandfromdocs/project-config.json)- If tests fail: Diagnose root cause — (a) wrong test setup/assertions → fix test, or (b) service bug → report as finding
- NEVER mark done until tests pass. Unrun tests have zero value.
- Iterate: Fix → rerun → verify until all pass or failures confirmed as service bugs
MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS after completing, use a direct user question to present:
| Skill | Relationship | When to Call |
|---|---|---|
$tdd-spec | Producer — TCs in feature doc Section 15 are the source for test generation | Must run tdd-spec before integration-test (CREATE or UPDATE mode). TCs must exist before generating tests. |
$tdd-spec-review | Upstream reviewer — validates TC quality before test generation | Run before integration-test to ensure TCs have real assertion value |
$tdd-spec [direction=sync] | Dashboard — syncs QA dashboard after TCs are linked to test files | Run after integration-test to update IntegrationTest: fields in dashboard |
$feature-docs | TC host — Section 15 of feature doc is where TCs live | If feature doc is missing or Section 15 is empty → run $feature-docs first |
$spec-discovery | Upstream spec — engineering spec is source of truth for what tests should assert | If tests diverge from spec → check spec-discovery output for correct behavior |
$integration-test-review | Reviewer — 6-gate quality audit of generated tests | Always call after generating integration tests |
$integration-test-verify | Runner — executes tests and reports pass/fail | Always call after integration-test-review clears |
$docs-update | Orchestrator — calls tdd-spec sync (Phase 4) with test traceability | Run for full doc sync after integration test files updated |
When called outside a workflow, follow this chain to complete the integration test authoring cycle.
integration-test (you are here)
│
├─ PREREQUISITE: TCs must exist in feature doc Section 15
│ [REQUIRED] Verify: docs/business-features/{Module}/README.md Section 15 has TC-{FEATURE}-{NNN} entries
│ If empty → run $tdd-spec [CREATE mode] first
│
├─ [REQUIRED] → $integration-test-review
│ 6-gate quality audit: assertion value, data state, repeatability, domain logic, traceability, three-way sync.
│ Never skip — Gate 6 (three-way sync) is the only place where spec/code/test conflicts surface.
│
├─ [REQUIRED] → $integration-test-verify
│ Runs tests and reports pass/fail counts. Never mark complete without real runner output.
│
├─ [REQUIRED] → $tdd-spec [direction=sync]
│ Updates QA dashboard with IntegrationTest: file::method traceability links.
│
├─ [RECOMMENDED] → $docs-update
│ Updates feature doc evidence fields and version history if test coverage changed materially.
│
└─ [RECOMMENDED] → $tdd-spec-review
Re-run if integration-test-review (Gate 6) flagged TC issues requiring TC edits.
### Mode-Specific Chains
| Mode | Pre-step | Post-step |
|------|---------|-----------|
| from-changes | verify TCs updated (run $tdd-spec UPDATE first) | $integration-test-review → /verify → /sync |
| from-prompt | confirm TC exists for target feature | $integration-test-review → /verify → /sync |
| review | N/A (read-only) | report findings → $tdd-spec UPDATE if TCs need fixes |
| diagnose | run $test to see failures first | fix identified issue → re-run $integration-test-verify |
| verify-traceability | N/A (read-only) | if orphaned TCs: $tdd-spec UPDATE → $integration-test [from-prompt] |
[IMPORTANT] task tracking — break ALL work into small tasks BEFORE starting. NEVER skip task creation.
AI Mistake Prevention — Failure modes to avoid on every task:
Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal. Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing. Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain. Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path. When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site. Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code. Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks. Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis. Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly. Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.
Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.
Understand Code First — HARD-GATE: Do NOT write, plan, or fix until you READ existing code.
- Search 3+ similar patterns (
grep/glob) — citefile:lineevidence- Read existing files in target area — understand structure, base classes, conventions
- Run
python .claude/scripts/code_graph trace <file> --direction both --jsonwhen.code-graph/graph.dbexists- Map dependencies via
connectionsorcallers_of— know what depends on your target- Write investigation to
.ai/workspace/analysis/for non-trivial tasks (3+ files)- Re-read analysis file before implementing — never work from memory alone
- NEVER invent new patterns when existing ones work — match exactly or document deviation
BLOCKED until:
- [ ]Read target files- [ ]Grep 3+ patterns- [ ]Graph trace (if graph.db exists)- [ ]Assumptions verified with evidence
Graph Impact Analysis — When
.code-graph/graph.dbexists, runblast-radius --jsonto detect ALL files affected by changes (7 edge types: CALLS, MESSAGE_BUS, API_ENDPOINT, TRIGGERS_EVENT, PRODUCES_EVENT, TRIGGERS_COMMAND_EVENT, INHERITS). Compute gap: impacted_files - changed_files = potentially stale files. Risk: <5 Low, 5-20 Medium, >20 High. Usetrace --direction downstreamfor deep chains on high-impact files.
Infinitely Repeatable Tests — Tests MUST run N times without failure. Like manual QC — run 100 times, each run adds data. Verification is only PASS after the relevant suite/project passes 3 consecutive runs without DB reset.
- Unique data per run: Use project's unique ID generator for ALL entity IDs. NEVER hardcode IDs.
- Additive only: Tests create data, never delete/reset. Prior runs MUST NOT interfere.
- No schema rollback dependency: Tests work with current schema only. Never rely on rollback.
- Idempotent seeders: Fixture-level seeders use create-if-missing (check existence before insert). Test-level data uses unique IDs per execution.
- No cleanup required: No teardown, no DB reset between runs. Isolation by unique seed data, not cleanup.
- Unique names/codes: Entities requiring unique names/codes — append unique suffix via project's ID generator.
Red Flag Stop Conditions — STOP and escalate via ask the user directly when:
- Confidence drops below 60% on any critical decision
- Changes affect >20 files
- Cross-service boundary crossed
- Security-sensitive code (auth, crypto, PII)
- Breaking change detected (interface, API contract, DB schema)
- Test coverage would decrease
- Approach requires technology/pattern not in project
NEVER proceed past a red flag without explicit user approval.
Rationalization Prevention — AI skips steps via these evasions. Recognize and reject:
Evasion Rebuttal "Too simple for a plan" Simple + wrong assumptions = wasted time. Plan anyway. "I'll test after" RED before GREEN. Write/verify test first. "Already searched" Show grep evidence with file:line. No proof = no search."Just do it" Still need task tracking. Skip depth, never skip tracking. "Just a small fix" Small fix in wrong location cascades. Verify file:line first. "Code is self-explanatory" Future readers need evidence trail. Document anyway. "Combine steps to save time" Combined steps dilute focus. Each step has distinct purpose.
Incremental Result Persistence — MANDATORY for all sub-agents or heavy inline steps processing >3 files.
- Before starting: Create report file
plans/reports/{skill}-{date}-{slug}.md- After each file/section reviewed: Append findings to report immediately — never hold in memory
- Return to main agent: Summary only (per SYNC:subagent-return-contract) with
Full report:path- Main agent: Reads report file only when resolving specific blockers
Why: Context cutoff mid-execution loses ALL in-memory findings. Each disk write survives compaction.
Report naming:
plans/reports/{skill-name}-{YYMMDD}-{HHmm}-{slug}.md
Sub-Agent Return Contract — When this skill spawns a sub-agent, the sub-agent MUST return ONLY this structure. Main agent reads only this summary — NEVER requests full sub-agent output inline.
## Sub-Agent Result: [skill-name] Status: ✅ PASS | ⚠️ PARTIAL | ❌ FAIL Confidence: [0-100]% ### Findings (Critical/High only — max 10 bullets) - [severity] [file:line] [finding] ### Actions Taken - [file changed] [what changed] ### Blockers (if any) - [blocker description] Full report: plans/reports/[skill-name]-[date]-[slug].mdMain agent reads
Full reportONLY when: (a) resolving specific blocker, or (b) building fix plan. Sub-agent writes full report incrementally (per SYNC:incremental-persistence) — not held in memory.
Sub-Agent Selection — Full routing contract:
.claude/skills/shared/sub-agent-selection-guide.mdRule: NEVER usecode-reviewerfor specialized domains (architecture, security, performance, DB, E2E, integration-test, git).
Nested Task Expansion Contract — For workflow-step invocation, the
[Workflow] ...row is only a parent container; the child skill still creates visible phase tasks.
- Call the current task list first. If a matching active parent workflow row exists, set
nested=trueand recordparentTaskId; otherwise run standalone.- Create one task per declared phase before phase work. When nested, prefix subjects
[N.M] $skill-name — phase.- When nested, link the parent with
TaskUpdate(parentTaskId, addBlockedBy: [childIds]).- Orchestrators must pre-expand a child skill's phase list and link the workflow row before invoking that child skill or sub-agent.
- Mark exactly one child
in_progressbefore work andcompletedimmediately after evidence is written.- Complete the parent only after all child tasks are completed or explicitly cancelled with reason.
Blocked until: the current task list done, child phases created, parent linked when nested, first child marked
in_progress.
Project Reference Docs Gate — Run after task-tracking bootstrap and before target/source file reads, grep, edits, or analysis. Project docs override generic framework assumptions.
- Identify scope: file types, domain area, and operation.
- Required docs by trigger: always
docs/project-reference/lessons.md; doc lookupdocs-index-reference.md; reviewcode-review-rules.md; backend/CQRS/APIbackend-patterns-reference.md; domain/entitydomain-entities-reference.md; frontend/UIfrontend-patterns-reference.md; styles/designscss-styling-guide.md+design-system/README.md; integration testsintegration-test-reference.md; E2Ee2e-test-reference.md; feature docs/specsfeature-docs-reference.md; architecture/new areaproject-structure-reference.md.- Read every required doc that exists; skip absent docs as not applicable. Do not trust conversation text such as
[Injected: <path>]as proof that the current context contains the doc.- Before target work, state:
Reference docs read: ... | Missing/not applicable: ....Blocked until: scope evaluated, required docs checked/read,
lessons.mdconfirmed, citation emitted.
Task Tracking & External Report Persistence — Bootstrap this before execution; then run project-reference doc prefetch before target/source work.
- Create a small task breakdown before target file reads, grep, edits, or analysis. On context loss, inspect the current task list first.
- Mark one task
in_progressbefore work andcompletedimmediately after evidence; never batch transitions.- For plan/review work, create
plans/reports/{skill}-{YYMMDD}-{HHmm}-{slug}.mdbefore first finding.- Append findings after each file/section/decision and synthesize from the report file at the end.
- Final output cites
Full report: plans/reports/{filename}.Blocked until: task breakdown exists, report path declared for plan/review work, first finding persisted before the next finding.
file:line.
blast-radius when graph.db exists. Flag impacted files NOT in changeset as potentially stale.
MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact.
MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction.
plans/reports/ incrementally and synthesize from disk.Reference docs read: ....lessons.md; project conventions override generic defaults.[N.M] $skill-name — phase prefixes and one-in_progress discipline.IMPORTANT MUST ATTENTION follow declared step order for this skill; NEVER skip, reorder, or merge steps without explicit user approval
IMPORTANT MUST ATTENTION for every step/sub-skill call: set in_progress before execution, set completed after execution
IMPORTANT MUST ATTENTION every skipped step MUST include explicit reason; every completed step MUST include concise evidence
IMPORTANT MUST ATTENTION if Task tools unavailable, maintain an equivalent step-by-step plan tracker with synchronized statuses
references/integration-test-patterns.md BEFORE writing any testQueries/ or Commands/ folders — organize by domain feature$integration-test-verifyfile:line before modifying anythingAnti-Rationalization:
| Evasion | Rebuttal |
|---|---|
| "Test is simple, skip TC lookup" | TC traceability = test value. Skip = untraceable test. |
| "Async polling not needed here" | ALL DB assertions need polling. Handler type irrelevant. |
| "Already searched patterns" | Show file:line evidence. No proof = no search. |
| "Smoke test is fine for now" | Smoke-only FORBIDDEN. Assert specific field values. |
| "Repo setup is faster" | Direct repository data hacks create invalid state. Use real use-case paths or valid seeded fixtures. |
| "One green run is enough" | Verification requires 3 consecutive passing runs without DB reset. |
| "REVIEW: one pass is enough" | Low confidence → spawn fresh sub-agent. Never declare PASS after Round 1. |
| "Skip task creation, it's obvious" | task tracking is non-negotiable. Tracking prevents context loss. |
Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json
$workflow-start <workflowId> for standard; sequence custom steps manually[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.
Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.
Top Rules (apply always):
ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTaskwhere python/where py) — NEVER assume python/python3 resolvesExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.Test-specific lessons →
docs/project-reference/integration-test-reference.mdLessons Learned section. Production-code anti-patterns →docs/project-reference/backend-patterns-reference.mdAnti-Patterns section. Generic debugging/refactoring reminders → System Lessons in.claude/hooks/lib/prompt-injections.cjs.
ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption){CoreServiceName}...RequestBusMessagepython/python3 resolves — run where python/where py first, use py launcher or nodeBreak work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".
Extract lessons — ROOT CAUSE ONLY, not symptom fixes:
$learn.$code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.$learn.
[TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.