Codex compatibility note:

Invoke repository skills with $skill-name in Codex; this mirrored copy rewrites legacy Claude /skill-name references.

Prefer the plan-hard skill for planning guidance in this Codex mirror.

Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.

User-question prompts mean to ask the user directly in Codex.

Ignore Claude-specific mode-switch instructions when they appear.

Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.

Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required spawn_agent subagent(s) for that task.

Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.

For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.

If a required step/tool cannot run in this environment, stop and ask the user before adapting.

Codex Project-Reference Loading (No Hooks)

Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.

Always read:

docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)
docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)
docs/project-reference/lessons.md (always-on guardrails and anti-patterns)

Situation-based docs:

Backend/CQRS/API/domain/entity changes: backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.md
Frontend/UI/styling/design-system: frontend-patterns-reference.md, scss-styling-guide.md, design-system/README.md
Spec/test-case planning or TC mapping: feature-docs-reference.md
Integration test implementation/review: integration-test-reference.md
E2E test implementation/review: e2e-test-reference.md
Code review/audit work: code-review-rules.md plus domain docs above based on changed files

Do not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.

[BLOCKING] Execute skill steps in declared order. NEVER skip, reorder, or merge steps without explicit user approval. [BLOCKING] Before each step or sub-skill call, update task tracking: set in_progress when step starts, set completed when step ends. [BLOCKING] Every completed/skipped step MUST include brief evidence or explicit skip reason. [BLOCKING] If Task tools are unavailable, create and maintain an equivalent step-by-step plan tracker with the same status transitions.

Quick Summary

Goal: Run integration tests after $integration-test writes them and $integration-test-review reviews them. Confirm all pass and remain repeatable.

Workflow:

Read Config — Load docs/project-config.json → integrationTestVerify section for project-specific run guidance
System Check — Verify required system is healthy before running
Determine Test Projects — Discover via testProjectPattern glob, testProjects list, or git auto-detect
Run Tests — Execute quickRunCommand on determined test projects for 3 consecutive runs
Report — Pass/fail counts, failed test names, next steps on failure

Key Rules:

MUST read project config integrationTestVerify section before doing anything else
MUST read project-specific reference docs named by integrationTestVerify.referenceDocs or the project's integration-test doc path before running tests
Use quickRunCommand from config — NEVER hardcode dotnet test or any language-specific command
If system check fails → instruct user how to start system (reference startupScript from config)
If config says local infrastructure, databases, services, or full system startup is required, treat that as a blocking prerequisite
On test failure → diagnose root cause: test bug or service bug. NEVER weaken assertions.
Verification only passes after 3 consecutive successful runs of each relevant suite/project without DB reset
Always report exact failure counts and names — "all passed" requires evidence

Be skeptical. Apply critical thinking. Every pass/fail claim needs actual test runner output.

Step 1: Read Project Config

Read docs/project-config.json and extract the integrationTestVerify section.

Expected config shape:
{
  "integrationTestVerify": {
    "guidance":             string   — instructions for this project's test run approach
    "referenceDocs":        string[] — project docs that explain integration-test setup/run prerequisites
    "quickRunCommand":      string   — test runner command (e.g., "dotnet test --no-build", "npm test", "pytest")
    "testProjectPattern":   string   — glob pattern to discover test projects (e.g., "src/Services/**/*.IntegrationTests.csproj")
    "testProjects":         string[] — explicit list of test project paths (fallback if no pattern)
    "systemCheckCommand":   string   — shell command to check system readiness
    "runScript":            string   — path to CI-style full run script (reference only)
    "startupScript":        string   — path to system startup script (reference only)
  }
}

Config priority: testProjectPattern (auto-discovers via glob) > testProjects (explicit list) > git auto-detect (fallback).

If integrationTestVerify section is missing: proceed to Fallback Mode.

If section exists: display the guidance value to the user verbatim — it contains project-specific instructions the implementer wrote intentionally.

Then read the project-specific setup guidance before any system check or test command:

Read every file listed in integrationTestVerify.referenceDocs, if present.
If no referenceDocs list exists, read the integration-test reference doc indicated elsewhere in docs/project-config.json (for example a framework/testing integration test doc path), if present.
If config names runScript or startupScript, read those scripts when needed to understand startup, health checks, arguments, or labels. Use them as project-specific evidence, not generic assumptions.
If no project-specific reference exists, proceed only with the explicit config values and call out that the project should add reference docs to integrationTestVerify.

Step 2: System Check

If systemCheckCommand exists in config:

Run the system check via Bash:

{systemCheckCommand}

Evaluate output:

Healthy → proceed to Step 3
Partially healthy / no containers → display startup instructions to user: > "System not fully ready. To start: run {startupScript} (or follow the guidance above). Wait for all services to be healthy, then re-run $integration-test-verify."

STOP — do not run tests against an unhealthy system. Results would be unreliable.

If no systemCheckCommand:

If guidance, reference docs, runScript, or startupScript indicate required local infrastructure/services, STOP and tell the user the project config needs a concrete readiness check before AI verification can run.
Otherwise, proceed to Step 3 and explicitly report that no system check was configured.

Step 3: Determine Test Projects

Priority order: testProjectPattern (glob auto-discover) > testProjects (explicit list) > git auto-detect (fallback).

If testProjectPattern exists in config:

Discover test projects by running a glob search for the pattern:

# Example for .NET projects (pattern: "src/Services/**/*.IntegrationTests.csproj")
find . -path "{testProjectPattern}" -type f
# or use language-appropriate glob tool

Use all discovered .csproj files (or equivalent) as the test project list. Exclude any paths outside the pattern scope.

If no testProjectPattern but testProjects list exists:

Use the explicit list from config directly.

If neither exists — auto-detect from git:

# Auto-detect changed test projects
git diff --name-only HEAD | grep -i "IntegrationTest" | sed 's|/[^/]*$||' | sort -u

If auto-detect finds nothing (no uncommitted test changes), ask user: "No changed test files detected. Run all test projects or skip?"

Filter rule: Only run projects relevant to the current change. If user explicitly asks to run all → run all discovered/configured projects.

Step 4: Run Tests

Do not run this step unless Step 2 passed or the config/reference docs explicitly state no external system is required.

Execute using quickRunCommand from config. Run each relevant suite/project 3 consecutive times without resetting data.

Three-run idempotency gate: If any run fails, verification fails. Fix the root cause, then restart the 3-run sequence from run 1.

Example for a.NET project:

# Run each test project individually for clear per-project results
{quickRunCommand} {testProject1}
{quickRunCommand} {testProject2}
# ...

Or run all at once using the solution filter if supported:

{quickRunCommand} --filter "Category=integration"

Capture output for every run: count Passed, Failed, Skipped. Note: skipped tests (tests marked with a framework-specific skip annotation, e.g., [Fact(Skip=...)] in xUnit, @Disabled in JUnit) are expected and not a failure.

Step 5: Report Results

After all tests complete, report:

### Integration Test Verify Results

**Run command:** {quickRunCommand}
**Projects tested:** {N}
**Repeatability gate:** 3 consecutive runs without DB reset

| Project | Run | Passed | Failed | Skipped |
|---------|-----|--------|--------|---------|
| {Project1} | 1 | X | 0 | Y |
| {Project1} | 2 | X | 0 | Y |
| {Project1} | 3 | X | 0 | Y |

**Total:** {total_passed} passed, {total_failed} failed, {total_skipped} skipped (expected skip annotations)

Status: ✅ ALL PASS | ❌ {N} FAILURES

On failure:

List each failing test name + failure message
Diagnose: test bug (wrong assertion setup) or service bug (handler actually broken)?
If test bug → fix in the test file (do NOT weaken assertions — fix setup/data)
If service bug → report as finding, do NOT silently fix without telling user
After fixing → re-run the full 3-run verify sequence

Fallback Mode (No Project Config)

When docs/project-config.json has no integrationTestVerify section:

Detect project type from root files:
- *.sln or *.csproj → dotnet test
- package.json → npm test or npx jest
- pytest.ini / setup.py / pyproject.toml → pytest
- go.mod → go test ./...
Auto-detect changed test files from git:
```
git diff --name-only HEAD
```
Run detected command on changed test projects.
Report results and recommend: "Add integrationTestVerify to docs/project-config.json for project-specific run guidance."

CI-Style Full Run (Reference)

When runScript is configured, reference it for the full CI-style run (not run by AI directly — Windows.cmd scripts and CI runners require user/pipeline execution):

"For a full CI-style run including Docker orchestration and health polling, execute: {runScript}"

This script typically: creates networks → removes stale containers → builds images → starts infrastructure (wait healthy) → starts APIs (wait healthy) → runs all tests.

On Test Failure Protocol

NEVER do these to make failures go away:

❌ Remove or weaken assertions
❌ Add skip annotations (e.g., [Fact(Skip=...)] in xUnit, @Disabled in JUnit) to hide failures
❌ Create or mutate domain data through repositories to bypass real use-case paths
❌ Mark passing by ignoring error output
❌ Report "all passed" without showing actual runner output

DO this instead:

Read the failing test method
Read the handler/service the test targets
Identify: is the assertion wrong, or is the code wrong?
Fix at the root cause layer; use real use cases or valid seeded fixtures for data setup
Re-run to confirm green

If a test fails because the system is unavailable → report as "system not ready" and reference startupScript / runScript. Never change the test.

Workflow Recommendation

MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS: If you are NOT already in a workflow, you MUST ATTENTION use a direct user question to ask the user. Do NOT judge task complexity or decide this is "simple enough to skip" — the user decides whether to use a workflow, not you:

Activate test-to-integration workflow (Recommended) — scout → integration-test → integration-test-review → integration-test-verify → test → docs-update → watzup → workflow-end

Execute $integration-test-verify directly — run this skill standalone

Next Steps

MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS after completing this skill, you MUST ATTENTION use a direct user question to present these options. Do NOT skip because the task seems "simple" or "obvious" — the user decides:

"$workflow-review-changes (Recommended)" — Review all changes before committing
"$integration-test-review" — If tests fail: review and fix integration tests before re-verify
"$docs-update" — Update documentation if test counts changed
"Skip, continue manually" — user decides

[IMPORTANT] Use task tracking to break ALL work into small tasks BEFORE starting. A verify step that does not actually run tests 3 consecutive times is not repeatability verification. It is theater. Read project config FIRST to understand how to run tests for this specific project.

AI Mistake Prevention — Failure modes to avoid on every task:

Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal. Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing. Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain. Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path. When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site. Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code. Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks. Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis. Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly. Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.

Nested Task Expansion Contract — For workflow-step invocation, the [Workflow] ... row is only a parent container; the child skill still creates visible phase tasks.

Call the current task list first. If a matching active parent workflow row exists, set nested=true and record parentTaskId; otherwise run standalone.

Create one task per declared phase before phase work. When nested, prefix subjects [N.M] $skill-name — phase.

When nested, link the parent with TaskUpdate(parentTaskId, addBlockedBy: [childIds]).

Orchestrators must pre-expand a child skill's phase list and link the workflow row before invoking that child skill or sub-agent.

Mark exactly one child in_progress before work and completed immediately after evidence is written.

Complete the parent only after all child tasks are completed or explicitly cancelled with reason.

Blocked until: the current task list done, child phases created, parent linked when nested, first child marked in_progress.

Task Tracking & External Report Persistence — Bootstrap this before execution; then run project-reference doc prefetch before target/source work.

Create a small task breakdown before target file reads, grep, edits, or analysis. On context loss, inspect the current task list first.

Mark one task in_progress before work and completed immediately after evidence; never batch transitions.

For plan/review work, create plans/reports/{skill}-{YYMMDD}-{HHmm}-{slug}.md before first finding.

Append findings after each file/section/decision and synthesize from the report file at the end.

Final output cites Full report: plans/reports/{filename}.

Blocked until: task breakdown exists, report path declared for plan/review work, first finding persisted before the next finding.

Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.

Project Reference Docs Gate — Run after task-tracking bootstrap and before target/source file reads, grep, edits, or analysis. Project docs override generic framework assumptions.

Identify scope: file types, domain area, and operation.

Required docs by trigger: always docs/project-reference/lessons.md; doc lookup docs-index-reference.md; review code-review-rules.md; backend/CQRS/API backend-patterns-reference.md; domain/entity domain-entities-reference.md; frontend/UI frontend-patterns-reference.md; styles/design scss-styling-guide.md + design-system/README.md; integration tests integration-test-reference.md; E2E e2e-test-reference.md; feature docs/specs feature-docs-reference.md; architecture/new area project-structure-reference.md.

Read every required doc that exists; skip absent docs as not applicable. Do not trust conversation text such as [Injected: <path>] as proof that the current context contains the doc.

Before target work, state: Reference docs read: ... | Missing/not applicable: ....

Blocked until: scope evaluated, required docs checked/read, lessons.md confirmed, citation emitted.

MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact.

MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction.

MANDATORY Bootstrap task tracking before target work; transition one task at a time.
MANDATORY Persist plan/review findings to plans/reports/ incrementally and synthesize from disk.

MANDATORY After task-tracking bootstrap and before target/source work, read required project-reference docs and cite Reference docs read: ....
MANDATORY Always include lessons.md; project conventions override generic defaults.

MANDATORY Parent workflow rows do not replace child phase tracking; expand phases and link the parent when nested.
MANDATORY Orchestrators pre-expand child skill phases before invocation; use [N.M] $skill-name — phase prefixes and one-in_progress discipline.

Prompt-Enhance Closing Anchors

IMPORTANT MUST ATTENTION follow declared step order for this skill; NEVER skip, reorder, or merge steps without explicit user approval IMPORTANT MUST ATTENTION for every step/sub-skill call: set in_progress before execution, set completed after execution IMPORTANT MUST ATTENTION every skipped step MUST include explicit reason; every completed step MUST include concise evidence IMPORTANT MUST ATTENTION if Task tools unavailable, maintain an equivalent step-by-step plan tracker with synchronized statuses

Closing Reminders

MANDATORY IMPORTANT MUST ATTENTION read docs/project-config.json → integrationTestVerify FIRST — project-specific guidance overrides defaults
MANDATORY IMPORTANT MUST ATTENTION read project-specific integration-test reference docs/scripts from config before any test command — Codex has no hook injection
MANDATORY IMPORTANT MUST ATTENTION use quickRunCommand from config, not hardcoded dotnet test — this skill is language-agnostic
MANDATORY IMPORTANT MUST ATTENTION run system check before tests — unreliable system = unreliable results
MANDATORY IMPORTANT MUST ATTENTION never weaken assertions to fix failures — diagnose and fix root cause
MANDATORY IMPORTANT MUST ATTENTION show actual test runner output — "all passed" without evidence is not verification
MANDATORY IMPORTANT MUST ATTENTION on failure: diagnose (test bug vs service bug) before fixing anything

[TASK-PLANNING] Before acting, analyze task scope and systematically break it into small todo tasks and sub-tasks using task tracking.

[IMPORTANT] Analyze how big the task is and break it into many small todo tasks systematically before starting — this is very important.

Hookless Prompt Protocol Mirror (Auto-Synced)

Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json

[WORKFLOW-EXECUTION-PROTOCOL] [BLOCKING] Workflow Execution Protocol — MANDATORY IMPORTANT MUST CRITICAL. Do not skip for any reason.

DETECT: Match prompt against workflow catalog
ANALYZE: Find best-match workflow AND evaluate if a custom step combination would fit better
ASK (REQUIRED FORMAT): Use a direct user question with this structure:
- Question: "Which workflow do you want to activate?"
- Option 1: "Activate [BestMatch Workflow] (Recommended)"
- Option 2: "Activate custom workflow: [step1 → step2 → ...]" (include one-line rationale)
ACTIVATE (if confirmed): Call $workflow-start <workflowId> for standard; sequence custom steps manually
CREATE TASKS: task tracking for ALL workflow steps
EXECUTE: Follow each step in sequence [CRITICAL-THINKING-MINDSET] Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination principle: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination. AI Attention principle (Primacy-Recency): Put the 3 most critical rules at both top and bottom of long prompts/protocols so instruction adherence survives long context windows.

Learned Lessons

Lessons Learned

[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.

Quick Summary

Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.

Top Rules (apply always):

MUST ATTENTION verify ALL preconditions (config, env, DB names, DI regs) BEFORE code-layer hypothesis
MUST ATTENTION fix responsible layer — NEVER patch symptom sites with caller-specific defensive code
MUST ATTENTION use ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTask
MUST ATTENTION name by PURPOSE not CONTENT — adding member forces rename = abstraction broken
MUST ATTENTION persist sub-agent findings incrementally after each file — NEVER batch at end
MUST ATTENTION Windows bash: verify Python alias (where python/where py) — NEVER assume python/python3 resolves

Debugging & Root Cause Reasoning

[2026-04-11] Holistic-first: verify environment before code. Failure → list ALL preconditions (config, env vars, DB names, endpoints, DI regs, credentials, permissions, data prerequisites) → verify each via evidence (grep/cat/query) BEFORE code-layer hypothesis. Worst rabbit holes: diving nearest layer while bug sits elsewhere — e.g., hours debugging "sync timeout", real cause: test appsettings pointing wrong DB. Cheapest check first.
[2026-04-01] Ask "whose responsibility?" before fixing. Trace: bug in caller (wrong data) or callee (wrong handling)? Fix responsible layer — NEVER patch symptom site masking real issue.
[2026-04-01] Trace data lifecycle, not error site. Follow data: creation → transformation → consumption. Bug usually where data created wrong, not consumed.
[2026-04-01] Code is caller-agnostic. Functions/handlers/consumers don't know who invokes them. Comments/guards/messages describe business intent — NEVER reference specific callers (tests, seeders, scripts).

Architecture Invariants

[2026-03-31] ParallelAsync + repo/UoW MUST use ExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).
[2026-03-31] Bus message naming MUST include service name prefix — core services NEVER consume feature events. Prefix declares schema ownership (AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.

Naming & Abstraction

[2026-04-12] Name PURPOSE not CONTENT — "OrXxx" anti-pattern. HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.

Environment & Tooling

[2026-04-20] Windows bash: NEVER assume python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.

Test-specific lessons → docs/project-reference/integration-test-reference.md Lessons Learned section. Production-code anti-patterns → docs/project-reference/backend-patterns-reference.md Anti-Patterns section. Generic debugging/refactoring reminders → System Lessons in .claude/hooks/lib/prompt-injections.cjs.

Closing Reminders

IMPORTANT MUST ATTENTION holistic-first: verify ALL preconditions (config, env, DB names, endpoints, DI regs) BEFORE code-layer hypothesis — cheapest check first
IMPORTANT MUST ATTENTION fix responsible layer — NEVER patch symptom site; trace caller (wrong data) vs callee (wrong handling), fix root owner
IMPORTANT MUST ATTENTION parallel async + repo/UoW → ALWAYS ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption)
IMPORTANT MUST ATTENTION bus message prefix = schema ownership; feature services NEVER define events for core services — use {CoreServiceName}...RequestBusMessage
IMPORTANT MUST ATTENTION name by PURPOSE — adding/removing member forces rename = broken abstraction
IMPORTANT MUST ATTENTION sub-agents MUST write findings after each file/section — NEVER batch all findings into one final write
IMPORTANT MUST ATTENTION Windows bash: NEVER assume python/python3 resolves — run where python/where py first, use py launcher or node

[LESSON-LEARNED-REMINDER] [BLOCKING] Task Planning & Continuous Improvement — MANDATORY. Do not skip.

Break work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".

Extract lessons — ROOT CAUSE ONLY, not symptom fixes:

Name the FAILURE MODE (reasoning/assumption failure), not symptom — "assumed API existed without reading source" not "used wrong enum value".
Generality test: does this failure mode apply to ≥3 contexts/codebases? If not, abstract one level up.
Write as a universal rule — strip project-specific names/paths/classes. Useful on any codebase.
Consolidate: multiple mistakes sharing one failure mode → ONE lesson.
Recurrence gate: "Would this recur in future session WITHOUT this reminder?" — No → skip $learn.
Auto-fix gate: "Could $code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.
BOTH gates pass → ask user to run $learn. [TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.

name	integration-test-verify
description	[Testing] Use when you need to verify integration tests pass after writing and reviewing them.

Codex compatibility note:

Invoke repository skills with $skill-name in Codex; this mirrored copy rewrites legacy Claude /skill-name references.

Prefer the plan-hard skill for planning guidance in this Codex mirror.

Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.

User-question prompts mean to ask the user directly in Codex.

Ignore Claude-specific mode-switch instructions when they appear.

Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.

Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required spawn_agent subagent(s) for that task.

Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.

For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.

If a required step/tool cannot run in this environment, stop and ask the user before adapting.

Codex Project-Reference Loading (No Hooks)

Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.

Always read:

docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)
docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)
docs/project-reference/lessons.md (always-on guardrails and anti-patterns)

Situation-based docs:

Backend/CQRS/API/domain/entity changes: backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.md
Frontend/UI/styling/design-system: frontend-patterns-reference.md, scss-styling-guide.md, design-system/README.md
Spec/test-case planning or TC mapping: feature-docs-reference.md
Integration test implementation/review: integration-test-reference.md
E2E test implementation/review: e2e-test-reference.md
Code review/audit work: code-review-rules.md plus domain docs above based on changed files

Do not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.

[BLOCKING] Execute skill steps in declared order. NEVER skip, reorder, or merge steps without explicit user approval. [BLOCKING] Before each step or sub-skill call, update task tracking: set in_progress when step starts, set completed when step ends. [BLOCKING] Every completed/skipped step MUST include brief evidence or explicit skip reason. [BLOCKING] If Task tools are unavailable, create and maintain an equivalent step-by-step plan tracker with the same status transitions.

Quick Summary

Goal: Run integration tests after $integration-test writes them and $integration-test-review reviews them. Confirm all pass and remain repeatable.

Workflow:

Read Config — Load docs/project-config.json → integrationTestVerify section for project-specific run guidance
System Check — Verify required system is healthy before running
Determine Test Projects — Discover via testProjectPattern glob, testProjects list, or git auto-detect
Run Tests — Execute quickRunCommand on determined test projects for 3 consecutive runs
Report — Pass/fail counts, failed test names, next steps on failure

Key Rules:

MUST read project config integrationTestVerify section before doing anything else
MUST read project-specific reference docs named by integrationTestVerify.referenceDocs or the project's integration-test doc path before running tests
Use quickRunCommand from config — NEVER hardcode dotnet test or any language-specific command
If system check fails → instruct user how to start system (reference startupScript from config)
If config says local infrastructure, databases, services, or full system startup is required, treat that as a blocking prerequisite
On test failure → diagnose root cause: test bug or service bug. NEVER weaken assertions.
Verification only passes after 3 consecutive successful runs of each relevant suite/project without DB reset
Always report exact failure counts and names — "all passed" requires evidence

Be skeptical. Apply critical thinking. Every pass/fail claim needs actual test runner output.

Step 1: Read Project Config

Read docs/project-config.json and extract the integrationTestVerify section.

Expected config shape:
{
  "integrationTestVerify": {
    "guidance":             string   — instructions for this project's test run approach
    "referenceDocs":        string[] — project docs that explain integration-test setup/run prerequisites
    "quickRunCommand":      string   — test runner command (e.g., "dotnet test --no-build", "npm test", "pytest")
    "testProjectPattern":   string   — glob pattern to discover test projects (e.g., "src/Services/**/*.IntegrationTests.csproj")
    "testProjects":         string[] — explicit list of test project paths (fallback if no pattern)
    "systemCheckCommand":   string   — shell command to check system readiness
    "runScript":            string   — path to CI-style full run script (reference only)
    "startupScript":        string   — path to system startup script (reference only)
  }
}

Config priority: testProjectPattern (auto-discovers via glob) > testProjects (explicit list) > git auto-detect (fallback).

If integrationTestVerify section is missing: proceed to Fallback Mode.

If section exists: display the guidance value to the user verbatim — it contains project-specific instructions the implementer wrote intentionally.

Then read the project-specific setup guidance before any system check or test command:

Read every file listed in integrationTestVerify.referenceDocs, if present.
If no referenceDocs list exists, read the integration-test reference doc indicated elsewhere in docs/project-config.json (for example a framework/testing integration test doc path), if present.
If config names runScript or startupScript, read those scripts when needed to understand startup, health checks, arguments, or labels. Use them as project-specific evidence, not generic assumptions.
If no project-specific reference exists, proceed only with the explicit config values and call out that the project should add reference docs to integrationTestVerify.

Step 2: System Check

If systemCheckCommand exists in config:

Run the system check via Bash:

{systemCheckCommand}

Evaluate output:

Healthy → proceed to Step 3
Partially healthy / no containers → display startup instructions to user: > "System not fully ready. To start: run {startupScript} (or follow the guidance above). Wait for all services to be healthy, then re-run $integration-test-verify."

STOP — do not run tests against an unhealthy system. Results would be unreliable.

If no systemCheckCommand:

If guidance, reference docs, runScript, or startupScript indicate required local infrastructure/services, STOP and tell the user the project config needs a concrete readiness check before AI verification can run.
Otherwise, proceed to Step 3 and explicitly report that no system check was configured.

Step 3: Determine Test Projects

Priority order: testProjectPattern (glob auto-discover) > testProjects (explicit list) > git auto-detect (fallback).

If testProjectPattern exists in config:

Discover test projects by running a glob search for the pattern:

# Example for .NET projects (pattern: "src/Services/**/*.IntegrationTests.csproj")
find . -path "{testProjectPattern}" -type f
# or use language-appropriate glob tool

Use all discovered .csproj files (or equivalent) as the test project list. Exclude any paths outside the pattern scope.

If no testProjectPattern but testProjects list exists:

Use the explicit list from config directly.

If neither exists — auto-detect from git:

# Auto-detect changed test projects
git diff --name-only HEAD | grep -i "IntegrationTest" | sed 's|/[^/]*$||' | sort -u

If auto-detect finds nothing (no uncommitted test changes), ask user: "No changed test files detected. Run all test projects or skip?"

Filter rule: Only run projects relevant to the current change. If user explicitly asks to run all → run all discovered/configured projects.

Step 4: Run Tests

Do not run this step unless Step 2 passed or the config/reference docs explicitly state no external system is required.

Execute using quickRunCommand from config. Run each relevant suite/project 3 consecutive times without resetting data.

Three-run idempotency gate: If any run fails, verification fails. Fix the root cause, then restart the 3-run sequence from run 1.

Example for a.NET project:

# Run each test project individually for clear per-project results
{quickRunCommand} {testProject1}
{quickRunCommand} {testProject2}
# ...

Or run all at once using the solution filter if supported:

{quickRunCommand} --filter "Category=integration"

Step 5: Report Results

After all tests complete, report:

### Integration Test Verify Results

**Run command:** {quickRunCommand}
**Projects tested:** {N}
**Repeatability gate:** 3 consecutive runs without DB reset

| Project | Run | Passed | Failed | Skipped |
|---------|-----|--------|--------|---------|
| {Project1} | 1 | X | 0 | Y |
| {Project1} | 2 | X | 0 | Y |
| {Project1} | 3 | X | 0 | Y |

**Total:** {total_passed} passed, {total_failed} failed, {total_skipped} skipped (expected skip annotations)

Status: ✅ ALL PASS | ❌ {N} FAILURES

On failure:

List each failing test name + failure message
Diagnose: test bug (wrong assertion setup) or service bug (handler actually broken)?
If test bug → fix in the test file (do NOT weaken assertions — fix setup/data)
If service bug → report as finding, do NOT silently fix without telling user
After fixing → re-run the full 3-run verify sequence

Fallback Mode (No Project Config)

When docs/project-config.json has no integrationTestVerify section:

Detect project type from root files:
- *.sln or *.csproj → dotnet test
- package.json → npm test or npx jest
- pytest.ini / setup.py / pyproject.toml → pytest
- go.mod → go test ./...
Auto-detect changed test files from git:
```
git diff --name-only HEAD
```
Run detected command on changed test projects.
Report results and recommend: "Add integrationTestVerify to docs/project-config.json for project-specific run guidance."

CI-Style Full Run (Reference)

When runScript is configured, reference it for the full CI-style run (not run by AI directly — Windows.cmd scripts and CI runners require user/pipeline execution):

"For a full CI-style run including Docker orchestration and health polling, execute: {runScript}"

This script typically: creates networks → removes stale containers → builds images → starts infrastructure (wait healthy) → starts APIs (wait healthy) → runs all tests.

On Test Failure Protocol

NEVER do these to make failures go away:

❌ Remove or weaken assertions
❌ Add skip annotations (e.g., [Fact(Skip=...)] in xUnit, @Disabled in JUnit) to hide failures
❌ Create or mutate domain data through repositories to bypass real use-case paths
❌ Mark passing by ignoring error output
❌ Report "all passed" without showing actual runner output

DO this instead:

Read the failing test method
Read the handler/service the test targets
Identify: is the assertion wrong, or is the code wrong?
Fix at the root cause layer; use real use cases or valid seeded fixtures for data setup
Re-run to confirm green

If a test fails because the system is unavailable → report as "system not ready" and reference startupScript / runScript. Never change the test.

Workflow Recommendation

MANDATORY IMPORTANT MUST ATTENTION — NO EXCEPTIONS: If you are NOT already in a workflow, you MUST ATTENTION use a direct user question to ask the user. Do NOT judge task complexity or decide this is "simple enough to skip" — the user decides whether to use a workflow, not you:

Activate test-to-integration workflow (Recommended) — scout → integration-test → integration-test-review → integration-test-verify → test → docs-update → watzup → workflow-end

Execute $integration-test-verify directly — run this skill standalone

Next Steps

"$workflow-review-changes (Recommended)" — Review all changes before committing
"$integration-test-review" — If tests fail: review and fix integration tests before re-verify
"$docs-update" — Update documentation if test counts changed
"Skip, continue manually" — user decides

[IMPORTANT] Use task tracking to break ALL work into small tasks BEFORE starting. A verify step that does not actually run tests 3 consecutive times is not repeatability verification. It is theater. Read project config FIRST to understand how to run tests for this specific project.

AI Mistake Prevention — Failure modes to avoid on every task:

Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal. Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing. Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain. Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path. When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site. Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code. Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks. Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis. Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly. Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.

Nested Task Expansion Contract — For workflow-step invocation, the [Workflow] ... row is only a parent container; the child skill still creates visible phase tasks.

Call the current task list first. If a matching active parent workflow row exists, set nested=true and record parentTaskId; otherwise run standalone.

Create one task per declared phase before phase work. When nested, prefix subjects [N.M] $skill-name — phase.

When nested, link the parent with TaskUpdate(parentTaskId, addBlockedBy: [childIds]).

Orchestrators must pre-expand a child skill's phase list and link the workflow row before invoking that child skill or sub-agent.

Mark exactly one child in_progress before work and completed immediately after evidence is written.

Complete the parent only after all child tasks are completed or explicitly cancelled with reason.

Blocked until: the current task list done, child phases created, parent linked when nested, first child marked in_progress.

Task Tracking & External Report Persistence — Bootstrap this before execution; then run project-reference doc prefetch before target/source work.

Create a small task breakdown before target file reads, grep, edits, or analysis. On context loss, inspect the current task list first.

Mark one task in_progress before work and completed immediately after evidence; never batch transitions.

For plan/review work, create plans/reports/{skill}-{YYMMDD}-{HHmm}-{slug}.md before first finding.

Append findings after each file/section/decision and synthesize from the report file at the end.

Final output cites Full report: plans/reports/{filename}.

Blocked until: task breakdown exists, report path declared for plan/review work, first finding persisted before the next finding.

Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.

Project Reference Docs Gate — Run after task-tracking bootstrap and before target/source file reads, grep, edits, or analysis. Project docs override generic framework assumptions.

Identify scope: file types, domain area, and operation.

Required docs by trigger: always docs/project-reference/lessons.md; doc lookup docs-index-reference.md; review code-review-rules.md; backend/CQRS/API backend-patterns-reference.md; domain/entity domain-entities-reference.md; frontend/UI frontend-patterns-reference.md; styles/design scss-styling-guide.md + design-system/README.md; integration tests integration-test-reference.md; E2E e2e-test-reference.md; feature docs/specs feature-docs-reference.md; architecture/new area project-structure-reference.md.

Read every required doc that exists; skip absent docs as not applicable. Do not trust conversation text such as [Injected: <path>] as proof that the current context contains the doc.

Before target work, state: Reference docs read: ... | Missing/not applicable: ....

Blocked until: scope evaluated, required docs checked/read, lessons.md confirmed, citation emitted.

MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact.

MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction.

MANDATORY Bootstrap task tracking before target work; transition one task at a time.
MANDATORY Persist plan/review findings to plans/reports/ incrementally and synthesize from disk.

MANDATORY After task-tracking bootstrap and before target/source work, read required project-reference docs and cite Reference docs read: ....
MANDATORY Always include lessons.md; project conventions override generic defaults.

MANDATORY Parent workflow rows do not replace child phase tracking; expand phases and link the parent when nested.
MANDATORY Orchestrators pre-expand child skill phases before invocation; use [N.M] $skill-name — phase prefixes and one-in_progress discipline.

Prompt-Enhance Closing Anchors

Closing Reminders

MANDATORY IMPORTANT MUST ATTENTION read docs/project-config.json → integrationTestVerify FIRST — project-specific guidance overrides defaults
MANDATORY IMPORTANT MUST ATTENTION read project-specific integration-test reference docs/scripts from config before any test command — Codex has no hook injection
MANDATORY IMPORTANT MUST ATTENTION use quickRunCommand from config, not hardcoded dotnet test — this skill is language-agnostic
MANDATORY IMPORTANT MUST ATTENTION run system check before tests — unreliable system = unreliable results
MANDATORY IMPORTANT MUST ATTENTION never weaken assertions to fix failures — diagnose and fix root cause
MANDATORY IMPORTANT MUST ATTENTION show actual test runner output — "all passed" without evidence is not verification
MANDATORY IMPORTANT MUST ATTENTION on failure: diagnose (test bug vs service bug) before fixing anything

[TASK-PLANNING] Before acting, analyze task scope and systematically break it into small todo tasks and sub-tasks using task tracking.

[IMPORTANT] Analyze how big the task is and break it into many small todo tasks systematically before starting — this is very important.

Hookless Prompt Protocol Mirror (Auto-Synced)

Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json

[WORKFLOW-EXECUTION-PROTOCOL] [BLOCKING] Workflow Execution Protocol — MANDATORY IMPORTANT MUST CRITICAL. Do not skip for any reason.

DETECT: Match prompt against workflow catalog
ANALYZE: Find best-match workflow AND evaluate if a custom step combination would fit better
ASK (REQUIRED FORMAT): Use a direct user question with this structure:
- Question: "Which workflow do you want to activate?"
- Option 1: "Activate [BestMatch Workflow] (Recommended)"
- Option 2: "Activate custom workflow: [step1 → step2 → ...]" (include one-line rationale)
ACTIVATE (if confirmed): Call $workflow-start <workflowId> for standard; sequence custom steps manually
CREATE TASKS: task tracking for ALL workflow steps
EXECUTE: Follow each step in sequence [CRITICAL-THINKING-MINDSET] Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination principle: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination. AI Attention principle (Primacy-Recency): Put the 3 most critical rules at both top and bottom of long prompts/protocols so instruction adherence survives long context windows.

Learned Lessons

Lessons Learned

[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.

Quick Summary

Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.

Top Rules (apply always):

MUST ATTENTION verify ALL preconditions (config, env, DB names, DI regs) BEFORE code-layer hypothesis
MUST ATTENTION fix responsible layer — NEVER patch symptom sites with caller-specific defensive code
MUST ATTENTION use ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTask
MUST ATTENTION name by PURPOSE not CONTENT — adding member forces rename = abstraction broken
MUST ATTENTION persist sub-agent findings incrementally after each file — NEVER batch at end
MUST ATTENTION Windows bash: verify Python alias (where python/where py) — NEVER assume python/python3 resolves

Debugging & Root Cause Reasoning

[2026-04-11] Holistic-first: verify environment before code. Failure → list ALL preconditions (config, env vars, DB names, endpoints, DI regs, credentials, permissions, data prerequisites) → verify each via evidence (grep/cat/query) BEFORE code-layer hypothesis. Worst rabbit holes: diving nearest layer while bug sits elsewhere — e.g., hours debugging "sync timeout", real cause: test appsettings pointing wrong DB. Cheapest check first.
[2026-04-01] Ask "whose responsibility?" before fixing. Trace: bug in caller (wrong data) or callee (wrong handling)? Fix responsible layer — NEVER patch symptom site masking real issue.
[2026-04-01] Trace data lifecycle, not error site. Follow data: creation → transformation → consumption. Bug usually where data created wrong, not consumed.
[2026-04-01] Code is caller-agnostic. Functions/handlers/consumers don't know who invokes them. Comments/guards/messages describe business intent — NEVER reference specific callers (tests, seeders, scripts).

Architecture Invariants

[2026-03-31] ParallelAsync + repo/UoW MUST use ExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).
[2026-03-31] Bus message naming MUST include service name prefix — core services NEVER consume feature events. Prefix declares schema ownership (AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.

Naming & Abstraction

[2026-04-12] Name PURPOSE not CONTENT — "OrXxx" anti-pattern. HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.

Environment & Tooling

[2026-04-20] Windows bash: NEVER assume python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.

Test-specific lessons → docs/project-reference/integration-test-reference.md Lessons Learned section. Production-code anti-patterns → docs/project-reference/backend-patterns-reference.md Anti-Patterns section. Generic debugging/refactoring reminders → System Lessons in .claude/hooks/lib/prompt-injections.cjs.

Closing Reminders

IMPORTANT MUST ATTENTION holistic-first: verify ALL preconditions (config, env, DB names, endpoints, DI regs) BEFORE code-layer hypothesis — cheapest check first
IMPORTANT MUST ATTENTION fix responsible layer — NEVER patch symptom site; trace caller (wrong data) vs callee (wrong handling), fix root owner
IMPORTANT MUST ATTENTION parallel async + repo/UoW → ALWAYS ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption)
IMPORTANT MUST ATTENTION bus message prefix = schema ownership; feature services NEVER define events for core services — use {CoreServiceName}...RequestBusMessage
IMPORTANT MUST ATTENTION name by PURPOSE — adding/removing member forces rename = broken abstraction
IMPORTANT MUST ATTENTION sub-agents MUST write findings after each file/section — NEVER batch all findings into one final write
IMPORTANT MUST ATTENTION Windows bash: NEVER assume python/python3 resolves — run where python/where py first, use py launcher or node

[LESSON-LEARNED-REMINDER] [BLOCKING] Task Planning & Continuous Improvement — MANDATORY. Do not skip.

Break work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".

Extract lessons — ROOT CAUSE ONLY, not symptom fixes:

Name the FAILURE MODE (reasoning/assumption failure), not symptom — "assumed API existed without reading source" not "used wrong enum value".
Generality test: does this failure mode apply to ≥3 contexts/codebases? If not, abstract one level up.
Write as a universal rule — strip project-specific names/paths/classes. Useful on any codebase.
Consolidate: multiple mistakes sharing one failure mode → ONE lesson.
Recurrence gate: "Would this recur in future session WITHOUT this reminder?" — No → skip $learn.
Auto-fix gate: "Could $code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.
BOTH gates pass → ask user to run $learn. [TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.