| name | cover |
| license | MIT |
| compatibility | Claude Code 2.1.148+. Requires network access. |
| description | Generate and run comprehensive test suites — unit tests, integration tests with real services (testcontainers/docker-compose), and Playwright E2E tests. Analyzes coverage gaps, spawns parallel test-generator agents per tier, runs tests, and heals failures (max 3 iterations). Use when generating tests for existing code, improving coverage after implementation, or creating a full test suite from scratch. Chains naturally after /ork:implement. Do NOT use for verifying/grading existing tests (use /ork:verify) or running tests without generation (use npm test directly). |
| argument-hint | [scope-or-feature] |
| context | fork |
| version | 1.2.0 |
| author | OrchestKit |
| tags | ["testing","coverage","unit","integration","e2e","test-generation","real-services","testcontainers"] |
| user-invocable | true |
| allowed-tools | ["AskUserQuestion","Bash","Read","Write","Edit","Grep","Glob","Task","TaskCreate","TaskUpdate","TaskList","TaskStop","ToolSearch","CronCreate","CronDelete","Monitor","PushNotification","mcp__memory__search_nodes","mcp__context7__resolve-library-id","mcp__context7__query-docs"] |
| skills | ["testing-unit","testing-integration","testing-e2e","testing-perf","testing-llm","chain-patterns","memory","quality-gates"] |
| complexity | high |
| persuasion-type | discipline |
| effort | high |
| model | sonnet |
| hooks | {"PreToolUse":[{"matcher":"Bash","command":"${CLAUDE_PLUGIN_ROOT}/hooks/bin/run-hook.mjs skill/test-framework-detector","once":true}]} |
| metadata | {"category":"workflow-automation","mcp-server":"memory, context7"} |
| triggers | {"keywords":["cover","generate tests","test suite","test coverage","write tests for","add tests","need tests"],"examples":["generate tests for the authentication service","cover the payment module with unit and integration tests","I just finished implementing checkout, now write tests"],"anti-triggers":["verify","run tests","npm test","fix","implement","review"]} |
| paths | ["src/**/*.test.{ts,tsx,js}","**/.coveragerc","vitest.config.*","jest.config.*"] |
| invocation_hooks | ["command -v vitest >/dev/null 2>&1 || command -v jest >/dev/null 2>&1 || echo 'Warning: no test runner found — run npm install first'"] |
Cover — Test Suite Generator
Generate comprehensive test suites for existing code with real-service integration testing and automated failure healing.
Note: If disableSkillShellExecution is enabled (CC 2.1.91), the precondition check for vitest/jest won't run. Verify a test runner is installed before proceeding: npx vitest --version or npx jest --version.
Quick Start
/ork:cover authentication flow
/ork:cover --model=opus payment processing
/ork:cover --tier=unit,integration user service
/ork:cover --real-services checkout pipeline
Argument Resolution
SCOPE = "$ARGUMENTS"
MODEL_OVERRIDE = None
TIERS = ["unit", "integration", "e2e"]
REAL_SERVICES = False
for token in "$ARGUMENTS".split():
if token.startswith("--model="):
MODEL_OVERRIDE = token.split("=", 1)[1]
SCOPE = SCOPE.replace(token, "").strip()
elif token.startswith("--tier="):
TIERS = token.split("=", 1)[1].split(",")
SCOPE = SCOPE.replace(token, "").strip()
elif token == "--real-services":
REAL_SERVICES = True
SCOPE = SCOPE.replace(token, "").strip()
Step -0.5: Effort-Aware Coverage Scaling (CC 2.1.76, env var since 2.1.120)
Read ${CLAUDE_EFFORT} (CC 2.1.120+) first; explicit --effort= token wins as override. Default high when CC < 2.1.120 and no flag. Pattern matches assess + explore (#1540). Scale test generation depth:
| Effort Level | Tiers Generated | Agents | Heal Iterations |
|---|
| low | Unit only | 1 agent | 1 max |
| medium | Unit + Integration | 2 agents | 2 max |
| high (default) | Unit + Integration + E2E | 3 agents | 3 max |
| xhigh (Opus 4.8, CC 2.1.111+) | Unit + Integration + E2E | 3 agents | 4 max (one extra heal pass) |
Override: Explicit --tier= flag or user selection overrides /effort downscaling.
Step -1: MCP Probe + Resume Check
ToolSearch(query="select:mcp__memory__search_nodes")
ToolSearch(query="select:mcp__context7__resolve-library-id")
Write(".claude/chain/capabilities.json", {
"memory": <true if found>,
"context7": <true if found>,
"skill": "cover",
"timestamp": now()
})
Read(".claude/chain/state.json")
Step 0: Scope & Tier Selection
AskUserQuestion(
questions=[
{
"question": "What test tiers should I generate?",
"header": "Test Tiers",
"options": [
{"label": "Full coverage (Recommended)", "description": "Unit + Integration (real services) + E2E"},
{"label": "Unit + Integration", "description": "Skip E2E, focus on logic and service boundaries"},
{"label": "Unit only", "description": "Fast isolated tests for business logic"},
{"label": "E2E only", "description": "Playwright browser tests"}
],
"multiSelect": false
},
{
"question": "Healing strategy for failing tests?",
"header": "Failure Handling",
"options": [
{"label": "Auto-heal (Recommended)", "description": "Fix failing tests up to 3 iterations"},
{"label": "Generate only", "description": "Write tests, report failures, don't fix"},
{"label": "Strict", "description": "All tests must pass or abort"}
],
"multiSelect": false
}
]
)
Override TIERS based on selection. Skip this step if --tier= flag was provided.
Task Management (MANDATORY)
TaskCreate(subject=f"Cover: {SCOPE}", description="Generate comprehensive test suite with real-service testing", activeForm=f"Generating tests for {SCOPE}")
TaskCreate(subject="Discover scope and detect frameworks", activeForm="Discovering test scope")
TaskCreate(subject="Analyze coverage gaps", activeForm="Analyzing coverage gaps")
TaskCreate(subject="Generate tests (parallel per tier)", activeForm="Generating tests")
TaskCreate(subject="Execute generated tests", activeForm="Running tests")
TaskCreate(subject="Heal failing tests", activeForm="Healing test failures")
TaskCreate(subject="Generate coverage report", activeForm="Generating report")
TaskUpdate(taskId="3", addBlockedBy=["2"])
TaskUpdate(taskId="4", addBlockedBy=["3"])
TaskUpdate(taskId="5", addBlockedBy=["4"])
TaskUpdate(taskId="6", addBlockedBy=["5"])
TaskUpdate(taskId="7", addBlockedBy=["6"])
task = TaskGet(taskId="2")
TaskUpdate(taskId="2", status="in_progress")
TaskUpdate(taskId="2", status="completed")
6-Phase Workflow
| Phase | Activities | Output |
|---|
| 1. Discovery | Detect frameworks, scan scope, find untested code | Framework map, file list |
| 2. Coverage Analysis | Run existing tests, map gaps per tier | Coverage baseline, gap map |
| 3. Generation | Parallel test-generator agents per tier | Test files created |
| 4. Execution | Run all generated tests | Pass/fail results |
| 5. Heal | Fix failures, re-run (max 3 iterations) | Green test suite |
| 6. Report | Coverage delta, test count, summary | Coverage report |
Phase Handoffs
| After Phase | Handoff File | Key Outputs |
|---|
| 1. Discovery | 01-cover-discovery.json | Frameworks, scope files, tier plan |
| 2. Analysis | 02-cover-analysis.json | Baseline coverage, gap map |
| 3. Generation | 03-cover-generation.json | Files created, test count per tier |
| 5. Heal | 05-cover-healed.json | Final pass/fail, iterations used |
Phase 1: Discovery
Detect the project's test infrastructure and scope the work.
Grep(pattern="vitest|jest|mocha|playwright|cypress", glob="package.json", output_mode="content")
Grep(pattern="pytest|unittest|hypothesis", glob="pyproject.toml", output_mode="content")
Grep(pattern="pytest|unittest|hypothesis", glob="requirements*.txt", output_mode="content")
Glob(pattern="**/docker-compose*.yml")
Glob(pattern="**/testcontainers*")
Grep(pattern="testcontainers", glob="**/package.json", output_mode="content")
Grep(pattern="testcontainers", glob="**/requirements*.txt", output_mode="content")
Glob(pattern="**/tests/**/*.test.*")
Glob(pattern="**/tests/**/*.spec.*")
Glob(pattern="**/__tests__/**/*")
Glob(pattern="**/test_*.py")
Grep(pattern=SCOPE, output_mode="files_with_matches")
Real-service decision:
docker-compose*.yml found → integration tests use real services
testcontainers in deps → use testcontainers for isolated service instances
- Neither found +
--real-services flag → error: "No docker-compose or testcontainers found. Install testcontainers or remove --real-services flag."
- Neither found, no flag → integration tests use mocks (MSW/VCR)
Load real-service detection details: Read("${CLAUDE_SKILL_DIR}/references/real-service-detection.md")
Phase 2: Coverage Analysis
Run existing tests and identify gaps.
Output coverage baseline to user immediately (progressive output).
Phase 3: Generation (Parallel Agents)
Spawn test-generator agents per tier. Launch ALL in ONE message with run_in_background=true.
⚠️ isolation="worktree" does NOT reliably isolate parallel agents in
CC Opus 4.7. Observed: spawning multiple agents with that param thrashes
the PRIMARY worktree's HEAD via sequential git checkout, cutting agents
off at ~60 tool uses. Tracked at Yonatan-HQ/platform#3224.
Use the manual pre-create pattern instead. Create one worktree per
agent BEFORE spawning (git worktree add ../<repo>-<tier> -b feat/<slug>-<tier> origin/dev),
then make FIRST: cd <worktree-path> the first Bash call in each agent
prompt. Empirical: 4–22 tool-uses per agent (vs 60–86 with broken
isolation), zero cutoffs across M164 Wave 2/3 + M170 sub-agents.
Full pattern + helper function + prompt-constraint template:
Read("/Users/yonatangross/coding/yonatangross/orchestkit/src/skills/implement/references/manual-worktree-pattern.md")
(cross-referenced from /ork:implement — same doc applies here).
import subprocess
REPO_ROOT = Bash("git rev-parse --show-toplevel").stdout.strip()
BASE_REF = "origin/dev"
TIER_WORKTREES = {}
for tier in TIERS:
wt_path = f"../<repo>-cover-{tier}"
branch = f"feat/cover-{SCOPE_SLUG}-{tier}"
Bash(f"git worktree add {wt_path} -b {branch} {BASE_REF}")
TIER_WORKTREES[tier] = wt_path
if "unit" in TIERS:
Agent(
subagent_type="test-generator",
prompt=f"""FIRST: cd {TIER_WORKTREES["unit"]}
Generate unit tests for: {SCOPE}
Coverage gaps: {gap_map.unit_gaps}
Framework: {detected_framework}
Existing tests: {existing_test_files}
Focus on:
- AAA pattern (Arrange-Act-Assert)
- Parametrized tests for multiple inputs
- MSW/VCR for HTTP mocking (never mock fetch directly)
- Factory-based test data (FactoryBoy/faker-js)
- Edge cases: empty input, errors, timeouts, boundary values
- Target: 90%+ business logic coverage
Do NOT use `isolation="worktree"` — this worktree was pre-created
by the lead. Every Bash call should start from {TIER_WORKTREES["unit"]}.""",
run_in_background=True,
max_turns=50,
model=MODEL_OVERRIDE
)
Output each agent's results as soon as it returns — don't wait for all agents.
Focus mode (CC 2.1.101): In focus mode, include the full coverage report (before/after delta, test count per tier, files created) in your final message.
Phase 4: Execution
Run all generated tests and collect results.
Phase 5: Heal Loop
Fix failing tests iteratively. Max 3 iterations to prevent infinite loops.
for iteration in range(3):
if all_tests_pass:
break
Load heal strategy details: Read("${CLAUDE_SKILL_DIR}/references/heal-loop-strategy.md")
Boundary: heal fixes TESTS, not source code. If a test fails because the source code has a bug, report it — don't silently fix production code.
Phase 6: Report
Generate coverage report with before/after comparison.
Full report layout (baseline→after table, tests-generated counts, heal iterations, files created, remaining gaps, next-steps commands): Read("${CLAUDE_SKILL_DIR}/references/coverage-report-template.md").
PushNotification on Completion (CC 2.1.110+)
Full /ork:cover runs (unit + integration + E2E with heal loop) take 15–45 min. After the Phase 6 report is assembled, call PushNotification(message=f"ork:cover complete — {SCOPE}: {coverage_pct}% coverage · {tests_generated} tests · {heal_loops} heal iters", status="proactive"). Full rule: Read("/Users/yonatangross/coding/yonatangross/orchestkit/plugins/ork/skills/chain-patterns/rules/push-notification-on-completion.md").
Coverage Drift Monitor (CC 2.1.71)
Optionally schedule weekly coverage drift detection:
CronCreate(
schedule="0 2 * * 0",
prompt="Weekly coverage drift check for {SCOPE}: npm test -- --coverage.
If coverage >= baseline → CronDelete.
If coverage drops > 5% → alert with regression details and recommendation."
)
Key Principles
- Output limits (CC 2.1.77+): Opus 4.8 defaults to 64k output tokens (128k upper bound). For large test suites, chunk generation across multiple agent turns if output approaches the limit.
- Partial reads (CC 2.1.144+): Read returns a
[PARTIAL view] first page (not an error) on oversized files — re-read with explicit offset/limit so coverage analysis sees the whole file.
- Tests only — never modify production source code, only generate test files
- Real services when available — prefer testcontainers/docker-compose over mocks for integration tests because mock/prod divergence causes silent failures in production
- Parallel generation — spawn one test-generator agent per tier in ONE message
- Heal, don't loop forever — max 3 iterations, then report remaining failures
- Progressive output — show results as each agent completes
- Factory over fixtures — use FactoryBoy/faker-js for test data, not hardcoded values
- Mock at network level — MSW/VCR, never mock fetch/axios directly
Agent Coordination
Context Passing
Each test-generator agent receives: coverage gaps for its tier, test framework config, real-service infrastructure (testcontainers, docker-compose), and fixture patterns from the project.
Monitor + Partial Results (CC 2.1.98)
Use Monitor for streaming test execution output from background agents:
Bash(command="npm test -- --coverage 2>&1", run_in_background=true)
Monitor(pid=test_task_id)
Full pattern reference (until-condition gates, partial-result salvage, TaskOutput vs Monitor decision): Read("/Users/yonatangross/coding/yonatangross/orchestkit/plugins/ork/skills/chain-patterns/references/monitor-patterns.md").
Partial results (CC 2.1.98): If a test-generator crashes mid-generation, synthesize what it produced:
for agent_result in test_gen_results:
if "[PARTIAL RESULT]" in agent_result.output:
partial_tests = Glob(pattern="**/tests/**/*.test.*", path=agent_result.worktree)
if partial_tests:
for test_file in partial_tests:
Bash(command=f"cp {test_file} {main_worktree}/{test_file}")
SendMessage (Test Healing)
When a generated test fails, the healer agent can request context from the generator:
SendMessage(to="test-generator-unit", message="Test user_service_test.py:42 fails — TypeError on mock return. What's the expected shape?")
Skill Chain
Standard chain: implement → cover → verify → commit. Use addBlockedBy between each.
Verification Gate
Before claiming coverage is complete, apply: Read("${CLAUDE_PLUGIN_ROOT}/skills/shared/rules/verification-gate.md"). Run the coverage report fresh. "Should pass" is not evidence.
Agent Status Protocol
All test-generator agents report using: Read("${CLAUDE_PLUGIN_ROOT}/agents/shared/status-protocol.md"). BLOCKED if tests can't be written due to missing interfaces. NEEDS_CONTEXT if test expectations are unclear.
Related Skills
ork:implement — generates tests during implementation (Phase 5); use /ork:cover after for deeper coverage
ork:verify — grades existing tests 0-10; chain: implement → cover → verify
testing-unit / testing-integration / testing-e2e — knowledge skills loaded by test-generator agents
ork:commit — commit generated test files
Session recovery (CC 2.1.108+): After idle periods or interruptions, use /recap to restore conversational context alongside checkpoint-resume state. Enabled by default since CC 2.1.110 (even with telemetry disabled).
References
Load on demand with Read("${CLAUDE_SKILL_DIR}/references/<file>"):
| File | Content |
|---|
real-service-detection.md | Docker-compose/testcontainers detection, service startup, teardown |
heal-loop-strategy.md | Failure classification, fix patterns, iteration budget |
coverage-report-template.md | Report format, delta calculation, gap analysis |
Version: 1.2.0 (April 2026) — ${CLAUDE_EFFORT} env var as primary effort signal (CC 2.1.120, #1540)