| description | Review whole-repo test quality, rerun coverage, score remaining worth-testing files, inspect slow-drift and stale test debt, and publish the next testing batch. Use every few weeks or before large breaking changes and rearchitecture. |
| name | testing-review |
| metadata | {"skiller":{"source":".agents/rules/testing-review.mdc"}} |
Testing Review
Review the repo test suite from current reality, not stale vibes.
Use this when you want a periodic testing audit, a fresh coverage map, or a new next-batch recommendation before a breaking-change wave.
Goal
- rerun fresh repo coverage
- inspect test-suite health
- score remaining files by real regression value
- publish the next recommended batch
- stop fake work before it starts
This workflow is audit-first. Do not implement the recommended tests unless the user explicitly asks for execution.
Inputs
@.agents/rules/task.mdc
@.agents/rules/testing.mdc
Core Rules
- Use fresh
lcov as source of truth.
- Score files, not just packages.
- Do not default to package sweeps once the obvious package-wide passes are spent.
- Prefer file-ranked batches across packages when the remaining value is scattered.
- Only recommend a package sweep when a package is still largely untouched and contains multiple top-ranked seams.
- Lock a roadmap for the current review phase instead of re-inventing the next batch on every pass.
- Future passes should update roadmap status in place unless the candidate set materially changes.
- Do not permanently exclude
/react. Only exclude it when the current review explicitly says so.
- Penalize wrappers, crumbs, giant sludge files, and likely-dead code.
- Reward deterministic transforms, queries, merge helpers, parser/serializer seams, plugin resolution, normalization, and public editor contracts.
- Coverage is regression telemetry, not a KPI.
- If the remaining misses are mostly low-ROI dust, say stop.
Workflow
1. Refresh Coverage
Run fresh repo coverage with a date-stamped output directory:
bun test --coverage --coverage-reporter=lcov --coverage-dir=.coverage-repo-YYYY-MM-DDx --reporter=dots
Capture:
- pass/fail count
- file count
- runtime
lcov.info path
2. Inspect Suite Health
Run the fast-lane timing checks:
pnpm test:profile -- --top 25
pnpm test:slowest -- --top 25
Then scan for stale suite debt:
rg -n "describe\\.skip|it\\.skip|test\\.skip|xit\\(|xdescribe\\(" packages apps
rg -n "^\\s*//\\s*(describe|it|test)\\(" packages apps -g "*.spec.ts" -g "*.spec.tsx"
rg -n "from '.*\\.spec'" packages apps -g "*.spec.ts" -g "*.spec.tsx"
Only report debt that is actually worth fixing.
3. Score Remaining Files
Score every remaining packages/**/src/** file for worth-testing value.
Exclude by default:
- test files
- barrels
- declaration files
- obvious type-only files
- generated junk
- zero-value crumbs
Scoring should reflect:
- seam type
- runtime coverage
- uncovered behavior
- likely regression value during breaking changes
- test ROI
When recommending the next batch:
- prefer the best files across packages over "do package X next"
- call out when package totals are inflated by crumbs, wrappers, or giant low-ROI leftovers
- say explicitly when a package sweep would be dumb
4. Write Artifacts
Write:
- a markdown map under
docs/plans/
- a package TSV
- a file TSV
- a locked roadmap markdown file when this is the first meaningful pass for the current phase, or update that roadmap if it already exists
The markdown map should include:
- fresh coverage result
- scoring rules
- strict next batch
- wider next batch if still defensible
- package ranking
- file ranking
- stop condition
- clear caveats about fake-high package totals
The roadmap should include:
- the frozen threshold for the current phase
- the execution queue in stable order
- explicit deferrals with reasons
- status for each queued or deferred file
- an update rule that says future passes mark items done, removed, or deferred instead of reshuffling the whole list
5. Final Recommendation
Answer with:
- what the real next batch is
- whether to keep pushing coverage or stop
- what should be deferred by design
Output Standard
Use blunt rankings, not mush.
Say things like:
core first, then markdown, then diff
do not do another package sweep
the best next files are split across packages, so do not sweep package X
this roadmap is locked for the current phase; future passes update status, not the whole ranking
this file is uncovered but not worth touching
stop after the >= 5 batch
Stop Conditions
Recommend stopping when the remaining misses are mostly:
- wrappers
- provider/store dust
- DOM-only seams
- giant low-ROI files
- tiny uncovered crumbs
- code likely to be rewritten soon
At that point, tell the user to switch from coverage work to architecture-safety work.