Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

strengthen-tests-deep

Heavyweight test review and edit workflow using fresh-context subagents, validation, edits, and checks. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight test strengthening or regression coverage improvements. Use strengthen-tests-solo for ordinary/direct single-agent test strengthening.

Exécuter dans Manus

Aperçu

Commande d'installation

npx skills add https://github.com/alexandersumer/config --skill strengthen-tests-deep

Copiez et collez cette commande dans Claude Code pour installer le skill

Source

alexandersumer/config

Étoiles1

Forks1

Mis à jour4 juin 2026 à 00:00

SKILL.md

readonly

name	strengthen-tests-deep
description	Heavyweight test review and edit workflow using fresh-context subagents, validation, edits, and checks. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight test strengthening or regression coverage improvements. Use strengthen-tests-solo for ordinary/direct single-agent test strengthening.

Strengthen Tests Deep

This is the heavyweight test-strengthening path. You do not strengthen tests by guessing from the patch or by only polishing tests that already exist. This is not review-only: after validation, make the justified test-code improvements in the workspace instead of merely recommending them. Fresh-context subagents identify realistic regressions, missing coverage, and weak assertions; your job is to provide the changed behavior, relevant production code, discovered test harnesses, conventions, and then make only validated test improvements.

Get the effective diff and behavior surface automatically. Do not require a PR, explicit scope, or committed branch changes. Resolve the remote default branch from origin/HEAD, falling back to origin/main or origin/master only if needed; if no remote default exists, omit only the committed-branch part. Build one effective diff from the union of: committed branch changes with git diff $(git merge-base <remote-default> HEAD)..HEAD, staged changes with git diff --cached, unstaged changes with git diff, and untracked files from git ls-files --others --exclude-standard rendered as new-file diffs. Always include staged, unstaged, and untracked changes even when the committed branch diff exists. If focus_area or $ARGUMENTS is provided, use it only to narrow this discovered diff. If the effective diff is empty, generated-only, formatter-only, version-bump-only, or has no behavior/test relevance, stop with no test changes justified and one short reason.
Read production code first, then discover the test seam. Identify the changed observable behaviors, contracts, public entry points, and failure paths before editing. Read nearby tests if they exist; if they do not, inspect sibling packages/modules, build config, test naming patterns, fixtures, and documented commands until you can name the correct harness and file location for a new test. Treat "no nearby tests" as a reason to add the first focused regression test when there is changed behavior, not as a reason to stop. Find every CLAUDE.md, AGENTS.md, or REVIEW.md whose directory is an ancestor of any changed production or test file and include those conventions in reviewer prompts.
Dispatch four fresh-context test reviewers in parallel. Before dispatch, confirm {DIFF} contains non-empty pasted diff text or focused excerpts, not a path, filename, or summary. Each reviewer gets this prompt verbatim, with {ROLE}, {DIFF}, {PRODUCTION_CONTEXT}, {TEST_CONTEXT}, {TEST_HARNESS}, and {CONVENTIONS} filled in. No session context — only what you paste:
You are reviewing tests as {ROLE}. Diff: {DIFF}. Production context: {PRODUCTION_CONTEXT}. Existing test context, if any: {TEST_CONTEXT}. Test harness and likely new-test locations: {TEST_HARNESS}. Conventions: {CONVENTIONS}. One issue = one missing or weak regression signal. If there are no tests, propose the smallest high-signal public-behavior test instead of saying there are no improvements. Skip coverage theater, style, private implementation details, and mock-call-order assertions unless the public contract is the call. Return exactly one of:

CANDIDATES:
- severity: <Critical | High | Medium | Low> path: line: claim: evidence: <realistic bug, public path, and why existing tests miss it> suggested_fix:
NO_FINDINGS Reviewed: <files/scope> Reason:
Roles:
- Behavior coverage — changed public behavior or contract not exercised through a real entry point
- Failure and edge cases — null/empty/boundary inputs, wrong exception, missing await, stale cache, retries, errors, auth bypass, schema drift
- Assertion strength — weak assertions, snapshots/golden files that hide the important outcome, tests that would pass with swapped arguments or wrong branches
- Test integration and maintainability — over-mocking, brittle implementation-detail coupling, missing fixture realism, unclear test names that obscure the regression
Validate reviewer output before candidate validation. A reviewer response is valid only if it contains either CANDIDATES: or NO_FINDINGS. Empty, whitespace-only, truncated, or otherwise unstructured output is invalid. If any reviewer output is invalid, retry that reviewer once with a smaller pasted diff/context packet. If it is still invalid, stop with Review inconclusive. Never treat invalid output as no findings.
Validate every candidate. Use one fresh Task subagent per candidate test improvement. Each validator gets this prompt verbatim, with {ISSUE}, {FILES}, {TEST_HARNESS}, and {CONVENTIONS} filled in:

Issue: {ISSUE}. Relevant production and test files in full or focused excerpts: {FILES}. Test harness and candidate file location: {TEST_HARNESS}. Conventions: {CONVENTIONS}. Confirm or refute. Return VALIDATED: with concrete evidence: changed behavior, realistic regression, existing test gap or absence, public path the new or changed test should exercise, why this is the minimal maintainable test location, and score 0–100. Anything that does not catch a named realistic bug scores under 80.

A validator response is invalid if it is empty, whitespace-only, truncated, or missing VALIDATED: with a score and evidence. Retry invalid validator output once with smaller focused production/test excerpts. If it is still invalid, stop with Review inconclusive. Drop every candidate below 80.
Implement only validated improvements. Validated improvements are mandatory edits, not suggestions: modify existing tests or add new test code at the narrowest useful seam. Prefer public behavior over private fields or mock call order. When tests exist, strengthen or extend them at the narrowest useful level. When no suitable test exists, create the smallest idiomatic test file in the discovered harness that exercises the changed public path end-to-end enough to fail for the named regression. Replace weak assertions with exact observable outcomes. Add edge/failure cases only when tied to real changed paths. Reject tests that only prove mocks, test-only production APIs, or implementation details. If a mock becomes more complex than the behavior, prefer a public seam or integration path. Skip trivial getters, generated code, framework boilerplate, style conventions, and broad coverage goals.
Run checks. Run the targeted tests that prove each improvement, then the broader relevant check when available. If no check applies, say why.
Report. If any reviewer or validator returned invalid output after retry, output Review inconclusive and the failed role. For each touched test, output:

<file>::<test_name> — catches <named bug> via <public path> — <command> -> <result>

If no validated improvement exists, output no test changes justified and one short sentence naming the reviewed behavior surface.

Never add tests that only lock in implementation details. Never weaken, skip, delete, or baseline existing checks to get green.

Plus depuis ce dépôt

même dépôt

architecture-review-deep

alexandersumer/config

Heavyweight codebase architecture review using fresh-context subagents and validation. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight review of architecture, modularity, seams, coupling, ownership, domain boundaries, AI-navigability, or testability. Use architecture-review-solo for ordinary/direct single-agent architecture review.

2026-06-041

architecture-review-solo

alexandersumer/config

Direct codebase architecture review without subagents. Use for ordinary, direct, inline, single-agent, or no-subagent architecture review. Use architecture-review-deep when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight architecture review.

2026-06-041

design-review-deep

alexandersumer/config

Heavyweight design review using fresh-context subagents and validation. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight review of a software design, architecture proposal, API boundary, module split, data model, or design-sensitive diff. Use design-review-solo for ordinary/direct single-agent design review.

2026-06-041

design-review-solo

alexandersumer/config

Direct design review without subagents. Use for ordinary, direct, inline, single-agent, or no-subagent review of software design, architecture proposals, API boundaries, module splits, data models, or design-sensitive diffs. Use design-review-deep when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight design review.

2026-06-041

execute-plan

alexandersumer/config

Implement a plan

2026-06-041

review-deep

alexandersumer/config

Heavyweight code-change review using fresh-context subagents and validation. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight review of PRs, diffs, branches, staged changes, unstaged changes, untracked files, or any set of code/config/test changes. Use review-solo for ordinary/direct single-agent review.

2026-06-041

Source

alexandersumer

alexandersumer/config

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Commande d'installation

Téléchargement

Exécuter dans Manus

Utile pourSOC

Analystes en assurance qualité des logiciels et testeursProfessions informatiques et mathématiques15-1253L4

name	strengthen-tests-deep
description	Heavyweight test review and edit workflow using fresh-context subagents, validation, edits, and checks. Use when the user asks for deep, thorough, multi-agent, high-confidence, or heavyweight test strengthening or regression coverage improvements. Use strengthen-tests-solo for ordinary/direct single-agent test strengthening.

Strengthen Tests Deep

Get the effective diff and behavior surface automatically. Do not require a PR, explicit scope, or committed branch changes. Resolve the remote default branch from origin/HEAD, falling back to origin/main or origin/master only if needed; if no remote default exists, omit only the committed-branch part. Build one effective diff from the union of: committed branch changes with git diff $(git merge-base <remote-default> HEAD)..HEAD, staged changes with git diff --cached, unstaged changes with git diff, and untracked files from git ls-files --others --exclude-standard rendered as new-file diffs. Always include staged, unstaged, and untracked changes even when the committed branch diff exists. If focus_area or $ARGUMENTS is provided, use it only to narrow this discovered diff. If the effective diff is empty, generated-only, formatter-only, version-bump-only, or has no behavior/test relevance, stop with no test changes justified and one short reason.
Read production code first, then discover the test seam. Identify the changed observable behaviors, contracts, public entry points, and failure paths before editing. Read nearby tests if they exist; if they do not, inspect sibling packages/modules, build config, test naming patterns, fixtures, and documented commands until you can name the correct harness and file location for a new test. Treat "no nearby tests" as a reason to add the first focused regression test when there is changed behavior, not as a reason to stop. Find every CLAUDE.md, AGENTS.md, or REVIEW.md whose directory is an ancestor of any changed production or test file and include those conventions in reviewer prompts.
Dispatch four fresh-context test reviewers in parallel. Before dispatch, confirm {DIFF} contains non-empty pasted diff text or focused excerpts, not a path, filename, or summary. Each reviewer gets this prompt verbatim, with {ROLE}, {DIFF}, {PRODUCTION_CONTEXT}, {TEST_CONTEXT}, {TEST_HARNESS}, and {CONVENTIONS} filled in. No session context — only what you paste:
You are reviewing tests as {ROLE}. Diff: {DIFF}. Production context: {PRODUCTION_CONTEXT}. Existing test context, if any: {TEST_CONTEXT}. Test harness and likely new-test locations: {TEST_HARNESS}. Conventions: {CONVENTIONS}. One issue = one missing or weak regression signal. If there are no tests, propose the smallest high-signal public-behavior test instead of saying there are no improvements. Skip coverage theater, style, private implementation details, and mock-call-order assertions unless the public contract is the call. Return exactly one of:

CANDIDATES:
- severity: <Critical | High | Medium | Low> path: line: claim: evidence: <realistic bug, public path, and why existing tests miss it> suggested_fix:
NO_FINDINGS Reviewed: <files/scope> Reason:
Roles:
- Behavior coverage — changed public behavior or contract not exercised through a real entry point
- Failure and edge cases — null/empty/boundary inputs, wrong exception, missing await, stale cache, retries, errors, auth bypass, schema drift
- Assertion strength — weak assertions, snapshots/golden files that hide the important outcome, tests that would pass with swapped arguments or wrong branches
- Test integration and maintainability — over-mocking, brittle implementation-detail coupling, missing fixture realism, unclear test names that obscure the regression
Validate reviewer output before candidate validation. A reviewer response is valid only if it contains either CANDIDATES: or NO_FINDINGS. Empty, whitespace-only, truncated, or otherwise unstructured output is invalid. If any reviewer output is invalid, retry that reviewer once with a smaller pasted diff/context packet. If it is still invalid, stop with Review inconclusive. Never treat invalid output as no findings.
Validate every candidate. Use one fresh Task subagent per candidate test improvement. Each validator gets this prompt verbatim, with {ISSUE}, {FILES}, {TEST_HARNESS}, and {CONVENTIONS} filled in:

Issue: {ISSUE}. Relevant production and test files in full or focused excerpts: {FILES}. Test harness and candidate file location: {TEST_HARNESS}. Conventions: {CONVENTIONS}. Confirm or refute. Return VALIDATED: with concrete evidence: changed behavior, realistic regression, existing test gap or absence, public path the new or changed test should exercise, why this is the minimal maintainable test location, and score 0–100. Anything that does not catch a named realistic bug scores under 80.

A validator response is invalid if it is empty, whitespace-only, truncated, or missing VALIDATED: with a score and evidence. Retry invalid validator output once with smaller focused production/test excerpts. If it is still invalid, stop with Review inconclusive. Drop every candidate below 80.
Implement only validated improvements. Validated improvements are mandatory edits, not suggestions: modify existing tests or add new test code at the narrowest useful seam. Prefer public behavior over private fields or mock call order. When tests exist, strengthen or extend them at the narrowest useful level. When no suitable test exists, create the smallest idiomatic test file in the discovered harness that exercises the changed public path end-to-end enough to fail for the named regression. Replace weak assertions with exact observable outcomes. Add edge/failure cases only when tied to real changed paths. Reject tests that only prove mocks, test-only production APIs, or implementation details. If a mock becomes more complex than the behavior, prefer a public seam or integration path. Skip trivial getters, generated code, framework boilerplate, style conventions, and broad coverage goals.
Run checks. Run the targeted tests that prove each improvement, then the broader relevant check when available. If no check applies, say why.
Report. If any reviewer or validator returned invalid output after retry, output Review inconclusive and the failed role. For each touched test, output:

<file>::<test_name> — catches <named bug> via <public path> — <command> -> <result>

If no validated improvement exists, output no test changes justified and one short sentence naming the reviewed behavior surface.

Never add tests that only lock in implementation details. Never weaken, skip, delete, or baseline existing checks to get green.