Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

adversarial-review

Sterne0

Forks0

Aktualisiert24. Juni 2026 um 03:26

Use when a change is written and "looks done" but has not had a hostile second pass before merge — especially diffs touching auth, money, migrations, concurrency, or anything the author is quietly unsure about. Spawns a fresh-eyes reviewer subagent that sees ONLY the diff and the spec, collects findings, drives fixes, and re-dispatches until findings degrade to nitpicks. Reach for this instead of self-reviewing; the author is the worst reviewer of their own diff.

Installation

Mit Codex oder Claude installieren Kopieren Sie diesen Prompt, fügen Sie ihn in Codex, Claude oder einen anderen Assistant ein und lassen Sie die Skill-Seite prüfen und installieren.

In Manus ausführen

Quelle

az9713

az9713/skill-best-practices

GitHub-Repository öffnen Creator-Repositorys ansehen

Download

In Manus ausführen

Datei-Explorer

2 Dateien

SKILL.md

readonly

name

adversarial-review

description

Adversarial Review

Overview

Self-review is weak because the author already believes the code is correct — they re-read it looking for confirmation, not for holes. This skill manufactures a genuinely independent critic: a fresh subagent dispatched via the Task tool that has never seen your reasoning, only the raw diff and the spec it must satisfy. You collect its findings, fix the real ones, and dispatch a new fresh reviewer on the updated diff. You repeat until a reviewer comes back with nothing worse than nitpicks. That degradation-to-nitpicks is the stop signal.

The reviewer is adversarial on purpose: its job is to find what is wrong, not to bless the change. Your job across the loop is to keep it honest (don't feed it your conclusions) and to keep the loop bounded (don't chase cosmetic findings forever).

When to Use

Use this when:

A change is functionally complete and you are about to merge, ship, or open a PR.
The diff touches something where a subtle bug is expensive: auth, billing, data migrations, money math, concurrency, deletes, or external API contracts.
You feel a vague unease about a change but can't name the problem — that unease is exactly what a fresh reviewer surfaces.
A review came back clean too fast and you suspect it rubber-stamped.

Do NOT use this when:

There is no spec or acceptance criterion to judge against. Write one first — a reviewer with no spec can only check style, and you'll get noise. (For pure style, use code-style; for test quality, use testing-practices.)
The change is a one-line typo fix or a generated lockfile bump; the loop costs more than the risk.
You haven't run the existing tests yet. Adversarial review critiques design and spec-fit; it is not a substitute for running the suite.

The loop

The skill is a loop, not a single review. One pass anchors and misses; the value is in re-dispatching fresh reviewers on the shrinking set of problems.

1. Build the packet (deterministic, anchoring-free)

Run the script to assemble exactly what the reviewer is allowed to see — the diff plus the spec, and nothing from your own head:

scripts/prepare_review_packet.sh path/to/spec.md > /tmp/review_packet.md

The script refuses to run without a spec and refuses on an empty diff. Its output already contains the reviewer instructions; do not edit them to soften the ask.

2. Dispatch a fresh reviewer subagent

Use the Task tool to spawn a general-purpose / read-only reviewer subagent. Pass the packet contents as the prompt. Keep it tool-generic — any fresh subagent works; do not depend on one specific installed reviewer agent by name.

Critically: the subagent must start cold. Do not prepend your summary, your PR description, "I already verified X", or "I think the only risk is Y". Every word of context you add is a hint that steers the reviewer toward your blind spots.

Ask it for findings as a severity-tagged list: blocker / major / minor / nitpick, each with file:line and the reason it violates the spec or is unsafe.

3. Triage and fix the real findings

Blocker / major: fix in the working tree now.
Minor: fix if cheap; otherwise note and decide.
Nitpick: usually ignore. A wave of only-nitpicks is your exit condition.
Wrong/hallucinated finding: the reviewer lacked context the diff couldn't show. Don't argue with the subagent — instead, make the diff self-explanatory (a comment, a clearer name) or add the missing fact to the spec, then move on.

Stay inside the diff. If the reviewer points at a pre-existing problem outside the change, log it as follow-up — do not let the review balloon the diff (see Gotchas).

4. Re-dispatch on the updated diff

Regenerate the packet (prepare_review_packet.sh again — the diff is now different) and spawn a brand new reviewer. Reusing the same subagent re-anchors it on its earlier verdict; a fresh one re-judges from scratch.

5. Stop when findings degrade to nitpicks

Stop when a fresh reviewer returns nothing above nitpick severity, OR when two consecutive rounds surface only cosmetic/subjective items. Don't loop for a perfect score — diminishing returns are real and reviewers will always find something to say. Record what you fixed and ship.

Gotchas

Anchoring is the #1 failure. The instant you tell the reviewer your conclusion ("just sanity-check the error handling"), it confines itself to that and rubber-stamps the rest. Hand it ONLY the diff + spec via the packet. The script exists specifically so you can't leak your reasoning by accident.
A reviewer without the spec can only nitpick. Code alone tells you what the code does, not whether that's correct. Without acceptance criteria the reviewer has nothing to measure against and produces style noise. No spec → write one → then review.
Scope creep masquerading as thoroughness. A good reviewer will spot real problems in code your diff merely sits next to. Tempting to fix them "while I'm here" — but that balloons the diff, invites new bugs, and makes the next review pass harder. Keep fixes inside the diff; file the rest as follow-ups.
Knowing when to stop. Reviewers never run out of opinions. If you treat every nitpick as a must-fix you loop forever and start introducing churn-bugs. Degradation to nitpicks IS done. Two cosmetic-only rounds = stop.
Re-using the same subagent re-anchors it. It remembers its prior verdict and defends it. Always dispatch a new fresh subagent each round.
Don't argue with a hallucinated finding — fix the diff's legibility. If the reviewer flags something that's actually handled but invisible in the diff, the diff is under-communicating. Add the clarifying comment or spec line rather than rebutting; the next reviewer would have tripped on the same thing.

Files

SKILL.md — this file; the review loop and its stop condition.
scripts/prepare_review_packet.sh — emits the anchoring-free reviewer packet (diff + spec only). Run before every dispatch; refuses on missing spec or empty diff.

Mehr aus diesem Repository

gleiches Repository

babysit-pr

az9713/skill-best-practices

Use when a PR is open and green-but-blocked, or red on CI for reasons that smell like flake — a timed-out test runner, a transient network 500 in a setup step, a check that passed locally but failed in CI. Reach for this whenever someone says "this PR keeps failing CI but the test is flaky", "can you babysit this PR to merge", "it's just a flaky check, retry it", or wants a PR shepherded through retries, conflict resolution, and auto-merge without sitting on it manually. Prefer this over hand-clicking "Re-run failed jobs" in the GitHub UI, which gives up no signal on flaky-vs-real and forgets to enable auto-merge.

2026-06-240

billing-lib

az9713/skill-best-practices

Use when writing or reviewing code that meters API token usage, bills accounts, issues invoices, applies credit grants, or computes balances with the internal `billing` library — especially around retries, mid-cycle plan changes, cache-read vs cache-write token pricing, or any place where double-billing or rounding drift would be a problem.

2026-06-240

checkout-verifier

az9713/skill-best-practices

Use when an API-credits checkout or paid-plan upgrade needs to be proven end-to-end against Stripe test mode — confirming a card charge actually creates the invoice and subscription in the right state, reproducing a "I paid but my credits didn't show up" report, checking that a declined or 3DS card fails the way the UI claims, or wiring a billing smoke test into CI so a checkout regression is caught before a customer's money is.

2026-06-240

cherry-pick-prod

az9713/skill-best-practices

Use when a specific fix that's already on main needs to land on a production/release branch without dragging along everything else — a hotfix to backport, a "cherry-pick this commit onto release-2.4", a "we need just that one PR on prod" request. Reach for this whenever someone wants to port one or a few commits to a release branch and open a PR for it, especially before doing it by hand in their main checkout, which pollutes their working tree and routinely leaves conflict markers committed or loses the original commit's provenance.

2026-06-240

code-style

az9713/skill-best-practices

Use when writing or editing code in this org's Python or JS/TS, especially before committing or opening a PR — and proactively the moment a diff adds an import, an except/catch, or any logging. Enforces the style rules Claude gets wrong by default: import grouping, error-wrapping (no bare except / empty catch), no leftover debug prints, explicit over clever. Runs scripts/check_style.sh (ruff, mypy --strict, eslint + grep guards) which exits nonzero so it drops into a pre-commit hook or CI.

2026-06-240

cohort-compare

az9713/skill-best-practices

Use when someone wants to compare two cohorts' retention or conversion, asks whether a difference between segments is "real" or "significant", wants retention curves for an A/B group or a launch vs control, or says one cohort "looks better" and needs the delta flagged with a p-value. Reach for this whenever the question is two-group comparison plus significance — and especially before eyeballing two percentages and declaring a winner, which ignores sample size and observation-window mismatch.

2026-06-240

name

adversarial-review

description

Adversarial Review

Overview

When to Use

Use this when:

A change is functionally complete and you are about to merge, ship, or open a PR.
The diff touches something where a subtle bug is expensive: auth, billing, data migrations, money math, concurrency, deletes, or external API contracts.
You feel a vague unease about a change but can't name the problem — that unease is exactly what a fresh reviewer surfaces.
A review came back clean too fast and you suspect it rubber-stamped.

Do NOT use this when:

There is no spec or acceptance criterion to judge against. Write one first — a reviewer with no spec can only check style, and you'll get noise. (For pure style, use code-style; for test quality, use testing-practices.)
The change is a one-line typo fix or a generated lockfile bump; the loop costs more than the risk.
You haven't run the existing tests yet. Adversarial review critiques design and spec-fit; it is not a substitute for running the suite.

The loop

The skill is a loop, not a single review. One pass anchors and misses; the value is in re-dispatching fresh reviewers on the shrinking set of problems.

1. Build the packet (deterministic, anchoring-free)

Run the script to assemble exactly what the reviewer is allowed to see — the diff plus the spec, and nothing from your own head:

scripts/prepare_review_packet.sh path/to/spec.md > /tmp/review_packet.md

The script refuses to run without a spec and refuses on an empty diff. Its output already contains the reviewer instructions; do not edit them to soften the ask.

2. Dispatch a fresh reviewer subagent

Ask it for findings as a severity-tagged list: blocker / major / minor / nitpick, each with file:line and the reason it violates the spec or is unsafe.

3. Triage and fix the real findings

Blocker / major: fix in the working tree now.
Minor: fix if cheap; otherwise note and decide.
Nitpick: usually ignore. A wave of only-nitpicks is your exit condition.
Wrong/hallucinated finding: the reviewer lacked context the diff couldn't show. Don't argue with the subagent — instead, make the diff self-explanatory (a comment, a clearer name) or add the missing fact to the spec, then move on.

Stay inside the diff. If the reviewer points at a pre-existing problem outside the change, log it as follow-up — do not let the review balloon the diff (see Gotchas).

4. Re-dispatch on the updated diff

5. Stop when findings degrade to nitpicks

Gotchas

Anchoring is the #1 failure. The instant you tell the reviewer your conclusion ("just sanity-check the error handling"), it confines itself to that and rubber-stamps the rest. Hand it ONLY the diff + spec via the packet. The script exists specifically so you can't leak your reasoning by accident.
A reviewer without the spec can only nitpick. Code alone tells you what the code does, not whether that's correct. Without acceptance criteria the reviewer has nothing to measure against and produces style noise. No spec → write one → then review.
Scope creep masquerading as thoroughness. A good reviewer will spot real problems in code your diff merely sits next to. Tempting to fix them "while I'm here" — but that balloons the diff, invites new bugs, and makes the next review pass harder. Keep fixes inside the diff; file the rest as follow-ups.
Knowing when to stop. Reviewers never run out of opinions. If you treat every nitpick as a must-fix you loop forever and start introducing churn-bugs. Degradation to nitpicks IS done. Two cosmetic-only rounds = stop.
Re-using the same subagent re-anchors it. It remembers its prior verdict and defends it. Always dispatch a new fresh subagent each round.
Don't argue with a hallucinated finding — fix the diff's legibility. If the reviewer flags something that's actually handled but invisible in the diff, the diff is under-communicating. Add the clarifying comment or spec line rather than rebutting; the next reviewer would have tripped on the same thing.

Files

SKILL.md — this file; the review loop and its stop condition.
scripts/prepare_review_packet.sh — emits the anchoring-free reviewer packet (diff + spec only). Run before every dispatch; refuses on missing spec or empty diff.