Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

review-verification-protocol

Mandatory verification steps for all code reviews to reduce false positives. Load this skill before reporting ANY code review findings.

In Manus ausführen

Überblick

Mandatory verification steps for all code reviews to reduce false positives. Load this skill before reporting ANY code review findings.

Installationsbefehl

npx skills add https://github.com/existential-birds/beagle --skill review-verification-protocol

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Quelle

existential-birds/beagle

Sterne61

Forks8

Aktualisiert31. Mai 2026 um 22:13

SKILL.md

readonly

Jeden Skill mit einem Klick ausführen

name	review-verification-protocol
description	Mandatory verification steps for all code reviews to reduce false positives. Load this skill before reporting ANY code review findings.
user-invocable	false

Review Verification Protocol

This protocol MUST be followed before reporting any code review finding. Skipping these steps leads to false positives that waste developer time and erode trust in reviews.

Anti-confabulation (gate 0 — runs before every other gate)

Before issuing any verdict — flag, reject, or downgrade a finding — you MUST echo the exact artifact you are judging, quoted from a source you read in this turn:

For a code finding: the file:line plus the cited code, read freshly now (not recalled from earlier in the session).
For a diff review: the actual diff hunk under review.

The artifact is the only source of truth. Never infer what you are reviewing from the branch name, the working directory, surrounding files, or recollection. If your mental model differs from the freshly read source, the source wins. A verdict issued without a same-turn echo of its target is invalid — emit the echo first, or do not emit the verdict.

This gate exists because an LLM under contextual priming will confidently flag code that is not in the file. It runs before the hard gates below.

Hard gates (sequence)

Complete gates in order for each finding (or finding class). A gate passes only when the pass condition is objectively met—internal confidence is not enough.

Read scope — PASS when: You cite the full enclosing unit you judged (e.g. function, method, or class): file path plus start–end line range or symbol name. Diff-only context without full-unit read fails this gate.
Reference check (required for “unused”, “dead code”, “orphaned export”, “never called”) — PASS when: You ran a workspace-wide search for the symbol (ripgrep, IDE references, or find_referencing_symbols-style lookup) and noted whether non-definition matches exist. If use may be dynamic (decorators, getattr, entry points, plugins), PASS when: you state that and name the registration or import path that could justify the symbol.
Upstream / downstream (required for “missing validation”, “no error handling”, “race”, “leak”) — PASS when: You checked at least one of: caller, route/middleware, parent task, framework hook, or lifecycle (e.g. teardown, signal), and recorded whether responsibility already sits there.
Evidence line — PASS when: The finding includes [FILE:LINE] to the line that shows the issue (same requirement as Before Submitting Review).

If any gate fails, do not report the issue; gather evidence or drop it.

Pre-Report Verification Checklist

Before flagging ANY issue, verify (after Hard gates above):

I read the actual code - Not just the diff context, but the full function/class
I searched for usages - Before claiming "unused", searched all references
I checked surrounding code - The issue may be handled elsewhere (guards, earlier checks)
I verified syntax against current docs - Framework syntax evolves (Tailwind v4, TS 5.x, React 19)
I distinguished "wrong" from "different style" - Both approaches may be valid
I considered intentional design - Checked comments, project conventions (e.g. AGENTS.md or CLAUDE.md), architectural context

Verification by Issue Type

"Unused Variable/Function"

Before flagging, you MUST:

Search for ALL references in the codebase (grep/find)
Check if it's exported and used by external consumers
Check if it's used via reflection, decorators, or dynamic dispatch
Verify it's not a callback passed to a framework

Common false positives:

State setters in React (may trigger re-renders even if value appears unused)
Variables used in templates/JSX
Exports used by consuming packages

"Missing Validation/Error Handling"

Before flagging, you MUST:

Check if validation exists at a higher level (caller, middleware, route handler)
Check if the framework provides validation (Pydantic, Zod, TypeScript)
Verify the "missing" check isn't present in a different form

Common false positives:

Framework already validates (FastAPI + Pydantic, React Hook Form)
Parent component validates before passing props
Error boundary catches at higher level

"Type Assertion/Unsafe Cast"

Before flagging, you MUST:

Confirm it's actually an assertion, not an annotation
Check if the type is narrowed by runtime checks before the point
Verify if framework guarantees the type (loader data, form data)

Valid patterns often flagged incorrectly:

# Type annotation, NOT cast
data: UserData = await load_user()

# Type narrowing with isinstance
if isinstance(data, User):
    data.name  # Mypy knows this is User

"Potential Memory Leak/Race Condition"

Before flagging, you MUST:

Verify cleanup function is actually missing (not just in a different location)
Check if AbortController signal is checked after awaits
Confirm the component can actually unmount during the async operation

Common false positives:

Cleanup exists in useEffect return
Signal is checked (code reviewer missed it)
Operation completes before unmount is possible

"Performance Issue"

Before flagging, you MUST:

Confirm the code runs frequently enough to matter (render vs click handler)
Verify the optimization would have measurable impact
Check if the framework already optimizes this (React compiler, memoization)

Do NOT flag:

Functions created in click handlers (runs once per click)
Array methods on small arrays (< 100 items)
Object creation in event handlers

Severity Calibration

Critical (Block Merge)

ONLY use for:

Security vulnerabilities (injection, auth bypass, data exposure)
Data corruption bugs
Crash-causing bugs in happy path
Breaking changes to public APIs

Major (Should Fix)

Use for:

Logic bugs that affect functionality
Missing error handling that causes poor UX
Performance issues with measurable impact
Accessibility violations

Minor (Consider Fixing)

Use for:

Code clarity improvements
Documentation gaps
Inconsistent style (within reason)
Non-critical test coverage gaps

Informational (No Action Required)

Use for:

Improvements that require adding new dependencies or modules
Suggestions for net-new code that didn't exist in the codebase before (new modules, test suites, abstractions)
Architectural ideas for future consideration
Test infrastructure suggestions (new mock libraries, behaviour extraction)
Optimizations without measurable impact in the current context

These are NOT review blockers. They should be noted for the author's awareness but must not appear in the actionable issue count. The Verdict should ignore informational items entirely.

Do NOT Flag At All

Style preferences where both approaches are valid
Optimizations with no measurable benefit
Test code not meeting production standards (intentionally simpler)
Library/framework internal code (shadcn components, generated code)
Hypothetical issues that require unlikely conditions

Valid Patterns (Do NOT Flag)

Python

Pattern	Why It's Valid
`dict.get(key, [])`	Returns default for missing keys, not error suppression
`Optional[T]` return type	Standard way to express nullable in Python typing
`assert` in test code	pytest uses assertions, not try/except
Type annotation on variable	Not a cast, just a hint for type checkers
`typing.cast()` with prior validation	Valid after runtime check confirms type

FastAPI

Pattern	Why It's Valid
`Depends()` without explicit type	FastAPI infers dependency type from function signature
`async def` endpoint without await	May use sync DB calls or simple returns
Response model different from DB model	Separation of concerns between API and persistence
`BackgroundTasks` parameter	Valid for fire-and-forget operations
Direct `request.state` access	Standard pattern for middleware-injected data

Testing

Pattern	Why It's Valid
`assert` without message	pytest rewrites assertions to show detailed diffs
`@pytest.fixture` without explicit scope	Default `function` scope is correct for most fixtures
`monkeypatch` over `unittest.mock`	Simpler API, pytest-native
Fixture returning mutable state	Each test gets fresh fixture invocation by default

General

Pattern	Why It's Valid
`+?` lazy quantifier in regex	Prevents over-matching, correct for many patterns
Direct string concatenation	Simpler than template literals for simple cases
Multiple returns in function	Can improve readability
Comments explaining "why"	Better than no comments

Context-Sensitive Rules

Type Annotations

Flag missing type annotation ONLY IF ALL of these are true:

Function is public API (not prefixed with _)
Types are not obvious from context (e.g., x = 5 is clearly int)
Not a test function or fixture
Codebase has existing typing conventions

Exception Handling

Flag bare except ONLY IF:

Not in a top-level error boundary / middleware
The caught exception is actually swallowed (not logged/re-raised)
Specific exception types are known and available
Not in cleanup/teardown code where any error should be caught

Error Handling

Flag missing try/except ONLY IF:

No middleware or error handler catches this at a higher level
The framework doesn't handle errors (FastAPI exception handlers)
The error would cause a crash, not just a failed operation
User needs specific feedback for this error type

Before Submitting Review

Final verification:

Re-read each finding and ask: "Did I verify this is actually an issue?"
For each finding, can you point to the specific line that proves the issue exists?
Would a domain expert agree this is a problem, or is it a style preference?
Does fixing this provide real value, or is it busywork?
Format every finding as: [FILE:LINE] ISSUE_TITLE
For each finding, ask: "Does this fix existing code, or does it request entirely new code that didn't exist before?" If the latter, downgrade to Informational.
If this is a re-review: ONLY verify previous fixes. Do not introduce new findings.

If uncertain about any finding, either:

Remove it from the review
Mark it as a question rather than an issue
Verify by reading more code context