| name | review-verification-protocol |
| description | Mandatory verification steps for all code reviews to reduce false positives. Load this skill before reporting ANY code review findings. |
| user-invocable | false |
Review Verification Protocol
This protocol MUST be followed before reporting any code review finding. Skipping these steps leads to false positives that waste developer time and erode trust in reviews.
Anti-confabulation (gate 0 — runs before every other gate)
Before issuing any verdict — flag, reject, or downgrade a finding — you MUST echo the exact artifact you are judging, quoted from a source you read in this turn:
- For a code finding: the file:line plus the cited code, read freshly now (not recalled from earlier in the session).
- For a diff review: the actual diff hunk under review.
The artifact is the only source of truth. Never infer what you are reviewing from the branch name, the working directory, surrounding files, or recollection. If your mental model differs from the freshly read source, the source wins. A verdict issued without a same-turn echo of its target is invalid — emit the echo first, or do not emit the verdict.
This gate exists because an LLM under contextual priming will confidently flag code that is not in the file. It runs before the hard gates below.
Hard gates (sequence)
Complete gates in order for each finding (or finding class). A gate passes only when the pass condition is objectively met—internal confidence is not enough.
- Read scope — PASS when: You cite the full enclosing unit you judged (e.g. function, method, or class): file path plus start–end line range or symbol name. Diff-only context without full-unit read fails this gate.
- Reference check (required for “unused”, “dead code”, “orphaned export”, “never called”) — PASS when: You ran a workspace-wide search for the symbol (ripgrep, IDE references, or
find_referencing_symbols-style lookup) and noted whether non-definition matches exist. If use may be dynamic (decorators, getattr, entry points, plugins), PASS when: you state that and name the registration or import path that could justify the symbol.
- Upstream / downstream (required for “missing validation”, “no error handling”, “race”, “leak”) — PASS when: You checked at least one of: caller, route/middleware, parent task, framework hook, or lifecycle (e.g. teardown, signal), and recorded whether responsibility already sits there.
- Evidence line — PASS when: The finding includes
[FILE:LINE] to the line that shows the issue (same requirement as Before Submitting Review).
If any gate fails, do not report the issue; gather evidence or drop it.
Pre-Report Verification Checklist
Before flagging ANY issue, verify (after Hard gates above):
Verification by Issue Type
"Unused Variable/Function"
Before flagging, you MUST:
- Search for ALL references in the codebase (grep/find)
- Check if it's exported and used by external consumers
- Check if it's used via reflection, decorators, or dynamic dispatch
- Verify it's not a callback passed to a framework
Common false positives:
- State setters in React (may trigger re-renders even if value appears unused)
- Variables used in templates/JSX
- Exports used by consuming packages
"Missing Validation/Error Handling"
Before flagging, you MUST:
- Check if validation exists at a higher level (caller, middleware, route handler)
- Check if the framework provides validation (Pydantic, Zod, TypeScript)
- Verify the "missing" check isn't present in a different form
Common false positives:
- Framework already validates (FastAPI + Pydantic, React Hook Form)
- Parent component validates before passing props
- Error boundary catches at higher level
"Type Assertion/Unsafe Cast"
Before flagging, you MUST:
- Confirm it's actually an assertion, not an annotation
- Check if the type is narrowed by runtime checks before the point
- Verify if framework guarantees the type (loader data, form data)
Valid patterns often flagged incorrectly:
data: UserData = await load_user()
if isinstance(data, User):
data.name
"Potential Memory Leak/Race Condition"
Before flagging, you MUST:
- Verify cleanup function is actually missing (not just in a different location)
- Check if AbortController signal is checked after awaits
- Confirm the component can actually unmount during the async operation
Common false positives:
- Cleanup exists in useEffect return
- Signal is checked (code reviewer missed it)
- Operation completes before unmount is possible
"Performance Issue"
Before flagging, you MUST:
- Confirm the code runs frequently enough to matter (render vs click handler)
- Verify the optimization would have measurable impact
- Check if the framework already optimizes this (React compiler, memoization)
Do NOT flag:
- Functions created in click handlers (runs once per click)
- Array methods on small arrays (< 100 items)
- Object creation in event handlers
Severity Calibration
Critical (Block Merge)
ONLY use for:
- Security vulnerabilities (injection, auth bypass, data exposure)
- Data corruption bugs
- Crash-causing bugs in happy path
- Breaking changes to public APIs
Major (Should Fix)
Use for:
- Logic bugs that affect functionality
- Missing error handling that causes poor UX
- Performance issues with measurable impact
- Accessibility violations
Minor (Consider Fixing)
Use for:
- Code clarity improvements
- Documentation gaps
- Inconsistent style (within reason)
- Non-critical test coverage gaps
Informational (No Action Required)
Use for:
- Improvements that require adding new dependencies or modules
- Suggestions for net-new code that didn't exist in the codebase before (new modules, test suites, abstractions)
- Architectural ideas for future consideration
- Test infrastructure suggestions (new mock libraries, behaviour extraction)
- Optimizations without measurable impact in the current context
These are NOT review blockers. They should be noted for the author's awareness but must not appear in the actionable issue count. The Verdict should ignore informational items entirely.
Do NOT Flag At All
- Style preferences where both approaches are valid
- Optimizations with no measurable benefit
- Test code not meeting production standards (intentionally simpler)
- Library/framework internal code (shadcn components, generated code)
- Hypothetical issues that require unlikely conditions
Valid Patterns (Do NOT Flag)
Python
| Pattern | Why It's Valid |
|---|
dict.get(key, []) | Returns default for missing keys, not error suppression |
Optional[T] return type | Standard way to express nullable in Python typing |
assert in test code | pytest uses assertions, not try/except |
| Type annotation on variable | Not a cast, just a hint for type checkers |
typing.cast() with prior validation | Valid after runtime check confirms type |
FastAPI
| Pattern | Why It's Valid |
|---|
Depends() without explicit type | FastAPI infers dependency type from function signature |
async def endpoint without await | May use sync DB calls or simple returns |
| Response model different from DB model | Separation of concerns between API and persistence |
BackgroundTasks parameter | Valid for fire-and-forget operations |
Direct request.state access | Standard pattern for middleware-injected data |
Testing
| Pattern | Why It's Valid |
|---|
assert without message | pytest rewrites assertions to show detailed diffs |
@pytest.fixture without explicit scope | Default function scope is correct for most fixtures |
monkeypatch over unittest.mock | Simpler API, pytest-native |
| Fixture returning mutable state | Each test gets fresh fixture invocation by default |
General
| Pattern | Why It's Valid |
|---|
+? lazy quantifier in regex | Prevents over-matching, correct for many patterns |
| Direct string concatenation | Simpler than template literals for simple cases |
| Multiple returns in function | Can improve readability |
| Comments explaining "why" | Better than no comments |
Context-Sensitive Rules
Type Annotations
Flag missing type annotation ONLY IF ALL of these are true:
Exception Handling
Flag bare except ONLY IF:
Error Handling
Flag missing try/except ONLY IF:
Before Submitting Review
Final verification:
- Re-read each finding and ask: "Did I verify this is actually an issue?"
- For each finding, can you point to the specific line that proves the issue exists?
- Would a domain expert agree this is a problem, or is it a style preference?
- Does fixing this provide real value, or is it busywork?
- Format every finding as:
[FILE:LINE] ISSUE_TITLE
- For each finding, ask: "Does this fix existing code, or does it request entirely new code that didn't exist before?" If the latter, downgrade to Informational.
- If this is a re-review: ONLY verify previous fixes. Do not introduce new findings.
If uncertain about any finding, either:
- Remove it from the review
- Mark it as a question rather than an issue
- Verify by reading more code context