with one click
review
// This skill should be used when performing exhaustive code reviews using multi-agent analysis, ultra-thinking, and worktrees.
// This skill should be used when performing exhaustive code reviews using multi-agent analysis, ultra-thinking, and worktrees.
| name | review |
| description | This skill should be used when performing exhaustive code reviews using multi-agent analysis, ultra-thinking, and worktrees. |
<command_purpose> Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. </command_purpose>
Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance
Load project conventions:
# Load project conventions
if [[ -f "CLAUDE.md" ]]; then
cat CLAUDE.md
fi
Read CLAUDE.md if it exists - apply project conventions during review.
<review_target> #$ARGUMENTS </review_target>
First, I need to determine the review target type and set up the code for analysis.<task_list>
skill: git-worktree with branch namegh pr view --json for title, body, files, linked issuesEnsure that the code is ready for analysis (either in worktree or on current branch). ONLY then proceed to the next step.
</task_list>
Before spawning review agents, classify the PR to avoid spawning agents whose expertise is irrelevant to the change.
Run git diff --name-only origin/main...HEAD | head -n 200 to get the list of changed files. Also capture status letters and line counts:
git diff --name-only origin/main...HEAD > .git/review-changed.txt
git diff --name-status origin/main...HEAD > .git/review-status.txt
git diff --numstat origin/main...HEAD > .git/review-numstat.txt
Check for override: scan $ARGUMENTS for "deep review" or "full review". Also run gh pr view --json body,title --jq '.body + " " + .title' and check for the same phrases. If override detected, skip classification and spawn all 8 agents.
Apply the four-class decision tree below in order; first match wins (override always trumps):
If $ARGUMENTS or PR body/title contains "deep review" / "full review":
class = code (full override) → 8 agents
Else if every changed file matches the lockfile glob OR
(lockfile glob + optional knowledge-base/** or *.md edit)
AND zero source-code extensions are present:
class = lockfile-only → 2 agents (git-history-analyzer + security-sentinel)
Else if total_files > 0 AND total_lines > 0 AND
(deleted_files * 100 / total_files) >= 80 AND
(deleted_lines * 100 / total_lines) >= 80 AND
zero source-code extensions are present in the diff:
class = deletion-dominated → 2 agents (git-history-analyzer + security-sentinel)
Else if any changed file has a source-code extension:
class = code → 8 agents
Else:
class = non-code → 4 agents
The "zero source-code extensions" guard on deletion-dominated closes a piggyback class: a 1000-line cleanup PR that adds a 50-line .ts file would otherwise route to 2 agents and bypass pattern-recognition / code-quality / architecture / data-integrity / performance / agent-native review on the new source file. Mirroring lockfile-only's $has_source empty requirement keeps the savings on legitimate orphan-cleanup PRs while routing any deletion-dominated PR with new source code through the full 8-agent path.
Source-code extensions: .ts, .tsx, .js, .jsx, .rb, .py, .go, .rs, .swift, .kt, .java, .c, .cpp, .cs, .php, .sh, .bash, .zsh, .mjs, .cjs — any file containing executable logic. Non-code: .md, .txt, .yml, .yaml, .toml, .json, .css, .html, .njk, .svg, .png, .jpg, .gif, .pen, LICENSE, CHANGELOG*, .github/** workflow files, and plugin/agent/skill definition files (plugins/**/*.md, agents/**/*.md).
Compute the predicates inline (set -uo pipefail — drop the e so legitimately-empty greps don't abort):
total_files=$(wc -l < .git/review-changed.txt)
deleted_files=$(grep -cE '^D' .git/review-status.txt || true)
added_lines=$(awk 'BEGIN{s=0} {if ($1 != "-") s += $1} END{print s}' .git/review-numstat.txt)
deleted_lines=$(awk 'BEGIN{s=0} {if ($2 != "-") s += $2} END{print s}' .git/review-numstat.txt)
total_lines=$((added_lines + deleted_lines))
LOCKFILE_RE='(^|/)(package-lock\.json|bun\.lock|yarn\.lock|Cargo\.lock|go\.sum|Gemfile\.lock|poetry\.lock|uv\.lock)$'
ALLOWED_NONLOCK_RE='^(knowledge-base/|.*\.md$)'
SOURCE_RE='\.(ts|tsx|js|jsx|rb|py|go|rs|swift|kt|java|c|cpp|cs|php|sh|bash|zsh|mjs|cjs)$'
non_lock_files=$(grep -vE "$LOCKFILE_RE" .git/review-changed.txt || true)
non_lock_non_doc=$(printf '%s\n' "$non_lock_files" | grep -vE "$ALLOWED_NONLOCK_RE" | grep -v '^$' || true)
has_source=$(grep -E "$SOURCE_RE" .git/review-changed.txt | head -1 || true)
any_lockfile=$(grep -E "$LOCKFILE_RE" .git/review-changed.txt | head -1 || true)
lockfile-only matches when $non_lock_non_doc is empty AND $any_lockfile is non-empty AND $has_source is empty.deletion-dominated matches when total_files > 0 AND total_lines > 0 AND (deleted_files * 100 / total_files) >= 80 AND (deleted_lines * 100 / total_lines) >= 80 AND $has_source is empty. Bash arithmetic evaluates left-to-right; multiply-first avoids the integer-truncation-to-zero trap. Note: git diff --name-only does not distinguish added/deleted paths, so $has_source may match a path that is itself a deletion — this is intentionally conservative (we want zero source-file activity in either direction) and prevents a piggyback attack where a backdoor .ts file rides along on a bulk-deletion cleanup PR.Announce the classification result before spawning agents.
<parallel_tasks>
If override is detected (deep review / full review), spawn all 8 agents regardless of class:
Else if class is code (any source-code extension and not deletion-dominated/lockfile-only), spawn all 8 agents (existing behavior).
Else if class is non-code (no source files, not lockfile-only or deletion-dominated), spawn 4 agents:
Skipped for non-code PRs: architecture-strategist, performance-oracle, data-integrity-guardian, agent-native-reviewer. These agents analyze source code structure, runtime performance, database integrity, and agent accessibility — none are relevant to documentation, configuration, or CI changes.
Else if class is lockfile-only or deletion-dominated (and override not detected), spawn 2 agents:
Skipped for lockfile-only / deletion-dominated PRs: pattern-recognition-specialist, code-quality-analyst, architecture-strategist, performance-oracle, data-integrity-guardian, agent-native-reviewer. Lockfile diffs and bulk deletions do not contain semantic patterns or quality regressions for the pattern/quality agents to find; architecture/perf/integrity/agent-native agents have no source code to analyze. Use deep review to force full pipeline.
Announce: "Change classified as [code/non-code/deletion-dominated/lockfile-only]. Spawning [N]/8 review agents. [If skipped agents: Skipped: — not relevant to changes. Use 'deep review' to force full pipeline.]"
</parallel_tasks>
Note: The conditional agents block below (agents 9-14: Rails reviewers, migration experts, test-design-reviewer, semgrep) is unaffected by the classification gate. Both gates run independently — the classification controls only the always-on agents above.
<conditional_agents>
These agents are run ONLY when the PR matches specific criteria. Check the PR files list and project structure to determine if they apply:
If project is a Rails app (Gemfile AND config/routes.rb exist at repo root):
When to run Rails review agents:
Gemfile and config/routes.rbWhat these agents check:
kieran-rails-reviewer: Strict Rails conventions, naming clarity, controller complexity, Turbo patternsdhh-rails-reviewer: Rails philosophy adherence, JavaScript framework contamination, unnecessary abstractionIf PR contains database migrations (db/migrate/*.rb files) or data backfills:
When to run migration agents:
db/migrate/*.rbWhat these agents check:
data-migration-expert: Verifies hard-coded mappings match production reality (prevents swapped IDs), checks for orphaned associations, validates dual-write patternsdeployment-verification-agent: Produces executable pre/post-deploy checklists with SQL queries, rollback procedures, and monitoring plansIf PR contains test files:
When to run test review agent:
*_test.rb, *_spec.rbtest_*.py, *_test.py*.test.ts, *.test.js, *.spec.ts, *.spec.js*_test.go*_test.swift, *Tests.swift__tests__/ or spec/ or test/ directoriesWhat this agent checks:
test-design-reviewer: Scores tests against Farley's 8 properties, produces a weighted Test Quality Score with letter grade and top 3 improvement recommendationsIf PR modifies source code files, semgrep-sast is a mandatory gate:
When to run SAST agent:
Bootstrap (mandatory before spawning the agent): Run ensure-semgrep.sh from the repo root. The script checks PATH first, then auto-installs via brew → pipx → pip --user in that order. Exits 0 when semgrep is reachable. Exit 1 means an install was attempted and failed; exit 2 means no install path was available (no brew, pipx, or python3 with pip). On non-zero exit, print the script's stderr to the user and abort the review. Do NOT silently skip — the deterministic SAST pass is what catches CodeQL-equivalent patterns like js/file-system-race before push.
Custom rules file: semgrep-custom-rules.yaml ships alongside the public rule packs and covers CodeQL queries the public packs miss (e.g. the TOCTOU patterns that blocked PR #2463 in CI). The semgrep-sast agent loads it via --config=plugins/soleur/skills/review/references/semgrep-custom-rules.yaml. Extend it whenever a CodeQL finding in CI was not caught locally — the goal is no-surprises on CI.
What this agent checks:
semgrep-sast: Known vulnerability signatures (CWE patterns), hardcoded secrets, insecure function calls, taint analysis. Complements security-sentinel's LLM-based architectural review with deterministic rule-based scanning.If the plan declares Brand-survival threshold as single-user incident:
## User-Brand Impact section mitigates or scope-outs eachWhen to run user-impact-reviewer:
Brand-survival threshold: single-user incident## User-Brand Impact section with that threshold labelWhat this agent checks:
user-impact-reviewer: Enumerates concrete user-facing artifacts exposed by the change (user.email, workspace.name, api_key.token, conversation.id, message.body, billing.amount, oauth.installation_id, etc.) AND a concrete exposure vector per artifact (cross-tenant read, RLS bypass, credential leak in logs, data loss on rollback, double-charge on retry, silent drop on degraded fallback). Rejects generic boilerplate (e.g., "users experience a bug", "error state", TBD/TODO placeholders). Coexists with security-sentinel — security-sentinel handles OWASP/CWE scanning across all PRs; user-impact-reviewer handles user-facing-outcome enumeration when the plan declares the brand-survival threshold as single-user incident.</conditional_agents>
<decision_gate>
After all parallel and conditional agents complete, check their outputs:
This is a binary gate: all-failed triggers the fallback; any-succeeded means continue.
</decision_gate>
<ultrathink_instruction> For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user.</ultrathink_instruction>
Complete system context map with component interactions<thinking_prompt> ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? </thinking_prompt>
<stakeholder_perspectives>
Developer Perspective
Operations Perspective
End User Perspective
Security Team Perspective
Business Perspective
<thinking_prompt> ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? </thinking_prompt>
<scenario_checklist>
Run the Task code-simplicity-reviewer() to see if we can simplify the code.
When reviewing a PR that changes *.njk, *.md, README, or content under
apps/**, scan every fenced code block tagged bash, sh, shell, or
untagged-but-CLI-shaped. For each <command> <subcommand> pair:
WebFetch or run <tool> --help. If unsure, flag as
cli-verification-unverified and require an explicit annotation or
citation before approving../scripts/*,
plugins/soleur/skills/*/scripts/*), verify the script exists at the
path.<model>:<tag>,
@<version>), fetch the registry or cite the registry URL.Flag any unverified CLI invocation as P1 (docs-trust) — NOT P3 polish. A fabricated CLI command on a high-intent landing page breaks first-touch trust (#1810/#2550).
When a review agent claims that a build-step CI gate (e.g., post-Eleventy
grep -rEn ... _site/, post-Webpack chunk regex, post-tsc output scan)
will fail on rendered output, rebuild the artifact directory BEFORE
running the gate locally. Never run the gate against an existing
_site/, dist/, build/, or .next/ from a prior session — those
predate the source change under review and return false-pass (zero
matches) even when the rendered output post-rebuild contains the flagged
strings.
The verification command order is non-negotiable:
<rebuild command> && <literal CI gate command>
Examples:
npx @11ty/eleventy --quiet && grep -rEn '<regex>' _site/bun run build && grep -rEn '<regex>' .next/If the rebuild step is unfamiliar, read the corresponding .github/workflows/
job to find the exact build command the gate runs against — match it, do
not invent one. A stale-artifact false-pass is the most common dismissal
class for build-output gates (PR #3296 → #3347 hotfix). Treat any agent
finding of the form "rendered/built artifact X contains Y" as a
fresh-build-required claim by default.
<critical_requirement>
Each finding's default action is to FIX IT INLINE on the PR branch: make the edit,
commit with a message review: <summary> (P<N>), and push. Apply to P1, P2, P3
equally.
Filing a GitHub issue instead of fixing is allowed ONLY when the finding meets one of these four scope-out criteria:
apps/web-platform/, plugins/soleur/) as the primary
changed file. Bare multi-file fixes do NOT qualify; the unrelatedness
must be concrete and defensible — count specific files or drop the
scope-out.main before this PR and
is not exacerbated by the PR's changes. (Does NOT block merge.) Only
reachable through the pre-existing branch of the provenance triage in
Step 1 below — never applies to pr-introduced findings. Mirroring an
existing brittle pattern "for symmetry" is exacerbation, not preservation:
if git diff origin/main...HEAD -- <file> | grep '^+' | grep <pattern>
returns ≥1 line, the criterion fails — fix inline. See
knowledge-base/project/learnings/2026-05-04-in-isolation-probe-missed-user-shape-and-scope-out-exacerbation.md.When filing:
## Scope-Out Justification section naming the
specific criterion and a 1-3 sentence rationale.--label deferred-scope-out and --milestone
(per guardrails:require-milestone).review:, Code review #,
Refactor:, arch:, compound:, follow-through:).gh issue create --body-file <path> — never --body "$VAR" — so
untrusted finding text (diffs, agent output) cannot shell-interpolate.Everything else (magic numbers, duplicated helpers, small refactors, missing tests for PR-introduced code, polish, naming, a11y on PR-introduced surfaces, performance issues introduced by the PR) MUST be fixed inline.
Second-reviewer confirmation gate: Before creating a scope-out issue under
any criterion, invoke code-simplicity-reviewer via Task. The prompt MUST
include:
CONCUR (to co-sign the filing) or DISSENT: <one-sentence reason> (to flip to fix-inline). Everything after the first line is
advisory context."If the first line of the agent's reply begins with DISSENT, the disposition
flips to fix-inline — do not file the issue. If the first line is CONCUR,
proceed with filing. Any other first-line content is treated as DISSENT
(fail-safe toward fix-inline).
Write-time self-check: Before invoking gh issue create --label deferred-scope-out, scroll up in the conversation and confirm the most
recent code-simplicity-reviewer Task reply begins with CONCUR for THIS
finding. If no such Task exists in this conversation, or the reply begins
with anything other than CONCUR, STOP — invoke the agent first. Filing
first and co-signing second is a protocol violation even when the agent
eventually returns CONCUR; the gate exists for the DISSENT case, and
filing-first leaves a publicly-visible issue that has to be closed if the
agent dissents. See learning
knowledge-base/project/learnings/best-practices/2026-05-05-extracted-bash-functions-need-self-contained-state.md
Pattern 3.
Rationale: One agent's "scope-out is fine here" can be wrong in the same
way a single test can miss a bug. Requiring a second, simplicity-biased agent
to co-sign blocks the most common regression pattern: an agent-author pair
rationalizing a filing that a fresh pair of eyes would reject. See
knowledge-base/project/learnings/2026-04-15-multi-agent-review-catches-bugs-tests-miss.md.
Filing without scope-out justification will be caught by /ship Phase 5.5 Review- Findings Exit Gate and BLOCK merge. See rule rf-review-finding-default-fix-inline. </critical_requirement>
<synthesis_tasks>
pr-introduced or pre-existing.
A finding is pr-introduced if the code the finding critiques was added
or modified by this PR's diff (verify with git log -L :<function>:<file> origin/main..HEAD or git diff origin/main...HEAD -- <file>). A finding
is pre-existing if the code existed on main before this PR and the
PR neither changed nor moved it. Provenance-ambiguous findings (e.g., a
helper the PR refactored but didn't introduce) default to
pr-introduced — the PR touched it, the PR owns the fix.Disposition by provenance:
pre-existing-unrelated criterion AND a re-evaluation deadline
(a target phase milestone such as Phase 4, or a concrete trigger
condition such as "revisit when syncWorkspace lands in #2244").
Open-ended scope-outs with no deadline are NOT permitted — they become
the backlog this rule exists to drain.The pr-introduced → fix inline rule is the mechanical version of rule
rf-review-finding-default-fix-inline: it removes the judgment loophole ("is
this really cross-cutting?") for findings the PR itself introduced.
</synthesis_tasks>
Coupling note: Ship Phase 1.5, Phase 5.5, and pre-merge hook pre-merge:review-evidence-gate detect review evidence by searching for GitHub issues with the code-review label whose body contains PR #<number>. If the issue body template or label changes, update detection logic in ship/SKILL.md and .claude/hooks/pre-merge-rebase.sh. Phase 5.5 Review-Findings Exit Gate (new in #2374) additionally detects open review-origin issues cross-referencing the PR by body regex (Ref|Closes|Fixes) #<N>\b without deferred-scope-out label; filing without scope-out justification will block merge.
<critical_instruction> Fix inline or, where a scope-out criterion applies, create a deferred-scope-out issue. Do NOT present findings for per-item user approval. </critical_instruction>
Read plugins/soleur/skills/review/references/review-todo-structure.md now for the complete GitHub issue creation flow: label prerequisite, issue body template, --body-file pattern, label/milestone selection, duplicate detection, error handling, and batch strategy.
Pipeline detection (run BEFORE writing the summary): Scan the conversation for skill: soleur:work or skill: soleur:one-shot output. If either is present, you are in pipeline mode — the calling orchestrator owns the lifecycle and is waiting on you to return so it can run step 5 / Phase 4. Emit the compact progress marker below instead of the verbose summary, then return immediately. Do NOT use the heading ## Code Review Complete, do NOT include a ### Next Steps section, and do NOT write a wrap-up sentence — those framings cause one-shot to mistake the summary for a turn boundary and stop mid-pipeline.
Compact progress marker (pipeline mode):
## Review Phase Complete
- **Findings:** N total — N1 P1 / N2 P2 / N3 P3
- **Fixed inline:** N (commits: <sha>, <sha>, …)
- **Filed as scope-out:** N (#NNN, #NNN — criteria listed below)
- **Agents run:** <comma-separated list>
[Optional 1-line table of scope-out issues with criteria, if any.]
After emitting the marker, the calling skill's continuation gate takes over — control returns to one-shot step 5 / work Phase 4 in the SAME response.
Direct invocation summary (interactive mode only — no soleur:work or soleur:one-shot in conversation): Use the verbose summary template below.
## Code Review Complete
**Review Target:** PR #XXXX - [PR Title] **Branch:** [branch-name]
### Findings Summary
- **Total Findings:** [X]
- **P1 CRITICAL:** [count] - BLOCKS MERGE
- **P2 IMPORTANT:** [count] - Should Fix
- **P3 NICE-TO-HAVE:** [count] - Enhancements
- **By provenance:** [pr-introduced count] pr-introduced, [pre-existing count] pre-existing
- **Pre-existing disposition:** [fix-inline count] fixed, [scope-out count] scoped-out, [wontfix count] wontfix
### Fixed Inline
**P1 - Critical (BLOCKS MERGE):**
- {description} — commit {sha}
- {description} — commit {sha}
**P2 - Important:**
- {description} — commit {sha}
**P3 - Nice-to-Have:**
- {description} — commit {sha}
### Filed as Deferred Scope-Out
**Scope-out criterion required per finding (cross-cutting-refactor | contested-design | architectural-pivot | pre-existing-unrelated):**
- #NNN - review: {description} — criterion: {name} — rationale: {1-3 sentences}
- #NNN - review: {description} — criterion: {name} — rationale: {1-3 sentences}
**Failed (if any):**
- {description} - Error: {error message}
### Review Agents Used
- security-sentinel
- performance-oracle
- architecture-strategist
- agent-native-reviewer
- [other agents]
### Next Steps
1. **Verify inline fixes landed**: Each finding above should have a commit on the PR branch.
```bash
git log --oneline origin/main..HEAD | grep '^[a-f0-9]* review:'
```
2. **Inspect any scope-out issues**: Review findings filed as `deferred-scope-out` with justification.
```bash
gh issue list --label deferred-scope-out --search "Ref #<PR_NUMBER>"
```
3. **Phase 5.5 gate self-check**: `/ship` will run the Review-Findings Exit Gate and block merge on any open review-origin issue cross-referencing the PR without the `deferred-scope-out` label. If the gate blocks, either fix inline and close the issue, or add the `deferred-scope-out` label + `## Scope-Out Justification`.
P1 (Critical - Blocks Merge):
P2 (Important - Should Fix):
P3 (Nice-to-Have):
Pipeline detection: If the conversation contains skill: soleur:work output earlier (indicating review was invoked by work's Phase 4 chain) or soleur:one-shot output (indicating review was invoked by one-shot step 4), skip the exit gate. The calling pipeline handles compound, commit, and lifecycle progression. When review is invoked by work or one-shot, do not duplicate these steps and do not output the verbose ## Code Review Complete block from Step 3 — the compact ## Review Phase Complete marker (Step 3, pipeline mode) is the only output and the orchestrator's continuation gate handles progression. The verbose summary's ### Next Steps block is the failure mode that causes orchestrators to mistake the report for a turn-ending deliverable.
If invoked directly by the user (no work or one-shot orchestrator in the conversation):
Run skill: soleur:compound to capture learnings from the review session.
If compound finds nothing to capture, it will skip gracefully — do not block on this.
Commit any local artifacts. GitHub issues are already created remotely,
but local files may have been modified (plan updates, todo resolutions).
Run git status --short. If there are changes:
git add <changed files>
git commit -m "docs: review artifacts for feat-<name>"
git push
If there are no local changes, skip the commit (this is the expected case — review's primary output is GitHub issues, which are remote-only). If push fails (no network), warn and continue.
Display: "Review complete. All findings are tracked as GitHub issues.
Run /clear then /soleur:work or /soleur:ship for maximum context headroom."
Read plugins/soleur/skills/review/references/review-e2e-testing.md now for project type detection, testing offers (Web/iOS/Hybrid), and subagent procedures for browser and Xcode testing.
Multi-agent parallel review has been shown to catch bugs in shipped, green-CI code across these classes (each a real P1 caught on PR #2347):
let bindings captured by a once-built object that multiple components import. Pattern-recognition and code-quality agents spot the closure capture in seconds; unit tests rarely co-mount instances..is("archived_at", null)) no longer matches the index's WHERE clause. Data-integrity-guardian reads both files and compares WHERE clauses symbolically; the bug stays silent until a user archives a row.leaderId: "system" reusing an internal taxonomy value whose UI semantics collide with router output; a registry.reap() method with no scheduler outside tests (tsc is silent on "never called in prod"); a nullable callback parameter the caller contract forbids but the implementer maps to a value that breaks invariants. Review prompts must enumerate the downstream consumer / scheduler / invariant explicitly for agents to reach it. See knowledge-base/project/learnings/best-practices/2026-04-24-multi-agent-review-catches-feature-wiring-bugs.md.user-impact-reviewer's "name artifact + name vector" mandate reliably surfaces this where simplicity-biased peer review at plan time does not. PR #3067 added D5 (commenter-author-pin + immutability-pin) after the 11-agent review caught the gap that 3-reviewer plan-time review missed. See knowledge-base/project/learnings/2026-05-03-user-impact-reviewer-catches-runtime-content-tamper-vectors.md."plugin:name" vs bare "name", dotted IDs vs slashed IDs, hashed keys vs raw keys). Review agents and unit tests both miss this because each side's tests look internally consistent. The defect surfaces only via a derived-metric counter (orphan rate, miss rate, fall-through rate) whose surprising value points back at the contract. PR #3124 surfaced a soleur:plan (hook) vs plan (inventory) mismatch only after the orphan-skill counter — added as polish — reported a non-zero count in production data. See knowledge-base/project/learnings/2026-05-04-telemetry-join-format-mismatch-caught-by-orphan-counter.md. Reviewer takeaway: when a PR adds a join across two streams, ask whether at least one fixture per side uses each producer's actual emission format, not a normalized placeholder.See knowledge-base/project/learnings/2026-04-15-multi-agent-review-catches-bugs-tests-miss.md for the full pattern catalogue.
Review agent suggestions that modify workflow if conditions or event filters must be smoke tested against the full user journey (not just the reduced trigger case) before shipping -- agents optimize locally and can break flows they don't fully model.
When a reviewer prescribes --arg for jq injection defense in a gh ... --jq context, verify the CLI forwards jq flags before implementing. gh --jq accepts a single expression string and does NOT forward --arg, --argjson, or --slurp to the underlying jq binary — applying the fix produces unknown arguments at runtime. Fall back to shape-validating the shell variable (e.g., [[ "$VAR" =~ ^[0-9]+$ ]]) before interpolation, or pipe to a second-stage standalone jq --arg. See knowledge-base/project/learnings/2026-04-15-gh-jq-does-not-forward-arg-to-jq.md.
Generalizing the rule above: whenever a review agent prescribes a CLI flag or subcommand as a fix (e.g., gh issue create --json number, gh issue close --body-file, <tool> <subcommand> --<flag>), verify the flag exists on that exact subcommand via <tool> <subcommand> --help BEFORE applying. Agents hallucinate flags by generalizing from sibling subcommands (gh issue list has --json, gh issue create does not). Cost of verification: one --help call. Cost of applying a non-existent flag: revert + rework + commit pollution. If the prescribed flag is absent, fall back to a verified pattern (split into two verified commands, parse output with awk -F/, etc.) and note the substitution in the disposition table. See knowledge-base/project/learnings/best-practices/2026-04-19-verify-reviewer-prescribed-cli-flags-before-applying.md.
Parallel review batches can stall silently — spawning 12 review agents at once has been observed to produce completion notifications for only 6, with the remaining agents' transcripts frozen ~15s after spawn and no completion event emitted. When more than 30% of spawned agents stop producing output for >2 minutes after launch, proactively announce "N of M agents stalled" rather than silently waiting. Proceed with synthesis from the agents that returned — the Rate Limit Fallback gate already permits partial coverage. See knowledge-base/project/learnings/2026-04-17-postgrest-aggregate-disabled-forces-rpc-option.md.
Before reporting a broken link or missing file, reviewer agents MUST verify via Glob or Read. Unverified "broken link" claims waste reviewer-response cycles — the file may exist at the exact path. Why: PR #2226 pattern-recognition-specialist false-positive on a runtime-errors/2026-02-13-... learning file that did exist.
When reviewing a Nunjucks/Eleventy page that pairs a visible HTML answer with a FAPage/FAQPage JSON-LD acceptedAnswer.text, compare the two surfaces character-for-character per Question. Google's FAQ rich-result parity check compares codepoints — flag (a) {{ ... }} interpolation in HTML paired with a hardcoded value in JSON-LD, and (b) HTML entities (’, &, etc.) in one surface and ASCII or \uXXXX in the other. See knowledge-base/project/learnings/2026-04-18-faq-html-jsonld-parity.md.
When flagging a skill description word-budget overrun, the tokenizer MUST match the CI gate. plugins/soleur/test/components.test.ts uses desc.split(/\s+/).filter(Boolean).length against the YAML value only (1800-word skill budget); the grep -h 'description:' | wc -w pattern in AGENTS.md belongs to the agent 2500-word budget and includes YAML framing, inflating counts by ~5 words per skill. Run bun test plugins/soleur/test/components.test.ts before reporting — if it passes, the budget is satisfied. See knowledge-base/project/learnings/2026-04-19-skill-description-word-budget-tokenizer.md.
When a review agent reports branch-scope regressions (claims the PR reverts merged commits, touches files outside the PR's linked issue/directory, or shows a file list materially larger than expected), verify with git diff origin/main...HEAD --name-only (three-dot) before accepting. Two-dot variants like git diff main..HEAD show commits on main since the fork point (NOT commits on HEAD) and produce wildly different file lists when the branch is behind main — a common agent failure mode that surfaces as a false-positive P0. See knowledge-base/project/learnings/2026-04-22-markdown-table-parser-papercuts-and-review-diff-direction.md.
When a review agent recommends ADDING a field, header, or schema element to a security-relevant surface (wire schema, redaction filter, log scrubber, error envelope), grep the diff scope for // See #N provenance comments referencing prior REMOVALS of the same artifact BEFORE applying the fix. A Pn rating reflects local severity; it does not auto-override deliberate cross-cutting decisions encoded in code comments. If a prior PR removed the field as a security/privacy mitigation, flip disposition to contested-design scope-out with the prior issue # named in the filing — code-simplicity-reviewer reliably co-signs when the threat-model context is surfaced. See knowledge-base/project/learnings/2026-05-05-agent-native-recommendation-vs-prior-security-removal.md.
ADRs documenting an already-chosen-and-shipping architecture fail architectural-pivot — the criterion requires the fix itself to change a cross-codebase pattern, and an ADR for the path you're already shipping is documentation work, not pattern-changing work. Inline-absorb ADRs of this shape (~1 markdown file under knowledge-base/engineering/architecture/decisions/) rather than scoping them out. Symmetric rule: when code-simplicity-reviewer DISSENTs by naming a different criterion that fits, re-file under that criterion (fresh concur cycle) rather than absorbing inline — the dissent is on the label, not on the underlying deferral. See knowledge-base/project/learnings/2026-05-06-scope-out-criterion-misclassification-adr-not-architectural-pivot.md.
When a reviewer prescribes ADDING a defensive wrapper (try/catch around an SDK call, a typeof guard, a validation step, a retry envelope) citing a single in-tree precedent, grep the same file/module for ≥3 sibling unwrapped invocations of the same primitive BEFORE applying. If precedent is consistent and the new code mirrors it, the wrapper recommendation is precedent-contradicting — reject with a one-line disposition citing the unwrapped sites. The cited precedent may be helper-internal (boot-path safety) and not generalize to call-site code. Cost of verification: one grep. Cost of applying a precedent-contradicting wrapper: a commit that future reviewers will roll back when they apply the same heuristic. See knowledge-base/project/learnings/2026-05-05-phase-1-instrumentation-when-prior-fix-visibly-missed.md (#3287 review's false-positive P1 on a Sentry.addBreadcrumb call that mirrored 5 in-file precedents).
Any P1 (CRITICAL) findings must be addressed before merging the PR. Present these prominently and ensure they're resolved before accepting the PR.
[HINT] Download the complete skill directory including SKILL.md and all related files