一键导入
review-pr
Multi-agent correctness review of a pull request.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Multi-agent correctness review of a pull request.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
| name | review-pr |
| description | Multi-agent correctness review of a pull request. |
| allowed-tools | Bash(gh issue view:*), Bash(gh search:*), Bash(gh issue list:*), Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr list:*), Bash(uv run infra/codehealth/log_stats.py:*), mcp__github_inline_comment__create_inline_comment |
Provide a code review for the given pull request.
Agent assumptions (applies to all agents and subagents):
Follow these steps precisely:
Launch a haiku agent to check if any of the following are true:
gh pr view <PR> --comments) AND a re-review was not explicitly requested. When a maintainer explicitly requests a re-review, always proceed even if a prior review exists.If any condition is true, stop. Note: still review Claude-generated PRs.
Launch a haiku agent to return file paths (not contents) for all relevant CLAUDE.md and AGENTS.md files:
Launch a sonnet agent to view the PR and return a summary of the changes.
Launch 4 agents in parallel to independently review the changes. Each returns a list of issues; each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug").
Agents 1 + 2: CLAUDE.md/AGENTS.md compliance sonnet agents. Audit changes for compliance. When evaluating a file, only consider CLAUDE.md/AGENTS.md files that share its path or are parents. If the PR adds or changes tests, read root TESTING.md plus the relevant module-specific testing docs, and check for low-value/slop tests or local testing-policy violations.
Agents 3 + 4: Opus bug agents (parallel). Scan for obvious bugs, security issues, and incorrect logic within the changed code. Focus only on the diff without reading extra context. Flag only significant bugs you can validate from the diff alone; ignore nitpicks and likely false positives.
CRITICAL: We only want HIGH SIGNAL issues. Flag issues where:
Do NOT flag:
If you are not certain an issue is real, do not flag it. False positives erode trust.
Tell each subagent the PR title and description for author-intent context.
Marin-specific: In experiments/grug, duplication is often intentional for high-velocity research iteration. Do not flag copy/paste or DRY concerns if behavior/contracts are correct.
For each issue from agents 3 and 4, launch a parallel subagent to validate it. Give the subagent the PR title, description, and issue description. It must confirm with high confidence that the issue is real — e.g. for "variable is not defined", verify that in the code; for a CLAUDE.md issue, verify the rule is scoped to this file and actually violated. Use Opus subagents for bugs/logic, sonnet for CLAUDE.md violations.
Filter out any issues not validated in step 5. The remainder is the high-signal review list.
Emit a stats event for this review (best-effort — never retry, never
surface failures to the user). This step runs unconditionally, before
any of the early-stop branches below, so we capture no-finding runs and
non---comment runs in the dashboard. Run from the repo root:
cat <<'EOF' | uv run infra/codehealth/log_stats.py
{
"tool": "review-pr",
"invocation": {
"trigger": "local",
"agent_cli": "claude",
"pr_number": <PR>,
"agent_exit_code": 0,
"timed_out": false
},
"findings": [
["<file>", <line>, "<category>", 1.0, "<first 200 chars of issue description>"]
]
}
EOF
<category> is one of: bug, claude-md-adherence.findings row per validated issue. Pass "findings": [] if there
were none — the empty row in the invocations table is the
"tool ran with no signal" datapoint we want. finding_count is derived
from the findings array length by log_stats.py.Output a summary of the review findings to the terminal:
If --comment argument was NOT provided, stop here. Do not post any GitHub comments.
If --comment argument IS provided and NO issues were found, post a summary comment using gh pr comment and stop.
If --comment argument IS provided and issues were found, continue to step 9.
Draft the list of comments you plan to leave. For your own review only — do not post it anywhere.
Post inline comments for each issue using mcp__github_inline_comment__create_inline_comment with confirmed: true. For each comment:
IMPORTANT: Only post ONE comment per unique issue. Do not post duplicate comments.
Use this list when evaluating issues in Steps 4 and 5 (these are false positives, do NOT flag):
Notes:
TESTING.md and the relevant module-specific AGENTS.md/TESTING.md/testing docs as the review checklist. Flag only concrete violations; do not use them to request broad coverage improvements.--comment argument is provided, post a comment with the following format:No issues found. Checked for bugs and CLAUDE.md compliance.
https://github.com/owner/repo/blob/$(git rev-parse HEAD)/foo/bar will not work, since your comment will be directly rendered in Markdown.L4-7)Lint, run the pre-PR checks, commit, push, and author or update the branch's pull request in the required plain-text format. Use when committing, pushing, or creating/updating a PR.
Modify or upstream a Grug/Grugformer experiment variant.
Run a perf gate on a PR that touches lib/zephyr internals.
Curate the experiment report index at docs/reports/index.md.
Triage a failed canary ferry run (CI-invoked).
Refresh Marin TPU-vLLM forks from a tpu-inference release/LKG pair, update exact SHA pins, run TPU smokes, and open the Marin PR.