com um clique
com um clique
Monitor an Iris job and recover it on failure. Use when asked to babysit or watch a job or run.
Author or update a Marin PR in the required plain-text format. Use when creating or updating a PR.
De-rot markdown docs in lib/iris, lib/zephyr, and lib/fray.
Multi-session research workflow: logbooks, experiment issues, and W&B.
Scheduled scrub: docs and code parity.
Scheduled scrub: TL;DR blocks on experiment issues.
| name | review-pr |
| description | Multi-agent correctness review of a pull request. |
| allowed-tools | Bash(gh issue view:*), Bash(gh search:*), Bash(gh issue list:*), Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr list:*), Bash(uv run infra/codehealth/log_stats.py:*), mcp__github_inline_comment__create_inline_comment |
Provide a code review for the given pull request.
Agent assumptions (applies to all agents and subagents):
Follow these steps precisely:
Launch a haiku agent to check if any of the following are true:
gh pr view <PR> --comments) AND the review was NOT explicitly requested via comment (e.g. "claude review this"). When a maintainer explicitly requests a re-review, always proceed even if a prior review exists.If any condition is true, stop. Note: still review Claude-generated PRs.
Launch a haiku agent to return file paths (not contents) for all relevant CLAUDE.md and AGENTS.md files:
Launch a sonnet agent to view the PR and return a summary of the changes.
Launch 4 agents in parallel to independently review the changes. Each returns a list of issues; each issue includes a description and the reason it was flagged (e.g. "CLAUDE.md adherence", "bug").
Agents 1 + 2: CLAUDE.md/AGENTS.md compliance sonnet agents. Audit changes for compliance. When evaluating a file, only consider CLAUDE.md/AGENTS.md files that share its path or are parents.
Agents 3 + 4: Opus bug agents (parallel). Scan for obvious bugs, security issues, and incorrect logic within the changed code. Focus only on the diff without reading extra context. Flag only significant bugs you can validate from the diff alone; ignore nitpicks and likely false positives.
CRITICAL: We only want HIGH SIGNAL issues. Flag issues where:
Do NOT flag:
If you are not certain an issue is real, do not flag it. False positives erode trust.
Tell each subagent the PR title and description for author-intent context.
Marin-specific: In experiments/grug, duplication is often intentional for high-velocity research iteration. Do not flag copy/paste or DRY concerns if behavior/contracts are correct.
For each issue from agents 3 and 4, launch a parallel subagent to validate it. Give the subagent the PR title, description, and issue description. It must confirm with high confidence that the issue is real — e.g. for "variable is not defined", verify that in the code; for a CLAUDE.md issue, verify the rule is scoped to this file and actually violated. Use Opus subagents for bugs/logic, sonnet for CLAUDE.md violations.
Filter out any issues not validated in step 5. The remainder is the high-signal review list.
Emit a stats event for this review (best-effort — never retry, never
surface failures to the user). This step runs unconditionally, before
any of the early-stop branches below, so we capture no-finding runs and
non---comment runs in the dashboard. Run from the repo root:
cat <<'EOF' | uv run infra/codehealth/log_stats.py
{
"tool": "review-pr",
"invocation": {
"trigger": "local",
"agent_cli": "claude",
"pr_number": <PR>,
"agent_exit_code": 0,
"timed_out": false
},
"findings": [
["<file>", <line>, "<category>", 1.0, "<first 200 chars of issue description>"]
]
}
EOF
<category> is one of: bug, claude-md-adherence.findings row per validated issue. Pass "findings": [] if there
were none — the empty row in the invocations table is the
"tool ran with no signal" datapoint we want. finding_count is derived
from the findings array length by log_stats.py.Output a summary of the review findings to the terminal:
If --comment argument was NOT provided, stop here. Do not post any GitHub comments.
If --comment argument IS provided and NO issues were found, post a summary comment using gh pr comment and stop.
If --comment argument IS provided and issues were found, continue to step 9.
Draft the list of comments you plan to leave. For your own review only — do not post it anywhere.
Post inline comments for each issue using mcp__github_inline_comment__create_inline_comment with confirmed: true. For each comment:
IMPORTANT: Only post ONE comment per unique issue. Do not post duplicate comments.
Use this list when evaluating issues in Steps 4 and 5 (these are false positives, do NOT flag):
Notes:
--comment argument is provided, post a comment with the following format:No issues found. Checked for bugs and CLAUDE.md compliance.
https://github.com/owner/repo/blob/$(git rev-parse HEAD)/foo/bar will not work, since your comment will be directly rendered in Markdown.L4-7)