ワンクリックで
multi-review
Multi-model code review. Runs code-review skill with 2 models in parallel, then synthesizes findings.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Multi-model code review. Runs code-review skill with 2 models in parallel, then synthesizes findings.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Structured workflow for investigating public software supply-chain attacks across local repositories, GitHub organizations, and CI/host fleets. Use when asked to assess exposure to compromised packages, malicious releases, package-manager attacks, GitHub Actions/OIDC abuse, or published IOC lists.
Multi-agent workflow (tracer/resolver/bypass) for secure code review, exploitability triage, and PoC validation in codebases. Use when conducting structured security research or penetration test analysis.
Send a desktop notification when pi finishes a task. Works on macOS, Windows, and Linux. On first use, interactively asks the user to choose a notification preference (always, ask every time, or never) via arrow-key selector. Use this skill at the end of every task completion — after committing, after finishing a code change, after answering a question, or any other unit of work.
Search a specific GitHub repository for code using gh search and fetch file contents via gh api without cloning. Use when given a repo URL and query to locate logic in remote repos.
Use the bkci CLI to query Buildkite builds, logs, artifacts, and token scope status with LLM-friendly JSON output.
Analyze Playwright test logs on Buildkite to extract failed-only tests across jobs.
| name | multi-review |
| description | Multi-model code review. Runs code-review skill with 2 models in parallel, then synthesizes findings. |
Runs the code-review skill with 2 different models in parallel, then synthesizes with active validation.
Create a unique temp dir + get the PR diff (same as code-review)
# Unique temp dir for this run
TMP_DIR="$(mktemp -d -t multi-review.XXXXXX)"
PR_DIFF="$TMP_DIR/pr-diff.txt"
# If PR number provided, use it. Otherwise current branch.
gh pr diff [PR_NUMBER] > "$PR_DIFF"
Run 2 parallel reviews via bash
# Claude review MUST use the `claude` CLI (not `pi -p --model ...`) so it uses Claude auth.
claude -p --model opus --permission-mode bypassPermissions \
"Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
> "$TMP_DIR/review-opus.md" &
# Codex review continues to use pi.
pi -p --model gpt-5.5 --provider openai-codex \
"Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
> "$TMP_DIR/review-codex.md" &
wait
If Claude fails because of auth or local lock contention, rerun only the Claude command after fixing auth (claude auth) or retrying.
Do not blindly trust the reviewers. Validate each finding yourself.
Read PR context first Before looking at sub-agent reviews, get the full picture:
# What the PR claims to do
gh pr view [PR_NUMBER] --json title,body
# What it actually does
cat "$PR_DIFF"
# What others have already said
gh pr view [PR_NUMBER] --json comments,reviews --jq '.comments[].body, .reviews[].body'
Form your own impressions. Note any issues already flagged in PR feedback.
Collect all findings Build a deduplicated list of every issue from both reviews. Note which model(s) found each issue.
Validate EACH finding For every finding, actually look at the code and verify:
Score by IMPACT, not consensus Rate each validated issue by actual severity:
Consensus count (both models) ≠ importance.
Flag unique findings for extra scrutiny When only one model found something:
Check for gaps What might BOTH models have missed?
# 🔍 Multi-Model PR Review: [PR title]
## Validated Issues
### 🔴 Critical
[Issues that must be fixed - functionality broken, security, etc.]
### 🟠 High Priority
[Real bugs, incorrect behavior - should fix before merge]
### 🟡 Medium Priority
[Performance, maintainability, edge cases - should discuss]
### 🟢 Low Priority
[Style, minor improvements - nice to have]
Each issue should include:
- **File**: path/to/file.ext#L10-L15
- **Status**: ✅ Confirmed | ⚠️ Needs verification | ❌ False positive
- **Found by**: Opus / Codex / PR feedback
- **Description**: What's wrong and why it matters
- **Suggestion**: How to fix (if applicable)
## ❌ False Positives Filtered
[List any findings that were wrong, with brief explanation]
## ⚠️ Potential Gaps
[Things all models may have missed - especially check PR description claims]
## 📊 Model Coverage
| Issue | Opus | Codex | PR | Status |
|-------|:----:|:-----:|:--:|--------|
| Issue 1 | ✅ | ✅ | - | ✅ Confirmed |
| Issue 2 | ❌ | ✅ | - | ✅ Confirmed |
| Issue 3 | ✅ | ❌ | - | ❌ False positive |
| Issue 4 | ❌ | ❌ | ✅ | ⚠️ Models missed! |
## Final Verdict
**[MERGE / FIX FIRST / NEEDS DISCUSSION]**
[Brief explanation of verdict]