ワンクリックでManusで任意のスキルを実行

始める

multi-review

スター16

フォーク2

更新日2026年5月8日 17:42

Multi-model code review. Runs code-review skill with 2 models in parallel, then synthesizes findings.

インストール

Codex または Claude でインストールこの Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。

Manusで実行

ソース

PSPDFKit-labs

PSPDFKit-labs/pi-skills

GitHub リポジトリを開く Creator のリポジトリを見る

ダウンロード

Manusで実行

Multi Review

Runs the code-review skill with 2 different models in parallel, then synthesizes with active validation.

Process

Phase 1: Gather Reviews

Create a unique temp dir + get the PR diff (same as code-review)

# Unique temp dir for this run
TMP_DIR="$(mktemp -d -t multi-review.XXXXXX)"
PR_DIFF="$TMP_DIR/pr-diff.txt"

# If PR number provided, use it. Otherwise current branch.
gh pr diff [PR_NUMBER] > "$PR_DIFF"

Run 2 parallel reviews via bash

# Claude review MUST use the `claude` CLI (not `pi -p --model ...`) so it uses Claude auth.
claude -p --model opus --permission-mode bypassPermissions \
  "Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
  > "$TMP_DIR/review-opus.md" &

# Codex review continues to use pi.
pi -p --model gpt-5.5 --provider openai-codex \
  "Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
  > "$TMP_DIR/review-codex.md" &

wait

If Claude fails because of auth or local lock contention, rerun only the Claude command after fixing auth (claude auth) or retrying.

Phase 2: Active Validation (IMPORTANT)

Do not blindly trust the reviewers. Validate each finding yourself.

Read PR context first Before looking at sub-agent reviews, get the full picture:

# What the PR claims to do
gh pr view [PR_NUMBER] --json title,body

# What it actually does
cat "$PR_DIFF"

# What others have already said
gh pr view [PR_NUMBER] --json comments,reviews --jq '.comments[].body, .reviews[].body'

Form your own impressions. Note any issues already flagged in PR feedback.

Collect all findings Build a deduplicated list of every issue from both reviews. Note which model(s) found each issue.
Validate EACH finding For every finding, actually look at the code and verify:
- Is this a real bug/issue? (check the code, don't just trust the claim)
- Is it a false positive? (model hallucinated or misunderstood)
- What file/line is affected? (verify it exists and matches)
Score by IMPACT, not consensus Rate each validated issue by actual severity:
- 🔴 Critical: Breaks functionality, security issue, data loss
- 🟠 High: Real bugs, incorrect behavior, major guideline violations
- 🟡 Medium: Performance, maintainability, edge cases
- 🟢 Low: Style, minor improvements, nitpicks
Consensus count (both models) ≠ importance.
- Consensus often means "obvious issue any reviewer would catch"
- Unique findings may be subtle insights worth MORE attention, not less
Flag unique findings for extra scrutiny When only one model found something:
- WHY did only one catch it? (deeper insight vs hallucination?)
- Validate more carefully - could be the most important find
- Could also be a false positive - verify against actual code
Check for gaps What might BOTH models have missed?
- Complex state/timing issues (e.g., async race conditions)
- Claimed features that don't actually work (check PR description)
- Subtle logic errors in control flow
- Look at the PR description - are all claims implemented?

Phase 3: Synthesized Output

Output format

# 🔍 Multi-Model PR Review: [PR title]

## Validated Issues

### 🔴 Critical
[Issues that must be fixed - functionality broken, security, etc.]

### 🟠 High Priority  
[Real bugs, incorrect behavior - should fix before merge]

### 🟡 Medium Priority
[Performance, maintainability, edge cases - should discuss]

### 🟢 Low Priority
[Style, minor improvements - nice to have]

Each issue should include:
- **File**: path/to/file.ext#L10-L15
- **Status**: ✅ Confirmed | ⚠️ Needs verification | ❌ False positive
- **Found by**: Opus / Codex / PR feedback
- **Description**: What's wrong and why it matters
- **Suggestion**: How to fix (if applicable)

## ❌ False Positives Filtered
[List any findings that were wrong, with brief explanation]

## ⚠️ Potential Gaps
[Things all models may have missed - especially check PR description claims]

## 📊 Model Coverage
| Issue | Opus | Codex | PR | Status |
|-------|:----:|:-----:|:--:|--------|
| Issue 1 | ✅ | ✅ | - | ✅ Confirmed |
| Issue 2 | ❌ | ✅ | - | ✅ Confirmed |
| Issue 3 | ✅ | ❌ | - | ❌ False positive |
| Issue 4 | ❌ | ❌ | ✅ | ⚠️ Models missed! |

## Final Verdict
**[MERGE / FIX FIRST / NEEDS DISCUSSION]**

[Brief explanation of verdict]

Key Principles

Validate, don't just synthesize - You are the senior reviewer, not a secretary
Unique findings deserve MORE attention - They might be the deepest insights
Consensus ≠ importance - Obvious issues get caught by all; critical bugs may be subtle
Check what's missing - The worst bugs are the ones no one found
Compare against PR description - Do claimed features actually work?

このリポジトリの他の Skills

同じリポジトリ

supply-chain-attack-investigation

PSPDFKit-labs/pi-skills

Structured workflow for investigating public software supply-chain attacks across local repositories, GitHub organizations, and CI/host fleets. Use when asked to assess exposure to compromised packages, malicious releases, package-manager attacks, GitHub Actions/OIDC abuse, or published IOC lists.

2026-05-1216

multi-agent-security-review

PSPDFKit-labs/pi-skills

Multi-agent workflow (tracer/resolver/bypass) for secure code review, exploitability triage, and PoC validation in codebases. Use when conducting structured security research or penetration test analysis.

2026-03-0316

task-notification

PSPDFKit-labs/pi-skills

Send a desktop notification when pi finishes a task. Works on macOS, Windows, and Linux. On first use, interactively asks the user to choose a notification preference (always, ask every time, or never) via arrow-key selector. Use this skill at the end of every task completion — after committing, after finishing a code change, after answering a question, or any other unit of work.

2026-02-2016

github-repo-search

PSPDFKit-labs/pi-skills

Search a specific GitHub repository for code using gh search and fetch file contents via gh api without cloning. Use when given a repo URL and query to locate logic in remote repos.

2026-02-1616

buildkite-cli

PSPDFKit-labs/pi-skills

Use the bkci CLI to query Buildkite builds, logs, artifacts, and token scope status with LLM-friendly JSON output.

2026-02-0816

buildkite-playwright-failures

PSPDFKit-labs/pi-skills

Analyze Playwright test logs on Buildkite to extract failed-only tests across jobs.

2026-02-0816

Multi Review

Runs the code-review skill with 2 different models in parallel, then synthesizes with active validation.

Process

Phase 1: Gather Reviews

Create a unique temp dir + get the PR diff (same as code-review)

# Unique temp dir for this run
TMP_DIR="$(mktemp -d -t multi-review.XXXXXX)"
PR_DIFF="$TMP_DIR/pr-diff.txt"

# If PR number provided, use it. Otherwise current branch.
gh pr diff [PR_NUMBER] > "$PR_DIFF"

Run 2 parallel reviews via bash

# Claude review MUST use the `claude` CLI (not `pi -p --model ...`) so it uses Claude auth.
claude -p --model opus --permission-mode bypassPermissions \
  "Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
  > "$TMP_DIR/review-opus.md" &

# Codex review continues to use pi.
pi -p --model gpt-5.5 --provider openai-codex \
  "Read and follow /Users/pat/Work/pi-skills/skills/code-review/SKILL.md to review the PR. Diff is at $PR_DIFF" \
  > "$TMP_DIR/review-codex.md" &

wait

If Claude fails because of auth or local lock contention, rerun only the Claude command after fixing auth (claude auth) or retrying.

Phase 2: Active Validation (IMPORTANT)

Do not blindly trust the reviewers. Validate each finding yourself.

Read PR context first Before looking at sub-agent reviews, get the full picture:

# What the PR claims to do
gh pr view [PR_NUMBER] --json title,body

# What it actually does
cat "$PR_DIFF"

# What others have already said
gh pr view [PR_NUMBER] --json comments,reviews --jq '.comments[].body, .reviews[].body'

Form your own impressions. Note any issues already flagged in PR feedback.

Collect all findings Build a deduplicated list of every issue from both reviews. Note which model(s) found each issue.

Validate EACH finding For every finding, actually look at the code and verify:

Is this a real bug/issue? (check the code, don't just trust the claim)
Is it a false positive? (model hallucinated or misunderstood)
What file/line is affected? (verify it exists and matches)

Score by IMPACT, not consensus Rate each validated issue by actual severity:

🔴 Critical: Breaks functionality, security issue, data loss
🟠 High: Real bugs, incorrect behavior, major guideline violations
🟡 Medium: Performance, maintainability, edge cases
🟢 Low: Style, minor improvements, nitpicks

Consensus count (both models) ≠ importance.

Consensus often means "obvious issue any reviewer would catch"
Unique findings may be subtle insights worth MORE attention, not less

Flag unique findings for extra scrutiny When only one model found something:

WHY did only one catch it? (deeper insight vs hallucination?)
Validate more carefully - could be the most important find
Could also be a false positive - verify against actual code

Check for gaps What might BOTH models have missed?

Complex state/timing issues (e.g., async race conditions)
Claimed features that don't actually work (check PR description)
Subtle logic errors in control flow
Look at the PR description - are all claims implemented?

Phase 3: Synthesized Output

Output format

# 🔍 Multi-Model PR Review: [PR title] ## Validated Issues ### 🔴 Critical [Issues that must be fixed - functionality broken, security, etc.] ### 🟠 High Priority [Real bugs, incorrect behavior - should fix before merge] ### 🟡 Medium Priority [Performance, maintainability, edge cases - should discuss] ### 🟢 Low Priority [Style, minor improvements - nice to have] Each issue should include: - **File**: path/to/file.ext#L10-L15 - **Status**: ✅ Confirmed | ⚠️ Needs verification | ❌ False positive - **Found by**: Opus / Codex / PR feedback - **Description**: What's wrong and why it matters - **Suggestion**: How to fix (if applicable) ## ❌ False Positives Filtered [List any findings that were wrong, with brief explanation] ## ⚠️ Potential Gaps [Things all models may have missed - especially check PR description claims] ## 📊 Model Coverage | Issue | Opus | Codex | PR | Status | |-------|:----:|:-----:|:--:|--------| | Issue 1 | ✅ | ✅ | - | ✅ Confirmed | | Issue 2 | ❌ | ✅ | - | ✅ Confirmed | | Issue 3 | ✅ | ❌ | - | ❌ False positive | | Issue 4 | ❌ | ❌ | ✅ | ⚠️ Models missed! | ## Final Verdict **[MERGE / FIX FIRST / NEEDS DISCUSSION]** [Brief explanation of verdict]

Key Principles

Validate, don't just synthesize - You are the senior reviewer, not a secretary

Unique findings deserve MORE attention - They might be the deepest insights

Consensus ≠ importance - Obvious issues get caught by all; critical bugs may be subtle

Check what's missing - The worst bugs are the ones no one found

Compare against PR description - Do claimed features actually work?

name	multi-review
description	Multi-model code review. Runs code-review skill with 2 models in parallel, then synthesizes findings.