تشغيل أي مهارة في Manus بنقرة واحدة

review-roulette

النجوم٣٩

التفرعات١٥

آخر تحديث٢٣ مارس ٢٠٢٦ في ١٤:٠٨

Dispatch a review task to 3 randomly-selected reasoning models in parallel for diverse perspectives, then merge all suggestions into a single result.

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

in-the-loop-labs

in-the-loop-labs/pair-review

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

محللو ضمان جودة البرمجيات والمختبرونمهن الحاسوب والرياضيات·SOC 15-1253

SKILL.md

readonly

name	review-roulette
description	Dispatch a review task to 3 randomly-selected reasoning models in parallel for diverse perspectives, then merge all suggestions into a single result.

Review Roulette

When this skill is active, your ONLY job is orchestration — you do NOT perform any review analysis yourself. You randomly select 3 reasoning models, dispatch the review to all of them in parallel, and merge the results.

Step 1: Discover Available Reasoning Models

Run ${PI_CMD:-pi} --list-models via bash to get the current list of models with valid API keys. Eligible models are those that show thinking: yes in the output — these are the reasoning-capable / premium models.

Excluded models: Never select openai/o3-pro — it is prohibitively expensive. If it appears in the model list, skip it entirely.

Example reasoning models you might see (provider/model format):

anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-5 (with thinking)
openai/o3
openai/o4-mini
openai/gpt-5-pro
openai/gpt-5.2-pro
google/gemini-2.5-pro (with thinking)
google/gemini-2.5-flash (with thinking)
google/gemini-3.1-pro-preview
xai/grok-4

The exact list depends on which API keys are configured. Always check — do not assume models are available.

Step 2: Randomly Select 3 Models

From the eligible reasoning models, pick exactly 3 at random.

CRITICAL — true randomness and diversity:

Do NOT always pick the same 3 models. The entire point of review roulette is variety of perspectives across runs.
Prefer different providers when possible. If you have reasoning models from Anthropic, OpenAI, Google, and xAI, pick from 3 different providers. Only double up on a provider if fewer than 3 providers have eligible models.
Shuffle or randomize your selection each time. Do not default to alphabetical order or any fixed preference.

Step 3: Dispatch the Review in Parallel

Use the task tool with the tasks array to dispatch all 3 reviews simultaneously. Each task object must include:

model: The selected model in provider/model format.
task: The FULL original review prompt/instructions. Each subtask starts fresh with NO conversation history and NO context from the parent. You must forward EVERYTHING you were asked to do — the complete prompt, all instructions, the diff, file contents, any constraints or formatting requirements, the expected JSON output schema, etc. Do not summarize or abbreviate. Pass it all through verbatim.

Example structure:

{
  "tasks": [
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "anthropic/claude-opus-4-6"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "openai/o3"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "google/gemini-2.5-pro"
    }
  ]
}

Step 4: Merge Results

Each subtask will return a review result containing a summary string and a suggestions array (the standard review JSON format).

Collect the results from all 3 subtask responses and merge them:

Suggestions: Concatenate all suggestions arrays into a single array.
Summary: Concatenate all summaries with model attribution. Format the merged summary as:
```
<provider/model1>:
<summary1>

<provider/model2>:
<summary2>

<provider/model3>:
<summary3>
```
This attributed format also serves as a record of which models were used in the review.

Return the merged result as the final JSON response.

Do NOT:

Deduplicate suggestions — let the consumer decide what overlaps
Synthesize, summarize, or editorialize on the combined results
Perform any review analysis yourself

Do:

Concatenate all suggestion arrays: [...model1, ...model2, ...model3]
Concatenate all summaries with provider/model: attribution as shown above
Return the merged result as the final JSON response in the same schema the subtasks used

Summary

You (parent)                    Subtask 1 (model A)
    │                               │
    ├── pick 3 random models        ├── receive full prompt
    ├── forward full prompt ──────► ├── perform review
    │                               └── return suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 2 (model B) ──► suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 3 (model C) ──► suggestions JSON
    │
    └── merge all summaries (with model attribution) + suggestions[] ──► final JSON response

The parent does zero analysis. It is purely a dispatcher and merger. Each model's summary is attributed so the final output records which models contributed.

المزيد من هذا المستودع

نفس المستودع

analyze

in-the-loop-labs/pair-review

Perform AI-powered code review analysis by spawning parallel Task agents directly within the coding agent's context. Does not require the pair-review MCP server — works standalone. Runs Level 1 (diff isolation), Level 2 (file context), and Level 3 (codebase context) as parallel tasks, then orchestrates results into curated suggestions. Results are returned directly in the conversation and also pushed to the pair-review web UI (if running). Use when the user says "analyze", "analyze my changes", "run analysis", "analyze using tasks", "analyze directly", "analyze here", or wants code review analysis of their changes. This is the default analysis skill. If the user says something ambiguous like "analyze my changes" or "run analysis", use this skill unless they specifically ask for in-app analysis.

2026-04-1139

review-model-guidance

in-the-loop-labs/pair-review

Guidance for selecting models when performing code review with subtasks. Load this skill to enable intelligent model selection for review analysis — choosing faster models for simple tasks and deeper reasoning models for complex analysis.

2026-03-0839

review-requests

in-the-loop-labs/pair-review

Open outstanding GitHub review requests in pair-review for AI-powered code review. Finds open PRs where my review is pending from the past week and starts pair-review analysis for each. Use when the user says "review requests", "review my PRs", "check review requests", "open review requests", "pair-review my requests", or wants to batch-review their outstanding GitHub review requests.

2026-03-0439

loop

in-the-loop-labs/pair-review

Implement code, review changes with AI analysis, fix issues, and repeat until clean or max iterations reached. Creates a tight implement-review-fix feedback loop. Use when the user says "critic loop", "code review loop", "implement and review", "build with review loop", or wants iterative development with automated quality checks.

2026-02-0339

ai-critic

in-the-loop-labs/pair-review

Fetch AI-generated review suggestions from pair-review and make code changes to address them. Use when the user says "address AI feedback", "address AI suggestions", "fix AI review feedback", or wants to iterate on code based on AI analysis results from pair-review.

2026-02-0339

analyze

in-the-loop-labs/pair-review

Perform AI-powered code review analysis using pair-review's server-side analysis engine via MCP. Requires the pair-review MCP server to be connected. For standalone analysis without MCP, use the `code-critic:analyze` skill instead. Starts analysis via the pair-review MCP start_analysis tool, polls for completion, then fetches and presents the curated suggestions. Results are also visible in the pair-review web UI alongside the diff. Use when the user says "analyze in the app", "analyze in the UI", "run server analysis", "analyze with pair-review", or wants analysis results integrated into the pair-review web UI. If the user says something ambiguous like "analyze my changes" or "run analysis" without specifying a method, and both the `code-critic:analyze` and `pair-review:analyze` skills are available, ask whether they want: (1) agent-based analysis (`code-critic:analyze` — results returned directly in the conversation, no server required), or (2) in-app analysis (`pair-review:analyze` — results appear in the

2026-02-0339

name	review-roulette
description	Dispatch a review task to 3 randomly-selected reasoning models in parallel for diverse perspectives, then merge all suggestions into a single result.

Review Roulette

Step 1: Discover Available Reasoning Models

Excluded models: Never select openai/o3-pro — it is prohibitively expensive. If it appears in the model list, skip it entirely.

Example reasoning models you might see (provider/model format):

anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-5 (with thinking)
openai/o3
openai/o4-mini
openai/gpt-5-pro
openai/gpt-5.2-pro
google/gemini-2.5-pro (with thinking)
google/gemini-2.5-flash (with thinking)
google/gemini-3.1-pro-preview
xai/grok-4

The exact list depends on which API keys are configured. Always check — do not assume models are available.

Step 2: Randomly Select 3 Models

From the eligible reasoning models, pick exactly 3 at random.

CRITICAL — true randomness and diversity:

Do NOT always pick the same 3 models. The entire point of review roulette is variety of perspectives across runs.
Prefer different providers when possible. If you have reasoning models from Anthropic, OpenAI, Google, and xAI, pick from 3 different providers. Only double up on a provider if fewer than 3 providers have eligible models.
Shuffle or randomize your selection each time. Do not default to alphabetical order or any fixed preference.

Step 3: Dispatch the Review in Parallel

Use the task tool with the tasks array to dispatch all 3 reviews simultaneously. Each task object must include:

model: The selected model in provider/model format.
task: The FULL original review prompt/instructions. Each subtask starts fresh with NO conversation history and NO context from the parent. You must forward EVERYTHING you were asked to do — the complete prompt, all instructions, the diff, file contents, any constraints or formatting requirements, the expected JSON output schema, etc. Do not summarize or abbreviate. Pass it all through verbatim.

Example structure:

{
  "tasks": [
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "anthropic/claude-opus-4-6"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "openai/o3"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "google/gemini-2.5-pro"
    }
  ]
}

Step 4: Merge Results

Each subtask will return a review result containing a summary string and a suggestions array (the standard review JSON format).

Collect the results from all 3 subtask responses and merge them:

Suggestions: Concatenate all suggestions arrays into a single array.
Summary: Concatenate all summaries with model attribution. Format the merged summary as:
```
<provider/model1>:
<summary1>

<provider/model2>:
<summary2>

<provider/model3>:
<summary3>
```
This attributed format also serves as a record of which models were used in the review.

Return the merged result as the final JSON response.

Do NOT:

Deduplicate suggestions — let the consumer decide what overlaps
Synthesize, summarize, or editorialize on the combined results
Perform any review analysis yourself

Do:

Concatenate all suggestion arrays: [...model1, ...model2, ...model3]
Concatenate all summaries with provider/model: attribution as shown above
Return the merged result as the final JSON response in the same schema the subtasks used

Summary

You (parent)                    Subtask 1 (model A)
    │                               │
    ├── pick 3 random models        ├── receive full prompt
    ├── forward full prompt ──────► ├── perform review
    │                               └── return suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 2 (model B) ──► suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 3 (model C) ──► suggestions JSON
    │
    └── merge all summaries (with model attribution) + suggestions[] ──► final JSON response

The parent does zero analysis. It is purely a dispatcher and merger. Each model's summary is attributed so the final output records which models contributed.