with one click
interrogate
// Use for "interrogate", "adversarial review", "multi-model review", "challenge this", "stress test this code", "find blind spots", or "tear this apart". Four LLM reviewers challenge changes from independent angles.
// Use for "interrogate", "adversarial review", "multi-model review", "challenge this", "stress test this code", "find blind spots", or "tear this apart". Four LLM reviewers challenge changes from independent angles.
Use for "how does X work", code walkthroughs before changing something, and placement / ownership / layering questions ("where should this live", "which package owns this", "is this the right layer"). Explains subsystem architecture, runtime flow, onboarding mental models. Can critique architecture. Use why for motivation.
poteto's agent style for concise, detailed responses, deliberate subagents, unslopped prose, simple code, and verified work. Use for poteto, /poteto-mode, or requests to work in this style.
Apply when work is repetitive or bulk: many similar edits, a check you'll rerun, a population to transform. Build the tool that amortizes it (codemod, script, generator) once you know the recipe, instead of grinding by hand.
Spawn three parallel review subagents over the active transcript, surface learnings, and route each to a concrete edit on an existing skill. Use when the user says reflect.
Use for 'why does X work this way', 'why we picked Y', design rationale, regressions, postmortems, or data-backed thresholds. Discovers available MCPs and queries each evidence category (source control, issue tracker, long-form docs, real-time chat, infrastructure observability, error tracking, product analytics warehouse) in parallel, then returns a cited read on decisions and tradeoffs. Use how for runtime behavior.
| name | interrogate |
| description | Use for "interrogate", "adversarial review", "multi-model review", "challenge this", "stress test this code", "find blind spots", or "tear this apart". Four LLM reviewers challenge changes from independent angles. |
| disable-model-invocation | true |
Spawn four reviewers on four different models to adversarially review code changes. Each model gets the same prompt and rubric. The adversarial signal comes from model diversity, not assigned personas. Different models have different blind spots, priors, and reasoning patterns. Agreement across models is high-confidence signal; lone-model findings are worth reading but lower confidence.
The deliverable is a synthesized verdict. Do NOT auto-apply changes.
Identify what to review from context:
git diff main...HEAD (or the appropriate base branch) to get the full changesetCollect the material into a clear package: the diff (or file contents), and any surrounding context files the reviewers will need to understand the code.
Before spawning reviewers, state the intent explicitly. What is this code trying to accomplish? Derive this from:
Write one clear paragraph. This is critical: reviewers challenge whether the work achieves the intent well, not whether the intent itself is correct. If you're unsure about the intent, ask the user before proceeding.
Launch all four in a single message using the Task tool, each with a different model. All four get the same prompt built from the template in references/reviewer-prompt.md.
| Subagent | Model |
|---|---|
| Reviewer A | claude-opus-4-7-thinking-xhigh |
| Reviewer B | gpt-5.3-codex-high-fast |
| Reviewer C | gpt-5.5-high-fast |
| Reviewer D | composer-2.5-fast |
For each reviewer:
subagent_type: generalPurposemodel: the model from the tablereadonly: trueIf a model slug in the table above is rejected as unresolvable when you try to spawn the subagent, check the current list of valid slugs in the Task tool's error message, pick the closest equivalent (prefer the highest-reasoning tier of the same family), spawn with the valid slug, and open a separate PR to update this table. Do not block the review on the slug issue.
Read references/reviewer-prompt.md and fill in the template with:
references/rubric.mdEach reviewer produces structured findings as described in the prompt template.
As reviewer results come back, build a unified picture:
You are the lead reviewer, a pragmatic senior engineer, not a neutral aggregator.
Read references/lead-judgment.md for the full framework. Core principle: reviewers only see a slice of the codebase. You have the full context: the goal, the constraints, the timeline, and which tradeoffs were already considered. Use that context aggressively.
Categorize every finding into one of four buckets:
For each finding, include:
Present the verdict in this structure:
[The stated intent paragraph from Step 2]
[Findings that should be addressed. For each: description, which models raised it, why it matters.]
[Findings worth thinking about. For each: description, which models raised it, tradeoff involved.]
[Valid but low-priority. Brief list.]
[Rejected findings with brief rationale. This section matters because it shows the user what was filtered out and why, so they can override your judgment if they disagree.]
[Where did models agree? Where did they diverge? What does the pattern of agreement/disagreement tell us?]