Performs multi-agent, multi-model code review of F# compiler PRs across 19 dimensions including type checking, IL emission, binary compatibility, and IDE performance. Dispatches parallel assessment agents per dimension, consolidates with cross-model agreement scoring, and filters false positives. Invoke when reviewing compiler changes, requesting expert feedback, or performing pre-merge quality checks.
Installation
Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.
Performs multi-agent, multi-model code review of F# compiler PRs across 19 dimensions including type checking, IL emission, binary compatibility, and IDE performance. Dispatches parallel assessment agents per dimension, consolidates with cross-model agreement scoring, and filters false positives. Invoke when reviewing compiler changes, requesting expert feedback, or performing pre-merge quality checks.
Reviewing Compiler PRs
Full dimension definitions and CHECK rules live in the expert-reviewer agent.
When to Invoke
PR touches src/Compiler/ — invoke the expert-reviewer agent
PR touches src/FSharp.Core/ — focus on FSharp.Core Stability, API Surface, Backward Compat, XML Docs
PR touches vsintegration/ or LanguageServer/ — focus on IDE Responsiveness, Concurrency, Memory
FSharp.Core Stability, Backward Compat, XML Docs, RFC Process
vsintegration/
IDE Responsiveness, Memory Footprint, Cross-Platform
eng/, setup/, build scripts
Build Infrastructure, Cross-Platform
Subagent Dispatch
For each selected dimension from the table above, the expert-reviewer agent MUST launch an independent subagent (background task) to assess that dimension. This is not optional — a single agent doing all dimensions sequentially produces shallow analysis and wall-of-text summaries.
Each subagent receives:
The dimension's CHECK rules (from expert-reviewer.md)
The relevant file diffs (filtered by the dimension's hotspot paths)
Instructions to produce a structured finding: {file, line, severity, dimension, issue, suggestion} or LGTM if no findings
The expert-reviewer consolidates subagent results, deduplicates, applies assessment gates, and posts as inline comments per Wave 5.
Assessment gates — apply before flagging:
Understand execution context before judging (test harness ≠ compiler runtime)
Classify as regression, improvement, or unclear — only regressions are findings
Require a concrete failing scenario — no hypotheticals
"Correct convention" for the context in use → discard, not a finding
"Unexplained" ≠ "wrong" — missing rationale in a commit message is a doc gap, not a defect