Run any Skill in Manus with one click

reviewing-compiler-prs

Stars4,306

Forks865

UpdatedMay 20, 2026 at 15:51

Performs multi-agent, multi-model code review of F# compiler PRs across 19 dimensions including type checking, IL emission, binary compatibility, and IDE performance. Dispatches parallel assessment agents per dimension, consolidates with cross-model agreement scoring, and filters false positives. Invoke when reviewing compiler changes, requesting expert feedback, or performing pre-merge quality checks.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

dotnet

dotnet/fsharp

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations·SOC 15-1253

SKILL.md

readonly

name

reviewing-compiler-prs

description

Reviewing Compiler PRs

Full dimension definitions and CHECK rules live in the expert-reviewer agent.

When to Invoke

PR touches src/Compiler/ — invoke the expert-reviewer agent
PR touches src/FSharp.Core/ — focus on FSharp.Core Stability, API Surface, Backward Compat, XML Docs
PR touches vsintegration/ or LanguageServer/ — focus on IDE Responsiveness, Concurrency, Memory
PR touches tests/ only — quick check: baselines explained? Cross-TFM coverage? Tests actually assert?
PR touches eng/ or build scripts — focus on Build Infrastructure, Cross-Platform

Dimension Selection

Files Changed	Focus Dimensions
`Checking/`, `TypedTree/`	Type System, Overload Resolution, Struct Awareness, Feature Gating
`CodeGen/`, `AbstractIL/`	IL Emission, Debug Experience, Test Coverage
`Optimize/`	Optimization Correctness, IL Emission, Test Coverage
`SyntaxTree/`, `pars.fsy`	Parser Integrity, Feature Gating, Typed Tree Discipline
`TypedTreePickle.`, `CompilerImports.`	Binary Compatibility (highest priority)
`Service/`	FCS API Surface, IDE Responsiveness, Concurrency, Incremental Checking
`LanguageServer/`	IDE Responsiveness, Concurrency
`Driver/`	Build Infrastructure, Incremental Checking, Cancellation
`Facilities/`	Feature Gating, Concurrency
`FSComp.txt`	Diagnostic Quality
`FSharp.Core/`	FSharp.Core Stability, Backward Compat, XML Docs, RFC Process
`vsintegration/`	IDE Responsiveness, Memory Footprint, Cross-Platform
`eng/`, `setup/`, build scripts	Build Infrastructure, Cross-Platform

Subagent Dispatch

For each selected dimension from the table above, the expert-reviewer agent MUST launch an independent subagent (background task) to assess that dimension. This is not optional — a single agent doing all dimensions sequentially produces shallow analysis and wall-of-text summaries.

Each subagent receives:

The dimension's CHECK rules (from expert-reviewer.md)
The relevant file diffs (filtered by the dimension's hotspot paths)
Instructions to produce a structured finding: {file, line, severity, dimension, issue, suggestion} or LGTM if no findings

The expert-reviewer consolidates subagent results, deduplicates, applies assessment gates, and posts as inline comments per Wave 5.

Assessment gates — apply before flagging:

Understand execution context before judging (test harness ≠ compiler runtime)
Classify as regression, improvement, or unclear — only regressions are findings
Require a concrete failing scenario — no hypotheticals
"Correct convention" for the context in use → discard, not a finding
"Unexplained" ≠ "wrong" — missing rationale in a commit message is a doc gap, not a defect

Consolidation:

Deduplicate findings at same location
Filter: wrong context → discard; improvement → downgrade; speculation → LOW
Classify: Behavioral (correctness) → Quality (structure) → Nitpick (style)
Rank by cross-model agreement (≥2 models agree = higher confidence)
Present Behavioral first; Nitpicks only if nothing higher — agents love producing nitpicks to have something to say, deprioritize them

Self-Review Checklist

Every behavioral change has a test
FSharp.Core changes maintain binary compatibility
No unintended public API surface changes
New language features have a LanguageFeature guard and RFC
No raw TType_* matching without stripTyEqns
Cancellation tokens threaded through async operations
Cleanup changes separate from feature enablement

Full dimension CHECK rules are in the expert-reviewer agent.

More from this repository

same repository

binlog-analysis

dotnet/fsharp

Triage a build / compile / restore / WarnAsError failure from its MSBuild binary log. Fetches the binlog (a local build's, or a failed dotnet/fsharp Azure DevOps PR build's published artifact) and analyzes it live via the `binlog-mcp` MCP server — structured errors, root-cause diagnosis, and an MSBuild perf X-ray. NOT for test failures or CheckCodeFormatting: a build binlog has no errors there.

2026-06-174.3k

hypothesis-driven-debugging

dotnet/fsharp

Investigate compiler failures, test errors, or unexpected behavior through systematic minimal reproduction, 3-hypothesis testing, and verification. Always re-run builds and tests after changes.

2026-06-174.3k

pr-build-status

dotnet/fsharp

Retrieve and analyze Azure DevOps build failures for GitHub PRs. Use when CI fails. CRITICAL: Collect ALL errors from ALL platforms FIRST, write hypotheses to file, then fix systematically.

2026-06-174.3k

pr-description

dotnet/fsharp

Use when drafting, proposing, creating, or editing prose for a dotnet/fsharp GitHub PR or issue — body, title, comment, review summary, edits — including bare asks like "open a PR", "ship this", "write up what I did", "summarise the change", "reply on the PR", "edit the issue body", "gh pr create", "gh pr comment", "gh pr edit --body", "gh issue comment", "gh pr review --body". Primary use case is PR descriptions; same rules apply to PR/issue comments and review summaries. Not for labels, reviewers, merging, or code-review findings (just the prose write-up of them).

2026-06-084.3k

fsharp-diagnostics

dotnet/fsharp

Always invoke after editing `.fs` files under `src/Compiler/`. Fast parse/typecheck without `dotnet build`, plus symbol references and inferred type hints. Use whenever the user asks about F# errors, compile errors, type inference, finding usages, or renaming a symbol in the compiler tree.

2026-06-044.3k

flaky-test-detector

dotnet/fsharp

Detect flaky tests by scanning recent AzDo CI builds for test failures recurring across multiple unrelated PRs. Use when investigating intermittent failures, CI instability, deciding which tests to quarantine, or checking if RunTestCasesInSequence no-ops are causing parallel-safety issues.

2026-03-164.3k

name

reviewing-compiler-prs

description

Reviewing Compiler PRs

Full dimension definitions and CHECK rules live in the expert-reviewer agent.

When to Invoke

PR touches src/Compiler/ — invoke the expert-reviewer agent
PR touches src/FSharp.Core/ — focus on FSharp.Core Stability, API Surface, Backward Compat, XML Docs
PR touches vsintegration/ or LanguageServer/ — focus on IDE Responsiveness, Concurrency, Memory
PR touches tests/ only — quick check: baselines explained? Cross-TFM coverage? Tests actually assert?
PR touches eng/ or build scripts — focus on Build Infrastructure, Cross-Platform

Dimension Selection

Files Changed	Focus Dimensions
`Checking/`, `TypedTree/`	Type System, Overload Resolution, Struct Awareness, Feature Gating
`CodeGen/`, `AbstractIL/`	IL Emission, Debug Experience, Test Coverage
`Optimize/`	Optimization Correctness, IL Emission, Test Coverage
`SyntaxTree/`, `pars.fsy`	Parser Integrity, Feature Gating, Typed Tree Discipline
`TypedTreePickle.`, `CompilerImports.`	Binary Compatibility (highest priority)
`Service/`	FCS API Surface, IDE Responsiveness, Concurrency, Incremental Checking
`LanguageServer/`	IDE Responsiveness, Concurrency
`Driver/`	Build Infrastructure, Incremental Checking, Cancellation
`Facilities/`	Feature Gating, Concurrency
`FSComp.txt`	Diagnostic Quality
`FSharp.Core/`	FSharp.Core Stability, Backward Compat, XML Docs, RFC Process
`vsintegration/`	IDE Responsiveness, Memory Footprint, Cross-Platform
`eng/`, `setup/`, build scripts	Build Infrastructure, Cross-Platform

Subagent Dispatch

Each subagent receives:

The dimension's CHECK rules (from expert-reviewer.md)
The relevant file diffs (filtered by the dimension's hotspot paths)
Instructions to produce a structured finding: {file, line, severity, dimension, issue, suggestion} or LGTM if no findings

The expert-reviewer consolidates subagent results, deduplicates, applies assessment gates, and posts as inline comments per Wave 5.

Assessment gates — apply before flagging:

Understand execution context before judging (test harness ≠ compiler runtime)
Classify as regression, improvement, or unclear — only regressions are findings
Require a concrete failing scenario — no hypotheticals
"Correct convention" for the context in use → discard, not a finding
"Unexplained" ≠ "wrong" — missing rationale in a commit message is a doc gap, not a defect

Consolidation:

Deduplicate findings at same location
Filter: wrong context → discard; improvement → downgrade; speculation → LOW
Classify: Behavioral (correctness) → Quality (structure) → Nitpick (style)
Rank by cross-model agreement (≥2 models agree = higher confidence)
Present Behavioral first; Nitpicks only if nothing higher — agents love producing nitpicks to have something to say, deprioritize them

Self-Review Checklist

Every behavioral change has a test
FSharp.Core changes maintain binary compatibility
No unintended public API surface changes
New language features have a LanguageFeature guard and RFC
No raw TType_* matching without stripTyEqns
Cancellation tokens threaded through async operations
Cleanup changes separate from feature enablement

Full dimension CHECK rules are in the expert-reviewer agent.