ワンクリックで
review-scenario
// Use when reviewing a conformance PR that adds or changes scenario .ts files for a SEP — before approving, before requesting changes, or as a self-check before opening one.
// Use when reviewing a conformance PR that adds or changes scenario .ts files for a SEP — before approving, before requesting changes, or as a self-check before opening one.
Scaffold a sep-NNNN.yaml requirement-traceability file for the MCP conformance repo from a SEP PR's spec diff. Runs the new-sep CLI, then parses the modelcontextprotocol/modelcontextprotocol spec diff to populate `requirements[]` with the RFC 2119 sentences and proposed check IDs.
Comprehensive tier assessment for an MCP SDK repository against SEP-1730. Produces tier classification (1/2/3) with evidence table, gap list, and remediation guide. Works for any official MCP SDK (TypeScript, Python, Go, C#, Java, Kotlin, PHP, Swift, Rust, Ruby).
| name | review-scenario |
| description | Use when reviewing a conformance PR that adds or changes scenario .ts files for a SEP — before approving, before requesting changes, or as a self-check before opening one. |
Spec diff is ground truth. Pull the SEP's actual spec changes and read the RFC-2119 sentences yourself — don't trust the PR description or SEP summary for keyword levels:
gh api "repos/modelcontextprotocol/modelcontextprotocol/pulls/<SEP>/files" \
--jq '.[] | select(.filename | test("^docs/specification/draft/.*\\.mdx$")) | {filename, patch}'
If the SEP includes a conformance-test-case table, that table is authoritative for the cases it lists. A table/prose mismatch is a spec gap to flag, not something to silently resolve either way.
Traceability YAML. src/seps/sep-<SEP>.yaml should exist (run /new-sep <SEP> first if not). Diff its rows against the spec sentences you extracted; flag rows that paraphrase rather than quote, claim a keyword level the spec doesn't, or assert something the spec never says. Check IDs follow sep-<NNNN>-<kebab-slug>.
Per-scenario-file:
Coverage. Count YAML check: rows vs how many the PR's scenarios actually exercise; list the gaps.
Proof it runs. The PR should reference at least one real implementation the scenario ran green against — the in-repo everything-client/server, or an external SDK via npx https://pkg.pr.new/@modelcontextprotocol/conformance@<PR>. No run referenced → ask for one before approving.
Negative test. Pins the specific failing slugs, not just failures.length > 0 (AGENTS.md §Examples: prove it passes and fails).
This is a first pass for a human reviewer — give them what they need to verify each finding without re-deriving it.
Open with a summary: N scenarios added/changed, M distinct check IDs emitted, X/Y YAML check: rows covered, and which implementation it was run against.
Then one bullet per finding. Each bullet makes its own case — the reviewer should be able to confirm or refute it from the bullet alone:
<check-id>—file.ts:Lnn— claim. Spec: "quoted normative sentence" (page#anchor). Consequence: what a compliant impl would do and how this check would mis-report it.
e.g.
client-consistent-version—stateless.ts:86— no spec backing. Spec: "Servers MUST NOT rely on prior requests over the same connection to establish context (e.g., capabilities, protocol version)" (lifecycle#stateless). A compliant client may changeprotocolVersionper request; this check FAILs it. TheflippingVersionClientnegative test enforces a non-requirement.
Get <HEAD-SHA> once via gh pr view <PR> --json headRefOid -q .headRefOid and use it for all permalinks so they don't drift on rebase.
Order: spec-backing → logic/dead → coverage (gap list) → conventions. Put spec gaps in a separate trailing list — those go upstream, not to the PR author.
Self-review: fix in place and re-run.
If asked to push fixes (stacked diff on top of the PR head): one commit per finding, commit message is the finding. Leave design-level items (scenario count, refactors) as prose.