Run any Skill in Manus with one click

apm-spec-guardian

Use this skill to run a four-panel adversarial advisory review on any pull request that touches the OpenAPM specification artifact (docs/src/content/docs/specs/openapm-*.md), its inline / sidecar JSON Schemas (docs/src/content/docs/specs/schemas/*.schema.json), or the conformance fixture seed (tests/fixtures/spec-conformance/**). The panel fans out to four spec-ecosystem reviewers (swagger-openapi-editor, oci-distribution-editor, pkgmgr-registry-contract-editor, w3c-tag-architect), each running in its own agent thread, and a spec-editor synthesizer that produces a fold-now / defer-v0.1.1 / defer-v0.2 / reject list plus a ship decision keyed off a 1..10 shocked_meter scale. The orchestrator is the sole writer to the PR: ONE consolidated comment, no verdict labels, no merge gating. The panel is advisory -- it surfaces findings, prioritizes folds, and renders a ship recommendation that the maintainer weighs.

Run Skill in Manus

Stars2,805

Forks232

UpdatedJune 9, 2026 at 19:52

Source

microsoft

microsoft/apm

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

File Explorer

5 files

SKILL.md

readonly

APM Spec Guardian -- Four-Panel Advisory Review for OpenAPM

This skill institutionalizes the two-round adversarial spec review that produced OpenAPM v0.1 by hand. The panel is FAN-OUT + SYNTHESIZER. Each panelist runs in its own agent thread (via the task tool) and returns JSON matching assets/panelist-return-schema.json. The orchestrator schema-validates each return, hands all returns to the spec-editor-synthesizer (also a task thread, returns JSON matching assets/synthesizer-return-schema.json), runs the linter checklist in assets/linter-checklist.md, then renders ONE comment from assets/comment-template.md.

This skill is ADVISORY by design. It does not compute a binary verdict, it does not apply verdict labels, and it does not gate merge. The panel surfaces findings; the maintainer ships.

Activation scope

This skill activates ONLY when the PR diff touches at least one of:

docs/src/content/docs/specs/openapm-*.md (the normative spec artifact, current and future versions)
docs/src/content/docs/specs/schemas/*.schema.json (sidecar JSON Schemas, if/when the inline Appendix-A schemas are extracted to files)
tests/fixtures/spec-conformance/** (the conformance fixture seed)

Edits to any OTHER documentation page MUST NOT trigger this skill. The maintainer's general docs-sync skill covers those.

Architecture invariants

Advisory regime, not gate regime. There is no APPROVE / REJECT, no spec-approved / spec-rejected label, no deterministic verdict computation. The synthesizer returns a ship_decision (fold_and_ship / needs_revision / next_brief); this is prose for the human reviewer, never auto-applied as a label or status check.
Ship-meter floor. ship_decision: fold_and_ship REQUIRES shocked_meter_avg >= 7.0. Below 7.0 the synthesizer MUST emit needs_revision (single drafter pass on the existing artifact) or next_brief (another round of panel review with a new brief). The floor is advisory wording in the comment, not a status check.
Blocker veto. If ANY panelist returns new_blocking_findings[].length > 0, the synthesizer MUST emit ship_decision: next_brief regardless of the shocked_meter_avg. A blocking finding from one panel is not outweighed by three panels rating the artifact 9/10.
Single-writer interlock. Only the orchestrator writes to the PR: exactly one add-comment call and one remove-labels call. The remove-labels call sweeps spec-review (trigger idempotency). NO add-labels call -- there are no verdict labels. Panelist subagents and the synthesizer subagent return JSON only and MUST NOT call any gh write command, post comments, apply labels, or touch PR state.
Single-emission discipline. Exactly one comment per panel run, rendered from assets/comment-template.md after all subagents return and the linter checklist runs.
ASCII-only artifact. Every byte the skill writes (the comment, the synthesizer prose, any rendered fold instruction) MUST be within printable ASCII (U+0020 - U+007E). The skill inherits the repo encoding rule from .github/instructions/encoding.instructions.md (if present) and additionally enforces it on the spec artifact via linter check 1.
No-vendor-foundation language ban. The spec artifact MUST NOT contain "CNCF", "Linux Foundation", "Sandbox", "Incubation", "W3C Process", or "IETF RFC stream". The persona prompts MAY reference these as pedigree (a panelist's credibility comes from having edited OpenAPI; that does not put OpenAPI's foundation affiliation in the spec text). Linter check 2 greps the artifact for the forbidden token list AFTER any fold.

Agent roster

Agent	Role	Always active?
Swagger / OpenAPI Editor	Interface-contract discipline (schemas, $ref hygiene, oneOf discriminators, conformance enumeration)	Yes
OCI Distribution Editor	Registry-HTTP rigor (hash envelopes, mirror tolerance, fail-closed extraction, supply-chain threat model)	Yes
Package-Manager Registry-Contract Editor	Dependency-resolution rigor (semver dialect pinning, lockfile determinism, transitive conflict policy, reserved-slot defensive MUSTs)	Yes
W3C TAG Architect	Web-platform integration / architecture (extensibility, layering, fingerprinting, machine-readable contract surface)	Yes
Spec Editor Synthesizer	Same hand that drafted; aggregates panel returns, computes shocked_meter_avg, clusters convergent themes, produces fold-now / defer / reject lists and ship_decision	Yes

The roster is invariant for the v0.1 lineage. Changing it requires bumping the skill version.

Topology

   apm-spec-guardian SKILL (orchestrator thread)
                      |
   +------ Wave 0: scope decision (orchestrator-internal) ------+
   |  classify diff:                                            |
   |   - editorial-only (tiny patch, single section, no schema  |
   |     change, no fixture add)  -> skip Wave 3, run Wave 5    |
   |     linter only, render lightweight comment                |
   |   - editorial-patch (default for PR-trigger)               |
   |     -> skip Wave 1+2, start at Wave 3                      |
   |   - new-version (operator opt-in via PR body marker        |
   |     `apm-spec-guardian: new-version`)                      |
   |     -> run Wave 1 + 2 + 3 + 4 + 5                          |
   +------------------------------------------------------------+
                      |
        IF new-version mode: run Wave 1 + Wave 2 first
                      |
   Wave 1 (optional, new-version only): task -> spec-editor-synthesizer
          acting as ASSESSOR -- reads issue context + corpus +
          produces SPEC_BRIEF_v0 (session-state artifact)
   Wave 2 (optional, new-version only): task -> spec-editor-synthesizer
          acting as DRAFTER -- produces SPEC_DRAFT_v0 from SPEC_BRIEF_v0
                      |
   Wave 3: FAN-OUT via task tool (4 panelists in parallel)
   +------+------+--------+------+
   v      v      v        v
   swagger oci  pkgmgr   tag
   (each returns JSON per assets/panelist-return-schema.json)
                      |
                      v   <-- S4 schema-validate per return
                      v   <-- on malformed: re-spawn that panelist
                      v       (max 2 attempts; then placeholder)
                      v
   Wave 4: task -> spec-editor-synthesizer
   - aggregates findings across 4 panels
   - computes shocked_meter_avg
   - resolves dissent
   - clusters into convergent themes
   - emits fold_now[] + defer_v0_1_1[] + defer_v0_2[] + reject[]
   - emits ship_decision honoring ship-meter floor + blocker veto
   - returns assets/synthesizer-return-schema.json
                      |
                      v   <-- S4 schema-validate
                      v
   Wave 5: LINTER (mechanical, from assets/linter-checklist.md)
          - 11 checks; each MUST pass
          - failures append to ship_prose as advisory notes
          - linter does NOT change ship_decision; it informs the
            human reviewer
                      |
   Wave 6: orchestrator (sole writer)
            |               |
            v               v
        add-comment    remove-labels
        (max:1)        [spec-review]
                       (trigger idempotency reset)

Wave 0 -- scope decision

The orchestrator classifies the diff before spawning any panelist. Decision rules, in order:

New-version mode. If the PR body contains the literal marker line apm-spec-guardian: new-version, OR the diff creates a new docs/src/content/docs/specs/openapm-*.md file (not an edit to an existing one), classify as new-version. Run Waves 1 -> 2 -> 3 -> 4 -> 5 -> 6.
Editorial-only mode. If ALL of the following hold:
- the diff added < 50 lines AND removed < 50 lines total across all in-scope paths,
- no JSON Schema file was added, removed, or had its top-level properties keys changed,
- no fixture file was added or removed (existing fixture content edits are OK),
- no anchor of the form <a id="req- was added or removed, classify as editorial-only. SKIP Wave 3 + Wave 4. Run Wave 5 linter on the modified artifact. Render the lightweight editorial-only branch of assets/comment-template.md (just the linter result + a one-line "no substantive spec change detected").
Editorial-patch mode (default). Everything else. Run Wave 3 -> 4 -> 5 -> 6. Wave 1 + Wave 2 are SKIPPED; the panel reviews the existing artifact as modified by the PR diff.

Document the decision in the comment header (one line: "Scope: ; diff = +X/-Y lines across N files").

Wave 3 -- panel fan-out

Spawn the following four tasks in PARALLEL via the task tool, one task per persona:

spec-swagger-editor
spec-oci-editor
spec-pkgmgr-editor
spec-tag-architect

Each task prompt MUST:

Reference its persona file by relative path (../../agents/spec-<slug>.agent.md) so the subagent loads its own scope, lens, and pedigree.
Include the PR number, title, body, and full diff (passed inline), PLUS the current contents of the in-scope spec artifact AS MODIFIED by the diff (the panel reviews the post-merge state of the file, not just the diff).
Cite assets/panelist-return-schema.json and require the subagent to emit JSON matching that schema as its FINAL message.
State the calibrated severity contract: "Use new_blocking_findings ONLY for issues that would break a conformant implementation, leak a security guarantee, or invalidate a published normative claim. Use new_recommended_findings for substantive improvements. Use new_nit_findings for one-line editorial polish. The panel is advisory; nothing you return blocks merge; pick the severity that honestly matches your signal strength."
State the no-vendor-foundation rule: "Your pedigree as a [persona] is part of your prompt. Do NOT propose adding the names of any standards body, foundation, or governance program (CNCF, Linux Foundation, Sandbox, Incubation, W3C Process, IETF RFC stream) to the spec artifact text. Findings that recommend such additions will be auto-rejected by the synthesizer."
State the ASCII rule: "Every byte in your return JSON and every byte of any proposed fix or replacement text MUST be within U+0020
- U+007E. No emojis, no Unicode dashes, no curly quotes."
Pass round (1 on first pass, 2+ on subsequent rounds in new-version mode).
Restate the output contract: NO gh write commands, NO posting comments, NO label changes, NO touching PR state. JSON return only.

Wave 4 -- synthesizer

Pass all four validated panelist JSON returns to a task invocation that loads ../../agents/spec-editor-synthesizer.agent.md. The prompt MUST:

Provide all panelist returns as structured input.
Ask for: convergence_table, convergent_themes (themes flagged by 2+ panels), fold_now[] (surgical single-section fixes only), defer_v0_1_1[] (small patches deferrable to next patch release), defer_v0_2[] (architectural work requiring a reserved slot in a future major), reject[] (findings the synthesizer declines, with rationale), ship_decision, ship_prose, linter_handoff_notes.
State the ship-decision rules verbatim:
- If sum(panelist.new_blocking_findings) > 0 across all panels, ship_decision MUST be next_brief.
- Else if shocked_meter_avg < 7.0, ship_decision MUST be needs_revision.
- Else ship_decision MAY be fold_and_ship.
State the no-vendor-foundation rule: auto-reject any panelist-proposed fix that would add a banned token to the artifact; surface in reject[] with rationale.
Cite assets/synthesizer-return-schema.json and require JSON return.
Restate the contract: the panel is advisory. The synthesizer does NOT pick a verdict label. The ship_decision is prose for the human reviewer, not a gate. NO gh write commands.

Validate the synthesizer return against assets/synthesizer-return-schema.json. On failure, re-spawn once with the violation cited.

Wave 5 -- linter

Run assets/linter-checklist.md against the in-scope artifact set (spec markdown + schemas + fixtures). The checklist has 11 mechanical checks; each is a one-liner producing exit code 0 (or empty grep output where noted). Record pass / fail per check.

Linter outcomes are ADVISORY: a failed check does NOT change the synthesizer's ship_decision. It DOES surface in the comment as a "Linter notes" section so the maintainer can decide whether to fold the fix into the same PR. If ship_decision == fold_and_ship AND any linter check failed, the comment surfaces the conflict prominently ("Synthesizer recommends ship; linter found N issues worth folding first").

Wave 6 -- render the comment

Load assets/comment-template.md, fill the placeholders from the synthesizer + panelist JSON + linter results, and emit it as exactly ONE comment.

Filling rules:

The convergence table renders ONE row per panelist with verdict, shocked_meter, new_blockers, new_recommended, new_nits counts.
The fold-now list renders the synthesizer's fold_now[] verbatim, ordered as returned.
The defer-v0.1.1 and defer-v0.2 lists render below the fold list, collapsed in <details> blocks if either has more than 3 items.
The reject list renders only if non-empty.
The "Linter notes" section renders only if any check failed; each failed check renders the check id + the one-line failure summary.
Full per-panel findings collapse into a <details> at the bottom.
NEVER render the words "Verdict", "APPROVE", "REJECT", "blocked", "merge gate", or any equivalent. The panel is advisory.

Then sweep the spec-review label via safe-outputs.remove-labels (idempotent on missing labels). NO add-labels call.

Output contract (non-negotiable)

Exactly ONE comment per panel run, rendered from assets/comment-template.md.
Exactly ONE remove-labels call sweeping [spec-review].
NO add-labels call.
Subagents (panelists + synthesizer) NEVER write to PR state, NEVER call gh pr comment, NEVER call gh pr edit --add-label. They return JSON. The orchestrator is the sole writer.
ASCII-only across every byte the orchestrator writes.
Never invent new top-level template sections or drop existing ones.

Loop budget

Editorial-patch mode: at most 2 panel rounds. If round 2 still carries new_blocking_findings, the synthesizer emits ship_decision: next_brief with a ship_prose note that the loop budget is exhausted and the maintainer should ESCALATE (manually decide between drafting a fix or closing the PR).
New-version mode: at most 3 panel rounds. Same exhaustion semantics on round 3.
Editorial-only mode: zero panel rounds (linter only).

The orchestrator increments and tracks the round counter; subagents receive it as input but MUST NOT trust panel-side memory.

Gotchas

Roster invariant. The frontmatter description, the activation scope list, the roster table, the topology diagram, and the schema enum MUST agree on the 4 panelists + 1 synthesizer. If you change one, change all in the same edit.
Blocker veto trumps ship-meter. A panelist who returns one new_blocking_finding and a shocked_meter of 9 means "the spec is mostly excellent but this one thing would break a conformant implementation". That blocker still vetoes fold_and_ship. Do not let the synthesizer average it away.
Calibrated severity discipline. The advisory regime relies on panelists distinguishing blocking from recommended honestly. If a panelist marks every editorial nit as blocking, the synthesizer's blocker veto becomes a denial-of-ship. The panelist prompts state the contract explicitly; the synthesizer arbitration prose is the safety valve.
Wave 0 editorial-only is a noise filter, not a quality shortcut. It exists so a one-line typo fix to a fixture comment does not summon four expert agents. It MUST NOT fire on schema changes, anchor additions, or fixture-tree topology changes; the rules above are intentionally conservative.
No-vendor-foundation tokens may appear in panelist returns (a panelist can name "the W3C TAG" as their pedigree in their summary field) but MUST NOT appear in the synthesizer's fold_now[].patch_instruction or in the rendered comment body outside <details> collapsed sections. The synthesizer auto-rejects panelist proposals that would add banned tokens to the artifact.
ASCII enforcement is per-byte, not per-codepoint. A character with codepoint > 0x7E is non-ASCII even if it would round-trip through a different encoding. The linter checks raw byte values, not the rendered visual.
Subagent write enforcement is contract-based, not sandbox-based. Tool permissions are workflow-scoped, not subagent-scoped, so every spawned task technically inherits the same gh toolset. The "subagents must not write" rule is enforced by the prompt contract in each .agent.md plus the safe-outputs.add-comment.max: 1 fail-soft.
Spec drift across count sites. Linter check 6 catches when sec. 1.3 sentence, Appendix C trailer, and Appendix D revision-history disagree on the normative-statement total. This is the most common failure mode of a fold pass and is the reason the linter is mandatory before render.

Relationship to apm-review-panel

apm-review-panel is the general OSS multi-persona review for any non-trivial PR in the repo. apm-spec-guardian is its narrow, spec-only sibling: a different persona roster, a different ship decision schema (shocked_meter instead of stance enum), and a mandatory linter step. The architectural shape (FAN-OUT + SYNTHESIZER + single-writer interlock + advisory regime) is deliberately the same so a contributor reading one can read the other. Do not merge them; the persona pedigrees and the artifact type (spec vs code) are different enough that one-size-fits-all prompts would dilute both.

apm-spec-guardian

More from this repository

More from this repository

APM Spec Guardian -- Four-Panel Advisory Review for OpenAPM

Activation scope

Architecture invariants

Agent roster

Topology

Wave 0 -- scope decision

Wave 3 -- panel fan-out

Wave 4 -- synthesizer

Wave 5 -- linter

Wave 6 -- render the comment

Output contract (non-negotiable)

Loop budget

Gotchas

Relationship to apm-review-panel

APM Spec Guardian -- Four-Panel Advisory Review for OpenAPM

Activation scope

Architecture invariants

Agent roster

Topology

Wave 0 -- scope decision

Wave 3 -- panel fan-out

Wave 4 -- synthesizer

Wave 5 -- linter

Wave 6 -- render the comment

Output contract (non-negotiable)

Loop budget

Gotchas

Relationship to apm-review-panel