with one click
adversarial-pr-review
Adversarial spec-compliance PR review — cross-references diffs against approved specs, verifies runtime claims against source, detects competing PRs, audits scope/convention compliance. Use before merging.
Menu
Adversarial spec-compliance PR review — cross-references diffs against approved specs, verifies runtime claims against source, detects competing PRs, audits scope/convention compliance. Use before merging.
Generate intelligence briefings for the planning dashboard. Use this skill whenever a request file appears in `.planning/console/inbox/pending/` whose `type` field starts with `intelligence.` (refresh_briefing, triage_finding, snapshot_diff, investigate_section, dismiss_finding). The skill reads the per-project KB at `.gsd/intelligence/intelligence.db`, synthesizes a briefing with a causal hypothesis + acknowledged uncertainty + confidence label and ranked moves, then writes the result back to the KB. Always trigger this skill for these request types — do not generate briefings manually.
Sketched Isotropic Gaussian Regularization primitive. Scalar loss matching the embedding distribution to a standard-normal target via Cramér-Wold slicing and the Epps-Pulley empirical characteristic function test. Port of rbalestr-lab/lejepa (MIT). Default-off in v1.49.571.
Pick among candidate outputs (code, configs, plans) by running them on diverse inputs and clustering by behavioural fingerprint, rather than by textual aggregation or log-probability. Activates when an executor returns multiple plausible candidates that need disambiguation, when output-majority voting would be the default choice, or when reviewing generated code that has not yet been validated. The 2026 evidence (Semantic Voting, arxiv 2605.08680v1) is that any execution-based selector dominates output-majority voting by 19-52pp; sketch-generated inputs beat random fuzz by 11.3pp. Triggers: "pick the best candidate", "majority vote on code", "select from N samples", "validate the generated output", "behavioural verification".
Extract creative intent from images into executable build specs. Activates on images + build intent, "image to mission", "i2m", or capturing visual energy in code/design.
Classify the information-need of a query and dispatch it to the appropriate retrieval or reasoning strategy. Use before read-side memory access, before multi-strategy retrieval, or any time you'd otherwise default to "one retriever for everything". Returns a strategy label, a token budget, and a retrieval depth so downstream handlers can be specialised. Backed by Pre-Route (arxiv 2605.10235v2) and MemFlow (arxiv 2605.03312v1), which together show LLMs possess latent routing ability elicitable via a structured prompt — and that externalising the routing decision improves small-model performance by ~2x. Triggers: "route this", "what strategy", "before retrieving", "intent classification", or any query whose ideal handling depends on what KIND of question it is.
Audit a skill by running a paired probe — the same task once with the skill loaded and once without — segment both traces into goal-directed phases, align phases, and emit a SIP report (surface anchoring, template copy, excess planning, task recovery, off-task artifact). Use whenever a skill is created, modified, or proposed for retirement. Pass-rate is BLIND to most skill effects: CTA (arxiv 2605.11946v1) shows a single skill can produce 522 measurable behavioural changes across 49 tasks while pass-rate moves only +0.3%. Triggers: "audit this skill", "is this skill helping", "retire skill", "before shipping skill", "behavioural impact of skill X", or any skill review event.
| name | adversarial-pr-review |
| description | Adversarial spec-compliance PR review — cross-references diffs against approved specs, verifies runtime claims against source, detects competing PRs, audits scope/convention compliance. Use before merging. |
| version | 1.0.0 |
| user-invocable | true |
| format | "2025-10-02T00:00:00.000Z" |
| status | ACTIVE |
| updated | "2026-04-15T00:00:00.000Z" |
| triggers | ["reviewing a pull request before merge","cross-referencing a diff against an approved spec","auditing PR scope/convention compliance or detecting competing PRs"] |
Adversarial review with spec-compliance focus. Reviews PRs like a hostile-but-fair reviewer — verifies that what the PR claims matches what it actually does and what the approved spec requires.
Accept one of:
--pr <N> — review a single PR--all — review all open PRs by the current author--branch <name> — review the diff on a specific branch against mainFor each PR under review, collect:
gh pr view <N> -R Tibsfox/gsd-skill-creator --json title,body,headRefName,labels,reviewDecision,statusCheckRollup,files
Extract issue number from PR body (Closes #N, Fixes #N, Resolves #N).
gh api repos/Tibsfox/gsd-skill-creator/issues/<N> --jq '{title, labels: [.labels[].name], state, body}'
Verify:
From the issue body or PR description, extract:
gh pr list -R Tibsfox/gsd-skill-creator --state open --search "closes #<issue_number> OR fixes #<issue_number>" --json number,author,title
If multiple PRs target the same issue, flag the competition.
gh pr diff <N> -R Tibsfox/gsd-skill-creator
Or for branch mode:
git diff main...HEAD
Cross-reference every claim against reality.
Compare files in the diff against files listed in the approved issue scope:
| Check | Verdict |
|---|---|
| All in-scope files are modified | PASS / UNDER-DELIVERY |
| Only in-scope files are modified | PASS / SCOPE CREEP |
| Modifications match approved intent | PASS / DEVIATION |
For each acceptance criterion in the approved issue:
Flag any criterion that is:
When the spec defines a data schema, API interface, or config shape:
This repo has strict import boundaries (CLAUDE.md):
src/ never imports desktop/@tauri-apps/apidesktop/ never imports Node.js modulesVerify that runtime references in the code actually exist.
When the diff adds new imports:
# Check the import target exists
grep -r "export.*<name>" src/ desktop/
If the imported symbol doesn't exist, this is BLOCKING.
When code references skills (.claude/skills/) or agents (.claude/agents/):
subagent_type values in Agent calls match defined agent typesWhen code references settings or config fields:
grep -n "<field_name>" .claude/settings.json src/
Verify field names, defaults, and types match the source of truth.
When workflows reference GSD commands (.claude/commands/gsd/):
Check project conventions per CLAUDE.md.
<type>(<scope>): <subject>For test files:
import { describe, it, expect } from 'vitest'any types without justificationFor skill files in .claude/skills/:
name, description requiredversion present for versioned skillsuser-invocable: true only if the skill should be directly callable.planning/ files never committed (gitignored by design)When code interpolates values into shell commands:
When code accepts file paths:
../ traversal in user-provided paths.planning/ files are not accidentally committed.env files, credentials, API keys in the diff.gitignore covers sensitive patternsThis is a self-modifying system — skills, agents, hooks can change behavior:
Beyond spec compliance:
For each PR, produce a structured review:
## Adversarial Review — PR #<N>
### Issue Linkage
- Issue: #<N> — labels: [list] — **PASS** / **FAIL** (reason)
### Scope Compliance
| Check | Status | Detail |
|-------|--------|--------|
| All in-scope files modified | PASS/FAIL | ... |
| No out-of-scope files | PASS/FAIL | ... |
| Import boundaries respected | PASS/FAIL | ... |
### Acceptance Criteria
| Criterion | Implemented | Tested | Correct | Notes |
|-----------|-------------|--------|---------|-------|
| AC-1: ... | Yes/No | Yes/No | Yes/No | ... |
### Runtime Claims
| Claim | Verified | Detail |
|-------|----------|--------|
| Import target exists | Yes/No | ... |
| Skill reference valid | Yes/No | ... |
### Convention Compliance
| Check | Status | Detail |
|-------|--------|--------|
| Conventional Commits | PASS/FAIL | ... |
| Vitest for tests | PASS/FAIL | ... |
| No .planning/ commits | PASS/FAIL | ... |
### Security
| Finding | Severity | Detail |
|---------|----------|--------|
| ... | BLOCKING/WARNING/INFO | ... |
### Adversarial Findings
| Finding | Severity | Detail |
|---------|----------|--------|
| ... | BLOCKING/WARNING/INFO | ... |
### Competing PRs
- None / #<N> by <author> — [comparison notes]
### Verdict
**APPROVE** / **CHANGES REQUESTED** / **CLOSE** (with rationale)
### Required Changes (if any)
1. ...
### Non-blocking Observations
1. ...