Run any Skill in Manus with one click

phira-impl-vs-claim-audit

Stars10

Forks3

UpdatedMarch 1, 2026 at 11:38

Procedure to audit alignment between a document's claims (e.g., algorithm description) and the actual implementation and defaults.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

phira-ai

phira-ai/Phira

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations·SOC 15-1253

SKILL.md

readonly

name	phira-impl-vs-claim-audit
description	Procedure to audit alignment between a document's claims (e.g., algorithm description) and the actual implementation and defaults.

Use this skill when someone provides a document describing what was implemented (paper, design doc, README section, notes) and you need to check whether the code actually matches what the document claims.

Goal: detect "claim drift" (code does not implement the described algorithm/behavior) and "default drift" (defaults/baseline behavior changed unintentionally).

Inputs you should request (if missing)

The document excerpt(s): exact text, with section headings (and page/figure/equation numbers if applicable).
Implementation context: changed files/diff, new flags/config keys, and the intended entrypoint(s).
Any stated compatibility constraints (env versions, dependency policy).

Audit checklist

Extract and restate the document claims

Pull out 3-10 testable claims from the document.
Separate:
- Algorithmic steps ("do A then B")
- Definitions (symbols, losses, schedules)
- Defaults (what happens if no flags are set)
- Expected outputs/logs/metrics
Restate each claim in one sentence, testable language.

Map document -> code

For each extracted claim, identify:
- where it is implemented (file + symbol)
- how it is triggered (config/flag/entrypoint)
- what the default does
- what is intentionally omitted or approximated (if any)

Default drift

Verify new code paths are gated when they should be.
Verify default config values preserve baseline behavior unless explicitly changed.

If the document describes a new algorithm but the feature is intended to be opt-in, explicitly confirm the gating and defaults.

Evidence quality (did we actually test the claim?)

Check whether any reported commands actually exercise the described algorithm path.
Prefer evidence that would fail if the algorithm were not implemented (assertions, characteristic logs, shape/value invariants).

If there is no evidence, propose the smallest decisive check(s): a tiny run, a unit test, or a "print the computed term" invariant.

Algorithmic fidelity traps

Look for the common mismatches between docs and code:
- wrong loss term sign/scaling/normalization
- missing detach/stop-gradient
- schedule differs (warmup, EMA, anneal)
- sampling differs (teacher forcing vs free running, masking)
- batch reduction differs (mean vs sum)
- randomness/seed placement differs

Integration seams

Look for the classic breakpoints: config plumbing, checkpoint load/save, distributed wrappers, dtype/device moves, shape conventions.

Required output (table)

Return a table like this:

Doc claim (with citation) -> Implementation (file:symbol + gate/default) -> Evidence -> Risk / Mismatch
- <claim 1 @ section/page/eq> -> <where + how enabled + default> -> <command/log/assertion or "none"> -> <mismatch or remaining risk>
- <claim 2 @ ...> -> ...

End with:

Verdict: PASS | FAIL
Smallest fix list (if FAIL)

phira-impl-vs-claim-audit

Inputs you should request (if missing)

Audit checklist

Required output (table)

More from this repository

Inputs you should request (if missing)

Audit checklist

Required output (table)

More from this repository