| name | scafforge-audit |
| description | Run Scafforge's host-side diagnosis flow for an existing repository. Use when a repo needs workflow diagnosis, contract verification, professional codebase review, or a four-report diagnosis pack with evidence-backed ticket recommendations and no repair edits. |
Scafforge Audit
Use this skill to inspect an existing repository in full diagnostic mode without mutating it.
This is the host-side diagnosis surface. It replaces the old mixed doctor-plus-bridge behavior by keeping diagnosis, review evidence intake, and report generation together in one non-mutating skill.
Every audit run produces the full four-report diagnosis pack.
Every diagnosis pack also persists disposition-bundle.json and package-evidence-bundle.json so package work can be staged into active-audits/ without inventing a second artifact family.
Use ../../references/competence-contract.md as the package-level bar for whether the workflow is actually competent.
This skill now emits code-quality findings as well as workflow findings, including EXEC families for stack-specific execution failures and REF families for broken canonical references.
When to use this skill
- The user asks for diagnosis, review, audit, or report generation
scaffold-kickoff reaches the retrofit audit step or an explicit diagnosis/review flow
- A managed repo needs a current-state diagnosis before any repair work
- A PR or review thread has findings that need evidence validation and ticket recommendations
- A generated repo needs a diagnosis pack that the user will manually carry into the Scafforge dev repo for package work
If the user explicitly asks to repair or refresh the managed workflow layer, route to ../scafforge-repair/SKILL.md instead.
Procedure
1. Establish scope and evidence
Read the repo state first.
- Inspect workflow surfaces, docs, ticketing, and managed state
- Inspect
diagnosis/ and .opencode/meta/bootstrap-provenance.json to determine whether this is a repeat audit after a prior repair attempt
- If the repo already has repeated diagnosis packs with materially identical repair-routed findings and no newer Scafforge package or process-version change, treat that as audit churn and stop recommending another subject-repo audit-first loop
- Inspect
.opencode/state/invocation-log.jsonl when it exists and treat coordinator-authored specialist artifacts there as suspect workflow evidence
- If session logs or transcript exports were supplied, inspect them before current-state reconciliation and treat them as first-class temporal evidence
- If the transcript shows the operator spending time figuring out how to move forward on basic workflow, treat that confusion as package evidence; the workflow should already expose one legal next move
- Treat coordinator narration inside supplied logs as candidate explanation, not ground truth; prefer concrete tool calls, tool outputs, tool errors, and current repo state when deciding what actually failed and why
- Reconstruct transcript chronology explicitly when logs are supplied:
- repeated lifecycle errors
- workaround or bypass attempts
- broken or non-executable tool calls
- deterministic tool-execution defects where the tool surface itself mis-parses a valid request before the intended command starts
- verification failures
- later executable recovery evidence
- later PASS claims or artifact publication
- coordinator-authored specialist artifacts
- If review comments, PR notes, or external findings were provided, treat them as candidate evidence only
- Apply the evidence and non-taint rules from
references/review-contract.md
Do not convert an unverified claim into a canonical finding.
If this is a repeat audit, explain why the previous audit-to-repair cycle failed before recommending another repair run.
2. Run the audit script
The script is evidence extraction, not the whole diagnosis.
Its rule logic should stay grouped by invariant family in code modules, not keep growing only as prose in references.
Its report generation and ticket-recommendation assembly should also stay in code modules rather than turning this skill doc into a second implementation surface.
New smell codes should land with rule implementation plus regression coverage, not just another narrative note.
For transcript-backed audits, the invoker must do all three steps in order:
- reconstruct chronology from the supplied logs before running the script
- run the script for deterministic candidate findings and repo evidence extraction
- reconcile the script output against the chronology and current repo truth before presenting final findings
When operating from the Scafforge package root, run:
python3 skills/scafforge-audit/scripts/audit_repo_process.py <repo-root> --format both --emit-diagnosis-pack
If your current working directory is the skill directory itself, the equivalent relative command is:
python3 scripts/audit_repo_process.py <repo-root> --format both --emit-diagnosis-pack
For standalone non-OpenCode usage, still prefer:
python3 skills/scafforge-audit/scripts/run_audit.py <repo-root> --format both
The canonical repo-root runner in Scafforge executes from the package root, so do not assume the skill directory is the shell cwd when copying commands from this document.
Pass --supporting-log <path> for each supplied session log or transcript export.
If the audited repo is outside the current host's writable roots, pass --diagnosis-output-dir <writable-path> so the diagnosis pack is still emitted in a host-writable location.
Use --diagnosis-kind initial_diagnosis for the first subject-repo diagnosis, --diagnosis-kind post_package_revalidation for the single fresh audit after Scafforge package changes land, and reserve post_repair_verification for the public repair runner.
It produces:
- a markdown audit report
- a JSON audit report
- the timestamped diagnosis pack in
<repo-root>/diagnosis/<YYYYMMDD-HHMMSS>/ or another writable host-selected output directory
The script diagnoses only. It does not modify files.
Treat every script finding as a candidate until the invoker has reconciled it against the supplied logs and the current repo.
3. Interpret findings against the repair contract
Read:
references/process-smells.md
references/repair-playbook.md
references/safe-stage-contracts.md
For each finding, identify:
- the concrete problem
- the root cause
- the safer target pattern
- whether the issue is workflow-layer drift, source-layer implementation drift, or review noise
- whether the issue belongs to EXEC or REF code-quality families rather than the managed workflow layer
- whether the issue is a host prerequisite blocker such as missing
uv, pytest, rg, git identity, or diagnosis-pack output permissions
- when logs were supplied, whether the issue is a historical chronology failure, a current-state repo failure, or both
- whether the script output needs to be amended, merged, downgraded, or rejected after chronology review
- whether the repo-local workflow explainer, coordinator prompt, and tool contract agree on the same lifecycle semantics
- whether deterministic execution tools such as
smoke_test can actually execute repo-standard explicit overrides, including shell-style KEY=VALUE cmd ... forms
- whether
smoke_test honors ticket-specific acceptance commands before falling back to generic repo-level smoke detection
- whether failed bootstrap artifacts show command traces that contradict the repo's declared dependency layout, so a managed
environment_bootstrap defect is not misclassified as operator-only rerun guidance
- whether the resume truth hierarchy keeps
tickets/manifest.json plus .opencode/state/workflow-state.json canonical over derived restart surfaces
- whether pending backlog process verification is merely visible current state or an actual workflow defect that the repo is hiding or contradicting
- whether lease ownership is consistently coordinator-owned or still split across worker prompts
- whether the workflow exposed one clear legal next move or forced the operator to infer a workaround from contradictory surfaces
- which adjacent surfaces should be inspected before the issue is considered understood:
- lifecycle thrash or bypass-seeking -> prompts,
ticket_lookup, ticket-execution, and stage gates
- smoke-test defects -> ticket acceptance scope,
smoke_test, and team-leader guidance together
- closeout contradictions ->
handoff_publish, restart surfaces, and lease rules together
- downstream PR or review gate confusion -> the adjacent orchestration wrapper contract, generated workflow docs, and repo-owned stage-gate evidence together
4. Validate review findings when present
If the request includes PR comments, review notes, or claimed bugs:
- inspect the cited implementation directly
- compare the claim against the current repo contract and actual code
- reject unsupported, outdated, or tainted findings
- keep validated findings with tight file evidence
- when the input is a session transcript, explain stale early-state evidence separately from later reasoning failures
This skill owns professional review validation and ticket recommendation generation at the host layer. Do not route to a separate PR-bridge skill.
5. Generate the four-report diagnosis pack
Produce the report pack described by the diagnosis docs on every run.
Use:
references/four-report-templates.md for report structure
references/pr-review-workflow.md for review-triage procedure
references/review-contract.md for evidence grades, ownership classification, and ticket-proposal rules
Required outputs:
- Report 1:
01-initial-codebase-review.md
- Report 2:
02-scafforge-process-failures.md
- Report 3:
03-scafforge-prevention-actions.md
- Report 4:
04-live-repo-repair-plan.md
disposition-bundle.json
package-evidence-bundle.json
At minimum, the pack must capture:
- validated findings, severity, evidence grade, and file references
- explicit split between workflow findings and code-quality findings
- supporting session logs or transcript exports when supplied
- whether a previous diagnosis and repair cycle already failed, and which workflow-layer findings persisted
- whether the transcript shows workflow thrash, bypass-seeking, or evidence-free PASS claims
- whether the transcript shows softer dependency-override or “close it anyway” reasoning even without literal
bypass wording
- whether the transcript shows coordinator-authored specialist artifacts or a recovery run that clears an earlier verification failure
- ownership classification for each issue: package defect, repo-local defect, mixed defect, managed-surface drift, or source bug
- rejected or outdated external claims when review evidence was supplied
- Scafforge prevention actions needed in the package repo
- live-repo repair actions that can happen only after the package changes are available
- a clear split between historical session truth and current repo truth whenever the repo changed after the logged session
- the package-evidence bundle fields needed for package investigation: repo name, diagnosis kind, triggering finding codes, disposition summary, audit-pack path, report paths, restart-surface refs, workflow-state ref, provenance refs, and candidate package surfaces
Report 4 must:
- distinguish safe repairs from intent-changing repairs
- identify whether
scafforge-repair should run
- state whether Scafforge package work must happen first
- include ticket recommendations for post-repair or source-layer follow-up, especially when EXEC or REF findings require remediation tickets instead of repair-time source edits
- optionally emit a machine-readable recommended-ticket payload when useful
5a. Stage package evidence and investigation inputs
When Report 4 says package work must happen first, stop at the diagnosis pack and move into the package-side evidence chain.
- Stage the raw diagnosis pack into the package workspace:
python3 skills/scafforge-audit/scripts/stage_active_audit.py <diagnosis-dir-or-manifest> --agent-log <audit-log>
- Treat
active-audits/<repo>/evidence-manifest.json as the machine-readable package investigation input. The current escalation matrix is intentionally conservative:
- severity
error managed-surface contradictions are always eligible package evidence
- severity
warning package-managed families escalate only when the same failure family is already present in at least one other distinct repo under active-audits/, or when the current diagnosis still says Scafforge package work is required first
- repo-local
EXEC* and REF* findings stay subject-repo follow-up unless the authoritative bundle marks them as package-managed
- Write the investigator outputs under
active-audits/<repo>/investigator/:
python3 skills/scafforge-audit/scripts/write_investigator_report.py active-audits/<repo> --symptom-summary "..." --cause-hypothesis "..." --prevented-by "..." --surface <path> --revalidation-step "..."
- After a normal reviewed package PR exists, record the package-fix linkage under
active-audits/<repo>/fixer/:
python3 skills/scafforge-audit/scripts/write_package_fix_record.py active-audits/<repo> --status open --package-pr <pr-ref> --package-issue <issue-ref> --validation "npm run validate:contract=passed" --validation "npm run validate:smoke=passed" --validation "python3 scripts/integration_test_scafforge.py=passed" --validation "python3 scripts/validate_gpttalker_migration.py=passed"
These investigator and fixer commands are bounded package-maintainer scripts, not public skills. They write evidence sidecars and review metadata under active-audits/; they do not mutate package code or generated repos directly, so separate skill-manifest registration is not required.
6. Decide the next workspace, then stop
This skill is non-mutating.
- If no repair is needed, record a clean diagnosis result
- If the diagnosis identifies a Scafforge package defect or prevention gap, stop after writing the diagnosis pack
- When an adjacent orchestration wrapper is in play, package-owned findings move the job into
package-change-pending outside the repo, while safe repo-local repair findings move it into repo-repair-pending
- Stage the diagnosis pack from the subject repo into
active-audits/<repo>/ with stage_active_audit.py; keep the copied raw evidence immutable
- Apply Scafforge package changes in the Scafforge dev repo before recommending repair
- Return to the subject repo and run exactly one fresh
post_package_revalidation audit against the updated package before recommending repair; stage that pack again so active-audits/<repo>/revalidation/resume-ready.json can carry the downstream resume truth
- Route to
../scafforge-repair/SKILL.md only from that fresh revalidation pack once package work is no longer the next required step
- If safe workflow repair is needed but no package work is required and the current repair surface already matches Report 4, repair may be the next separate step
- If only source-layer follow-up is needed, recommend tickets through
ticket-pack-builder
Do not modify the repo from this skill.
Do not tell the user to go straight from report generation to repair unless you have first confirmed that the required Scafforge-side changes already exist.
How this differs from scafforge-repair
scafforge-audit diagnoses, validates evidence, and recommends next actions
scafforge-repair consumes diagnosis outputs and applies safe managed-surface repairs
Keep those responsibilities separate.
After this step
- Continue to
../handoff-brief/SKILL.md after a clean audit
- Ask the user to stage the diagnosis pack into
active-audits/ when Report 4 identifies Scafforge package work
- Continue to
../scafforge-repair/SKILL.md only after those package changes are implemented and the subject repo is back in scope
Required outputs
- Full diagnosis scope
- Evidence-backed findings only
- Root cause for each validated finding
- Prior diagnosis/repair-cycle analysis when this is a repeat audit
- Four-report diagnosis pack
- Clear repair recommendation boundary
- Ticket recommendations for post-audit follow-up where needed
- Explicit statement that no repo edits were made
Rules
- Keep diagnosis non-mutating
- Treat missing host prerequisites as first-class findings instead of silent verification skips
- Do not preserve contradictory workflow semantics because they already exist
- Do not accept review claims without repo evidence
- Do not let PR comments taint canonical state
- Do not answer a supplied causal transcript question with current-state findings alone
- Do not treat script output as self-sufficient when the user asked about a supplied transcript or session log
- Do not keep recommending another subject-repo audit when repeated diagnosis packs show the same repair-routed findings and no newer package or process-version change exists
- Do not collapse the intended loop into audit -> package patch -> many more audits. After package work, one fresh post-package revalidation audit is the only normal subject-repo audit before repair.
- Do not treat repeated lifecycle retries, unsupported-stage probing, or PASS artifacts without executable proof as harmless transcript noise
- Do not collapse a transcript-backed tool-execution defect into a generic test failure when the tool never launched the requested command
- Do not collapse acceptance-command drift in
smoke_test into a generic failing-test finding when the tool ran the wrong smoke scope
- Do not collapse repeated incompatible bootstrap command traces into a generic
ENV002 rerun recommendation when the managed bootstrap surface is the reason the expected dependency flags never ran
- Do not treat
pending_process_verification by itself as a package defect when restart surfaces and routing already expose it truthfully
- Treat missing or failed current-cycle handoff proof as a first-class workflow defect when restart surfaces still claim ready or complete state
- For Godot downstream reliability, keep
EXEC-GODOT-013, EXEC-GODOT-014, and EXEC-GODOT-015 explicit instead of collapsing them into generic drift
- Do not collapse stale source/follow-up graph contradictions into generic ticket noise when the repo lacks a canonical
ticket_reconcile path
- Do not treat open-parent child decomposition as ordinary remediation when
split_scope routing is missing or drifted
- Do not treat operator confusion as mere user error when the workflow did not expose one clear legal next move
- Do not accept a later zero-finding verification audit as proof of repair if the earlier transcript-backed causal basis was dropped
- Keep workflow-layer findings separate from source-layer implementation findings
- Fold review validation into this skill instead of reviving a separate bridge
References
references/process-smells.md — workflow smells covered by the audit
references/repair-playbook.md — repair targets and safe-versus-intent boundary
references/safe-stage-contracts.md — stage contract definitions
references/review-contract.md — evidence validation and non-taint rules for review findings
references/pr-review-workflow.md — review-comment intake, validation, and routing workflow
references/four-report-templates.md — required structure for the diagnosis-pack reports