Run any Skill in Manus with one click

architectural-conformance-audit

Pre-R0 sprint gate that diffs implementation vs SOTA research output verbatim. Surfaces cited counter-examples and architectural mismatches before sprint hooks fire. Triggers: 'before R0', 'architectural audit', 'verify against research'. NOT for per-PR review or post-merge.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/EtanHey/golems --skill architectural-conformance-audit

Copy and paste this command into Claude Code to install the skill

Source

EtanHey/golems

Stars3

Forks2

UpdatedJune 1, 2026 at 10:25

File Explorer

15 files

SKILL.md

readonly

More from this repository

same repository

frustration-capture

EtanHey/golems

Capture user corrections as high-importance BrainLayer entries. Use when user says no/wrong/stop, repeats instructions, expresses frustration, or during session mining. Triggers on: user correction, 'I told you', 'not that', frustration signal.

2026-06-033

skill-creator

EtanHey/golems

Create, edit, audit, and evaluate golem-powers skills. Use for new skills, structural skill edits, workflow/adapter changes, pre-deploy validation, skill evals, benchmarks, live A/B tests, session JSONL mining, batch miners, and handoff digests. Triggers: create skill, edit skill, audit skill, validate skill, skill eval, live eval, mine session. NOT for invoking existing skills or convergence weaving.

2026-06-033

drive-usage

EtanHey/golems

Brain Drive filing discipline — where every artifact goes + how to name it. Use WHENEVER touching Google Drive / Brain Drive: uploading, creating folders, saving research prompts/results, audits, plans, transcripts, dashboards, or when about to leave a durable artifact in docs.local/. Teaches the numbered folder model (01_STANDARDS / 02_GROUNDING / 03_RESEARCH / 04_INGEST / 06_ARCHIVE), date-prefixed naming, and the rule: FILE durable artifacts in the right Drive folder — docs.local/ is cache-only. NOT for querying Drive via Gemini (use /braindrive) or web research (use /gemini-research); for >100KB heavy archival defer to /google-drive-archive.

2026-06-033

repogolem

EtanHey/golems

Launch agents in any repo via repoGolem launchers ({name}Claude, {name}Codex, {name}Cursor). Unified flags: -s skip, -c continue, -m model, -p headless, -w worktree cwd. 40 projects registered. Triggers on: spawn agent, launcher, repoGolem, brainlayerClaude, flags.

2026-06-023

weave

EtanHey/golems

Orchestrator-only convergence workflow: mine recent Claude/Codex JSONLs, weave cited findings into an action ledger, route every finding to a disposition, then red-team facts against raw logs. Triggers: weave, weave now, run weave, session weave, convergence weave. Use only when fleet is quiet; NOT for single-session mining or web research.

2026-06-023

code-review

EtanHey/golems

Full code review lifecycle: requesting reviews (CodeRabbit, Greptile, Bugbot, GitHub PR comments) and receiving feedback (classify issues, implement fixes, push back on wrong suggestions). Use when: creating a PR review, reading review comments, handling reviewer feedback, fixing review items, or deciding whether to accept or reject a suggestion. NOT for: running tests directly or CI/CD pipeline issues (use relevant repo tools).

2026-06-013

Source

EtanHey

EtanHey/golems

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	architectural-conformance-audit
description	Pre-R0 sprint gate that diffs implementation vs SOTA research output verbatim. Surfaces cited counter-examples and architectural mismatches before sprint hooks fire. Triggers: 'before R0', 'architectural audit', 'verify against research'. NOT for per-PR review or post-merge.
argument-hint	<sprint-name \| research-output-path>
disable-model-invocation	false

Skill: architectural-conformance-audit

Wave evidence (severity 10, 4 corroborating digests): AP1 root cause was architectural ground-truth blindness. Researcher's SOTA output at idx [380] LITERALLY cited Letta-on-FastAPI as a counter-example, yet R0→R5 sprint hooks optimized within the daemon assumption anyway. ~35h misdirected work + 2.5h explicit correction. PR #312 fixed it in code (FastAPI daemon deleted, socket-direct CLI added, merged May 22 11:39Z). See pain-points/_consolidated.md Pattern 4 for the full chronology. This skill prevents AP1-class recurrence procedurally.

WHEN TO ACTIVATE

This skill fires at a specific sprint moment: the kickoff of a new R0→R5 sprint OR a large-plan phase that builds atop a research output (whichever fires first in a given build arc). It is NOT a per-PR check; it is a per-sprint gate.

Tier 1 — Mandatory triggers (always invoke)

New sprint kickoff that references a prior researcher SOTA output (research/*.md or similar).
New sprint that modifies / extends architecture established in an earlier PR (PR-α, PR-β, etc.).
After any researcher dispatch whose output explicitly cites counter-examples.
When a sprint is rebooting after an architectural correction (e.g., resuming from an AP1-class failure).

Tier 2 — Recommended triggers (invoke unless explicitly skipped)

New skill development that wraps or extends an MCP/daemon/service.
When R5 evaluator score < parent-objective threshold (suggests local-optimum trap).
When implementation language / framework choice was inherited (not first-principles).

Tier 3 — Manual invocation

User asks "is the architecture correct?" / "verify against research" / "before we go further, let's audit"

THE AUDIT CONTRACT

The audit produces three artifacts before any R0 work begins:

Artifact 1 — SOTA Excerpt (verbatim)

For each cited research output:

Pull the SOTA output file (research/<topic>-research.md or equivalent).
Extract every section that mentions architecture, framework choice, OR counter-examples.
Quote the relevant passages verbatim with source indices.

Output → docs.local/audits/<sprint>/<date>-sota-excerpt.md.

Artifact 2 — Implementation Map (concrete)

For the current implementation:

List every architectural primitive (daemon, service, MCP, queue, socket, HTTP layer, persistence backend).
For each primitive: cite file path + line range that defines it.
For each primitive: note whether it was first-principles-derived in this sprint, inherited from a prior PR, or scaffolded by an external generator.

Output → docs.local/audits/<sprint>/<date>-impl-map.md.

Artifact 3 — Conformance Verdict

For each (SOTA excerpt × impl primitive) pair:

MATCH — implementation follows SOTA recommendation.
DIVERGE — UNJUSTIFIED — implementation contradicts SOTA without documented rationale.
DIVERGE — JUSTIFIED — implementation contradicts SOTA with documented rationale (cite the rationale).
N/A — SOTA silent on this primitive.

The gate rule: if ANY DIVERGE — UNJUSTIFIED exists → SPRINT R0 IS BLOCKED until the divergence is either reconciled (impl changed) or justified (rationale documented + brain_store'd).

Output → docs.local/audits/<sprint>/<date>-conformance-verdict.md.

WORKFLOW

Step 1 — Locate the SOTA research output

Canonical scan order (most-recent on tie via ls -lat):

ls -lat research/*.md 2>/dev/null
ls -lat docs.local/research/*.md 2>/dev/null
ls -lat ~/Gits/orchestrator/docs.local/research/*.md 2>/dev/null
ls -lat ~/Gits/orchestrator/docs.local/handoffs/**/research/*.md 2>/dev/null

If multiple SOTA outputs conflict, the audit MUST list both and require Etan to pick the canonical source before proceeding. Do NOT auto-canonicalize by date — staleness is the AP1 root cause.

Step 2 — Extract SOTA architectural claims

Read the SOTA output in full. For each architectural claim, extract:

The claim itself.
Source index (line number, section name, or [N] reference).
Whether the claim is positive ("use X"), negative ("avoid Y"), or comparative ("X over Y because Z").
Any cited counter-examples (the FastAPI case: SOTA cited Letta-on-FastAPI as the thing not to do).

If extraction grows large or repetitive, follow workflows/extract-claims.md.

Step 3 — Map the implementation

Per architectural primitive in scope:

find src packages -name "daemon.py" -o -name "service.py" -o -name "*.service.ts" | xargs wc -l
grep -rn "^from fastapi\|^import fastapi\|from socketio\|import asyncio" src packages 2>/dev/null
grep -rn "mcp__server__\|@server\.tool\|@server\.resource" src packages 2>/dev/null

For each primitive found: file path + line range; direct vs. transitive (inherited) authorship; first-principles vs. scaffolded. If mapping grows large, follow workflows/map-impl.md.

Step 4 — Diff

For each (SOTA claim, impl primitive):

SOTA says "use X" + impl uses X → MATCH.
SOTA says "use X" + impl uses Y → DIVERGE (look for documented rationale via git log -p --follow <impl-file> and brain_search "<primitive> chosen over <alternative>").
SOTA cites X as counter-example + impl uses X → DIVERGE — UNJUSTIFIED unless explicitly documented (this IS the AP1 pattern; treat as severity-10).
SOTA silent → N/A.

Step 5 — Gate decision

ANY DIVERGE — UNJUSTIFIED → R0 BLOCKED. Surface to Etan + sprint LEAD with verbatim SOTA cite + impl divergence + proposed reconciliation path (change impl OR document rationale).
ALL MATCH | DIVERGE — JUSTIFIED | N/A → R0 CLEARED. brain_store the audit verdict at importance ≥8 with tags [architectural-audit, <sprint>, R0-cleared]. Composes with /brain-store-fallback for transport failures.

There is no --override flag. Per gen-8 decision: document or change impl — those are the only two paths. Footgun risk too high. If override is later deemed necessary, it must brain_store at importance 10 with tag [audit-override] + verbatim rationale.

ANTI-PATTERNS

AP1 — Reading research output but not diffing against impl

The historical case. Researcher's output at idx [380] existed; multiple agents READ it (researcher, R5 evaluator, Codex workers, Cursor auditors). Nobody DIFFED it against daemon.py. The audit MUST produce a literal pairwise diff, not "I read the research."

AP2 — Skipping audit because "we already audited this last sprint"

Architectural assumptions decay. Each sprint must re-audit. If nothing changed, the audit is fast (re-cite the prior verdict). If something changed, the audit catches it.

AP3 — Treating "the researcher said use X" as gospel

SOTA outputs themselves can be wrong or stale. The audit's job is conformance — does impl match SOTA? — NOT validation that SOTA is correct. If SOTA is wrong, that's a separate research-correction sprint (and worth flagging).

AP4 — Confusing this with code review

Code review reads the diff and checks craft. This audit reads the architecture and checks first-principles alignment. They compose; this fires BEFORE R0, code review fires DURING R3.

AP5 — Letting the audit become a paperwork exercise

If audits routinely come back MATCH for everything with no friction, suspect false-pass. The audit's value comes from catching real DIVERGE cases. If 3 consecutive sprints show all-MATCH, run a meta-audit on the auditor (was it reading the right SOTA? was it checking the right primitives?).

COMPOSITION

Research artifacts — Claude Desktop/Gemini/web research outputs feed the SOTA output this audit reads.
/never-fabricate — audit verdicts cite specific file paths + line ranges; never-fabricate enforces that those citations are real. This skill is the architectural-level fabrication guard; never-fabricate is the file-level guard.
/brain-store-fallback (SHIP-2, merged) — audit verdicts get stored at importance ≥8; brain-store-fallback handles transport failures during storage. Mandatory composition.
/coderabbit — composes downstream; coderabbit fires per-PR, this skill fires per-sprint.
/plan-validate — adjacent skill (general assumption checks); plan-validate is general, this is architecture-specific.
/large-plan/workflows/scaffold — the audit is a pre-R0 step in scaffold.md so it isn't skipped by oversight.
/orc — orc invokes this skill at sprint kickoff when Tier-1 triggers fire.

EVALS (summary — full scenarios in `evals/evals.json`)

#	Scenario	Without skill (baseline)	With skill (target)	Assertion
1	SOTA recommends socket-direct; impl uses FastAPI HTTP (AP1 re-creation)	R5 graded local-optimum 8.85/10; mismatch slips through	DIVERGE — UNJUSTIFIED; R0 blocks until reconciled	Verdict file lists FastAPI primitive with counter-example cite
2	SOTA recommends X; impl uses X	No-op	MATCH; R0 clears	Verdict shows MATCH; brain_store fires
3	SOTA silent on primitive Z; impl uses Z	No-op	N/A for Z (doesn't block)	Verdict file shows N/A for Z
4	Two SOTA outputs conflict	Stale SOTA used silently	Audit lists both; gate held pending Etan pick	Both files referenced; no auto-canonicalize
5	Mixed MATCH/DIVERGE across multiple primitives	Slips through	Lists all; ANY UNJUSTIFIED blocks	R0 blocked even if 9/10 MATCH

The AP1 re-creation eval (scenario 1) is load-bearing. Fixture: real-world excerpt from the May 2026 brainlayer-readpath research output + synthetic FastAPI daemon snippet mimicking the deleted PR-α daemon.py.

Smoke test (retrospective): run against the current brainlayer codebase POST-PR #312. Expected: MATCH on socket-direct primitive (the audit retrospectively confirms the fix held). Note: smoke is read-only against ~/Gits/brainlayer/ per cross-repo constraint.

DEFINITION OF DONE (per-invocation)

Audit produces 3 artifacts in docs.local/audits/<sprint>/<date>-{sota-excerpt,impl-map,conformance-verdict}.md.
R0 gate is enforced: ANY UNJUSTIFIED DIVERGE blocks proceeding.
Verdict is brain_store'd at importance ≥8 with tags [architectural-audit, <sprint>, R0-cleared|R0-blocked]. Use /brain-store-fallback if BL transport fails.
Skill composes cleanly with research artifacts, /coderabbit, /large-plan/scaffold.
AP1 re-creation eval passes (scenario 1 fixture).

R5 EVALUATOR EXTENSION — OUT OF SCOPE HERE

Per consolidated.md Pattern 4 system-fix:

"R5 evaluator skill change: must include 'goal-envelope check' — score against the original parent objective, not the sprint's local optimization."

This skill does NOT modify the R5 evaluator (it's a separate skill change, tracked as a future SHIP-9 candidate). This skill surfaces the parent-objective in the audit so the R5 evaluator has a referenceable target. The two compose; they ship independently.

ESCALATION

Multiple SOTA candidates conflict → list both, require Etan to canonicalize. Do NOT auto-pick by date.
AP1-class DIVERGE found → R0 BLOCKED message MUST include verbatim SOTA cite, file path of divergent impl, and explicit "change impl OR document rationale" path. No silent block.
BrainLayer transport fails during verdict storage → fall back via /brain-store-fallback and report the fallback file path in the verdict.

architectural-conformance-audit

More from this repository

More from this repository

Skill: architectural-conformance-audit

WHEN TO ACTIVATE

Tier 1 — Mandatory triggers (always invoke)

Tier 2 — Recommended triggers (invoke unless explicitly skipped)

Tier 3 — Manual invocation

THE AUDIT CONTRACT

Artifact 1 — SOTA Excerpt (verbatim)

Artifact 2 — Implementation Map (concrete)

Artifact 3 — Conformance Verdict

WORKFLOW

Step 1 — Locate the SOTA research output

Step 2 — Extract SOTA architectural claims

Step 3 — Map the implementation

Step 4 — Diff

Step 5 — Gate decision

ANTI-PATTERNS

AP1 — Reading research output but not diffing against impl

AP2 — Skipping audit because "we already audited this last sprint"

AP3 — Treating "the researcher said use X" as gospel

AP4 — Confusing this with code review

AP5 — Letting the audit become a paperwork exercise

COMPOSITION

EVALS (summary — full scenarios in evals/evals.json)

DEFINITION OF DONE (per-invocation)

R5 EVALUATOR EXTENSION — OUT OF SCOPE HERE

ESCALATION

Skill: architectural-conformance-audit

WHEN TO ACTIVATE

Tier 1 — Mandatory triggers (always invoke)

Tier 2 — Recommended triggers (invoke unless explicitly skipped)

Tier 3 — Manual invocation

THE AUDIT CONTRACT

Artifact 1 — SOTA Excerpt (verbatim)

Artifact 2 — Implementation Map (concrete)

Artifact 3 — Conformance Verdict

WORKFLOW

Step 1 — Locate the SOTA research output

Step 2 — Extract SOTA architectural claims

Step 3 — Map the implementation

Step 4 — Diff

Step 5 — Gate decision

ANTI-PATTERNS

AP1 — Reading research output but not diffing against impl

AP2 — Skipping audit because "we already audited this last sprint"

AP3 — Treating "the researcher said use X" as gospel

AP4 — Confusing this with code review

AP5 — Letting the audit become a paperwork exercise

COMPOSITION

EVALS (summary — full scenarios in evals/evals.json)

DEFINITION OF DONE (per-invocation)

R5 EVALUATOR EXTENSION — OUT OF SCOPE HERE

ESCALATION

EVALS (summary — full scenarios in `evals/evals.json`)

EVALS (summary — full scenarios in `evals/evals.json`)