| name | repo-cartographer |
| description | Produces or updates expert-level repository documentation engineered so a top 3% engineer needs zero explanation. Trigger AGGRESSIVELY on — document this repo, doc the codebase, generate docs, update README, build a README, architecture doc, ARCHITECTURE.md, ADR, onboarding docs, new engineer ramp, investor code due diligence, code DD, code audit, API documentation, OpenAPI, runbook, ops doc, what does this service do, explain this repo, walk me through this repo, doc drift, outdated docs, regenerate docs, post-merge docs, after merging to main, code/doc divergence, compliance dossier, HIPAA evidence, SOC 2 evidence, audit-ready repo, reading path, where do I start in this codebase. Auto-fires on GitHub push-to-main via Claude Code CI. Never write ad-hoc READMEs from memory — this skill encodes doc taxonomy, evidence-linking, compliance overlay, drift detection, cross-repo system map, and in-repo / Mintlify / Notion three-channel sync. Fire even when not explicitly invoked. |
repo-cartographer
Map a repository so completely that a top-3% engineer reading cold needs zero explanation.
The Contract
Output passes this bar or it doesn't ship:
- The map — system + module + data-flow diagrams generated from actual code, not folder structure.
- The why — every non-obvious decision has an ADR. Undocumented decisions get an inferred ADR marked
status: inferred — founder review needed.
- The contracts — every public interface (HTTP, queue, CLI, library export) documented with semantics, error codes, idempotency, rate limits.
- The data model — ER + entity lifecycles + per-field PHI/PII classification + retention.
- The failure model — every external dependency has documented failure modes and a runbook.
- The deployment truth — where code runs, network paths, secrets flow, IaC pointer.
- The dependency story — every direct dep has a one-line rationale; transitive flagged if security-sensitive.
- The reading path — literal ordered file list for 30-min / 2-hr / 1-day onboarding.
Every factual claim links to an evidence anchor (file path + line range, commit SHA, or test name). No hand-waving.
Two-Mode Operation
Mode A — On-Demand (chat / Claude Code)
User invokes against a repo or workspace. Skill performs full pass.
Mode B — CI (GitHub Action)
Push-to-main triggers Claude Code with this skill loaded. Skill diffs the merged change against the last doc baseline and updates ONLY the affected sections. Opens a PR docs: sync from <SHA>.
Workflow file template: assets/github-actions/doc-sync.yml.
Workflow — On-Demand Mode
Step 1 — Reconnaissance
Run scripts/analyze_repo.py <repo-path> to produce a structural inventory:
- Languages + line counts
- Entry points (mains, server bootstraps, CLI defs)
- Manifest files (package.json, pyproject.toml, go.mod, Cargo.toml, requirements.txt)
- Test framework + coverage signals
- IaC presence (CDK, Terraform, Pulumi, Helm)
- CI/CD config
Run scripts/extract_deps.py <repo-path> for the dependency table with rationale slots.
Run scripts/detect_phi_paths.py <repo-path> to identify PHI/PII-touching code paths — the compliance overlay moat. Output drives COMPLIANCE-DOSSIER.md.
Step 2 — Diagram extraction
Read references/architecture-extraction.md for the methodology. Generate Mermaid diagrams inline:
- System diagram (services, queues, datastores, external systems)
- Sequence diagrams for the 3 most critical user-facing flows
- ER diagram derived from Drizzle / Prisma / SQLAlchemy / ORM schema
Diagrams live in the Markdown — no external diagramming service. Mermaid renders in GitHub, Mintlify, and Notion.
Step 3 — Inferred ADR back-fill
Read references/adr-template.md. For each non-obvious architectural decision detectable from code (schema-per-tenant pattern, choice of Drizzle over Prisma, ECS vs Lambda, etc.) where no existing ADR exists, draft a candidate ADR and mark status: inferred — founder review needed. Mine commit history (git log --follow <path>) for context where useful.
Never ship an inferred ADR without that status tag. Lang reviews and either accepts (flips to status: accepted) or rewrites.
Step 4 — Generate doc set
Use templates in assets/templates/. Required outputs per repo:
docs/
├── README.md # entry point with reading path
├── READING-PATH.md # 30min / 2hr / 1day ordered file lists
├── ARCHITECTURE.md # system + module + sequence diagrams
├── DATA-MODEL.md # ER + lifecycles + PHI classification
├── API-CONTRACTS.md # every public interface
├── DEPLOYMENT.md # topology + IaC pointer + secrets map
├── SECURITY-MODEL.md # threat model, authn/authz, audit log
├── COMPLIANCE-DOSSIER.md # HIPAA/CFR/SOC 2 traces (the moat)
├── OBSERVABILITY.md # metrics, logs, traces, SLOs
├── adr/
│ ├── 0001-<decision>.md
│ └── ...
└── runbooks/
├── <failure-mode>.md
└── ...
Step 5 — Cross-repo map
Read references/multi-repo-map.md. If --workspace flag given OR if the target is bridge-os, reconcile against the canonical artificialBRIDGE system map at /home/claude/repo-cartographer/state/system-map.json (created on first run). Update with new dependencies discovered.
Step 6 — Mirror to Mintlify + Notion
Read references/docs-site.md and references/notion-sync.md. Run:
python scripts/site_sync.py docs/ --target <mintlify-repo-path>
python scripts/notion_sync.py docs/ --workspace artificialBRIDGE
In-repo /docs stays canonical. Mintlify is the cross-repo navigation layer. Notion is the ops handoff to Megan / Ryan / Fadner.
Step 7 — Evidence emission
Every run writes an audit-log entry:
{repo, commit_sha, timestamp, doc_files_changed, inferred_adrs_drafted, phi_paths_detected, drift_findings}
to ~/.repo-cartographer/audit-log.jsonl. This feeds soc2-evidence-collector for CC8.1 (change management) evidence.
Step 8 — Open PR
Branch name: docs/cartographer-sync-<YYYYMMDD>. PR body:
- Summary of changes (sections added vs updated)
- List of inferred ADRs needing review
- List of drift findings
- Compliance dossier delta
- Evidence anchor count
Workflow — CI Mode (Post-Merge)
Triggered by assets/github-actions/doc-sync.yml on push to main.
Step 1 — Diff scope
Run scripts/doc_drift_check.py --since <last-doc-sync-sha> to determine which doc sections are affected by the merged change. Read references/drift-detection.md for the algorithm.
Affected sections are scored:
- HIGH: schema change, new API endpoint, new external dependency, IaC change, new PHI touchpoint
- MED: refactor crossing module boundaries, new feature flag, observability change
- LOW: doc fix, comment-only, internal refactor
Step 2 — Surgical update
Only regenerate the affected doc sections. Preserve hand-written nuance everywhere else. Drift detection algorithm in references/drift-detection.md.
Step 3 — Mirror + emit + PR
Same as on-demand steps 6–8.
Quality Gates (Non-Negotiable)
Before any PR opens, the skill verifies:
Any gate failure halts the PR and surfaces a remediation list.
Compliance Overlay (The Moat)
references/compliance-overlay.md is the playbook. Detects:
- PHI/PII handling code paths (regex + import graph)
- CFR / U.S.C. citations in comments → cross-checked against verified citation list
- BAA-relevant integrations (CMS HETS, Blue Button 2.0, carrier APIs, clearinghouses)
- SOC 2 control touchpoints (audit logging, access control, encryption boundary, key management)
Output → COMPLIANCE-DOSSIER.md per repo. This is what payers and Rhizome will actually read during procurement.
Compounds with tpmo-compliance-gate, hipaa-risk-assessor, cfr-citation-verifier, soc2-evidence-collector. If any of those flag issues during doc generation, halt and report.
Failure Modes
- Repo too large for single pass — chunk by package / workspace boundary, run sub-passes, reconcile.
- No tests, no types, ambiguous architecture — produce docs with
confidence: low banner, flag heavy ADR back-fill needed.
- PHI sample data detected in repo — halt, raise critical finding, refuse to mirror to public docs site.
- Drift PR conflicts with hand-written docs — never auto-merge; PR stays open for human review.
- Notion / Mintlify sync fails — in-repo /docs still ships; mirror retried on next run; audit log notes partial sync.
What This Skill Does NOT Do
- Does not document code that doesn't exist yet (no speculative architecture).
- Does not edit application code — only doc files.
- Does not mark inferred ADRs as accepted.
- Does not push to public docs site for any repo flagged as PHI-touching unless explicitly cleared.
- Does not bypass
tpmo-compliance-gate for any beneficiary-facing copy that ends up in docs.
References
Load these on demand, not up front:
references/doc-taxonomy.md — what each doc category must contain
references/architecture-extraction.md — how to derive diagrams from code
references/adr-template.md — Nygard format + inferred-status taxonomy
references/compliance-overlay.md — PHI/CFR/HIPAA detection patterns
references/drift-detection.md — code/doc divergence algorithm
references/reading-path.md — onboarding-path methodology
references/multi-repo-map.md — cross-repo dependency graph schema
references/docs-site.md — Mintlify sync conventions
references/notion-sync.md — Notion database schema + sync rules
Scripts (Deterministic Helpers)
scripts/analyze_repo.py — structural inventory
scripts/extract_deps.py — dependency table with rationale slots
scripts/detect_phi_paths.py — PHI/PII signal detection (moat)
scripts/doc_drift_check.py — code-vs-doc divergence diff
scripts/validate_mermaid.py — Mermaid syntax checker
scripts/site_sync.py — push to Mintlify repo
scripts/notion_sync.py — push to Notion workspace
scripts/ci_doc_updater.py — GitHub Action entry point
Templates
assets/templates/*.j2 — Jinja2 templates for each doc type. Templates are opinionated. Do not weaken them to fit a thin repo; instead flag the thin spots as TODO: section requires founder input.
GitHub Action
assets/github-actions/doc-sync.yml — drop into .github/workflows/ of any target repo.