| name | tech-debt-audit |
| description | Thorough, file-cited technical debt audit across 9 dimensions using AST-grep (tree-sitter), grep, language-native tooling, and optionally CodeGraph knowledge graph. Produces TECH_DEBT_AUDIT.md with severity, effort estimates, and prioritized fixes. Use when asked for codebase health check, tech debt audit, architecture review, code quality assessment, or cleanup planning. Triggers: 'tech debt', 'technical debt', 'debt audit', 'code health', 'technical debt audit', 'codebase health check', 'find tech debt', 'debt analysis', 'audit code quality'. |
Tech Debt Audit Protocol
Model-agnostic technical debt audit for oh-my-openagent (OMO). Uses OMO's built-in tools (grep, glob, bash with sg, read, lsp_diagnostics, task) plus optional CodeGraph MCP for enhanced code graph analysis when available. Produces a grounded, citable TECH_DEBT_AUDIT.md artifact.
CodeGraph Enhancement (Optional)
If you have CodeGraph installed (check with codegraph status), its MCP tools (codegraph_search, codegraph_callers, codegraph_callees, codegraph_impact, codegraph_explore, etc.) can supersede or augment the standard tool searches in the dimensions marked below. CodeGraph gives you:
- Symbol search — instant by-name lookup via FTS5
- Call graph analysis — callers/callees for any function
- Impact analysis — blast radius before changing any symbol
- Smart context building — entry points, related symbols, and snippets in one call
- Framework-aware routes — URL patterns linked to their handlers
To use CodeGraph, ensure the codegraph MCP server is configured in your project's .mcp.json or global MCP config. The skill will auto-detect CodeGraph by checking if codegraph MCP tools are available. Sub-agents spawned via task() cannot use CodeGraph — they use the standard tool fallback.
Output
Write results to TECH_DEBT_AUDIT.md in the repo root with:
- Executive Summary — 3-5 sentences: overall health, worst dimension, quick wins count
- Mental Model — the repo's architecture in 1 paragraph (what it does, stack, module boundaries)
- Findings Table — columns: ID, Category, File:Line, Severity (Critical/High/Medium/Low), Effort (Hours), Description, Recommendation
- Top 5 Priorities — ranked by impact/effort ratio
- Quick Wins Checklist — items under 30 minutes each
- "Looks Bad But Is Fine" — patterns that look like debt but are intentional
- Open Questions — things the maintainer should clarify
Phase 0: Orient
Standard (always run)
glob("**/*.ts") / glob("**/*.py") / etc — map the language stack
glob("**/package.json") + read() — dependencies and build tooling
bash("git log --oneline -200") — churn: find highest-change files
glob("**/*") + basic math — find largest files (>300 LOC are candidates)
- Cross-reference high-churn + large = debt hot zones
- Write the mental model paragraph in your own working context
CodeGraph Enhancement (if available)
Instead of guessing module boundaries, query the code graph:
codegraph_explore(query="architecture overview and main modules")
This returns symbol relationships and source grouped by file. Use the structure as your architectural mental model instead of hand-inferring it from directory names.
codegraph_explore(query="main entry points and execution flow")
This surfaces entry points and call chains. Use these to understand how the code actually flows vs how the directory layout suggests it flows.
Phase 1: Audit Across 9 Dimensions
Use OMO tools for each dimension. Run parallel tool calls within each dimension. Every finding MUST cite file:line:col.
1. Architectural Decay
Standard (always run)
bash("sg -p \"import { $$$ } from '$SRC'\" -l ts .") — map module graph, look for circular patterns
bash("sg -p \"class $NAME { $$$ }\" -l ts .") — check for god classes
grep("TODO|FIXME|HACK|XXX|WORKAROUND|TEMP") — tagged debt markers
grep("async|await") on sync-looking files — misplaced async boundaries
bash("wc -l <file>") on each large file found in Phase 0
CodeGraph Enhancement (if available)
Dead code detection:
codegraph_callers(symbol="<suspected-dead-function>")
codegraph_callers(symbol="<suspected-dead-class>")
Run codegraph_callers on suspected dead exports found via grep/glob. If the result shows zero callers (excluding test files), it's dead code.
Circular dependency detection:
codegraph_impact(target="<module-or-file>", direction="upstream")
Use codegraph_impact on key modules to trace their dependents. If A depends on B and B depends on A, that's a cycle.
Architecture boundaries:
codegraph_explore(query="module dependencies and architecture boundaries")
Use codegraph_explore to survey actual module structure.
What to flag
- Files > 500 LOC (god files)
- Functions > 80 LOC or > 4 nesting levels
- Classes with > 15 methods or > 400 LOC
- Import cycles (A → B → A)
- Dead exports: function/class defined but never imported elsewhere (CodeGraph:
codegraph_callers)
- Commented-out code blocks (>3 consecutive consecutive lines)
2. Consistency Rot
Standard (always run)
bash("sg -p \"import $CLIENT from '$PKG'\" -l ts .") — multiple HTTP clients
grep("console.log|console.error|console.warn") — direct console use vs logger
bash("sg -p \"try { $$$ } catch ($$$) { $$$ }\" -l ts .") — error handling patterns
grep("as any|@ts-ignore|@ts-expect-error|as unknown") — type escapes
grep("eslint-disable|prettier-ignore") — lint suppressions
What to flag
- 3+ ways of doing the same thing (HTTP, logging, validation, config)
- Mixed naming conventions (camelCase + snake_case + PascalCase)
- Multiple date/time handling libraries
- Mixed error response shapes across modules
3. Type & Contract Debt
Standard (always run)
bash("sg -p \"$VALUE as any\" -l ts .") — runtime type escapes
grep("@ts-expect-error") — suppressed errors
grep("@ts-ignore") — suppressed errors (legacy)
bash("sg -p \"$NAME: any\" -l ts .") — typed as any
lsp_diagnostics(filePath="<src-dir>") — current type errors
What to flag
any types on public APIs and exported interfaces
- Untyped function parameters
- Missing schema validation at API/IO boundaries
- LSP type errors grouped by file
4. Test Debt
Standard (always run)
glob("**/*.test.ts") — find all test files
bash("bun test 2>&1 | grep -E '(fail|skip|todo)'") — current test health
- Cross-reference Phase 0 high-churn files with test existence
What to flag
- Critical-path files with zero tests
- Skipped tests (
test.skip, describe.skip)
- Tests asserting implementation details vs behavior
- Slow tests (>1s each)
5. Dependency & Config Debt
Standard (always run)
bash("npm audit --omit=dev 2>&1 | head -40") — known CVEs (if node_modules present)
read("package.json") — check dependency count and stale deps
grep(".env|process.env|Bun.env") — env var usage
grep("API_KEY|SECRET|PASSWORD|TOKEN") in non-config files — hardcoded config
CodeGraph Enhancement (if available)
Blast radius of core dependencies:
codegraph_impact(target="<core-utility-function>", direction="upstream")
Run this on a few key internal modules (logger, config loader, HTTP client) to see how widely they're used. A widely-depended-on module with poor error handling or type safety is a high-priority refactor target because changes to it ripple everywhere.
What to flag
- Outdated major-version deps
- Dependencies that do the same thing (duplicate libraries)
- Referenced env vars not documented in README
- Hardcoded environment-specific values
6. Performance & Resource Hygiene
Standard (always run)
bash("sg -p \"for ($$$ of $$$) { $$$ await $$$ }\" -l ts .") — async-in-loop
grep("await.*map|await.*filter|await.*forEach") — sequential async iteration
grep("Promise\\.all|Promise\\.allSettled") — existing parallel patterns (good signal)
grep("addEventListener|on\\(|subscribe") without removeEventListener|off\\(|unsubscribe nearby — listener hygiene
What to flag
await inside for/of loops (sequential when parallel possible)
- N+1 query patterns
- Missing cleanup on event listeners, intervals, handles
- Unnecessary serialization/deserialization
7. Error Handling & Observability
Standard (always run)
bash("sg -p \"catch ($$$) { $$$ }\" -l ts .") — catch blocks
grep("catch.*{}|catch.*{\\s*}") — empty catch blocks
grep("console.error|logger\\.error|log\\.error") — actual error logging
bash("sg -p \"throw new $ERR($$$)\" -l ts .") — error types used
CodeGraph Enhancement (if available)
Trace error propagation through call chains:
codegraph_callers(symbol="<key-error-handler-or-middleware>")
codegraph_explore(query="how errors propagate through <key-error-handler>")
Use codegraph_callers to find who calls your error handlers. If errors are caught and swallowed at multiple levels, that's a finding.
Impact of changing error types:
codegraph_impact(target="<error-class-or-interface>", direction="upstream")
Check the blast radius of custom error classes. If changing an error type would break 20+ consumers, the error contract is too tight.
What to flag
- Empty catch blocks (worst offense)
- Generic
catch (e) { console.error(e) } without recovery
- Inconsistent error shapes across modules
- Missing structured logging on critical paths
- Errors swallowed in promise chains (
.catch(() => {}))
8. Security Hygiene
Standard (always run)
grep("api[Kk]ey|api_secret|password|secret|token|credential") in source files (not config or env)
grep("SELECT .* FROM|INSERT INTO|UPDATE.*SET|DELETE FROM") — SQL construction
grep("innerHTML|dangerouslySetInnerHTML") — XSS vectors
grep("eval\\(|Function\\(|setTimeout\\(.*string|setInterval\\(.*string") — code injection
What to flag
- Hardcoded secrets in source
- String-concatenated SQL
innerHTML / dangerouslySetInnerHTML usage
eval() or string-based setTimeout/setInterval
- Permissive CORS or auth middleware
9. Documentation Drift
Standard (always run)
read("README.md") — check if claims match reality
grep("@param|@returns|@throws") — docstring coverage
grep("FIXME|TODO|HACK|XXX|WORKAROUND") — fixme density
- Compare README API examples with actual signatures
What to flag
- README claiming features that don't exist
- Public functions without any doc comment
- Comments that contradict the code
- Stale architecture decision records (ADRs) if present
Phase 2: Deeper Dives (Parallel Sub-Agents)
For large codebases (>50k LOC), delegate heavy dimensions to parallel sub-agents. Sub-agents CANNOT use CodeGraph — they use standard tools only:
task(category="unspecified-low", run_in_background=true, load_skills=[], prompt="[CONTEXT] Tech debt audit. [GOAL] Audit dimensions 1 (Architecture) and 2 (Consistency). [REQUEST] Run ast_grep and grep searches for dimensions 1-2 from the tech-debt-audit skill. Report every finding with file:line:col. Tag severity: Critical/High/Medium/Low.")
task(category="unspecified-low", run_in_background=true, load_skills=[], prompt="[CONTEXT] Tech debt audit. [GOAL] Audit dimensions 3 (Type debt) and 7 (Error handling). [REQUEST] Run searches for dimensions 3 and 7 from the tech-debt-audit skill. Report every finding with file:line:col. Tag severity.")
Spawn 2-3 sub-agents for the heaviest dimensions, collect results in parallel, then synthesize. The main agent handles CodeGraph queries itself while sub-agents run the standard tool passes.
Phase 3: Synthesize & Deliver
- Collect all findings from direct tool calls, CodeGraph queries (if available), and sub-agent results
- Deduplicate — same issue mentioned by multiple dimensions
- Classify severity:
- Critical — Causes incorrect behavior, data loss, or security vulnerability
- High — Will cause problems in production; blocks maintenance
- Medium — Reduces maintainability; violates conventions
- Low — Cosmetic; should fix when in the area
- Estimate effort in hours per finding (conservative)
- Write
TECH_DEBT_AUDIT.md with all required sections
- Report summary to the user
Severity Rubric
Critical = actively causing bugs or security holes
High = will cause problems under normal operation; blocks changes
Medium = reduces maintainability; inconsistent; violates team conventions
Low = cosmetic; would be nice to fix when nearby
Quick Checks Before Finishing