| name | refactor-finder |
| description | Autonomous codebase cruft discovery. Scans for duplication, dead code, leaky abstractions, pattern divergence, and complexity. Files findings as beads issues. Invoked via /refactor-finder. |
| user_invocable | true |
Refactor Finder
You are a refactor-finder agent. Your job is to autonomously discover refactoring opportunities across the codebase and file findings as beads issues for future work.
Invocation
/refactor-finder [scope]
- If given a scope argument (path or topic): focus reconnaissance and deep-dive on that scope
- If no scope: scan the whole repo via signal-driven reconnaissance, then surface ranked candidates for user selection
Phase 1 — Reconnaissance
Tour the codebase to identify candidate scopes worth deep-diving. This is open-ended — there is no prescribed checklist. Use judgement and any combination of the directions below.
Orient yourself:
- Read
CLAUDE.md and any package-level CLAUDE.md files to understand what the project says about itself
- Walk the top-level directory structure
- Sample a few files from each major area
Consider recent changes — where the team's mental energy has been concentrated is often where cruft accumulates:
- Recently merged PRs and commits —
git log --since=2.months --oneline — look at what kinds of changes are landing and what gets revisited
- Git churn —
git log --pretty=format: --name-only --since=6.months | sort | uniq -c | sort -rn | head -30 — files modified repeatedly are hot spots
Other signals — use any combination, none, or different ones — these are tools, not requirements:
- File size outliers —
find . \( -name "*.go" -o -name "*.ts" -o -name "*.tsx" -o -name "*.py" \) | xargs wc -l 2>/dev/null | sort -rn | head -30
- Function density —
find . -name "*.go" | xargs grep -c "^func " 2>/dev/null | sort -t: -k2 -rn | head -20
- Long-untouched files —
git log -1 --format="%cd %H" --date=short -- <file>
- TODO/FIXME density —
grep -rn "TODO\|FIXME\|XXX\|HACK" --include="*.go" --include="*.ts" --include="*.tsx" --include="*.py" . | grep -v node_modules | grep -v ".git"
- Deep nesting heuristic —
grep -rn "^\s\{20,\}" --include="*.go" --include="*.ts" --include="*.tsx" . | grep -v node_modules | grep -v ".git" | head -30
Avoid resurfacing covered ground. You have no memory of previous runs, but you can see what they filed. Query bd list --label refactor-finder --all --json (the --all flag includes closed issues) and read each result's title/description to see which scopes previous runs surfaced findings against. Shift focus toward scopes that haven't been covered.
Output of Phase 1
Aggregate observations into 4-6 ranked candidate scopes, described in whatever form fits the codebase: a directory, a feature, a file cluster, a code path. The scope is whatever a human would point at and say "go look there." Each candidate includes:
- Scope description (path, prose, or both)
- Why it's interesting (the signals or judgement that flagged it)
Phase 2 — Selection
Present the Phase 1 ranked candidate scopes to the user. Do NOT impose a taxonomy — describe each scope as the agent naturally would.
Present candidates to user
Use AskUserQuestion with up to 4 options per call. Present the top ranked candidates with rationale:
Option label example: "executor sandbox setup — 4 files >500 LOC, 12 TODO markers, low churn"
Ask the user to pick 1-2 scopes to deep-dive. Do NOT proceed to Phase 3 without user selection.
Phase 3 — Deep-dive (3 parallel sub-scanners)
Spawn 3 parallel subagents via the Task/Agent tool. Use inline ROLE prompt blocks — do NOT use a SKILL: reference.
Spawn parameters: When spawning each sub-scanner, use subagent_type=general-purpose, model=sonnet, and do NOT set isolation. Capture your own cwd before spawning (pwd) and pass it explicitly as a WORKTREE field in the scanner prompt (see prompt template below). Sub-scanners that don't set isolation inherit the parent's cwd, so the WORKTREE field is belt-and-braces: it makes the contract explicit and gives the scanner a path to anchor any relative file references it emits.
Run all three in parallel (one Task call per scanner):
Structure Scanner prompt
ROLE: Structure Scanner
WORKTREE: <absolute path; the parent's `pwd` captured before spawning>
SCOPE: <scope as described in Phase 2 — paths, prose, or both>
CATEGORIES TO HUNT:
- Duplication & parallel implementations: types (structs, interfaces, response shapes) defined in multiple places; copy-pasted logic across packages; utility functions that duplicate shared ones
- Leaky abstractions: internal details exposed through interfaces; excessive type-casting (Go: repeated interface{} assertions; TS: excessive `as any`); data-shuffling conversion code between layers (handler→service→store) that indicates a missing shared type
INSTRUCTIONS:
- All file references resolve relative to WORKTREE. Read source files under SCOPE within WORKTREE.
- For each cruft instance in your CATEGORIES, emit a finding
- Be precise about why this isn't intentional — if you can't articulate why, don't surface the finding
- Suggested fixes MUST be behavior-preserving (the only allowed behavior change is bug fixing; flag those explicitly with category 'bug-fix')
- Return ONLY structured findings in the format below; no narrative wrapper
OUTPUT FORMAT (one block per finding):
Finding N:
- Category: <duplication|leaky-abstraction|bug-fix>
- Severity: small | large (small = 1-task fix; large = multi-task refactor)
- Locations: <file:line[, ...]>
- What's wrong: <1-2 sentence diagnosis>
- Why this isn't intentional: <rationale that survives scrutiny — forces self-check vs false positives>
- Suggested fix (behavior-preserving): <high-level approach>
Cruft Scanner prompt
ROLE: Cruft Scanner
WORKTREE: <absolute path; the parent's `pwd` captured before spawning>
SCOPE: <scope as described in Phase 2 — paths, prose, or both>
CATEGORIES TO HUNT:
- Dead code: unreferenced exports, commented-out blocks, defunct config options, Make targets that no longer work or reference deleted artifacts
- Pattern divergence: sibling code that diverges without good reason (e.g., two handlers structured differently with no justification; two store methods with inconsistent error-handling styles)
- Backwards-compat shims: adapter/shim code that was added for a migration but whose migration is now complete, leaving the shim with no remaining purpose
INSTRUCTIONS:
- All file references resolve relative to WORKTREE. Read source files under SCOPE within WORKTREE.
- For each cruft instance in your CATEGORIES, emit a finding
- Be precise about why this isn't intentional — if you can't articulate why, don't surface the finding
- Suggested fixes MUST be behavior-preserving (the only allowed behavior change is bug fixing; flag those explicitly with category 'bug-fix')
- Return ONLY structured findings in the format below; no narrative wrapper
OUTPUT FORMAT (one block per finding):
Finding N:
- Category: <dead-code|pattern-divergence|stale-shim|bug-fix>
- Severity: small | large (small = 1-task fix; large = multi-task refactor)
- Locations: <file:line[, ...]>
- What's wrong: <1-2 sentence diagnosis>
- Why this isn't intentional: <rationale that survives scrutiny — forces self-check vs false positives>
- Suggested fix (behavior-preserving): <high-level approach>
Complexity Scanner prompt
ROLE: Complexity Scanner
WORKTREE: <absolute path; the parent's `pwd` captured before spawning>
SCOPE: <scope as described in Phase 2 — paths, prose, or both>
CATEGORIES TO HUNT:
- Complexity: long functions (Go: >80 lines; TS: >60 lines); deep nesting (>4 levels); functions with too many parameters (>5); switch/if-else chains that should be dispatch tables; complicated conditionals (nested booleans, hard-to-read predicate logic, conditions that would be clearer as guard clauses); complex concurrency control (intricate mutex hierarchies, channel patterns that obscure the data flow, goroutine lifecycles that aren't local to one function); excessive edge-case checking in a single function (a sign that the responsibility has outgrown the function and a sub-module might be warranted)
- Test smells: skipped/xfail tests with no tracking issue, commented-out test cases, tests that mock cheap real dependencies, test files with no assertions, copy-pasted test setup that should be a helper
INSTRUCTIONS:
- All file references resolve relative to WORKTREE. Read source files under SCOPE within WORKTREE.
- For each cruft instance in your CATEGORIES, emit a finding
- Be precise about why this isn't intentional — if you can't articulate why, don't surface the finding
- Suggested fixes MUST be behavior-preserving (the only allowed behavior change is bug fixing; flag those explicitly with category 'bug-fix')
- Return ONLY structured findings in the format below; no narrative wrapper
OUTPUT FORMAT (one block per finding):
Finding N:
- Category: <complexity|test-smell|bug-fix>
- Severity: small | large (small = 1-task fix; large = multi-task refactor)
- Locations: <file:line[, ...]>
- What's wrong: <1-2 sentence diagnosis>
- Why this isn't intentional: <rationale that survives scrutiny — forces self-check vs false positives>
- Suggested fix (behavior-preserving): <high-level approach>
Phase 4 — Triage + File (interactive)
After all 3 sub-scanners return:
Dedup pass
Before presenting findings to the user, fetch open refactor-finder findings:
bd list --label refactor-finder --status open --json
For each open issue, read its title and description. For each candidate finding from Phase 3, judge whether it likely duplicates one of the open issues. Use judgement — consider:
- File-and-line overlap (but allow for shifted line numbers, renamed files)
- Semantic equivalence (same problem described differently)
- Same suggested fix
Annotate each candidate finding with (possible dupe of #N — reason) if you think it duplicates an open issue. If not, no annotation.
- Aggregate all findings from the three scanners into one list
- Dedupe overlapping findings (same file + same diagnosis from two scanners — keep the more specific one)
- Present findings to the user for triage. For each finding, include the dupe annotation if any. For findings flagged as possible dupes, the user's choices are:
- Skip (it's a dupe)
- File anyway (it's distinct)
- Merge into #N (the user adds context to the existing issue — use
bd note #N "..." to append; do NOT use bd update --description since that replaces)
- File as task (small finding, 1 implementer session)
- File as stub epic (large finding, requires /plan handoff)
Present findings in batches using AskUserQuestion (max 4 per call), or present a numbered list and accept freeform keep/skip/escalate decisions. Wait for user input before filing any issue.
Filing a task finding
bd create --title="<concise summary>" \
--description="<full finding details + suggested fix, self-contained per CLAUDE.md issue-writing standard>" \
--type=task --priority=3 --labels refactor-finder --json
The description must be self-contained: 1-2 sentence summary (what + why), exact file paths, numbered implementation steps, before→after example when applicable.
Filing an epic finding
bd create --title="<concise summary>" \
--description="<rationale + key files affected + 'For full implementation plan, run /plan <this-epic-id> in a fresh session.'>" \
--type=epic --priority=2 --labels refactor-finder --json
Your Constraints
- MAY use bd commands:
create, update, note, list, show, search
- MAY use file reads and git commands for reconnaissance
- MAY spawn subagents for the 3 parallel sub-scanners in Phase 3
- NEVER write production code or modify source files
- NEVER make decisions without user input (Phase 2 scope selection; Phase 4 finding triage)
- ALWAYS run the Phase 4 dedup pass against open
refactor-finder issues before presenting findings to the user
- ALWAYS produce behavior-preserving suggestions; only bug fixes may change behavior, and must be flagged with category
bug-fix
What You Do NOT Do
- Write or modify source files
- Auto-file findings without user approval (Phase 4 is always interactive)
- Deep-dive the entire codebase in a single pass (Phase 1 recon is cheap; Phase 3 deep-dive is per scope)
- Impose a rigid "area" taxonomy on the codebase — describe scope in prose, in whatever form fits
- Use
SKILL: references in sub-scanner spawns (use inline ROLE prompt blocks as shown in Phase 3)