| name | geniro:refactor |
| description | Use when restructuring code for better organization, reducing tech debt, or improving patterns while guaranteeing zero behavior change. Ideal for modularization, test refactoring, or pattern consolidation after implementation. |
| context | main |
| model | inherit |
| allowed-tools | ["Read","Write","Edit","Bash","Glob","Grep","Agent","AskUserQuestion","TodoWrite"] |
| argument-hint | [what to refactor and why] |
Refactor with Test Verification
Safe incremental refactoring that validates behavior is preserved at every step. Restructures code for better organization, reduces tech debt, and improves patterns without changing observable behavior.
When to use
- Extracting shared logic from multiple modules
- Restructuring a module for clarity or testability
- Consolidating similar patterns across files
- Reducing coupling between components
- Improving module organization within a package
When NOT to use
- For behavioral changes or feature additions (use
/geniro:implement instead)
- To optimize performance (use
/geniro:deep-simplify and measure first)
- To add error handling not previously present
- To reorganize without clear architectural benefit
Subagent Model Tiering
Follow the canonical rule in skills/_shared/model-tiering.md. Every Agent(...) spawn MUST pass model= explicitly. For plugin-defined subagents (refactor, relevance-filter, reviewer), also follow skills/_shared/spawn-agent.md — bare-name first; on Agent type '<name>' not found, degrade to general-purpose with the agent body inlined.
Skill-specific mapping — refactor work is mostly mechanical pattern application; Sonnet handles ~90% of cases:
| Spawn | Tier | When |
|---|
refactor-agent (LOW or MEDIUM risk) | sonnet | Default — pattern application, file moves, rename, extract method |
refactor-agent (HIGH risk) | opus | 15+ files OR cross-module architectural restructure OR public API surface changes |
relevance-filter-agent | inherit | Orchestrator-grade reasoning to weigh repo-convention evidence against detected smells |
| Phase 5 reviewer (general-purpose) | sonnet | Independent diff review for Medium and Large tiers |
Agent Failure Handling
If any delegated agent fails (timeout, error, empty/garbage result): retry once with the same prompt. If the retry also fails:
- Phase 2 evidence-gathering agent (refactor-agent, relevance-filter-agent): proceed without the failed agent's output; note "Agent [name] failed — [dimension] not available" in the Phase 5 completion summary, and offer user the choice via
AskUserQuestion header "Partial evidence": "Abort refactor" / "Continue with partial evidence (risky)". Default: Abort.
- Phase 4 execution agent (refactor-agent): do NOT silently skip — revert all changes (
git checkout -- . with user confirmation per Phase 5 Step 1) and escalate to the user with failure context.
- Phase 5 reviewer-agent: note the failure in the completion summary and proceed (fail-open); warn the user that independent review did not complete.
Complexity Gate
Refactor uses a deliberately different complexity rubric than the canonical in ${CLAUDE_PLUGIN_ROOT}/skills/_shared/effort-scaling.md. Zero-behavior-change refactors have different risk drivers — test coverage in scope and public-surface footprint matter more than reversibility (which is always low for a refactor by definition). The hard escalation signals are also refactor-specific (e.g. "behavioral change required" forces escalation OUT of refactor entirely). Tiers below are refactor-local: Small / Medium / Large.
Match refactor depth to task risk. File count is a supporting signal, not the primary gate. Score the task across five dimensions, then check for any hard escalation signal.
Step 1: Score Complexity Dimensions
| Dimension | Low (0) | Medium (1) | High (2) |
|---|
| Task type | Mechanical (rename, extract method) | Pattern consolidation | Structural (module moves, cross-boundary) |
| Cross-boundary scope | 1 module | 2 modules | 3+ modules or cross-stack |
| Public surface touched | None | Internal module exports | Public API |
| Scale | ≤5 files | 6-15 files | 15+ files |
| Test coverage in scope | Strong | Partial | None-or-unknown |
Score: sum of all dimensions (0-10)
- 0-3 → Small
- 4-6 → Medium
- 7+ → Large
Step 2: Apply Tier Behavior
| Tier | Behavior |
|---|
| Small | Skip Phase 3 relevance-filter (scope too narrow to matter). Skip Phase 5 independent reviewer. Proceed through Phases 1-5 with lightweight gates. |
| Medium | Full pipeline as specified (relevance-filter + reviewer-agent). |
| Large | Recommend running /geniro:decompose first to split the refactor into independently shippable milestones; refactor then runs one milestone at a time against an approved plan. If the user wants to proceed without decomposition, require explicit confirmation via AskUserQuestion header "Scope": "Run /geniro:decompose first" (description: "Split the refactor into 3-7 milestones so each one can be reviewed and shipped independently") / "Proceed without a plan (risky)". On "Proceed without a plan", Large runs the Medium pipeline (full relevance-filter + independent reviewer-agent in Phase 5). The only difference is the user has accepted the added risk of proceeding without architectural review. |
Hard Escalation Signals (any ONE escalates)
| Signal | Why it escalates |
|---|
| Behavioral change required | Not a refactor — use /geniro:implement |
| New tests required to cover untested code | Add tests first via /geniro:implement or escalate |
| Signature/semantics change on public API | Cross-stack coordination — use /geniro:implement |
| Auth, crypto, or payment code touched | Owner review required — escalate |
| Ambiguous intent (multiple valid shapes) | Use Claude Code's built-in plan mode (Shift+Tab twice) to draft an approach first, or escalate to /geniro:implement |
| Config/migration regeneration needed | Runtime failure modes — use /geniro:implement |
| Touching test assertions (not just test imports) | Not refactoring — use /geniro:implement |
Any signal → AskUserQuestion header "Scope": "Escalate to suggested skill" / "Proceed anyway (treat as Large)" / "Reduce scope". Default to escalate.
State & Resume Semantics
On skill start, compute <slug> per ${CLAUDE_PLUGIN_ROOT}/skills/_shared/within-skill-state-handoff.md § Slug rules, then Glob(".geniro/state/refactor/state-<slug>.md"). If present, run the consumer flow per the helper § Consumer contract (Case A/B/C/D handling). After that returns "proceed", offer resume via AskUserQuestion header "Resume": "Resume from phase [N]" / "Start fresh (discard state)". Otherwise create the state file with an initial header after Phase 1.
State file schema (.geniro/state/refactor/state-<slug>.md):
Branch: <git branch --show-current OR detached-<short-sha>>
Worktree: <git rev-parse --show-toplevel>
Timestamp: <ISO-8601 UTC>
phase: 1|2|3|4|5
tier: Small|Medium|Large
scope-files: [...]
smells-detected: N
plan-approved: bool
steps-completed: [...]
steps-blocked: [...]
State is written on all tiers for consistency. Only strategic compact points are tier-gated (Medium/Large only). Capital Branch:/Worktree:/Timestamp: are mandatory per ${CLAUDE_PLUGIN_ROOT}/skills/_shared/within-skill-state-handoff.md § Producer contract — they let the consumer detect cross-branch collisions on resume.
Strategic compact points:
- After Phase 2 (plan built, before Phase 3 approval): write checkpoint, tell user:
Plan ready. I recommend /compact now to free context for execution. After compacting, type /geniro:refactor continue to resume from Phase 3.
- After Phase 4 (execution complete, before Phase 5 review): same pattern, resume from Phase 5.
On resume: read .geniro/state/refactor/state-<slug>.md and the scope-files list, skip to the next incomplete phase. The state file is cleaned up at the end of Phase 5 per ${CLAUDE_PLUGIN_ROOT}/skills/_shared/within-skill-state-handoff.md § Cleanup contract — delete only the current branch's slug.
Process
Phase 1: Scope & Context
- Parse
$ARGUMENTS to understand what is being refactored and why
- Use Grep and Glob to find all related files
- Read all files in scope to understand current organization, dependencies, imports, and test coverage
- Prior planning/knowledge context — scope follows
${CLAUDE_PLUGIN_ROOT}/skills/_shared/scope-anchor.md (anchor on cwd's worktree + current branch; no gh pr list / git checkout discovery). For cross-session artifacts, resolve the path prefix via ${CLAUDE_PLUGIN_ROOT}/skills/_shared/primary-worktree.md Mode A. Check these artifacts and load relevant ones: .geniro/planning/*/ (task-local, cwd-relative; match current branch; if found, read spec.md, plan-*.md, state.md, concerns.md, notes.md), .geniro/workflow/*.md (cwd-relative active integrations), <PRIMARY_ROOT>/.geniro/knowledge/learnings.jsonl (grep for scope-file keywords to surface relevant gotchas), and current git state (git rev-parse --show-toplevel, git branch --show-current, git log --oneline -5, git status --short).
- Read any project convention files referenced in CLAUDE.md (coding standards, architecture docs) — understanding project patterns prevents flagging intentional designs as smells
- Load custom instructions from
.geniro/instructions/global.md and .geniro/instructions/refactor.md. Read any found. Apply rules as constraints, additional steps at specified phases, and hard constraints.
Step 7 (final): Baseline validation. Run the project's validation suite once (read command from CLAUDE.md).
- If red:
AskUserQuestion header "Baseline": "Fix the broken tests first (stop refactoring)" / "Proceed anyway — existing failures are out of scope (risky)". Default to stop.
- If no tests exist at all: escalate immediately — "Cannot refactor safely without tests. Use
/geniro:implement to add coverage first."
- If tests green: record the passing-state fingerprint (test count) in
.geniro/state/refactor/state-<slug>.md and proceed.
Phase 2: Analyze (subagent) + Plan (orchestrator)
Spawn a refactor-agent to detect smells and count consumers — evidence only. The orchestrator then classifies risk, orders the plan, and marks HIGH-risk steps for user confirmation.
Agent(subagent_type="refactor-agent", model="sonnet", prompt="""
You are analyzing code for refactoring. Your task:
WHAT TO REFACTOR: $ARGUMENTS
FILES IN SCOPE:
[list the files you read in Phase 1]
WORKTREE: [from `git rev-parse --show-toplevel`]
BRANCH: [from `git branch --show-current`]
PROJECT CONVENTIONS:
[paste any relevant conventions from CLAUDE.md or project docs]
PHASE: EVIDENCE GATHERING ONLY.
- Execute ONLY your Phase 1 (Code Smell Detection). Skip all planning, risk scoring, and ordering.
- Skip Phase 2 (Refactoring Plan), Phase 3 (Atomic Application), and Phase 4 (Reporting) entirely.
- Do NOT use Write or Edit tools during this invocation. You are producing raw evidence, not a plan.
- Return smells + consumer counts as your final output.
- For every detected smell, also run the canonical **Existing Abstraction Audit** at `${CLAUDE_PLUGIN_ROOT}/skills/_shared/existing-abstraction-audit.md` — apply its Procedure (Grep designated helper directories, categorize REUSE-AS-IS / EXTEND / NO-ANALOGUE, force-fit guard, Rule of Three). Emit candidates inline alongside each smell using the audit's Output format (`reuse-as-is: <file:line>`, `extend-existing: <file:line> — <one-line justification>`, or `no-analogue: rule-of-three=<met|not-met>, call-sites=N`). If a viable extension exists, the orchestrator may prefer it over the smell-local transformation.
Run all 6 smell detection categories (duplication, long methods, god classes, dead code, tight coupling, type/import issues) AND the Deepening Opportunities lens (orthogonal to smells — asks "is this module shallow when it could be deep?"; uses the canonical vocabulary at `${CLAUDE_PLUGIN_ROOT}/skills/_shared/architecture-vocabulary.md`). For each finding, count consumers with Grep (files that import/reference the symbol).
Return as a flat list:
- Smell 1: [type, file:line references, proposed transformation, consumer count, files affected]
- Smell 2: ...
- Public surface notes: [smells that change public API signature, module export, or shared type — orchestrator will treat these as HIGH risk regardless of consumer count]
Do NOT classify risk (LOW/MEDIUM/HIGH). Do NOT order the smells. Do NOT flag steps for user confirmation. Those are orchestrator decisions.
Anchor: stay within WORKTREE on BRANCH — verify with `pwd && git branch --show-current` on first Bash call; abort if either differs. See `skills/_shared/scope-anchor.md` § Subagent spawn anchor.
""")
After the agent returns, the orchestrator builds the plan:
- Classify risk per smell (lookup rule):
- 1-3 consumers → LOW
- 4-9 consumers → MEDIUM
- 10+ consumers → HIGH
- Public API / module export / shared type change → HIGH (overrides consumer count)
- Order the plan: safer transformations first (LOW → MEDIUM → HIGH). Within the same tier, group by file to minimize re-reads.
- Mark HIGH-risk steps for user confirmation (presented via
AskUserQuestion in Phase 3).
- Build the final plan with: smells, ordered steps, risk per step, consumer counts, files that will change, what will NOT change (public APIs, DB schema, test behavior), and
max_risk (max across all step risks, used to select execution model in Phase 4).
Update .geniro/state/refactor/state-<slug>.md: phase: 2, smells-detected: N, tier: <Small|Medium|Large>.
Phase 3: Approval
Relevance evidence + orchestrator tagging (Medium and Large only — Small skips this step): Before presenting the plan, spawn a relevance-filter-agent to gather evidence on detected smells against repo conventions, then you (the orchestrator) decide KEEP vs FILTER yourself from the dossier — do NOT delegate the tagging decision:
Agent(subagent_type="relevance-filter-agent", prompt="""
FINDINGS: [smells detected by refactor-agent, with file:line references and risk levels]
CHANGED FILES: [files in refactoring scope from Phase 1]
WORKTREE: [from `git rev-parse --show-toplevel`]
BRANCH: [from `git branch --show-current`]
PROJECT CONTEXT: [stack, conventions from CLAUDE.md]
CONVENTION FILES: [content of CONTRIBUTING.md, ADRs, architecture docs if they exist]
Gather evidence for each detected smell against this repo's actual patterns:
1. Convention alignment — is this "smell" actually the repo's chosen pattern?
2. Over-engineering — would fixing this smell introduce more complexity than it removes?
3. Intentional pattern — does the flagged pattern exist deliberately in 3+ other files?
Return an evidence dossier per smell (ALIGNS/CONTRADICTS/NEUTRAL, APPROPRIATE/OVER-ENGINEERED, ISOLATED/WIDESPREAD). Do NOT tag smells KEEP or FILTER — return evidence only; the orchestrator decides.
Anchor: stay within WORKTREE on BRANCH — verify with `pwd && git branch --show-current` on first Bash call; abort if either differs. See `skills/_shared/scope-anchor.md` § Subagent spawn anchor.
""")
After the dossier returns, synthesize it yourself: for each smell, weigh convention-alignment, over-engineering, and pattern-frequency evidence and tag KEEP or FILTER. Remove FILTERED smells from the plan before presenting to user; note them in the results. If the agent fails, pass all smells through as KEEP (fail-open).
Review the agent's plan:
- If any steps are HIGH risk: present them to user via
AskUserQuestion and wait for confirmation before proceeding
- If all steps are LOW/MEDIUM: present the plan summary and proceed
Update .geniro/state/refactor/state-<slug>.md: phase: 3, plan-approved: true.
Strategic Compact Point (Medium and Large only). See "State & Resume Semantics" above. After compaction, resume from Phase 4.
Phase 4: Execute
Refresh custom instructions (~5 sec): re-read .geniro/instructions/global.md, .geniro/instructions/refactor.md, and .geniro/instructions/code-style.md (if any are present). Their rules / additional steps / hard constraints still apply to this phase — re-load to ensure they survive any compaction since Phase 1.
Spawn the refactor-agent to execute the approved plan:
Pick model from approved plan: use model="opus" when plan.max_risk == "HIGH", otherwise model="sonnet".
Before spawning the refactor-agent, Read .geniro/instructions/code-style.md if it exists. Pre-inline its content into the agent prompt under ## Code-style instructions.
Agent(subagent_type="refactor-agent", model="<sonnet|opus per risk>", prompt="""
You are executing a refactoring plan. Your task:
APPROVED PLAN:
[paste the plan from Phase 2, marking any HIGH steps the user rejected]
WORKTREE: [from `git rev-parse --show-toplevel`]
BRANCH: [from `git branch --show-current`]
PER-STEP TEST COMMAND: [<test_cmd_affected> from CLAUDE.md if defined, else <test_cmd>] — agent uses this for per-step pre-condition and post-condition checks
REGRESSION TEST COMMAND: [<test_cmd> from CLAUDE.md] — full suite; the orchestrator runs this separately for Phase 1 baseline / Phase 4 final regression gate; the agent does NOT need to invoke it
AUTOFIX COMMAND: [autofix command from CLAUDE.md, if any]
BACKPRESSURE: source "${CLAUDE_PLUGIN_ROOT}/hooks/backpressure.sh" && run_silent "Tests" "<validation_cmd>". If unavailable, pipe through tail -80.
## Code-style instructions (pre-inlined from `.geniro/instructions/code-style.md`, if present)
[paste content here, OR omit section if file absent]
Execute each step following the Step Execution Protocol in your agent definition.
CRITICAL RULES:
- One logical transformation per step
- Run validation after each step
- If a step fails 3 times: REVERT it, mark as BLOCKED, and CONTINUE to the next step
- Do NOT stop the entire session because one step is blocked
- No git operations (no add, commit, push, checkout)
Return a structured report of what was applied, what was blocked, and final validation status.
Anchor: stay within WORKTREE on BRANCH — verify with `pwd && git branch --show-current` on first Bash call; abort if either differs. See `skills/_shared/scope-anchor.md` § Subagent spawn anchor.
""")
Session-level cap: After execution returns, count the ratio of BLOCKED to executed steps (post-user-rejection; i.e., denominator = approved plan steps minus user-rejected HIGH-risk steps). If ≥30% BLOCKED: stop and escalate via AskUserQuestion header "Stuck": "Keep what worked and escalate the rest" / "Revert all changes" / "Force-continue (not recommended)". Do NOT proceed to Phase 5 automatically when this cap triggers.
Update .geniro/state/refactor/state-<slug>.md: phase: 4, steps-completed: [...], steps-blocked: [...].
Strategic Compact Point (Medium and Large only). See "State & Resume Semantics" above. After compaction, resume from Phase 5.
Phase 5: Review Results
Refresh custom instructions (~5 sec): re-read .geniro/instructions/global.md, .geniro/instructions/refactor.md, and .geniro/instructions/code-style.md (if any are present). Their rules / additional steps / hard constraints still apply to this phase — re-load to ensure they survive any compaction since Phase 1.
Step 1: Diff sanity (all tiers)
Run git diff --name-only and git diff --stat. Cross-check the refactor-agent's self-reported file list against the actual diff — flag mismatches.
If the agent's final validation failed, fire AskUserQuestion header "Revert":
- "Revert all changes" — safe default, matches current behavior
- "Show me the diff first" — print
git diff --stat and re-ask
- "Keep changes for debugging" — leave uncommitted, print recovery guidance
Default: Revert all changes. On "Revert all changes", run git checkout -- . and report failure.
Step 2: Independent review (Medium and Large only — skip for Small)
Spawn a fresh reviewer-agent. The agent reads its own criteria — do NOT pre-read into orchestrator context.
Before spawning the reviewer, Read .geniro/instructions/code-style.md if it exists. Pre-inline its content alongside the CLAUDE.md conventions paste-block.
Agent(subagent_type="reviewer-agent", model="sonnet", prompt="""
## Review: Refactor Diff
This is a refactor — behavior MUST be unchanged. CI already passed. Focus on invariants, not style.
WORKTREE: [from `git rev-parse --show-toplevel`]
BRANCH: [from `git branch --show-current`]
DIFF: [paste git diff output]
AGENT SELF-REPORT: [refactor-agent's structured report]
PROJECT CONVENTIONS: [paste relevant conventions from CLAUDE.md]
## Code-style instructions (pre-inlined from `.geniro/instructions/code-style.md`, if present)
[content here]
## Focus Areas
- Accidental public-API changes
- Test assertion mutations (imports-only changes are fine; assertion changes are NOT)
- Invariant drift (error shapes, return types, null-vs-undefined, ordering)
- New coupling introduced by extraction/move
- Dead-code removal that actually had references
## Review Criteria
Read and apply these criteria files:
- `${CLAUDE_PLUGIN_ROOT}/skills/review/bugs-criteria.md`
- `${CLAUDE_PLUGIN_ROOT}/skills/review/architecture-criteria.md`
- `${CLAUDE_PLUGIN_ROOT}/skills/review/tests-criteria.md`
Report findings with severity (CRITICAL/HIGH/MEDIUM) and confidence. Return findings as evidence. Do NOT emit an overall verdict — the orchestrating skill synthesizes findings and decides disposition.
Anchor: stay within WORKTREE on BRANCH — verify with `pwd && git branch --show-current` on first Bash call; abort if either differs. See `skills/_shared/scope-anchor.md` § Subagent spawn anchor.
""", description="Review: refactor diff")
Orchestrator disposition logic:
-
Any finding with decision: PRODUCT-DECISION → ESCALATE, do NOT fix in-skill. A multi-path finding implies multiple valid behaviors, and refactor guarantees zero behavior change. Surface every PRODUCT-DECISION finding to the user via AskUserQuestion per the canonical shape at ${CLAUDE_PLUGIN_ROOT}/skills/_shared/per-finding-question.md § Single-finding gate (set header: "Escalate"). The escalation menu replaces the finding's own Options: field — render four fixed options: "Run /geniro:implement on this finding (Recommended)" / "Revert this refactor and start over" / "Document and ship as-is — accept the open decision" / "Document as ADR — capture rejection rationale" (the 4th option fires only when the finding meets the ADR criteria below; otherwise omit it and present 3 options). Render the question text with the finding's severity / path:lines / short-title / decision-type / why-matters per the spec's Source-field map; render each option's preview with the finding body (Evidence / Suggested-fix / Confidence / Origin) pulled from in-memory reviewer-agent output (Phase 5 of /refactor runs in the same invocation that produced findings). Without the body in preview the user cannot tell which finding they're escalating — title-only escalation rubber-stamps blindly. Do NOT spawn the refactor-agent fix loop for these findings; gate-and-fix would silently ship a product decision the user did not authorize. This gate is Always-WAIT in every tier (Small / Medium / Large — see ${CLAUDE_PLUGIN_ROOT}/skills/implement/implement-reference.md §Auto Mode Behavior, [PRODUCT-DECISION] finding encountered row). Fire one AskUserQuestion per finding (4 escalation options when ADR-eligible, 3 otherwise); chain across findings — never batch multiple findings into a single question.
ADR-eligibility check (before adding the 4th option): include the "Document as ADR" option ONLY when the rejected refactor candidate meets all three criteria from ${CLAUDE_PLUGIN_ROOT}/skills/_shared/improvement-routing.md § ADR target — when to use it (sparingly): (1) hard to reverse, (2) surprising without context, (3) result of genuine trade-offs. Examples that qualify: rejecting "split this god-class into 3 modules because the team prefers single-file feature ownership" (the rejection is the durable decision); rejecting "switch from inheritance to composition here because the existing inheritance is load-bearing for the plugin system." Examples that do NOT qualify: rejecting a duplicate-extraction smell because the duplication is intentional (Rule of Three not yet met) — that's a learning, not an ADR. If unsure, omit the ADR option; routing to Knowledge (the default escalation outcome) is always safe.
If user picks "Document as ADR": spawn a focused agent (model: sonnet) to draft the ADR using the template in _shared/improvement-routing.md § ADR template. Pre-inline: the rejected finding's evidence + the user's stated rationale + relevant codebase context. Write to docs/adr/NNNN-<slug>.md (next sequential N; create directory if missing, after AskUserQuestion confirmation). Then proceed as if user picked "Document and ship as-is" — the ADR captures the rejection durably, the working tree is unchanged.
-
Any CRITICAL or HIGH (non-PRODUCT-DECISION) → fix loop (max 1 round): spawn fresh refactor-agent to address specific findings, re-review with fresh reviewer.
-
Only MEDIUM → note in completion summary; proceed.
-
None → proceed.
Step 3: Present Completion Summary
## Refactor Complete
### Transformations Applied (N)
- [file:line] — [what changed] — risk: [LOW/MEDIUM/HIGH] — consumers: N
### Blocked Steps (N)
- [file:line] — [what was attempted] — reason: [failure summary]
### Filtered by Relevance (N — omit section for Small tier; relevance filter not run)
- [smell] — [reason filtered]
### Review Findings
- CRITICAL: N, HIGH: M, MEDIUM: K
- Disposition: [proceeded / 1-round fix loop / escalated]
### Validation
- Tests: PASS/FAIL
- Baseline delta: [before→after test count]
### Files Modified: N
- [file path]: [one-line summary]
### Deferred
- [P3 item or user-rejected HIGH step]
Delete .geniro/state/refactor/state-<slug>.md at the very end of Phase 5 per ${CLAUDE_PLUGIN_ROOT}/skills/_shared/within-skill-state-handoff.md § Cleanup contract — delete only the current branch's slug, leave other branches' state files intact. Also clear two generations of legacy state files (best-effort; either may not exist):
rm -f ".geniro/refactor/state-${slug}.md" 2>/dev/null
rm -f .geniro/refactor/state.md 2>/dev/null
After deleting the state file, tell the user explicitly: "Refactor complete — the diff is in your working tree. Commit it yourself, or run /geniro:follow-up to ship with a review gate."
Git Constraint
Do NOT run git add, git commit, or git push. The orchestrating workflow handles version control. Exception: git checkout -- . is permitted in Phase 5 for reverting failed changes — this is an orchestration-level revert, not a version control operation.
Anti-rationalization constraints
| Your reasoning | Why it's wrong |
|---|
| "This smell is too small to fix" | If the plan says fix it, fix it. Small smells compound. |
| "I'll batch multiple transformations" | One atomic transformation at a time. Always. |
| "Tests are passing so I'll skip the blocked step protocol" | The protocol exists for the NEXT failure. Follow it. |
| "This refactoring needs a behavior change" | Then it's not a refactoring. Use /geniro:implement instead. |
| "I'll skip reading project conventions" | You'll flag intentional patterns as smells. Read first. |
| "This duplication needs a new shared helper" | Run the Existing Abstraction Audit first. If a utility / service / hook already exists nearby that could absorb this duplication via a small extension, prefer extending it. Only create a new shared helper when no analogue exists OR when extending the existing one would require adding a parameter or conditional that complicates it (Rule of Three: revisit at the third occurrence; until then prefer local duplication over forced abstraction). |
| "All detected smells are real issues" | Generic smell categories flag intentional repo patterns. Without filtering against THIS repo's conventions, you'll refactor code that was designed that way on purpose. |
| "This is just a refactor" | Refactors break things. Tests and review apply equally. |
| "I'll spawn agents one at a time" | All parallel agents MUST be spawned in ONE response — multiple Agent() calls in the same assistant turn. Separate turns = no concurrency, full wall-clock latency per agent. |
| "The user said go fast — skip phases" | Phase skipping is tied to Complexity Gate tier, not user impatience. Small-tier already skips appropriately. |
| "I noticed a bug mid-refactor, I'll fix it" | That's feature work. Note it for /geniro:follow-up or /geniro:implement and stay in refactor scope. |
| "This change is obviously safe" | "Obviously safe" is the #1 predictor of broken builds. Run validation. |
| "I'll upgrade this sonnet spawn to opus just to be safe" | Model tier is task-nature-matched, not risk-appetite-matched. Re-classify via Subagent Model Tiering table; don't silently upsize. |
"Reviewer flagged a [PRODUCT-DECISION] finding — I'll route it through the fix loop like any other CRITICAL/HIGH" | A [PRODUCT-DECISION] finding has multiple valid resolution paths by definition (see agents/reviewer-agent.md §Decision Type Guidance) — picking one is a behavior change, which contradicts refactor's zero-behavior-change guarantee. Phase 5 Step 2 disposition logic ESCALATES PRODUCT-DECISION findings to /geniro:implement (always-WAIT) — never gates-and-fixes them in-skill. If you find yourself spawning the refactor-agent for a PRODUCT-DECISION finding, that's the rationalization. Stop and route the escalation. |
Learn & Improve
After refactoring is complete, extract knowledge and suggest improvements.
Extract Learnings
Follow the canonical rubric in skills/_shared/learnings-extraction.md. Bias hard toward flow, architectural, and recurring-mistake learnings; do NOT save narrow interface/field shapes, single-file behaviors, or facts re-derivable by reading the code. Apply the Reflect → Abstract → Generalize pre-pass before every save: if you cannot restate the finding one level up, drop it.
Refactor-specific triggers (supplemental bias, not replacement rubric):
- Blocked transformations →
project memory (architectural pressure points)
- Convention discoveries from codebase reading →
project memory
- User corrections ("don't refactor that, it's intentional") →
feedback memory
- Surprising coupling revealed during execution →
project memory
UPDATE existing memories instead of duplicating. Skip if nothing novel.
Suggest Improvements (project scope only)
Check if the refactoring revealed improvement opportunities. Follow the canonical routing in skills/_shared/improvement-routing.md — refactoring most often surfaces (a) undocumented coding conventions / style patterns used consistently across the codebase → route to .claude/rules/<scope>.md with paths: glob frontmatter (Anthropic-native, file-scoped — auto-loads when matching files are touched); (b) surprising coupling between modules → learnings.jsonl; (c) patterns that should be auto-enforced → project rules/hooks; (d) skill-behavior constraints the user enforced manually during refactor → .geniro/instructions/refactor.md. Plugin-internal paths (${CLAUDE_PLUGIN_ROOT}/…) are out of scope — use /improve-template.
Task Tracking
Use TodoWrite to expose per-phase progress. At skill start, create phase-level todos: Scope&Context, Analyze&Plan, Approval, Execute, Review. During Phase 4, add dynamic per-step todos derived from the approved plan. Mark in_progress → completed as phases run. At most ONE todo is in_progress at a time.
Troubleshooting
| Problem | Fix |
|---|
| Baseline validation never passes | Escalate: tests must be fixed before refactoring can proceed safely |
| Refactor-agent blocked on ≥30% of steps | Session-level cap hit — stop and escalate; likely scope too large or conventions misread |
| Relevance filter rejects >50% of smells | Likely scope-convention mismatch — confirm with user before proceeding |
| User rejects all HIGH-risk steps | Empty remaining plan → ask whether to proceed with LOW/MEDIUM only or abort |
| Cross-module coupling discovered mid-execution | Follow Blocked Step Protocol; do NOT expand scope mid-session — note for follow-up refactor |
Definition of Done
Example invocations
/geniro:refactor Extract shared validation logic from auth and user modules
/geniro:refactor Consolidate test helpers in utils/ to single module
/geniro:refactor Split 1000-line service into focused domain modules
/geniro:refactor Reduce coupling between database and business logic layers