| name | find-duplication |
| description | Find code duplication in the codebase. Supports two modes - scoped to current branch changes or a full codebase sweep. Use when the user asks to find duplicated code, copy-paste, repeated patterns, or wants to deduplicate before a PR. |
| disable-model-invocation | true |
Find Code Duplication
Detect duplicated or near-duplicate Go code and suggest consolidation candidates.
Rules
- Report findings, do not refactor. Refactoring is a separate task.
- Focus on production code (
controller/, api/, cmd/, cli/). Skip test duplication unless explicitly asked.
- Group findings by severity: exact duplicates first, then near-duplicates.
- For each finding, state whether extraction is worth it or acceptable duplication.
Step 1: Determine Scope
Ask the user:
- Branch mode: only files changed in the current branch vs main.
- Full mode: scan the entire codebase.
For branch mode:
git diff --name-only upstream/main -- 'controller/' 'api/' 'cmd/' 'cli/' | grep '\.go$' | grep -v '_test\.go$'
For full mode, the target is controller/ api/ cmd/ cli/.
Step 2: Run dupl
Install and run dupl to find duplicate code blocks:
go install github.com/mibk/dupl@latest
dupl -threshold 15 <target>
-threshold 15 means at least 15 tokens of duplication. Lower values = more noise.
Review output and filter false positives:
- Import blocks (common imports are not duplication)
- Error constant declarations (repeated pattern is intentional)
- Single-line patterns (logging, error wrapping)
- Kubebuilder boilerplate (RBAC markers, webhook setup)
Step 3: Semantic Duplication Search
dupl only catches textual similarity. Also look for:
- Similar function signatures — functions with near-identical parameter lists doing similar work across different packages.
- Repeated error handling — same
if err != nil { return fmt.Errorf(...) } pattern with slight variations.
- Copy-pasted reconciler logic — similar reconciliation patterns across different controllers.
- Duplicated struct definitions — similar structs in different packages (candidate for shared types).
Search for patterns:
rg "fmt\.Errorf.*%w.*err\)" controller/ api/ cmd/ cli/ -A 1 -B 1
rg "func.*Reconcile" controller/ -l
rg "func Generate" controller/ -l
Step 4: Check for Repeated Utilities
Look for utility functions that appear in multiple packages:
rg "func.*GetSecret" controller/ api/ cmd/ cli/ -l
rg "func.*Generate.*Labels" controller/ api/ cmd/ cli/ -l
rg "func.*Equal\(" controller/ api/ cmd/ cli/ -l
These should live in a single shared place (e.g. a small helper file or package under controller/) rather than copy-pasted across packages.
Step 5: Classify Findings
For each duplicate found, classify:
| Category | Action |
|---|
| Extract — identical logic in 3+ places | Recommend a shared helper (e.g. under controller/ or a small pkg/) |
| Parameterize — same structure, different values | Recommend a common function with parameters |
| Acceptable — similar but serving different domains (app server vs lcore) | Note it, no action needed |
| Boilerplate — kubebuilder/controller-runtime patterns | Skip, this is framework convention |
| Test-only — repeated test setup/fixtures | Recommend shared test fixture (only if user asked) |
Step 6: Report
For each finding:
- Files and line ranges involved
- What is duplicated (brief description)
- Token/line count
- Classification (extract / parameterize / acceptable / boilerplate)
- Suggested location for shared code (e.g. next to callers or a small internal helper package)
Summary: total findings, how many actionable, estimated lines saved.