| name | hickey |
| description | Evaluate code (especially LLM-generated) for structural simplicity using Rich Hickey's "Simple Made Easy" framework. Use this skill whenever reviewing a PR, diff, or code snippet for accidental complexity — particularly when the code was generated by an AI coding assistant and line-by-line review isn't feasible. Also use when the user asks about complecting, simplicity vs. easiness, structural coupling, or concept deduplication. Trigger on phrases like "is this simple", "does this complect", "review for complexity", "structural analysis", or any reference to Hickey, Simple Made Easy, or grey-box review. |
| context | fork |
| agent | Explore |
| model | sonnet |
Hickey: Structural Simplicity Evaluation
Evaluate code for structural simplicity using Rich Hickey's "Simple Made Easy" framework. This skill is designed for the world where AI generates code faster than humans can read it — where functional tests pass but accidental complexity accumulates silently.
The core premise: tests tell you code works; they tell you nothing about whether it's simple. Hickey: "What's true of every bug found in the field? It passed the type checker... it passed all the tests." Complected code can be perfectly correct today. The damage shows up when you try to change it, reason about it, or extend it.
Source: Full talk transcript (also in transcript.md relative to this skill)
Key Definitions
Simple (sim-plex, "one fold/braid"): One concern, one role, one concept. Simplicity is objective — count the interleaved concerns. Hickey: "What matters for simplicity is that there is no interleaving, not that there's only one thing."
Easy (adjacens, "nearby"): Familiar, at hand, within our skillset. Easy is relative. Hickey: "If you want everything to be familiar, you will never learn anything new."
Complect (com-plectere, "to braid together"): Interleave independent concerns so they cannot be reasoned about in isolation. Hickey: "Every time I think I pull out a new part of the software I need to comprehend, and it's attached to another thing, I had to pull that other thing into my mind because I can't think about the one without the other."
Compose (com-ponere, "to place together"): Combine independent things side by side, preserving isolation. Hickey: "I'd rather have more things hanging nice, straight down, not twisted together, than just a couple of things tied in a knot."
Scope of Review
The trigger — "review this for X", "extract Y", "look at file Z", a /do diff with N touched files — is a starting point, not a frame. The structural questions in this skill (concept multiplication, fragmentation, complecting) are most legible at module boundaries; reviewing only the lines the user (or upstream issue) pointed at is how recurring patterns in the same file get missed.
Default to whole-module scope. When the trigger lives inside a single file or component, read the whole file — not just the cited region. When invoked on a multi-file diff, each touched file is in scope, and cross-file structural patterns (concept multiplication across modules, fragmentation that spans files) are in scope too. Adjacent files in the same directory are fair game when the trigger's pattern recurs there — concept multiplication often lives across siblings.
Don't let the user's framing define the scope. A trigger that says "extract <ValueInputMode>" implies a UI extraction; if the surrounding code shows the same fragmentation pattern recurring, name that — even when the implied fix is elsewhere (e.g., a discriminated data-model change rather than a sub-component split). The reviewer's job is to surface what the evidence on disk says, not to confirm the trigger's framing.
Push back when the evidence contradicts the trigger. If the prompt narrows the question to one symptom but the file shows the symptom is one instance of a broader structural issue, the broader finding is the headline, not a footnote. "Issue #N described an extraction; the actual leverage is the data model" is a valid first finding, not an out-of-scope tangent. Anchoring on the trigger's framing is itself a Layer 2 silence — a finding the review never let form.
The Evaluation Process
Work through these layers in order. Every finding must survive /fact-check — after completing all layers, invoke the fact-check skill on your own evaluation to catch wishful justifications and bogus dismissals.
Layer 1: Identify the Concerns
Name the independent concerns the code addresses. Write them out explicitly. If you can't cleanly name distinct concerns, that is itself a finding.
Layer 2: Fragmentation Check
Hickey's "don't complect" has a dual the rest of this skill doesn't cover: don't fragment what belongs together. When one domain concept is split across multiple fields, state locations, signals, modules, or call sites, and their coherence depends on an unenforced rule, you have the same structural bug as complecting — you just arrived at it from the opposite direction. The fix for complecting is separation. The fix for fragmentation is reunification at whatever layer the one thing naturally lives: one type, one signal, one module, one function, one file.
For every group of related fields, state locations, or entities in the changed code, ask: does the domain model this as one thing?
If yes, is the code representing it as one thing, or has it been shattered into parts whose coherence depends on an unenforced rule? The rule can live anywhere — type shape, code convention, doc comment, "we remember to update both", runtime ordering, a reactive pipeline that rebuilds the unity at read time.
- Enumerate invariants. Write out every rule coupling the parts, in plain language: "if A then B", "all Xs agree on Y", "when kind=X then field F is present", "at most one of A/B/C is set", "every update of P must update Q".
- Check consumption sites, not just definitions. Fragmentation is most visible at the reader. If every consumer projects the same value out of a per-entity structure, the per-entity part is lying — the projection is the fingerprint. If multiple modules derive the same value from different sources, the derivation is the fingerprint.
- Watch for reconciliation machinery. A memo that lifts one value out of a collection; a callback that writes back into the collection; an effect that copies one entity's state onto others; a config loader that cross-validates two files; a convention comment that says "keep X in sync with Y". All are the shape of the state is fragmented and we're rebuilding the unity somewhere downstream. The machinery is the bug, not the fix.
- Collapse at the natural layer. Pick the representation layer where the one thing naturally lives. Sometimes that's a discriminated union in a type; sometimes it's a module-level signal; sometimes it's moving two files into one. The layer is wherever the unity stops needing a rule to hold.
- Silence is the bug. A review that skips this layer because "no invariants came to mind" is the same review that misses the fragmentation. When the review genuinely finds nothing, writing "no invariants found" explicitly is the required cheap output. The enumeration is what forces the check, not the outcome.
Layer 3: Check for Concept Multiplication
Hickey: "Be particularly careful not to be fooled by code organization. There are tons of libraries that look — oh, look, there's different classes; there's separate classes. They call each other in these nice ways."
For each new abstraction (component, module, signal, type) the code introduces:
- Name what it represents at the domain level, not the implementation level.
- Search for existing abstractions serving the same domain concept. If one exists, extending it is the default — not creating a parallel one.
- "Mirror existing pattern" is an easiness judgment, not a simplicity judgment. Creating ComponentB because ComponentA exists and looks similar adds a concept. Extending ComponentA keeps concept count flat.
Two abstractions serving one user-level concern = accidental concept multiplication, even if each is internally clean. The structural layers below won't catch this — they check within abstractions, not across them.
Concept Multiplication vs. Fragmentation: these are close cousins but distinct bugs. Concept Multiplication is about duplicated wholes (two classes for one domain concept → delete one). Fragmentation (Layer 2) is about split wholes (one domain concept shattered across multiple locations → collapse to one). A single finding can trigger both layers from different angles; that's redundancy, not muddling.
Layer 4: Check the Structural Pattern Catalog
Scan for known structural patterns plus any additional patterns the project has declared. Before evaluating, read .agency/hickey.md if it exists — its content is project-specific (inline patterns, or a pointer to another file). Treat any patterns found there as additions to the catalog below. The catalog has two halves: complecting (things braided together that should be separate) and fragmentation (things split apart that should be one). Both directions are "interleaved vs. not-interleaved" — Hickey's principle is bidirectional.
Complecting patterns (things-that-should-be-separate braided together)
| Construct | What it complects | Simpler alternative |
|---|
| Mutable state | Value + time + identity | Immutable values, controlled state containers |
| Objects | State + identity + value + namespace | Plain functions + data + namespaces |
| Methods | Function + state; function + namespace | Free functions, interfaces |
| Inheritance | Types with types | Composition, interfaces, traits |
| Switch/case on type | Who + what | Dynamic dispatch, visitor pattern |
| Mutable variables | Value + time | const/final/let bindings, immutable data |
| Imperative loops | What + how + when | map/filter/reduce, declarative transforms |
| Actors | What + who | Queues + stateless handlers |
| ORM | Object identity + relational model + query | Plain data + declarative queries |
| Conditionals scattered across code | One decision braided across many sites | Rules, declarative policies, lookup tables |
| Callbacks/closures over mutable state | Control flow + state + time | Streams, queues, immutable values |
| Hand-rolled utility (tokenizer, parser, walker, normalizer, state machine, date/semver/URL helper, CLI arg parser) when a focused library solves the exact problem | Scope decision (how much to implement) with implementation choice (write it yourself) | Use the library. Hand-roll only when the library adds surface area you actively don't want, not when it would save you writing code you'd otherwise own. "Zero deps" is an easiness judgment dressed as a simplicity judgment — code you don't own is genuinely simpler than code you do own: it doesn't accumulate private test fixtures, it doesn't bitrot when requirements shift, and its edge cases are someone else's problem to fix. |
Fragmentation patterns (things-that-belong-together split apart)
| Construct | What it fragments | Simpler alternative |
|---|
| Parallel optional/nullable fields with coupled presence | "Thing exists + its shape" into independent slots whose combinations include illegal states | Discriminated union / single container encoding the coupling |
| Per-entity state the domain says must agree across entities | One value into N copies indexed by identity, with an unenforced "all agree" rule | Single source-of-truth at the containing scope (type, signal, module-level value) |
| Reactive pipeline projecting one value out of a per-entity structure (memo + prop-drill + effect) | Sharedness reconstituted at read time | Make the underlying state shared; delete the pipeline |
| Callback-down + value-up across module boundaries | One state location into a cycle across N modules | Lift the state to the layer all consumers share |
| Sum type modeled as parallel optional fields per variant | A discriminator scattered into its projections | Actual sum type with the discriminator as the key |
| Booleans whose combinations encode a state machine | One state into several independent flags | Enum / union naming each reachable state |
| "Convention: update X when Y changes" maintained by memory | A rule into documentation or code review discipline | Structure so the coupling is mechanical, not memorized |
| Duplicated derivations (same value computed in N places) | One computation into N copies | Compute once, read N times |
| Config or data split across files/modules by accident of history | A concept into shards held together by cross-reference | Collapse into the one file or module that owns the concept |
| Shared helper placed in whichever module first imported it (rather than the module that owns the underlying concept) | The helper's natural home into an authoring-order accident | Alternative-placement test. For every cross-module helper / type / constant the diff introduces or relocates, name the modules that consume it, then ask which placement leaves the more cohesive module behind. The default home is the module that owns the concept's generative side (filesystem layout, data store, identity, lifecycle), not the consumer that happens to read it first. "Primary consumer is X" is a tautology if the helper currently lives in X — try placing it in each alternative and compare which alternative leaves the chosen module with fewer interleaved concerns. |
When you find a catalog match in either half, do not dismiss it. Design the concrete alternative first (Layer 7), then evaluate whether the current approach is actually justified. The proof burden is on the current code, not on you to prove it's wrong. Hickey: "what matters are the artifacts not the authoring."
Layer 5: Structural Entanglement Analysis
For each file or module touched:
- Concern count per module. Multiple distinct concerns in one module = braiding. Flag it.
- Mutable binding count per function. Count
let bindings + reassignments. 3+ in a single scope = scrutinize.
- Closure depth. Closures over mutable state braid the inner function with the outer function's timeline.
- Data flow topology. Pipeline or DAG = good. Cycles (handler emits onto its own channel) = complected.
- Lifecycle nesting. New lifecycle scopes (AbortControllers, watchers, timers) inside a handler that already has its own lifecycle = braiding "when" concerns.
- Temporal coupling. Correctness depending on execution order beyond what types enforce = invisible braid.
Layer 6: Assess Severity
For every finding, assess — but severity does not grant dismissal. A low-severity finding is still a finding. Report it. The user decides what to act on, not you.
- Blast radius. How much of the system must be understood to diagnose a bug from this entanglement?
- Change friction. Can the complected concerns be modified independently?
- Reasoning load. Can you explain the code without "and then" or "but only if"?
Layer 7: Suggest Simplifications
For every finding — not just "significant" ones — propose a concrete structural alternative. Write out what it would look like, even as pseudocode. This is mandatory.
Never label a finding "essential complexity" without first designing the simplified version. Hickey: "Simplicity is a choice. It's your fault if you don't have a simple system." Assume the existing code came from someone who didn't know better. Design the simpler version. Only after you have it in hand can you evaluate whether the current approach is justified — and if so, explain exactly what makes the simplified version non-viable.
Follow Hickey's design questions:
- What: Cleaner abstraction boundary?
- Who: Subcomponents as arguments instead of hardwired?
- How: Implementation isolated so changes don't ripple?
- When/Where: Queue or channel instead of direct calls?
- Why: Declarative policies instead of scattered conditionals?
Fact-Check Your Own Evaluation
After completing all layers, invoke /fact-check on your own output. The fact-check skill catches:
- Findings you talked yourself out of ("However..." / "acceptable tradeoff" / "justified")
- "Low severity" used as a synonym for "ignore"
- Bogus "essential complexity" labels without a concrete simplified alternative
- Claims about code behavior that you didn't verify by reading the code
Also flag scope-based dismissals. "Out of scope for this PR", "pre-existing issue", "appropriate scope for a bug fix", "orthogonal", and "follow-up refactor" are not simplicity judgments — they are process judgments that let findings evaporate between hickey and implementation. There is no defer. Every finding's only forward-action disposition is Fix in this PR; if you find yourself writing a scope-based dismissal, fix the finding in this PR. The Actions section enforces this — if you can't fill it in, you're dismissing the finding.
Flag orphaned pre-existing/orthogonal findings. Any finding described as "pre-existing" or "orthogonal" during analysis that does not appear in the Actions section as Fix in this PR is a dismissal. The fact-check pass must catch these: scan the full evaluation for the words "pre-existing", "orthogonal", "already existed", or "not introduced by this PR" — each occurrence must have a corresponding Actions entry that fixes it in this PR. The PR's scope expands to absorb the finding; pre-existing-ness is not an exit. No exceptions.
Also flag your own output for phrase shapes that mean you stopped reasoning one step early. These aren't findings you talked yourself out of — they're findings you never let form. If any of these phrase shapes appear in your evaluation, re-open the question they're dismissing:
- "X and Y share Z but are separate concerns" — verified at the domain level, or just at the current implementation layout? Shared input is a precondition; work it through.
- "different consumers read different fields/signals/modules" — is the split justified by domain difference, or by how today's UI happens to be organized?
- "technically could diverge, but in practice doesn't" — if it's technical, the representation can fix it; "in practice" is a promise that won't hold across refactors.
- "each X could theoretically have its own Y" — theoretical divergence is the classic fragmentation cover story.
- "this is a convention, not a constraint" — conventions are fragmentation dressed as discipline.
- "we agree to update both" / "we remember to clear X when Y changes" — discipline is not a type system.
- "lift X to be shared" — if you're lifting X to make it shared, X wants to be shared at its home, not projected from elsewhere.
- "X is the natural home for Y" / "Y's primary consumer is X" / "X already imports the surrounding context, so Y belongs there" — these are circular if Y currently lives in X. Run the alternative-placement test from the Layer 4 fragmentation catalog: name each module that consumes Y, place Y in each one in turn, compare which placement leaves the more cohesive module behind. "Primary consumer" determined without trying the alternatives is the question you stopped before answering.
These catch fragmentation (Layer 2) and placement misses (Layer 4) that the reviewer glossed over. The prosecutor stance applies equally to findings never made and findings made-then-dismissed.
If fact-check finds issues with your evaluation, revise before presenting to the user.
Output Format
-
Concerns identified — Name the distinct concerns.
-
Fragmentation findings — Layer 2 findings. If none, write "no invariants found" explicitly.
-
Concept multiplication — Layer 3 findings.
-
Structural pattern matches — Layers 4–5 findings (both complecting and fragmentation halves of the catalog), with line references.
-
Severity — For each finding: blast radius, change friction, reasoning load.
-
Simplifications — Concrete alternative for every finding.
-
Fact-check result — Output of /fact-check on this evaluation, including the phrase-shape check.
-
Actions — One entry per finding, formatted so a downstream step (e.g. /do's PR comment composer) can lift each entry into a table row. Every finding from every layer must appear here — including findings labeled "pre-existing", "orthogonal", or "not introduced by this PR". A finding that never reaches this section has been dismissed.
Each entry starts with a short bolded finding label (≤8 words) that names what is wrong, then dispositions it as exactly one of:
- Fix in this PR: one-line description of what the implementation step must do. This is the only forward-action disposition. The PR's scope expands to absorb every finding, even when the fix grows the diff substantially —
/do is optimizing for the simpler artifact landing in master, not for diff size.
- No-op: reserved for findings that need no code action — the diff already deletes the offending code, or the finding is subsumed verbatim by another finding in this same review (cite the canonical entry by label). Treat this as the rare exception, not the escape hatch. Anything else is a Fix.
There is no Defer disposition. "Out of scope", "pre-existing", "follow-up refactor", "broader cleanup", "should be its own PR" are not dispositions — they are process judgments dressed up as engineering. If a finding seems too big for this PR, the PR is too small for the work; the answer is to fix it here, not to issue-and-forget. Findings that genuinely require coordinated work outside this repo (upstream library bugs, multi-PR migrations) shouldn't have surfaced as findings of this structural review in the first place — they belong in a separate strategic discussion, not a Defer line on this diff. If one slipped in, apply a local workaround or interface boundary in this PR rather than punt.
Example: **viewportDimensions complects current+default roles** — Fix in this PR: delete the signal, replace with per-tile FitAddon measurement.
"No findings" → "No actions." But if findings exist and the actions list is empty, the evaluation is incomplete.
Do NOT include a "What's simple" section. Praise biases toward positive framing and makes findings feel like minor quibbles. Report what you found. The absence of findings is its own praise.
Important Caveats
- Simple ≠ easy, simple ≠ short. Hickey: "This whole notion of programmer convenience... we are infatuated with it, not to our benefit."
- Simple ≠ familiar. Hickey: "If you want everything to be familiar, you will never learn anything new because it can't be significantly different from what you already know."
- Tests are necessary but not sufficient. Hickey: "I'm glad I've got these guardrails... do the guardrails help you get to where you want to go? No."
- This is not a replacement for functional review. Simplicity analysis complements correctness review. A perfectly simple program that does the wrong thing is useless.
- Do NOT run tests, builds, or any shell commands. This skill is purely analytical — read code, reason about structure, report findings. Running tests is the job of other workflow steps, not this one.
- For volatility-based decomposition (do boundaries encapsulate axes of change?), see
/lowy.