// Extract an Allium specification from an existing codebase. Use when the user has existing code and wants to distil behaviour into a spec, reverse engineer a specification from implementation, generate a spec from code, turn implementation into a behavioural specification, or document what a codebase does in Allium terms.
Extract an Allium specification from an existing codebase. Use when the user has existing code and wants to distil behaviour into a spec, reverse engineer a specification from implementation, generate a spec from code, turn implementation into a behavioural specification, or document what a codebase does in Allium terms.
Distillation guide
This guide covers extracting Allium specifications from existing codebases. The core challenge is the same as forward elicitation: finding the right level of abstraction. In elicitation you filter out implementation ideas as they arise. In distillation you filter out implementation details that already exist. Both require the same judgement about what matters at the domain level.
Code tells you how something works. A specification captures what it does and why it matters. The skill is asking "why does the stakeholder care about this?" and "could this be different while still being the same system?"
Scoping the distillation effort
Before diving into code, establish what you are trying to specify. Not every line of code deserves a place in the spec.
Questions to ask first
"What subset of this codebase are we specifying?"
Mono repos often contain multiple distinct systems. You may only need a spec for one service or domain. Clarify boundaries explicitly before starting.
"Is there code we should deliberately exclude?"
Legacy code: features kept for backwards compatibility but not part of the core system
Incidental code: supporting infrastructure that is not domain-level (logging, metrics, deployment)
Deprecated paths: code scheduled for removal
Experimental features: behind feature flags, not yet design decisions
"Who owns this spec?"
Different teams may own different parts of a mono repo. Each team's spec should focus on their domain.
The "Would we rebuild this?" test
For any code path you encounter, ask: "If we rebuilt this system from scratch, would this be in the requirements?"
Yes: include in spec
No, it is legacy: exclude
No, it is infrastructure: exclude
No, it is a workaround: exclude (but note the underlying need it addresses)
Documenting scope decisions
At the top of a distilled spec, document what is included and excluded:
The version marker (-- allium: N) must be the first line of every .allium file. Use the current language version number.
Finding the right level of abstraction
Distillation and elicitation share the same fundamental challenge: choosing what to include. The tests below work in both directions, whether you are hearing a stakeholder describe a feature or reading code that implements it.
The "Why" test
For every detail in the code, ask: "Why does the stakeholder care about this?"
Code detail
Why?
Include?
Invitation expires in 7 days
Affects candidate experience
Yes
Token is 32 bytes URL-safe
Security implementation
No
Sessions stored in Redis
Performance choice
No
Uses PostgreSQL JSONB
Database implementation
No
Slot status changes to 'proposed'
Affects what candidate sees
Yes
Email sent when invitation accepted
Communication requirement
Yes
If you cannot articulate why a stakeholder would care, it is probably implementation.
The "Could it be different?" test
Ask: "Could this be implemented differently while still being the same system?"
If yes: probably implementation detail, abstract it away
If no: probably domain-level, include it
Detail
Could be different?
Include?
secrets.token_urlsafe(32)
Yes, any secure token generation
No
7-day invitation expiry
No, this is the design decision
Yes
PostgreSQL database
Yes, any database
No
"Pending, Confirmed, Completed" states
No, this is the workflow
Yes
The "Template vs Instance" test
Is this a category of thing, or a specific instance?
Instance (often implementation)
Template (often domain-level)
Google OAuth
Authentication provider
Slack webhook
Notification channel
SendGrid API
Email delivery
timedelta(hours=3)
Confirmation deadline
Sometimes the instance IS the domain concern. See "The concrete detail problem" below.
The distillation mindset
Code is over-specified
Every line of code makes decisions that might not matter at the domain level:
Question: Is "Google OAuth" domain-level or implementation?
It is implementation if:
Google is just the auth mechanism chosen
It could be replaced with any OAuth provider
Users do not see or care which provider
The code is written generically (provider is a parameter)
It is domain-level if:
Users explicitly choose Google (vs Microsoft, etc.)
"Sign in with Google" is a feature
Google-specific scopes or permissions are used
Multiple providers are supported as a feature
How to tell: Look at the UI and user flows. If users see "Sign in with Google" as a choice, it is domain-level. If they just see "Sign in" and Google happens to be behind it, it is implementation.
Almost always implementation. The spec should say:
entity Candidate {
skills: Set<String>
metadata: String? -- or model specific fields
}
The specific database is rarely domain-level. Exception: if the system explicitly promises PostgreSQL compatibility or specific PostgreSQL features to users.
Look for enum definitions, status or state columns, constants like STATUS_PENDING = 'pending', and state machine libraries (e.g. transitions, django-fsm).
Step 2.5: Identify candidate processes
After extracting entities and their states, scan for state machines that suggest end-to-end processes. Trace where each status value gets set across the codebase (where does status = 'interviewing' happen?). Present candidate processes to the user for validation: "I see an entity with states applied → screening → interviewing → deciding → hired/rejected. Is this a process the system is meant to support?"
Also trace cross-entity data flow. If a rule on entity A requires a field from entity B, follow the chain: where does entity B's field get set, and what triggers that? Present the chain: "The hiring decision requires background_check_status = clear. This gets set by a webhook handler at /api/webhooks/background-check. Does this chain look right?"
Generate transition graphs from the extracted rules. The graph is a derived view of the code. If it has gaps (states with no outbound transitions that aren't terminal), flag them as potential issues.
Step 3: Extract transitions
Find where status changes happen:
defaccept_invitation(invitation_id: int, slot_id: int):
invitation = get_invitation(invitation_id)
if invitation.status != 'pending':
raise InvalidStateError()
if invitation.expires_at < datetime.utcnow():
raise ExpiredError()
slot = get_slot(slot_id)
if slot notin invitation.slots:
raise InvalidSlotError()
invitation.status = 'accepted'
slot.status = 'booked'# Release other slotsfor other_slot in invitation.slots:
if other_slot.id != slot_id:
other_slot.status = 'available'# Create the interview
interview = Interview(
candidate_id=invitation.candidate_id,
slot_id=slot_id,
status='scheduled'
)
notify_interviewers(interview)
send_confirmation_email(invitation.candidate, interview)
Extract:
rule CandidateAcceptsInvitation {
when: CandidateAccepts(invitation, slot)
requires: invitation.status = pending
requires: invitation.expires_at > now
requires: slot in invitation.slots
ensures: invitation.status = accepted
ensures: slot.status = booked
ensures:
for s in invitation.slots:
if s != slot: s.status = available
ensures: Interview.created(
candidacy: invitation.candidacy,
slot: slot,
status: scheduled
)
ensures: Notification.created(to: slot.interviewers, ...)
ensures: Email.created(to: invitation.candidate.email, ...)
}
Key extraction patterns:
Code pattern
Spec pattern
if x.status != 'pending': raise
requires: x.status = pending
if x.expires_at < now: raise
requires: x.expires_at > now
if item not in collection: raise
requires: item in collection
x.status = 'accepted'
ensures: x.status = accepted
Model.create(...)
ensures: Model.created(...)
send_email(...)
ensures: Email.created(...)
notify(...)
ensures: Notification.created(...)
Assertions, checks and validations found in code (e.g. assert balance >= 0, class-level validators) may map to expression-bearing invariants rather than rule preconditions. Consider whether they describe a system-wide property or a rule-specific guard.
Step 4: Find temporal triggers
Look for scheduled jobs and time-based logic:
# In celery tasks or cron jobs@app.taskdefexpire_invitations():
expired = Invitation.query.filter(
Invitation.status == 'pending',
Invitation.expires_at < datetime.utcnow()
).all()
for invitation in expired:
invitation.status = 'expired'for slot in invitation.slots:
slot.status = 'available'
notify_candidate_expired(invitation)
@app.taskdefsend_reminders():
upcoming = Interview.query.filter(
Interview.status == 'scheduled',
Interview.slot.time.between(
datetime.utcnow() + timedelta(hours=1),
datetime.utcnow() + timedelta(hours=2)
)
).all()
for interview in upcoming:
send_reminder_notification(interview)
Extract:
rule InvitationExpires {
when: invitation: Invitation.expires_at <= now
requires: invitation.status = pending
ensures: invitation.status = expired
ensures:
for s in invitation.slots:
s.status = available
ensures: CandidateInformed(candidate: invitation.candidate, about: invitation_expired)
}
rule InterviewReminder {
when: interview: Interview.slot.time - 1.hour <= now
requires: interview.status = scheduled
ensures: Notification.created(to: interview.interviewers, template: reminder)
}
Step 5: Identify external boundaries
Look for third-party API calls, webhook handlers, import/export functions, and data that is read but never written (or vice versa).
These often indicate external entities:
# Candidate data comes from Greenhouse, we don't create itdefimport_from_greenhouse(webhook_data):
candidate = Candidate.query.filter_by(
greenhouse_id=webhook_data['id']
).first()
ifnot candidate:
candidate = Candidate(greenhouse_id=webhook_data['id'])
candidate.name = webhook_data['name']
candidate.email = webhook_data['email']
When repeated interface patterns appear across service boundaries (e.g. the same serialisation contract expected by multiple consumers), these suggest contract declarations for reuse rather than duplicated inline obligation blocks.
Step 5.5: Identify actors from auth patterns
After extracting surfaces from API endpoints, identify actors by examining authentication and authorisation patterns. Different auth contexts suggest different actors:
API key authentication → system actor (external service)
Role-based access (user.role == 'admin') → distinct actor per role
Scoped access (user.org_id == resource.org_id) → actor with within scoping
Unauthenticated endpoints → public-facing actor or system webhook
Ask the user to confirm: "This endpoint requires admin role authentication. Is 'Admin' a distinct actor, or is this the same person as the regular user with elevated permissions?"
Step 6: Abstract away implementation
Now make a pass through your extracted spec and remove implementation details.
candidate_id: Integer became candidacy: Candidacy (relationship, not FK)
token: String(32) removed (implementation)
DateTime became Timestamp (domain type)
Added derived is_expired for clarity
Config values that derive from other config values (e.g. extended_timeout = base_timeout * 2) should use qualified references or expression-form defaults in the config block rather than independent literal values.
Step 7: Validate with stakeholders
The extracted spec is a hypothesis. Validate it:
Show the spec to the original developers. "Is this what the system does?"
Show to stakeholders. "Is this what the system should do?"
Look for gaps. Code often has bugs or missing features; the spec might reveal them.
Common findings:
"Oh, that retry logic was a hack, we should remove it"
"Actually we wanted X but never built it"
"These two code paths should be the same but aren't"
Before running further checks, read assessing specs to gauge the distilled spec's maturity. This tells you whether the spec is ready for process-level analysis or still needs structural work.
If the Allium CLI is available, run allium check on the distilled spec to catch structural issues, then allium analyse to identify process-level gaps. Findings from analyse can drive validation questions: "The distilled spec has a rule that requires background_check.status = clear but no surface captures background check results. Is this handled by a part of the codebase we haven't looked at?" Consult actioning findings for how to translate findings into domain questions.
Recognising library spec candidates
During distillation, stay alert for code that implements generic integration patterns rather than application-specific logic. These belong in library specs. See recognising library spec opportunities for the full decision framework (questions to ask, how to handle, common extractions).
Signals in the code
Look for these patterns that suggest a library spec:
Generic patterns with specific providers: OAuth flows, payment processing, email delivery, calendar sync, ATS integrations, file storage.
Red flags: integration logic in your spec
If you find yourself writing spec like this, stop and reconsider:
-- TOO DETAILED - this is Stripe's domain, not yours
rule ProcessStripeWebhook {
when: WebhookReceived(payload, signature)
requires: verify_stripe_signature(payload, signature)
let event = parse_stripe_event(payload)
if event.type = "invoice.paid":
...
}
When you find two terms for the same concept (across specs, within a spec, or between spec and code) treat it as a blocking problem.
-- BAD: Acknowledges duplication without resolving it
-- Order vs Purchase
-- checkout.allium uses "Purchase" - these are equivalent concepts.
This is not a resolution. When different parts of a codebase are built against different specs, both terms end up in the implementation: duplicate models, redundant join tables, foreign keys pointing both ways.
What to do:
Choose one term. Cross-reference related specs before deciding.
Update all references. Do not leave the old term in comments or "see also" notes.
Note the rename in a changelog, not in the spec itself.
Warning signs in code:
Two models representing the same concept (Order and Purchase)
Join tables for both (order_items, purchase_items)
Comments like "equivalent to X" or "same as Y"
The spec you extract must pick one term. Flag the other as technical debt to remove.
Challenge: Implicit state machines
Code often has implicit states that are not modelled:
# No explicit status field, but there's a state machine hiding hereclassFeedbackRequest:
interview_id = Column(Integer)
interviewer_id = Column(Integer)
requested_at = Column(DateTime)
reminded_at = Column(DateTime, nullable=True)
feedback_id = Column(Integer, nullable=True) # FK to Feedback if submitted
Whether the current implementation properly handles failures is separate from what the system should do.
Challenge: Over-engineered abstractions
Enterprise codebases often have abstraction layers that obscure intent:
publicinterfaceNotificationStrategy {
voidnotify(NotificationContext context);
}
publicclassSlackNotificationStrategyimplementsNotificationStrategy {
@Overridepublicvoidnotify(NotificationContext context) {
// Actual Slack call buried 5 levels deep
}
}
Cut through to the actual behaviour. The spec does not need strategy patterns, dependency injection or abstract factories. Just: ensures: Notification.created(channel: slack, ...)
Checklist: Have you abstracted enough?
Before finalising a distilled spec:
No database column types (Integer, VARCHAR, etc.)
No ORM or query syntax
No HTTP status codes or API paths
No framework-specific concepts (middleware, decorators, etc.)
No programming language types (int, str, List, etc.)
No variable names from the code (use domain terms)
No infrastructure (Redis, Kafka, S3, etc.)
Foreign keys replaced with relationships
Tokens/secrets removed (implementation of identity)
Timestamps use domain Duration, not timedelta/seconds
If any remain, ask: "Would a stakeholder include this in a requirements doc?"
Checklist: Terminology consistency
Each concept has exactly one name throughout the spec
No "also known as" or "equivalent to" comments
Cross-referenced related specs for conflicting terms
Duplicate models in code flagged as technical debt to remove
After distillation
The extracted spec is a starting point. If distillation reveals gaps that need structured discovery (unclear requirements, complex entity relationships, unstated business rules), use the elicit skill to fill them. For targeted changes as requirements evolve, use the tend skill. For checking ongoing alignment between the spec and implementation, use the weed skill.