| name | service-layer-architecture |
| description | Use when designing or refactoring an API surface where the route handler / Convex action / framework endpoint has accumulated runtime mechanics (provisioning, credential reads, validation, readiness checks, restart/teardown helpers, integration fan-out). Separates the orchestration layer (auth, ownership, status transitions, audit events, persistence, user-facing error policy) from service modules (reusable runtime mechanics that return structured results). Substrate-agnostic — applies to Convex actions, Next.js App Router API route handlers / Server Actions, FastAPI endpoints, NestJS controllers, Express handlers. |
| metadata | {"origin":"David Ondrej framing — codified for portfolio use after the Runtime Reconciliation / Deployment Lifecycle plan, 2026-05-18.","applies_to":"any backend where business logic and runtime mechanics live in the same function"} |
Service Layer Architecture
The pattern in one sentence: the route handler decides what to do and who is allowed to do it; the service module decides how to do it and returns a structured result the handler can interpret.
When to invoke this skill
Use this skill BEFORE touching code when any of these are true:
- A single route file is > 200 lines and mixes auth + persistence + integration calls + retries + telemetry + error mapping.
- You're about to add the same provisioning / credential-lookup / readiness-poll code to a second handler.
- A handler imports a runtime client (Stripe, Xero, Supabase service-role, a worker dispatcher) and immediately calls 4+ methods on it in sequence.
- A bugfix to "how something is provisioned" requires editing N route files in lock-step.
- Tests for a route require mocking auth + DB + every downstream integration — a tell that the route owns too many concerns.
If none of the above, this skill is overkill. Don't introduce a service layer for a 20-line CRUD route.
The two layers, precisely
Action layer (a.k.a. orchestration layer / use-case layer)
Lives in: the route handler (app/api/**/route.ts), the Server Action (app/_actions/*.ts), the Convex action (convex/*.ts), the FastAPI endpoint (app/server/routes/*.py).
Owns:
- Authentication.
getServerSession, auth.uid(), bearer-token verification.
- Authorization / ownership. "Does this user own this resource?" "Does this workspace permit this transition?"
- Status transition policy. "An inspection can move from SUBMITTED → CLASSIFIED only when …"
- Audit events.
AuditLog.create({ action: "INSPECTION_SUBMITTED", actorId, … }).
- Persistence orchestration. Calls Prisma / Convex mutations directly OR delegates to a service module that returns data to persist.
- User-facing error policy. "HTTP 402 if subscription expired", "HTTP 403 if not your inspection", "HTTP 409 if status mismatch", "HTTP 500 + log if anything else".
- Telemetry.
console.error("[Inspections]", …) and structured logging.
Does not own:
- Network retries to third parties.
- Credential lookup from secret managers or env.
- Health/readiness probes against external systems.
- Worker / dispatcher / runtime provisioning (memory allocation, container spin-up).
- Pure validation logic (format checks, shape checks). Validation that requires a DB read is action-layer; pure validation is service-layer.
Service-module layer (a.k.a. domain-service layer / runtime layer)
Lives in: lib/services/<domain>/<concern>.ts (Next.js), services/<domain>.py (FastAPI), a separate convex/_services/ folder kept off the public action graph (Convex).
Owns:
- Gateway credentials reads. "Get the Xero access token for org X" — a function, not inline code in every handler that talks to Xero.
- Runtime setup. Dispatcher init, memory allocation, container provisioning, model client warming.
- Validation helpers. Pure functions that say
{ ok: true } or { ok: false, errors: [...] }.
- Readiness probes.
await isXeroReady(orgId) — returns structured { ready: boolean, reason?: string, retryAfterMs?: number }.
- Restart / teardown helpers. Idempotent functions you can call from a cron job AND from a user-triggered route.
- Retry policy. Exponential backoff, jitter, circuit-breaker state.
- Structured-result contract. Every public service function returns a discriminated-union result type, never throws for expected outcomes.
The structured-result shape (TypeScript):
export type ServiceResult<T, E extends string = string> =
| { ok: true; data: T }
| { ok: false; reason: E; detail?: string; retryAfterMs?: number };
The action layer reads result.ok and maps to HTTP semantics. The service layer never knows what HTTP is.
The cardinal rule
A service module never reads request, session, or cookies. It receives every dependency as a function argument. If it needs the workspace ID, the caller passes it in. If it needs a DB client, the caller passes it in. This is what makes service modules:
- Reusable — same function works from an API route, a cron job, an admin script, a test.
- Testable — no auth fakery required, just pass test inputs.
- Substrate-portable — swap Next.js for FastAPI, the service module moves untouched.
Conversely, an action never imports a third-party SDK directly. If you find import Stripe from "stripe" inside app/api/billing/route.ts, refactor that line into lib/services/billing/stripe-gateway.ts and let the action call a wrapper.
Anti-patterns this prevents
| Smell | What's wrong | Service-layer fix |
|---|
| Fat action — 400+ line handler that does everything | Untestable; every change risks regressions in unrelated concerns | Split: action keeps auth + status; mechanics move to service module |
Copy-paste credential reads — 8 handlers all call getXeroToken() inline | Token-refresh logic drifts; one handler updates, others break | One lib/services/xero/credentials.ts with getValidXeroAccessToken(orgId) |
Throw-and-catch ladder — service throws XeroAuthError, handler catches it, maps to 401, then throws XeroRateLimitError, handler catches it, maps to 429 | Errors-as-control-flow; new error class needs a touch in every handler | Service returns { ok: false, reason: "RATE_LIMITED", retryAfterMs }; handler has one switch on reason |
Inline retry loops — for (let i = 0; i < 3; i++) try { … } catch {} in a route | Retry policy diverges per route; impossible to instrument | Service module owns retry; route just awaits one call |
DB-write inside provisioning helper — provisionWorker() writes Worker.status = "PROVISIONED" | Helper can no longer be called from a dry-run / restart context | Helper returns the new status; action decides whether to persist |
How this maps across substrates
| Substrate | Action layer | Service-module layer |
|---|
| Convex | convex/<domain>.ts action / mutation | convex/_services/<domain>/<concern>.ts (folder convention; not exported from convex root) |
| Next.js App Router | app/api/<domain>/route.ts or app/_actions/<domain>.ts (Server Actions) | lib/services/<domain>/<concern>.ts |
| FastAPI | app/server/routes/<domain>.py endpoint | app/server/services/<domain>/<concern>.py |
| NestJS | <Domain>Controller | <Domain>Service (Nest already enforces this — the skill is about not letting the service grow into a god-class) |
| Express | routes/<domain>.ts handler | services/<domain>/<concern>.ts |
The principle is identical. The naming differs.
Test-design implications
Service-module tests
- Unit tests, fully mocked downstream. Inject the DB client. Inject the HTTP client. Inject the clock.
- Cover the structured-result space: every
reason value has at least one test producing it.
- No request/session fixtures. No HTTP framework imports.
- Example file path:
lib/services/xero/__tests__/credentials.test.ts.
Action tests
- Integration tests, mock at the service-module boundary only. Stub
getValidXeroAccessToken to return { ok: true, data: "fake-token" } or { ok: false, reason: "REFRESH_FAILED" }.
- Cover HTTP-status mapping: every
result.reason should produce a deterministic HTTP code.
- Use the real framework (Next.js / Convex / FastAPI test client) for routing + auth + serialisation.
- Example file path:
app/api/xero/__tests__/sync.test.ts.
What you do not test
- The framework itself. If Next.js correctly extracts
params.id from the URL, that's the framework's job.
- Implementation details of service modules from action tests. If
credentials.ts adds retry, action tests don't need updating.
Concrete refactor recipe (TDD)
When converting a fat action into action + service:
- Identify the seams. Read the action. Underline every line that does NOT belong in the action-layer list above. Those become the service module.
- Name the service.
lib/services/<noun>/<verb>.ts — e.g. lib/services/inspection/submit-orchestrator.ts. One verb per file when possible.
- Write the failing service test first. Mock the DB client and any HTTP client. Assert the structured result for the happy path and one failure path.
- Extract the code. Move lines out of the action into the service. Make the action call
await submitOrchestrator({ inspectionId, prisma, fetch }).
- Run the service test. Make it pass.
- Update or write the action test. Stub the service module's exported function. Assert HTTP status mapping for each
reason.
- Run both test suites. Commit only when both green.
- Repeat for the next seam in the same action.
Frequent commits — one extracted concern per commit. If you extract 4 concerns from one action, that's 4 commits, not 1. This makes the diff reviewable and reversible.
What this skill does NOT prescribe
- Folder shape inside
lib/services/. Flat vs nested is per-project taste. Suggestion: nest by domain (lib/services/inspection/, lib/services/billing/) once you have 3+ files per domain.
- DI container. No. Pass dependencies as function arguments. A 30-line constructor with 8 dependencies is a smell, not a feature.
- Naming a class. Service modules are usually a set of exported functions, not a class. Use a class only when the runtime state genuinely belongs together (a long-lived connection pool, a worker queue, a circuit breaker).
- Synchronous vs async. Both fine. Match the substrate's idiom.
Ondrej's own framing (verbatim — for context)
In his "Decorate with Convex" walkthrough (May 2025), Ondrej explains why a Convex action is the orchestration seam:
"This is the generate decorated image function it's an internal action and on convex actions are effectual which means that they can run for I think 10 minutes but they can't directly access the database they have to go via mutations or queries to do that... I have to do this use nodes um directive on this convex action which means that this is all going to run within a node context as opposed to a um V8 isolate context"
In Next.js 16 / RSC content he names the pattern directly:
"To avoid 'Dependency Hell,' Server Actions should stay minimal. Treat them like Controllers in a modular backend — keep the business logic in a dedicated Service Layer."
In his Anthropic Agent SDK overview (Feb 2026) he ties the pattern to agent ergonomics:
"It was already there baked into the project because of the way convex is set up so claude code which is what I was using for the project really understood what was going on."
The takeaway: action = side-effectful gateway with extended runtime; service-module = pure business logic the agent can iterate on safely. The lower-level vocabulary in this skill (dispatcher runtime setup, readiness probes, teardown helpers) is not verbatim Ondrej — it's the standard enterprise framing this skill folds in for parity with worker-pool / model-server / cron-driven systems.
Cross-service transactions and Saga compensation
When one action calls two service modules in sequence and the second one fails, the action — not the service — owns the rollback. Two cases:
- Single-DB transaction. The action opens a Prisma
$transaction and threads the transaction client to both service modules as a dependency. If module B returns { ok: false }, the action throws (or returns its own structured failure) and the transaction rolls back automatically. Service modules stay transaction-agnostic — they accept whatever client the action gives them.
- Cross-system / no shared transaction. When the modules touch separate systems (Stripe + Supabase, Stripe + a worker queue), use the Saga pattern: each successful module exposes a
compensate() or inverse helper. If a downstream step fails, the action calls compensators in reverse order to restore consistency (e.g. BillingService.refund(chargeId) after ProvisioningService.spinUp(workerId) failed).
Both stay action-layer responsibilities; service modules expose the helpers but never decide when to invoke them.
Cross-references
superpowers:subagent-driven-development — when implementing a multi-step refactor, dispatch a fresh subagent per service-module extraction so context stays focused.
superpowers:test-driven-development — the recipe above is TDD; this skill is the layout, TDD is the discipline.
- Domain-Driven Design — this maps to the application-service vs domain-service distinction. Search "Vaughn Vernon application service" for the deepest formal treatment.
- Hexagonal Architecture / Ports & Adapters — actions are the inbound ports; services are the use cases; integrations are the outbound adapters. Same shape, different vocabulary.
- Clean Architecture (Uncle Bob) — actions are the "interface adapters" layer; services are the "use case interactors". Same shape again.
- Convex runtime model — Ondrej's framing leans on Convex's structural enforcement: queries/mutations run in a deterministic V8 isolate, actions run in Node with full side-effect capability. This is the same separation, enforced at runtime instead of by convention.
David Ondrej's framing is a practitioner's distillation of these traditions, with names tuned for the Convex / Next.js generation. Sources: Margot deep-research synthesis 2026-05-18 (interaction v1_ChdWMWtL…), Ondrej YouTube transcripts (Decorate with Convex, Anthropic Agent SDK overview).
Quick checklist (pin this above your refactor)