| name | serverless-development |
| description | AWS Lambda, Vercel Edge Functions, Cloudflare Workers, cold starts, deployment patterns, and infrastructure as code (SST, Serverless Framework). Use when building serverless applications or optimizing function-based architectures. |
Serverless Development
Overview
This skill covers building applications on serverless compute platforms where infrastructure management is abstracted away. It addresses AWS Lambda (Node.js, Python, Go runtimes), Vercel Edge Functions and Serverless Functions, Cloudflare Workers, cold start optimization, serverless design patterns (fan-out, step functions, event-driven), database connection management, deployment with SAM/SST/Serverless Framework, monitoring, and cost optimization.
Use this skill when building APIs, webhooks, background processing, scheduled tasks, or any workload that benefits from auto-scaling, pay-per-use pricing, and zero infrastructure management.
Core Principles
- Design for statelessness - Every function invocation is independent. Never store state in global variables between invocations (though you can reuse connections). Use external state stores (databases, caches, queues) for persistence.
- Minimize cold starts - Cold starts add 100ms-10s of latency on first invocation. Keep bundles small, use lightweight runtimes, and consider provisioned concurrency for latency-sensitive paths.
- Connection management is critical - Serverless functions can spawn thousands of concurrent instances. Each opening its own database connection will exhaust connection pools. Use connection pooling (RDS Proxy, Prisma Data Proxy, PgBouncer).
- Pay-per-invocation thinking - You pay for execution time and memory. Optimize hot paths, choose appropriate memory sizes (more memory = more CPU = potentially faster = sometimes cheaper), and batch operations where possible.
- Embrace the constraints - Function timeouts, payload limits, and concurrency limits are features, not bugs. Design around them: use queues for long tasks, S3 for large payloads, and step functions for complex orchestration.
Key Patterns
Pattern 1: AWS Lambda with TypeScript
When to use: Backend APIs, event processing, scheduled tasks on AWS.
Implementation:
import {
APIGatewayProxyEventV2,
APIGatewayProxyResultV2,
SQSEvent,
ScheduledEvent,
Context,
} from "aws-lambda";
let dbConnection: PrismaClient | null = null;
function getDb(): PrismaClient {
if (!dbConnection) {
dbConnection = new PrismaClient({
datasources: {
db: { url: process.env.DATABASE_URL },
},
});
}
return dbConnection;
}
export async function apiHandler(
event: APIGatewayProxyEventV2,
context: Context
): Promise<APIGatewayProxyResultV2> {
context.callbackWaitsForEmptyEventLoop = false;
try {
const method = event.requestContext.http.method;
const path = event.rawPath;
if (method === "GET" && path === "/api/users") {
const db = getDb();
const users = await db.user.findMany({ take: 50 });
return response(200, { users });
}
if (method === "POST" && path === "/api/users") {
const body = JSON.parse(event.body ?? "{}");
const db = getDb();
const user = await db.user.create({ data: body });
return response(201, { user });
}
return response(404, { error: "Not found" });
} catch (error) {
console.error("Handler error:", error);
return response(500, { error: "Internal server error" });
}
}
export async function sqsHandler(event: SQSEvent): Promise<{ batchItemFailures: Array<{ itemIdentifier: string }> }> {
const failures: string[] = [];
for (const record of event.Records) {
try {
const body = JSON.parse(record.body);
await processMessage(body);
} catch (error) {
console.error(`Failed to process message ${record.messageId}:`, error);
failures.push(record.messageId);
}
}
return {
batchItemFailures: failures.map((id) => ({ itemIdentifier: id })),
};
}
export async function cronHandler(event: ScheduledEvent): Promise<void> {
console.log("Running scheduled task at:", event.time);
const db = getDb();
const deleted = await db.session.deleteMany({
where: { expiresAt: { lt: new Date() } },
});
console.log(`Cleaned up ${deleted.count} expired sessions`);
}
function response(statusCode: number, body: Record<string, unknown>): APIGatewayProxyResultV2 {
return {
statusCode,
headers: {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": process.env.ALLOWED_ORIGIN ?? "*",
},
body: JSON.stringify(body),
};
}
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: nodejs20.x
Timeout: 30
MemorySize: 256
Environment:
Variables:
DATABASE_URL: !Ref DatabaseUrl
NODE_OPTIONS: "--enable-source-maps"
Resources:
ApiFunction:
Type: AWS::Serverless::Function
Properties:
Handler: dist/handler.apiHandler
Events:
Api:
Type: HttpApi
Properties:
Path: /api/{proxy+}
Method: ANY
Metadata:
BuildMethod: esbuild
BuildProperties:
Minify: true
Target: es2022
Sourcemap: true
EntryPoints:
- src/handler.ts
QueueProcessor:
Type: AWS::Serverless::Function
Properties:
Handler: dist/handler.sqsHandler
Events:
Queue:
Type: SQS
Properties:
Queue: !GetAtt ProcessingQueue.Arn
BatchSize: 10
FunctionResponseTypes:
- ReportBatchItemFailures
ScheduledCleanup:
Type: AWS::Serverless::Function
Properties:
Handler: dist/handler.cronHandler
Events:
Schedule:
Type: Schedule
Properties:
Schedule: rate(1 hour)
ProcessingQueue:
Type: AWS::SQS::Queue
Properties:
VisibilityTimeout: 180
RedrivePolicy:
deadLetterTargetArn: !GetAtt DLQ.Arn
maxReceiveCount: 3
DLQ:
Type: AWS::SQS::Queue
Why: AWS Lambda with SAM provides infrastructure-as-code, local testing (sam local invoke), and automated deployment. Connection reuse via module-level variables avoids the cold start penalty of reconnecting on every invocation. Partial batch failure reporting for SQS ensures only failed messages are retried.
Pattern 2: Vercel Edge Functions
When to use: Low-latency API routes and middleware that need to run close to users globally.
Implementation:
export const runtime = "edge";
export async function GET(request: Request) {
const country = request.headers.get("x-vercel-ip-country") ?? "US";
const city = request.headers.get("x-vercel-ip-city") ?? "Unknown";
const cachedData = await kv.get(`content:${country}`);
if (cachedData) {
return Response.json(cachedData);
}
const content = await getLocalizedContent(country);
await kv.set(`content:${country}`, content, { ex: 3600 });
return Response.json(content, {
headers: {
"Cache-Control": "public, s-maxage=3600, stale-while-revalidate=86400",
},
});
}
import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";
export const config = {
matcher: ["/dashboard/:path*", "/api/:path*"],
};
export function middleware(request: NextRequest) {
const token = request.cookies.get("session_token");
if (!token && request.nextUrl.pathname.startsWith("/dashboard")) {
return NextResponse.redirect(new URL("/login", request.url));
}
const abCookie = request.cookies.get("ab-pricing");
if (!abCookie && request.nextUrl.pathname === "/pricing") {
const variant = Math.random() < 0.5 ? "control" : "new-layout";
const response = NextResponse.next();
response.cookies.set("ab-pricing", variant, { maxAge: 60 * 60 * 24 * 30 });
if (variant === "new-layout") {
return NextResponse.rewrite(new URL("/pricing-v2", request.url));
}
return response;
}
return NextResponse.next();
}
Why: Edge Functions run in 30+ global locations with sub-millisecond cold starts (V8 isolates, not containers). They're ideal for auth checks, geo-routing, A/B testing, and API responses that benefit from proximity to users. The tradeoff: limited runtime (no Node.js APIs, no native modules, smaller memory).
Pattern 3: Cold Start Optimization
When to use: When latency-sensitive endpoints experience unacceptable cold start times.
Implementation:
import { build } from "esbuild";
await build({
entryPoints: ["src/handler.ts"],
bundle: true,
minify: true,
platform: "node",
target: "node20",
outdir: "dist",
external: [
"@aws-sdk/*",
],
treeShaking: true,
sourcemap: true,
});
let _sharp: typeof import("sharp") | null = null;
async function getSharp() {
if (!_sharp) {
_sharp = await import("sharp");
}
return _sharp;
}
export async function imageHandler(event: APIGatewayProxyEventV2) {
if (event.rawPath.includes("/resize")) {
const sharp = await getSharp();
}
}
export async function handler(event: unknown) {
if ((event as Record<string, unknown>).source === "serverless-warming") {
console.log("Warming invocation, returning early");
return { statusCode: 200, body: "warm" };
}
}
Why: Cold starts are the primary latency concern in serverless. Smaller bundles initialize faster. Lazy loading defers heavy import costs to when they're actually needed. Provisioned concurrency eliminates cold starts entirely for critical paths (at a cost). Warming invocations keep a function instance hot without paying for provisioned concurrency.
Pattern 4: Database Connection Pooling
When to use: Any serverless function that connects to a relational database.
Implementation:
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
directUrl = env("DIRECT_DATABASE_URL")
}
generator client {
provider = "prisma-client-js"
}
import { Pool, neonConfig } from "@neondatabase/serverless";
import { PrismaNeon } from "@prisma/adapter-neon";
import { PrismaClient } from "@prisma/client";
import ws from "ws";
neonConfig.webSocketConstructor = ws;
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const adapter = new PrismaNeon(pool);
const prisma = new PrismaClient({ adapter });
let client: PrismaClient | null = null;
export function getClient(): PrismaClient {
if (!client) {
client = new PrismaClient({
log: process.env.NODE_ENV === "development" ? ["query"] : [],
});
}
return client;
}
Why: Each Lambda invocation can create a new database connection. With thousands of concurrent invocations, this exhausts the database connection limit (typically 100-500 for managed databases). Connection pooling (RDS Proxy, PgBouncer, Neon serverless, Prisma Accelerate) multiplexes many serverless connections through fewer database connections.
Cost Optimization Reference
| Strategy | Savings | Tradeoff |
|---|
| Right-size memory | 20-40% | Requires benchmarking |
| ARM64 (Graviton) | 20% | Minor compatibility concerns |
| Provisioned concurrency (vs over-provisioned servers) | 50-70% vs always-on | Higher latency on scale-up |
| Reserved concurrency | Prevents runaway costs | Limits throughput |
| Batch processing (SQS batch size) | 50-80% | Higher latency |
| Tiered storage (S3 lifecycle) | 30-60% | Access latency for cold tiers |
Anti-Patterns
| Anti-Pattern | Why It's Bad | Better Approach |
|---|
| Opening DB connection per invocation | Exhausts connection pool | Module-level connection reuse + RDS Proxy |
Bundling entire node_modules | Massive cold start (5-10s) | Tree-shake with esbuild, external AWS SDK |
| Storing state in global variables | Lost between invocations (unreliable) | Use DynamoDB, Redis, or S3 for state |
| 15-minute timeout for API handlers | User waiting 15 minutes? | 30s for APIs, use Step Functions for long tasks |
| No concurrency limits | Runaway costs, downstream overload | Set reserved concurrency per function |
| Synchronous fan-out | Slow, timeout risk | Use SQS/SNS for fan-out, process async |
console.log without structure | Unsearchable in CloudWatch | Structured JSON logging with correlation IDs |
Checklist
Related Resources
- Skills:
event-driven-architecture (SQS/SNS patterns), monitoring-observability (Lambda tracing)
- Skills:
performance-engineering (cold start impact on latency)
- Rules:
docs/reference/stacks/python.md (Python Lambda patterns)