| name | database-design |
| description | Designs database schemas, indexing strategies, query optimization, and migration patterns for SQL and NoSQL databases. Use when designing tables, optimizing queries, fixing N+1 problems, planning migrations, or when asked about database performance, normalization, ORMs, or data modeling. |
Database Design
When to Load
- Trigger: Schema design, migrations, query optimization, indexing strategies, data modeling, N+1 fixes
- Skip: No database work involved in the current task
Database Design Workflow
Copy this checklist and track progress:
Database Design Progress:
- [ ] Step 1: Identify entities and relationships
- [ ] Step 2: Normalize schema (3NF minimum)
- [ ] Step 3: Evaluate denormalization needs
- [ ] Step 4: Design indexes for query patterns
- [ ] Step 5: Write and optimize critical queries
- [ ] Step 6: Plan migration strategy
- [ ] Step 7: Configure connection pooling
- [ ] Step 8: Validate against anti-patterns checklist
Schema Design Principles
Normalization Forms
1NF: Atomic values, no repeating groups
2NF: 1NF + no partial dependencies (all non-key columns depend on full PK)
3NF: 2NF + no transitive dependencies (non-key columns don't depend on other non-key columns)
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_name TEXT,
customer_email TEXT,
product1_name TEXT,
product1_qty INT,
product2_name TEXT,
product2_qty INT
);
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(id),
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INT REFERENCES orders(id),
product_id INT REFERENCES products(id),
quantity INT NOT NULL CHECK (quantity > 0)
);
When to Denormalize
Denormalize only when you have measured proof of performance issues:
ALTER TABLE posts ADD COLUMN comment_count INT DEFAULT 0;
CREATE FUNCTION update_comment_count() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
UPDATE posts SET comment_count = comment_count + 1 WHERE id = NEW.post_id;
ELSIF TG_OP = 'DELETE' THEN
UPDATE posts SET comment_count = comment_count - 1 WHERE id = OLD.post_id;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Indexing Strategy
Index Types and When to Use
B-tree (default): Equality, range, sorting, LIKE 'prefix%'
Hash: Equality only (rarely better than B-tree)
GIN: Full-text search, JSONB, arrays
GiST: Geometry, range types, full-text
BRIN: Large tables with naturally ordered data (timestamps)
Composite Indexes
CREATE INDEX idx_users_status_created ON users (status, created_at);
Partial and Covering Indexes
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';
CREATE INDEX idx_users_email_covering ON users (email)
INCLUDE (name, avatar_url);
Index Anti-patterns
CREATE INDEX idx_users_active ON users (is_active);
CREATE INDEX idx_users_active_created ON users (is_active, created_at DESC)
WHERE is_active = true;
Query Optimization
Reading EXPLAIN Plans
EXPLAIN ANALYZE SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.status = 'active'
GROUP BY u.name;
N+1 Query Detection and Prevention
users = db.query(User).all()
for user in users:
orders = db.query(Order).filter(Order.user_id == user.id).all()
users = db.query(User).options(joinedload(User.orders)).all()
user_ids = [u.id for u in users]
orders = db.query(Order).filter(Order.user_id.in_(user_ids)).all()
orders_by_user = defaultdict(list)
for order in orders:
orders_by_user[order.user_id].append(order)
const users = await prisma.user.findMany();
for (const user of users) {
const orders = await prisma.order.findMany({ where: { userId: user.id } });
}
const users = await prisma.user.findMany({
include: { orders: true },
});
const userIds = users.map((u) => u.id);
const orders = await prisma.order.findMany({
where: { userId: { in: userIds } },
});
Pagination
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
SELECT * FROM posts
WHERE created_at < '2024-01-15T10:30:00Z'
ORDER BY created_at DESC
LIMIT 20;
Migration Patterns
Safe Migration Rules
1. Never rename a column in one step (add new, migrate data, drop old)
2. Never drop a column that's still read by running code
3. Add columns as nullable or with defaults
4. Create indexes CONCURRENTLY to avoid locking
5. Test rollback before deploying
Zero-Downtime Migration Example
ALTER TABLE users ADD COLUMN display_name TEXT;
UPDATE users SET display_name = name WHERE display_name IS NULL AND id BETWEEN 1 AND 10000;
ALTER TABLE users DROP COLUMN name;
Index Creation
CREATE INDEX idx_orders_user ON orders (user_id);
CREATE INDEX CONCURRENTLY idx_orders_user ON orders (user_id);
Connection Pooling
Rule of thumb: connections = (CPU cores * 2) + disk spindles
For most apps: 10-20 connections per application instance
engine = create_engine(
DATABASE_URL,
pool_size=10,
max_overflow=20,
pool_timeout=30,
pool_recycle=1800,
pool_pre_ping=True,
)
ORM Best Practices
Select Only What You Need
users = db.query(User).all()
users = db.query(User.id, User.name).all()
const users = await prisma.user.findMany();
const users = await prisma.user.findMany({
select: { id: true, name: true, email: true },
});
Bulk Operations
for item in items:
db.add(Item(**item))
db.commit()
db.bulk_insert_mappings(Item, items)
db.commit()
for (const item of items) {
await prisma.item.create({ data: item });
}
await prisma.item.createMany({ data: items });
await prisma.$transaction([
prisma.user.create({ data: userData }),
prisma.profile.create({ data: profileData }),
]);
NoSQL Design Patterns
Document Database (MongoDB)
{
_id: ObjectId("..."),
name: "Alice",
addresses: [
{ street: "123 Main St", city: "NYC", type: "home" },
{ street: "456 Work Ave", city: "NYC", type: "work" }
]
}
Key-Value / Redis Patterns
# Cache-aside pattern
1. Check cache for key
2. If miss, query database
3. Store result in cache with TTL
4. Return result
# Cache invalidation
- TTL-based: SET key value EX 3600 (1 hour)
- Event-based: Delete key on write
- Write-through: Update cache on every write
Common Anti-Patterns Summary
AVOID DO INSTEAD
-------------------------------------------------------------------
SELECT * SELECT specific columns
OFFSET pagination Cursor-based pagination
N+1 queries Eager load or batch queries
Indexing every column Index based on query patterns
UUID v4 as primary key UUID v7 or BIGSERIAL (better locality)
Storing money as FLOAT Use DECIMAL / BIGINT (cents)
No foreign keys "for speed" Use foreign keys (data integrity)
Giant migrations Small, reversible steps
No connection pooling Always pool connections
Premature denormalization Normalize first, denormalize with data