| name | sql-optimization-patterns |
| description | Use when debugging slow queries, designing indexes, analyzing EXPLAIN output, resolving N+1 problems, or tuning database performance. Do NOT use for basic SQL, simple CRUD, or JPA entity design (use jpa-patterns). |
| paths | **/*.sql, **/migrations/**, **/flyway/**, **/liquibase/** |
SQL Optimization Patterns
ํ๋จ ๊ธฐ์ค๊ณผ ๊ท์น ์ค์ฌ. SQL ๋ฌธ๋ฒ์ด ์๋, ์ฌ๋ฐ๋ฅธ ์ต์ ํ ์ ๋ต ์ ํ์ ์๋ด.
Quick Start
CRITICAL Rules
- ALWAYS
EXPLAIN ANALYZE before optimizing -- ์ถ์ธก ๊ธ์ง, ์ค์ธก ๊ธฐ๋ฐ
- NEVER
SELECT * in production -- ํ์ํ ์ปฌ๋ผ๋ง ๋ช
์
- ALWAYS index columns in WHERE, JOIN, ORDER BY -- ํ ํ
์ด๋ธ ์ค์บ ๋ฐฉ์ง
- NEVER function on indexed column in WHERE --
WHERE LOWER(email) = x ๋ ์ธ๋ฑ์ค ๋ฌด์ (functional index ํ์)
- ALWAYS
ANALYZE after bulk data changes -- ํต๊ณ ๊ฐฑ์ ์ผ๋ก ์ตํฐ๋ง์ด์ ํ๋จ ์ ์ํ
- NEVER OFFSET for deep pagination -- 10๋ง ํ ์ดํ ๊ทน์ฌํ ์ฑ๋ฅ ์ ํ
- PREFER batch operations -- ๋ฃจํ ๋ด ๊ฐ๋ณ INSERT/UPDATE ๊ธ์ง
- ALWAYS covering index for hot queries -- Index Only Scan ์ ๋
- NEVER over-index -- ์ธ๋ฑ์ค๋ง๋ค INSERT/UPDATE/DELETE ๋น์ฉ ์ฆ๊ฐ
- ALWAYS monitor slow query log -- ๋ฌธ์ ๋ ์ฝ๋๊ฐ ์๋ ์ด์์์ ๋ฐ๊ฒฌ๋จ
Optimization Workflow
Slow query reported
+-- 1. EXPLAIN ANALYZE ์คํ
| +-- Seq Scan on large table? --> ์ธ๋ฑ์ค ํ์ (Index Decision Tree)
| +-- Nested Loop on large sets? --> JOIN ์ ๋ต ๋ณ๊ฒฝ ๋๋ ์ธ๋ฑ์ค ์ถ๊ฐ
| +-- High cost but low rows? --> ํต๊ณ ์ค๋๋จ, ANALYZE ์คํ
| +-- Sort with high cost? --> ORDER BY ์ปฌ๋ผ์ ์ธ๋ฑ์ค ์ถ๊ฐ
+-- 2. ์ฟผ๋ฆฌ ์์ฒด ์ต์ ํ
| +-- SELECT * ? --> ํ์ํ ์ปฌ๋ผ๋ง
| +-- Correlated subquery? --> JOIN์ผ๋ก ๋ณํ
| +-- N+1 pattern? --> JOIN ๋๋ IN clause batch
+-- 3. ์ธ๋ฑ์ค ์ถ๊ฐ/์กฐ์
+-- 4. EXPLAIN ANALYZE ์ฌ์คํ์ผ๋ก ๊ฐ์ ํ์ธ
Index Type Decision
What query pattern?
+-- Equality (=) + Range (<, >, BETWEEN)
| --> B-Tree (default, most common)
+-- Equality only, no range
| --> Hash (PostgreSQL, slightly faster for =)
+-- Full-text search (LIKE '%word%')
| --> GIN + to_tsvector
+-- JSONB containment (@>, ?)
| --> GIN
+-- Geometric/spatial data
| --> GiST
+-- Very large table, data correlates with physical order
| --> BRIN (10-100x smaller than B-Tree)
Composite Index Rules
Index column order matters!
+-- WHERE a = ? AND b = ? --> (a, b) or (b, a) - both work
+-- WHERE a = ? AND b > ? --> (a, b) - equality first, range last
+-- WHERE a = ? ORDER BY b --> (a, b) - covers both filter and sort
+-- WHERE a = ? AND b = ? AND c > ? --> (a, b, c) - equalities first
Leftmost prefix rule: INDEX(a, b, c) ๋ (a), (a, b), (a, b, c) ์ฟผ๋ฆฌ์ ์ฌ์ฉ ๊ฐ๋ฅ. (b, c) ๋จ๋
์ ๋ถ๊ฐ.
Index Selection Checklist
| Index Type | When | Example |
|---|
| Standard B-Tree | WHERE/JOIN/ORDER BY | CREATE INDEX idx_email ON users(email) |
| Composite | Multi-column filter | CREATE INDEX idx_user_status ON orders(user_id, status) |
| Partial | ํน์ ์กฐ๊ฑด ํ๋ง ์์ฃผ ์กฐํ | CREATE INDEX idx_active ON users(email) WHERE status = 'active' |
| Covering (INCLUDE) | Index Only Scan ์ ๋ | CREATE INDEX idx_email_cover ON users(email) INCLUDE (name, created_at) |
| Expression | ํจ์ ๊ฒฐ๊ณผ๋ก ํํฐ๋ง | CREATE INDEX idx_lower ON users(LOWER(email)) |
Pagination Decision
Need pagination?
+-- Small dataset (<10K rows)?
| --> Offset-based (simple, "jump to page N" ๊ฐ๋ฅ)
+-- Large dataset, infinite scroll?
| --> Cursor-based (consistent performance)
+-- Search results with page numbers?
| --> Offset + total count cache
+-- Real-time feed?
| --> Cursor (keyset) only
Cursor-Based Pattern
SELECT id, name, created_at FROM users
ORDER BY created_at DESC, id DESC
LIMIT 21;
SELECT id, name, created_at FROM users
WHERE (created_at, id) < (:last_created_at, :last_id)
ORDER BY created_at DESC, id DESC
LIMIT 21;
CREATE INDEX idx_users_cursor ON users(created_at DESC, id DESC);
Core Optimization Patterns
N+1 Query
SELECT u.id, u.name, o.id as order_id, o.total
FROM users u LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id IN (1, 2, 3, 4, 5);
SELECT * FROM orders WHERE user_id IN (1, 2, 3, 4, 5);
Correlated Subquery
SELECT u.name, (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id)
FROM users u;
SELECT u.name, COUNT(o.id) as order_count
FROM users u LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name;
Aggregate Optimization
SELECT COUNT(*) FROM orders;
SELECT reltuples::bigint FROM pg_class WHERE relname = 'orders';
CREATE INDEX idx_orders_created ON orders(created_at);
SELECT COUNT(*) FROM orders WHERE created_at > NOW() - INTERVAL '7 days';
Batch Operations
INSERT INTO users (name) VALUES ('Alice');
INSERT INTO users (name) VALUES ('Bob');
INSERT INTO users (name) VALUES ('Alice'), ('Bob'), ('Carol');
COPY users (name, email) FROM '/tmp/users.csv' CSV HEADER;
Anti-Patterns
| Anti-Pattern | Why Bad | Fix |
|---|
SELECT * | ๋ถํ์ํ I/O, covering index ๋ฌดํจ | ํ์ํ ์ปฌ๋ผ๋ง ๋ช
์ |
WHERE LOWER(col) = ? | ์ธ๋ฑ์ค ์ฌ์ฉ ๋ถ๊ฐ | Expression index ๋๋ ์ ๊ทํ ์ ์ฅ |
LIKE '%abc' | ์๋ถ๋ถ ์์ผ๋์นด๋ = ํ ์ค์บ | GIN trigram index ๋๋ full-text search |
OR in WHERE | ์ธ๋ฑ์ค ๋นํจ์จ | UNION ALL ๋๋ ๋ณ๋ ์ฟผ๋ฆฌ |
| Implicit type cast | WHERE id = '123' ์ธ๋ฑ์ค ๋ฌด์ ๊ฐ๋ฅ | ํ์
์ผ์น์ํค๊ธฐ |
| CTE as optimization fence | PostgreSQL 12+์์ inline ๋์ง๋ง ์ฃผ์ | MATERIALIZED / NOT MATERIALIZED ๋ช
์ |
| Over-indexing | ์ฐ๊ธฐ ์ฑ๋ฅ ์ ํ, ๋์คํฌ ๋ญ๋น | ์ฌ์ฉ๋์ง ์๋ ์ธ๋ฑ์ค ์ฃผ๊ธฐ์ ์ ๊ฑฐ |
Troubleshooting
| Symptom | Cause | Solution |
|---|
| Seq Scan on indexed column | ํต๊ณ ์ค๋๋จ ๋๋ ์ ํ๋ ๋ฎ์ | ANALYZE table; ์คํ |
| Index ์๋๋ฐ ์ ์ฐ์ | ํ ๋น์จ ๋์ผ๋ฉด ์ตํฐ๋ง์ด์ ๊ฐ Seq Scan ์ ํ | ์ ์ ๋์. Partial index ๊ณ ๋ ค |
| Nested Loop ๋๋ฆผ | ํฐ ํ
์ด๋ธ ๊ฐ JOIN | ์กฐ์ธ ์ปฌ๋ผ ์ธ๋ฑ์ค ํ์ธ |
| ํ์ด์ง๋ค์ด์
๋๋ ค์ง | OFFSET ๊น์ด์ง | Cursor-based pagination |
| ๋ฒํฌ INSERT ๋๋ฆผ | ์ธ๋ฑ์ค ๋ง์ | ์ธ๋ฑ์ค DROP -> INSERT -> ์ฌ์์ฑ |
| Lock contention | ์ฅ์๊ฐ ํธ๋์ญ์
| ํธ๋์ญ์
์งง๊ฒ, batch ์ฒ๋ฆฌ |
| Planner ์๋ชป๋ ์ถ์ | ํต๊ณ ๋ถ์ ํ | ALTER TABLE SET STATISTICS ๋์ด๊ธฐ |
EXPLAIN Key Metrics
| Metric | Good | Bad | Action |
|---|
| Index Scan / Index Only Scan | O | | ์ ์ง |
| Seq Scan (small table) | O | | ๋ฌด์ |
| Seq Scan (large table) | | X | ์ธ๋ฑ์ค ์ถ๊ฐ |
| Nested Loop (small inner) | O | | ์ ์ง |
| Nested Loop (large inner) | | X | ์ธ๋ฑ์ค ์ถ๊ฐ ๋๋ Hash Join ์ ๋ |
| Sort (in-memory) | O | | ์ ์ง |
| Sort (on-disk) | | X | work_mem ์ฆ๊ฐ ๋๋ ์ธ๋ฑ์ค |
| Rows estimate vs actual 10x+ ์ฐจ์ด | | X | ANALYZE ์คํ |
Maintenance Commands
ANALYZE users;
VACUUM ANALYZE users;
VACUUM FULL users;
SELECT indexrelname, idx_scan FROM pg_stat_user_indexes
WHERE idx_scan = 0 ORDER BY pg_relation_size(indexrelid) DESC;
SELECT query, mean_exec_time, calls
FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 20;
Cross-References
References