| name | sql-optimization-patterns |
| description | Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance. |
SQL Optimization Patterns
Transform slow database queries into lightning-fast operations through systematic optimization, proper indexing, and query plan analysis.
When to Use This Skill
- Debugging slow-running queries
- Designing performant database schemas
- Optimizing application response times
- Reducing database load and costs
- Improving scalability for growing datasets
- Analyzing EXPLAIN query plans
- Implementing efficient indexes
- Resolving N+1 query problems
Core Concepts
1. Query Execution Plans (EXPLAIN)
Understanding EXPLAIN output is fundamental to optimization.
PostgreSQL EXPLAIN:
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user@example.com';
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT u.*, o.order_total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > NOW() - INTERVAL '30 days';
Key Metrics to Watch:
- Seq Scan: Full table scan (usually slow for large tables)
- Index Scan: Using index (good)
- Index Only Scan: Using index without touching table (best)
- Nested Loop: Join method (okay for small datasets)
- Hash Join: Join method (good for larger datasets)
- Merge Join: Join method (good for sorted data)
- Cost: Estimated query cost (lower is better)
- Rows: Estimated rows returned
- Actual Time: Real execution time
2. Index Strategies
Indexes are the most powerful optimization tool.
Index Types:
- B-Tree: Default, good for equality and range queries
- Hash: Only for equality (=) comparisons
- GIN: Full-text search, array queries, JSONB
- GiST: Geometric data, full-text search
- BRIN: Block Range INdex for very large tables with correlation
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
CREATE INDEX idx_users_email_covering ON users(email)
INCLUDE (name, created_at);
CREATE INDEX idx_posts_search ON posts
USING GIN(to_tsvector('english', title || ' ' || body));
CREATE INDEX idx_metadata ON events USING GIN(metadata);
3. Query Optimization Patterns
Avoid SELECT *:
SELECT * FROM users WHERE id = 123;
SELECT id, email, name FROM users WHERE id = 123;
Use WHERE Clause Efficiently:
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
SELECT * FROM users WHERE email = 'user@example.com';
Optimize JOINs:
SELECT u.name, o.total
FROM users u, orders o
WHERE u.id = o.user_id AND u.created_at > '2024-01-01';
SELECT u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01';
SELECT u.name, o.total
FROM (SELECT * FROM users WHERE created_at > '2024-01-01') u
JOIN orders o ON u.id = o.user_id;
Detailed patterns and worked examples
Detailed pattern documentation lives in references/details.md. Read that file when the navigation tier above is insufficient.
Best Practices
- Index Selectively: Too many indexes slow down writes
- Monitor Query Performance: Use slow query logs
- Keep Statistics Updated: Run ANALYZE regularly
- Use Appropriate Data Types: Smaller types = better performance
- Normalize Thoughtfully: Balance normalization vs performance
- Cache Frequently Accessed Data: Use application-level caching
- Connection Pooling: Reuse database connections
- Regular Maintenance: VACUUM, ANALYZE, rebuild indexes
ANALYZE users;
ANALYZE VERBOSE orders;
VACUUM ANALYZE users;
VACUUM FULL users;
REINDEX INDEX idx_users_email;
REINDEX TABLE users;
Common Pitfalls
- Over-Indexing: Each index slows down INSERT/UPDATE/DELETE
- Unused Indexes: Waste space and slow writes
- Missing Indexes: Slow queries, full table scans
- Implicit Type Conversion: Prevents index usage
- OR Conditions: Can't use indexes efficiently
- LIKE with Leading Wildcard:
LIKE '%abc' can't use index
- Function in WHERE: Prevents index usage unless functional index exists
Monitoring Queries
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
SELECT
schemaname,
tablename,
seq_scan,
seq_tup_read,
idx_scan,
seq_tup_read / seq_scan AS avg_seq_tup_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 10;
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;