with one click
optimizing-performance
// Guides performance optimization, profiling techniques, and bottleneck identification. Use when improving application speed, reducing resource usage, or diagnosing performance issues.
// Guides performance optimization, profiling techniques, and bottleneck identification. Use when improving application speed, reducing resource usage, or diagnosing performance issues.
| name | optimizing-performance |
| description | Guides performance optimization, profiling techniques, and bottleneck identification. Use when improving application speed, reducing resource usage, or diagnosing performance issues. |
| license | MIT |
| compatibility | opencode |
| metadata | {"category":"quality","audience":"developers"} |
Strategies for identifying, analyzing, and resolving performance bottlenecks.
80% of performance problems come from 20% of the code.
Focus on:
āāā Hot paths (frequently executed code)
āāā I/O operations (database, network, disk)
āāā Memory allocation patterns
āāā Algorithm complexity
| Type | What It Measures | Tools |
|---|---|---|
| CPU Profiling | Time spent in functions | pprof, py-spy, Chrome DevTools |
| Memory Profiling | Allocation patterns, leaks | Valgrind, memory_profiler, Chrome |
| I/O Profiling | Disk/network operations | strace, perf, Wireshark |
| Database Profiling | Query performance | EXPLAIN, slow query log, APM |
1. Establish baseline
āā Measure current performance with realistic load
2. Identify hotspots
āā Profile to find where time/resources are spent
3. Form hypothesis
āā Why is this slow? What would make it faster?
4. Implement fix
āā Make ONE change at a time
5. Measure again
āā Did it help? By how much?
6. Repeat
āā Until performance goals are met
# Node.js
node --prof app.js
node --prof-process isolate-*.log > profile.txt
# Python
python -m cProfile -s cumtime app.py
py-spy record -o profile.svg -- python app.py
# Go
go test -cpuprofile cpu.prof -memprofile mem.prof -bench .
go tool pprof cpu.prof
# Database (PostgreSQL)
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';
BAD (N+1 queries):
SELECT * FROM posts; -- 1 query
SELECT * FROM users WHERE id=1; -- N queries
SELECT * FROM users WHERE id=2;
...
GOOD (2 queries):
SELECT * FROM posts;
SELECT * FROM users WHERE id IN (1, 2, 3, ...);
Detection: High query count relative to data returned Fix: Eager loading, batch fetching, JOINs
BAD:
SELECT * FROM logs; -- Returns millions of rows
GOOD:
SELECT * FROM logs
WHERE created_at > NOW() - INTERVAL '1 day'
LIMIT 100;
Detection: Memory spikes, timeouts Fix: Pagination, limits, streaming
BAD (blocking):
result1 = fetch_api_1() -- Wait 200ms
result2 = fetch_api_2() -- Wait 200ms
return combine(result1, result2) -- Total: 400ms
GOOD (parallel):
[result1, result2] = await Promise.all([
fetch_api_1(),
fetch_api_2()
]) -- Total: ~200ms
Detection: Sequential I/O in traces Fix: Parallel execution, async/await
BAD (allocates in loop):
for item in large_list:
result = [] # Allocates each iteration
result.append(transform(item))
GOOD (pre-allocate):
result = []
for item in large_list:
result.append(transform(item))
BEST (generator):
def transform_all(items):
for item in items:
yield transform(item)
Detection: GC pressure, memory profiling Fix: Object pooling, pre-allocation, generators
| Technique | When to Use | Impact |
|---|---|---|
| Indexing | Slow WHERE/JOIN queries | High |
| Query optimization | Complex queries | High |
| Connection pooling | Many short connections | Medium |
| Read replicas | Read-heavy workloads | High |
| Caching | Repeated queries | Very High |
| Denormalization | Complex JOINs | Medium |
-- Create index for frequently queried columns
CREATE INDEX idx_users_email ON users(email);
-- Composite index for multiple column queries
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Check if index is used
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';
| Strategy | Use Case | Invalidation |
|---|---|---|
| Cache-aside | General purpose | Manual or TTL |
| Write-through | Strong consistency | On write |
| Write-behind | Write-heavy | Async batched |
| Read-through | Read-heavy | On miss |
Cache-aside pattern:
1. Check cache
2. If miss, query database
3. Store in cache
4. Return result
| Technique | When to Use |
|---|---|
| Object pooling | Frequent allocation of same type |
| Lazy loading | Large objects not always needed |
| Streaming | Processing large datasets |
| Weak references | Cache that can be evicted |
| Data structure choice | Right structure for access pattern |
| Metric | Target | What It Measures |
|---|---|---|
| LCP (Largest Contentful Paint) | < 2.5s | Load performance |
| INP (Interaction to Next Paint) | < 200ms | Interactivity |
| CLS (Cumulative Layout Shift) | < 0.1 | Visual stability |
Loading Performance:
ā Code splitting (lazy load routes/components)
ā Tree shaking (remove unused code)
ā Minification (JS, CSS)
ā Compression (gzip, brotli)
ā Image optimization (WebP, srcset, lazy loading)
ā CDN for static assets
Runtime Performance:
ā Virtualized lists for large data
ā Debounce/throttle event handlers
ā Memoization of expensive computations
ā Avoid layout thrashing (batch DOM reads/writes)
ā Use CSS transforms for animations
ā Web Workers for heavy computation
# Analyze bundle size
npx webpack-bundle-analyzer stats.json
npx source-map-explorer bundle.js
# Identify large dependencies
npx depcheck
| Percentile | Target | User Experience |
|---|---|---|
| p50 | < 100ms | Fast |
| p95 | < 500ms | Acceptable |
| p99 | < 1s | Tolerable |
| Technique | Benefit |
|---|---|
| Response compression | Reduce transfer size |
| Pagination | Limit response size |
| Field selection | Return only needed data |
| ETags/Caching headers | Reduce redundant requests |
| Connection keep-alive | Reduce handshake overhead |
| HTTP/2 | Multiplexing, header compression |
BAD (multiple requests):
GET /users/1
GET /users/2
GET /users/3
GOOD (batch):
POST /users/batch
{ "ids": [1, 2, 3] }
| Category | Metrics |
|---|---|
| Latency | p50, p95, p99 response times |
| Throughput | Requests per second |
| Errors | Error rate, error types |
| Saturation | CPU, memory, connections |
Critical (page immediately):
- Error rate > 5%
- p99 latency > 5s
- Service down
Warning (notify during hours):
- Error rate > 1%
- p95 latency > 2s
- Resource utilization > 80%
# Log slow operations
import time
import logging
def timed_operation(func):
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
duration = time.time() - start
if duration > 1.0: # Log if > 1 second
logging.warning(f"{func.__name__} took {duration:.2f}s")
return result
return wrapper
| Tool | Use Case |
|---|---|
| k6 | Modern, scriptable load testing |
| JMeter | Complex scenarios, GUI |
| Locust | Python-based, distributed |
| Artillery | YAML config, easy to start |
| wrk | Simple HTTP benchmarking |
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up
{ duration: '5m', target: 50 }, // Stay at 50 users
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% under 500ms
http_req_failed: ['rate<0.01'], // Error rate < 1%
},
};
export default function () {
const res = http.get('https://api.example.com/users');
check(res, { 'status is 200': (r) => r.status === 200 });
sleep(1);
}
PROFILING FLOW:
Measure ā Identify ā Hypothesize ā Fix ā Measure ā Repeat
COMMON BOTTLENECKS:
N+1 queries ā Eager loading
Unbounded data ā Pagination
Blocking I/O ā Parallelization
Excessive allocation ā Object pooling
DATABASE:
Index frequently queried columns
Use EXPLAIN ANALYZE
Add caching layer
CACHING:
Cache-aside for general use
TTL for time-based invalidation
Invalidate on write for consistency
TARGETS:
p50 < 100ms
p95 < 500ms
p99 < 1s
TOOLS:
CPU: pprof, py-spy
Memory: valgrind, memory_profiler
Load: k6, locust
DB: EXPLAIN, slow query log
Guides systematic project analysis, codebase exploration, and architecture pattern recognition. Use when understanding new codebases, onboarding to projects, or investigating system structure.
Guides REST and GraphQL API design, endpoint patterns, request/response schemas, versioning, and API best practices. Use when building APIs, designing endpoints, or reviewing API contracts.
Guides software architecture decisions, design patterns, and system design principles. Use when designing systems, choosing patterns, or making architectural decisions.
Guides test strategy, TDD/BDD approaches, test coverage planning, and testing best practices. Use when designing test suites, improving coverage, or choosing testing approaches.
Guides git workflows, branching strategies, commit conventions, and version control best practices. Use when managing repositories, creating branches, or handling merges.
CRITICAL skill for executing multiple Task tool calls in a SINGLE message for true parallelism. Essential for efficient multi-task workflows, subagent coordination, and maximizing throughput.