Problem Statement
Your application is experiencing slow response times, high resource usage, or a poor user experience. You need to identify bottlenecks and implement optimizations systematically.
Performance Optimization Process
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Measure │───▶│ Identify │───▶│ Optimize │───▶│ Verify │
│ Baseline │ │ Bottlenecks │ │ Target │ │ Impact │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │
└────────────────────────────────────────────────────────┘
Iterate
1. Database Performance
Identifying Slow Queries
PostgreSQL
-- Enable slow query logging
ALTER SYSTEM SET log_min_duration_statement = '500'; -- Log queries > 500 ms
SELECT pg_reload_conf();
-- Find slow queries
SELECT
query,
calls,
total_time / 1000 as total_seconds,
mean_time / 1000 as mean_seconds,
rows
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;
-- Find missing indexes
SELECT
schemaname,
relname,
seq_scan,
seq_tup_read,
idx_scan,
idx_tup_fetch,
n_tup_ins,
n_tup_upd,
n_tup_del
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;
MySQL
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- Find slow queries using Performance Schema
SELECT
DIGEST_TEXT,
COUNT_STAR,
SUM_TIMER_WAIT/1000000000000 as total_seconds,
AVG_TIMER_WAIT/1000000000000 as avg_seconds,
SUM_ROWS_EXAMINED,
SUM_ROWS_SENT
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 20;
Query Optimization
Add Missing Indexes
-- Analyze query execution plan
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 123
AND status = 'pending'
AND created_at > '2024-01-01';
-- Create composite index
CREATE INDEX CONCURRENTLY idx_orders_user_status_date
ON orders(user_id, status, created_at);
-- Partial index for common queries
CREATE INDEX CONCURRENTLY idx_orders_pending
ON orders(user_id, created_at)
WHERE status = 'pending';
Optimize JOINs
-- Bad: Joining large tables without proper indexes
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id;
-- Better: Use a subquery with an index
SELECT u.*,
(SELECT COUNT(*) FROM orders WHERE user_id = u.id) as order_count
FROM users u;
-- Or use a materialized view for complex aggregations
CREATE MATERIALIZED VIEW user_stats AS
SELECT
user_id,
COUNT(*) as order_count,
SUM(total) as total_spent
FROM orders
GROUP BY user_id;
-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;
Connection Pooling
PgBouncer for PostgreSQL
# pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
2. Caching Strategies
Multi-Layer Caching
┌─────────────────────────────────────────────────────────────────┐
│ Browser Cache │
│ Static assets, API responses with Cache-Control │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ CDN Cache │
│ Static files, edge caching for API │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ Application Cache │
│ Redis, Memcached, In-memory │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ Database Cache │
│ Query cache, buffer pool │
└─────────────────────────────────────────────────────────────────┘
Redis Caching Implementation
// Laravel caching example
class ProductService
{
public function getProduct($id)
{
return Cache::tags(['products'])->remember(
"product:{$id}",
now()->addMinutes(60),
fn() => Product::with('category', 'images')->findOrFail($id)
);
}
public function getPopularProducts()
{
return Cache::remember(
'products:popular',
now()->addMinutes(15),
fn() => Product::popular()->limit(20)->get()
);
}
public function updateProduct($id, $data)
{
$product = Product::findOrFail($id);
$product->update($data);
// Invalidate related caches
Cache::tags(['products'])->forget("product:{$id}");
Cache::forget('products:popular');
return $product;
}
}
// Node.js caching example
const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL);
class CacheService {
async get(key) {
const data = await redis.get(key);
return data ? JSON.parse(data) : null;
}
async set(key, value, ttl = 3600) {
await redis.setex(key, ttl, JSON.stringify(value));
}
async getOrSet(key, fetchFn, ttl = 3600) {
let data = await this.get(key);
if (data) return data;
data = await fetchFn();
await this.set(key, data, ttl);
return data;
}
async invalidatePattern(pattern) {
const keys = await redis.keys(pattern);
if (keys.length > 0) {
await redis.del(...keys);
}
}
}
// Usage
const product = await cache.getOrSet(
`product:${id}`,
() => db.products.findUnique({ where: { id } }),
3600
);
Cache-Aside Pattern
async function getUser(userId) {
const cacheKey = `user:${userId}`;
// Try cache first
let user = await cache.get(cacheKey);
if (!user) {
// Cache miss: fetch from database
user = await db.users.findUnique({ where: { id: userId } });
if (user) {
// Store in cache
await cache.set(cacheKey, user, 3600);
}
}
return user;
}
// Cache invalidation on update
async function updateUser(userId, data) {
const user = await db.users.update({
where: { id: userId },
data,
});
// Invalidate cache
await cache.delete(`user:${userId}`);
return user;
}
3. Frontend Performance
Bundle Optimization
// vite.config.js
export default defineConfig({
build: {
rollupOptions: {
output: {
manualChunks: {
// Vendor chunk for rarely changing dependencies
vendor: ['react', 'react-dom', 'react-router-dom'],
// UI library chunk
ui: ['@radix-ui/react-dialog', '@radix-ui/react-dropdown-menu'],
},
},
},
// Enable minification and tree-shaking
minify: 'terser',
terserOptions: {
compress: {
drop_console: true,
drop_debugger: true,
},
},
},
// Pre-bundle dependencies
optimizeDeps: {
include: ['react', 'react-dom'],
},
});
Code Splitting
// React lazy loading
import { lazy, Suspense } from 'react';
const Dashboard = lazy(() => import('./pages/Dashboard'));
const Settings = lazy(() => import('./pages/Settings'));
const Reports = lazy(() => import('./pages/Reports'));
function App() {
return (
<Suspense fallback={<LoadingSpinner />}>
<Routes>
<Route path="/dashboard" element={<Dashboard />} />
<Route path="/settings" element={<Settings />} />
<Route path="/reports" element={<Reports />} />
</Routes>
</Suspense>
);
}
Image Optimization
// Next.js Image component
import Image from 'next/image';
export function ProductImage({ src, alt }) {
return (
<Image
src={src}
alt={alt}
width={400}
height={300}
loading="lazy"
placeholder="blur"
blurDataURL="/placeholder.png"
sizes="(max-width: 768px) 100vw, 50vw"
/>
);
}
// Manual optimization with srcset
<picture>
<source srcset="image.webp" type="image/webp" />
<source srcset="image.jpg" type="image/jpeg" />
<img
src="image.jpg"
alt="Product"
loading="lazy"
decoding="async"
srcset="image-400.jpg 400w, image-800.jpg 800w"
sizes="(max-width: 600px) 400px, 800px"
/>
</picture>
4. API Performance
Response Compression
// Express with compression
const compression = require('compression');
app.use(compression({
filter: (req, res) => {
if (req.headers['x-no-compression']) return false;
return compression.filter(req, res);
},
threshold: 1024, // Only compress responses > 1 KB
}));
HTTP/2 and Keep-Alive
# nginx.conf
http {
# Enable HTTP/2
listen 443 ssl http2;
# Keep-alive settings
keepalive_timeout 65;
keepalive_requests 1000;
# Gzip compression
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied any;
gzip_types text/plain text/css application/json application/javascript;
# Static file caching
location ~* \.(jpg|jpeg|png|gif|ico|css|js|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}
Pagination & Field Selection
// GraphQL field selection
const userResolvers = {
Query: {
users: async (_, { first, after, fields }, { db }) => {
// Only select requested fields
const select = buildSelectFromFields(fields);
return db.users.findMany({
take: first,
skip: after ? 1 : 0,
cursor: after ? { id: after } : undefined,
select,
});
},
},
};
// REST API with sparse fieldsets
app.get('/api/users', async (req, res) => {
const fields = req.query.fields?.split(',') || ['id', 'name', 'email'];
const page = parseInt(req.query.page) || 1;
const limit = Math.min(parseInt(req.query.limit) || 20, 100);
const users = await db.users.findMany({
select: Object.fromEntries(fields.map(f => [f, true])),
skip: (page - 1) * limit,
take: limit,
});
res.json({ data: users, meta: { page, limit } });
});
5. Background Processing
Queue Heavy Operations
// Laravel job
class ProcessOrderJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public function __construct(public Order $order) {}
public function handle()
{
// Heavy processing
$this->generateInvoice();
$this->sendNotifications();
$this->updateInventory();
$this->syncExternalSystems();
}
public function failed(\Throwable $exception)
{
Log::error('Order processing failed', [
'order_id' => $this->order->id,
'error' => $exception->getMessage(),
]);
}
}
// Dispatch job
ProcessOrderJob::dispatch($order)->onQueue('orders');
Node.js with BullMQ
import { Queue, Worker } from 'bullmq';
const orderQueue = new Queue('orders', {
connection: { host: 'redis', port: 6379 },
});
// Add job
await orderQueue.add('process-order', {
orderId: order.id,
}, {
attempts: 3,
backoff: { type: 'exponential', delay: 1000 },
});
// Worker
const worker = new Worker('orders', async (job) => {
const { orderId } = job.data;
await generateInvoice(orderId);
await sendNotifications(orderId);
await updateInventory(orderId);
return { success: true };
}, {
connection: { host: 'redis', port: 6379 },
concurrency: 5,
});
worker.on('completed', (job) => {
console.log(`Job ${job.id} completed`);
});
worker.on('failed', (job, err) => {
console.error(`Job ${job.id} failed:`, err);
});
Performance Monitoring
Key Metrics to Track
// Custom metrics with prom-client
const histogram = new client.Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration',
labelNames: ['method', 'route', 'status'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5],
});
const dbQueryDuration = new client.Histogram({
name: 'db_query_duration_seconds',
help: 'Database query duration',
labelNames: ['operation', 'table'],
buckets: [0.001, 0.01, 0.1, 0.5, 1],
});
const cacheHitRatio = new client.Gauge({
name: 'cache_hit_ratio',
help: 'Cache hit ratio',
});
The Senior Performance Mindset
The "First 60 Seconds" on a Burning Server
When you SSH into a server under load, run these commands in order:
# 1. Load averages (increasing or decreasing?)
uptime
# 2. Kernel errors - OOM kills, disk errors?
dmesg | tail
# 3. System-wide view of processes, memory, swap, CPU
vmstat 1
# 4. Which process is causing the load?
pidstat 1
# 5. Disk latency and saturation
iostat -xz 1
# 6. Memory usage and cache
free -m
# 7. Network throughput
sar -n DEV 1
Database Performance Forensics
When the pager goes off because "the database is slow," you need immediate visibility:
-- PostgreSQL: What's running RIGHT NOW?
SELECT
pid,
usename,
state,
age(clock_timestamp(), query_start) AS duration,
query
FROM pg_stat_activity
WHERE state <> 'idle'
AND query NOT LIKE '%pg_stat_activity%'
AND age(clock_timestamp(), query_start) > interval '5 seconds'
ORDER BY duration DESC;
Senior Tip: A query hanging in an idle in transaction state prevents vacuuming, leading to table bloat. Kill these sessions if they exceed a threshold.
Killing Zombie Queries
-- Soft kill (try this first - safely stops query)
SELECT pg_cancel_backend(<pid>);
-- Hard kill (closes socket - may cause 500s if app doesn't handle reconnection)
SELECT pg_terminate_backend(<pid>);
Linux File Descriptor Limits
"Too many open files" (EMFILE) is a classic production error:
# Check current limits
ulimit -Sn # Soft limit
ulimit -Hn # Hard limit
# Fix permanently in /etc/security/limits.conf:
* soft nofile 200000
* hard nofile 500000
Designing for Failure: Circuit Breakers
In distributed systems, failure is latent. The network will blink.
The Retry Storm Problem:
- Service A calls Service B. Service B is slow.
- Naive: Service A retries 3 times immediately.
- Result: Service B was struggling with 100 req/s. Now it has 300 req/s. It crashes harder.
Senior Pattern: Exponential Backoff + Jitter
- Wait 1s, then 2s, then 4s.
- Add random noise (jitter) so all clients don't retry simultaneously.
Circuit Breaker State Machine: 1. Closed (Normal): Requests pass through. 2. Open (Broken): Error rate > 50%. Fail immediately. Don't call downstream. 3. Half-Open (Testing): After timeout, let 1 request through. Success = Close. Fail = Open again.
Libraries: Resilience4j (Java), Polly (.NET), opossum (Node.js).
Bulkheads: Isolate Failures
If "Image Processing" uses 100% of threads, "Login" shouldn't die.
Solution: Thread pools per dependency.
- Connection Pool A (Payment): 10 threads
- Connection Pool B (Images): 10 threads
- If Pool B fills up, Pool A is unaffected
Performance Checklist
Database
- [ ] Slow query logging enabled
- [ ] Missing indexes identified and added
- [ ] Connection pooling configured (PgBouncer, etc.)
- [ ] Query results cached appropriately
- [ ] N+1 queries eliminated
- [ ]
idle in transaction sessions monitored and killed
Application
- [ ] Response caching implemented
- [ ] Heavy operations moved to queues
- [ ] Memory leaks identified and fixed
- [ ] Async/await used properly
- [ ] Circuit breakers on external dependencies
- [ ] Exponential backoff with jitter on retries
Frontend
- [ ] Bundle size optimized
- [ ] Code splitting implemented
- [ ] Images optimized and lazy-loaded
- [ ] Critical CSS inlined
- [ ] Service worker caching
Infrastructure
- [ ] CDN configured for static assets
- [ ] HTTP/2 enabled
- [ ] Gzip/Brotli compression enabled
- [ ] Connection keep-alive configured
- [ ] File descriptor limits increased
- [ ] System resource limits tuned
Related Wiki Articles