SQL Performance¶
Status: 🟢 Active | Owner: Data Engineering
Query Analysis¶
Always EXPLAIN ANALYZE new queries before deploying to production:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.status, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'PENDING'
AND o.created_at >= NOW() - INTERVAL '7 days';
Look for:
- Sequential scans on large tables — usually means a missing index.
- Nested Loop joins with large row estimates — may benefit from a hash or merge join.
- High actual rows >> estimated rows — stale statistics; run
ANALYZE. - Bitmap heap scan with many pages — consider a partial index.
Indexing Guidelines¶
-- B-tree (default): equality and range queries
CREATE INDEX CONCURRENTLY idx_orders_customer_status
ON orders (customer_id, status)
WHERE status != 'COMPLETED'; -- partial index excludes majority of rows
-- For LIKE prefix searches
CREATE INDEX idx_products_name_text_pattern
ON products (name text_pattern_ops);
-- Covering index: avoids heap fetch if all needed columns are in the index
CREATE INDEX idx_orders_covering
ON orders (customer_id, created_at DESC)
INCLUDE (status, total);
Use CREATE INDEX CONCURRENTLY in production to avoid locking the table.
Common Performance Anti-Patterns¶
Functions on Indexed Columns¶
-- ❌ Cannot use index on created_at — wrapping in LOWER prevents index use
SELECT * FROM orders WHERE LOWER(customer_email) = '[email protected]';
-- ✅ Use a functional index or store the email normalised
CREATE INDEX idx_orders_email_lower ON orders (LOWER(customer_email));
SELECT *¶
-- ❌ Fetches unnecessary columns, prevents index-only scans
SELECT * FROM orders WHERE customer_id = $1;
-- ✅ Select only what you need
SELECT id, status, total FROM orders WHERE customer_id = $1;
OFFSET Pagination on Large Tables¶
-- ❌ Must scan and discard all preceding rows
SELECT * FROM orders ORDER BY created_at DESC OFFSET 10000 LIMIT 20;
-- ✅ Keyset (cursor) pagination
SELECT * FROM orders
WHERE created_at < :last_seen_cursor
ORDER BY created_at DESC
LIMIT 20;
Implicit Type Casting¶
-- ❌ customer_id is UUID — passing a varchar forces a cast and may skip the index
SELECT * FROM orders WHERE customer_id = '123e4567-e89b-...';
-- ✅ Use the correct parameter type in your application driver
-- (pgx, JDBC, psycopg will handle this correctly when you use typed parameters)
Connection Pool Settings¶
PostgreSQL has a process-per-connection model. Keep connection counts reasonable:
| Service Type | Max Pool Size |
|---|---|
| Low-traffic service | 5–10 |
| Mid-tier API service | 10–20 |
| High-throughput service | 20–30 |
Use PgBouncer in transaction pooling mode when many services connect to the same database.
References¶
Last reviewed: 2025-Q4 | Owner: Data Engineering