Skip to content

SQL Performance

Status: 🟢 Active  |  Owner: Data Engineering

Query Analysis

Always EXPLAIN ANALYZE new queries before deploying to production:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.status, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'PENDING'
  AND o.created_at >= NOW() - INTERVAL '7 days';

Look for:

  • Sequential scans on large tables — usually means a missing index.
  • Nested Loop joins with large row estimates — may benefit from a hash or merge join.
  • High actual rows >> estimated rows — stale statistics; run ANALYZE.
  • Bitmap heap scan with many pages — consider a partial index.

Indexing Guidelines

-- B-tree (default): equality and range queries
CREATE INDEX CONCURRENTLY idx_orders_customer_status
    ON orders (customer_id, status)
    WHERE status != 'COMPLETED';  -- partial index excludes majority of rows

-- For LIKE prefix searches
CREATE INDEX idx_products_name_text_pattern
    ON products (name text_pattern_ops);

-- Covering index: avoids heap fetch if all needed columns are in the index
CREATE INDEX idx_orders_covering
    ON orders (customer_id, created_at DESC)
    INCLUDE (status, total);

Use CREATE INDEX CONCURRENTLY in production to avoid locking the table.

Common Performance Anti-Patterns

Functions on Indexed Columns

-- ❌ Cannot use index on created_at — wrapping in LOWER prevents index use
SELECT * FROM orders WHERE LOWER(customer_email) = '[email protected]';

-- ✅ Use a functional index or store the email normalised
CREATE INDEX idx_orders_email_lower ON orders (LOWER(customer_email));

SELECT *

-- ❌ Fetches unnecessary columns, prevents index-only scans
SELECT * FROM orders WHERE customer_id = $1;

-- ✅ Select only what you need
SELECT id, status, total FROM orders WHERE customer_id = $1;

OFFSET Pagination on Large Tables

-- ❌ Must scan and discard all preceding rows
SELECT * FROM orders ORDER BY created_at DESC OFFSET 10000 LIMIT 20;

-- ✅ Keyset (cursor) pagination
SELECT * FROM orders
WHERE created_at < :last_seen_cursor
ORDER BY created_at DESC
LIMIT 20;

Implicit Type Casting

-- ❌ customer_id is UUID — passing a varchar forces a cast and may skip the index
SELECT * FROM orders WHERE customer_id = '123e4567-e89b-...';

-- ✅ Use the correct parameter type in your application driver
-- (pgx, JDBC, psycopg will handle this correctly when you use typed parameters)

Connection Pool Settings

PostgreSQL has a process-per-connection model. Keep connection counts reasonable:

Service Type Max Pool Size
Low-traffic service 5–10
Mid-tier API service 10–20
High-throughput service 20–30

Use PgBouncer in transaction pooling mode when many services connect to the same database.

References


Last reviewed: 2025-Q4  |  Owner: Data Engineering