PostgreSQL Slow Query Diagnostic Flowchart

Q: What is the first thing I should check when a PostgreSQL query is slow?

Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) on the query. This executes the query and shows the actual execution plan with timing data and buffer usage. The plan reveals whether the problem is a missing index, stale statistics, disk spills, or something else entirely. For write queries (UPDATE, DELETE, INSERT), wrap the EXPLAIN ANALYZE in a transaction and roll it back to avoid unintended side effects.

Q: Why does PostgreSQL choose a sequential scan when an index exists?

The most common cause: the table's statistics are stale, so the planner has incorrect estimates of how many rows match the filter. Run ANALYZE tablename and re-check the plan. Other causes include high selectivity (the query returns a large fraction of the table), cost parameters calibrated for spinning disks when storage is SSD (random_page_cost too high), a function or cast on the indexed column, or a type mismatch between the column and comparison value.

Q: How do I find what is blocking my PostgreSQL query?

Query pg_stat_activity for sessions in a Lock wait state, then use pg_blocking_pids() to identify the blocker. Common scenarios include long-running transactions holding row locks, ALTER TABLE waiting for an exclusive lock, and autovacuum blocked by open transactions. Resolve by cancelling or terminating the blocking session, and prevent recurrence with statement_timeout, lock_timeout, and idle_in_transaction_session_timeout.

Q: How much does table bloat affect PostgreSQL query performance?

Considerably, and sequential scans bear the worst of it. Dead tuples occupy space between live rows. A sequential scan must read every page in the table, including pages that contain nothing but dead tuples. A table with 50% dead tuples performs roughly twice the I/O it should. Check dead tuple ratios with SELECT n_dead_tup, n_live_tup FROM pg_stat_user_tables. Remediate with VACUUM FULL (offline) or pg_repack (online). Index bloat also accumulates — rebuild with REINDEX CONCURRENTLY.

Q: What PostgreSQL configuration settings affect query performance most?

Four settings have the largest impact: work_mem (memory for sorts and hashes, default 4 MB forces disk spills), shared_buffers (PostgreSQL's page cache, set to ~25% of system RAM), random_page_cost (planner cost for random I/O, default 4.0 assumes spinning disks — set to 1.1–1.5 for SSDs), and effective_cache_size (hint about total available cache, set to ~75% of system RAM).

We ran EXPLAIN ANALYZE on the missing artwork. The bottleneck, it turns out, was the artist.

The Butler of Gold Lapel · March 26, 2026 · 22 min read

We ran EXPLAIN ANALYZE on the missing artwork. The bottleneck, it turns out, was the artist.

Before You Guess, Measure

I have seen more time lost to premature tuning than to the slow queries themselves. Adjusting work_mem, adding speculative indexes, increasing shared_buffers — all without first reading the query plan. This rarely helps. It occasionally makes things worse.

This article is a diagnostic decision tree. Follow the steps in order, and stop when you find the cause. Each step either resolves the issue or directs you to the next one.

How to use this guide: Start at Step 1. Run the diagnostic SQL. Check the result against the criteria listed. If the step identifies the problem, apply the fix and verify. If not, proceed to the next step. The answer is always in the plan — one simply has to read it.

For a broader treatment of why queries slow down in the first place, you may wish to visit Why Is My PostgreSQL Query Slow?. This guide, by contrast, is the runbook you follow while the problem is actively happening.

Step 1 — Run EXPLAIN (ANALYZE, BUFFERS)

Every slow query investigation begins with the query plan. No exceptions. I cannot stress this enough — the plan is not a suggestion, it is the starting point.

The starting point

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ... your slow query here ...;

A word of caution: ANALYZE actually executes the query. On a write query (UPDATE, DELETE), wrap it in a transaction and roll back:

Safe EXPLAIN for write queries

BEGIN;
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
UPDATE orders SET status = 'shipped' WHERE id = 42;
ROLLBACK;

What to Look at First

Two numbers in the plan output deserve your immediate attention:

Actual time vs. estimated cost. The actual time=X..Y values show real execution time in milliseconds. The cost=X..Y values are the planner's abstract estimates. When these diverge significantly, the planner is working with incomplete information.
Actual rows vs. estimated rows. Each plan node shows rows=N (estimated) and actual ... rows=M. A mismatch of 10x or more is a reliable sign of stale statistics or a planning limitation.

Three Red Flags in Any Plan

Red flag 1 — Sequential scan on a large table with low selectivity. The plan reads every row in the table and returns only a handful:

Sequential scan red flag

Seq Scan on orders  (cost=0.00..25432.00 rows=15 width=120)
                    (actual time=312.44..4891.22 rows=12 loops=1)
  Filter: (customer_id = 9847)
  Rows Removed by Filter: 1248993
  Buffers: shared hit=8432 read=14200

This scan read 1.2 million rows to return 12. That is an extraordinary amount of effort for a modest result. Go to Step 2.

Red flag 2 — Nested loop with high loop count. The inner side of the loop executes thousands of times:

Nested loop red flag

Nested Loop  (cost=0.43..48291.50 rows=5000 width=200)
             (actual time=0.05..8842.31 rows=5000 loops=1)
  ->  Seq Scan on customers  (actual time=0.02..12.10 rows=5000 loops=1)
  ->  Index Scan on orders   (actual time=1.74..1.76 rows=1 loops=5000)
        Index Cond: (customer_id = customers.id)

The inner index scan runs 5,000 times. Multiply actual time by loops for the true cost: 1.76 ms × 5,000 = 8,800 ms. The per-loop time looks perfectly reasonable — it is the repetition that creates the problem.

Red flag 3 — Sort or hash spilling to disk. Look for Sort Method: external merge Disk or Batches: N in hash joins:

Disk sort red flag

Sort  (cost=89201.44..89451.44 rows=100000 width=64)
      (actual time=4521.22..5102.88 rows=100000 loops=1)
  Sort Key: created_at
  Sort Method: external merge  Disk: 15224kB

This sort exceeded available memory and resorted to disk. The word "external" in that output is never a welcome sight. Go to Step 6 to check work_mem.

Decision Points

The plan shows a sequential scan on a large table → go to Step 2
The plan shows bad row estimates (10x+ mismatch) → go to Step 3
The plan looks reasonable but the query is still slow → go to Step 4

Reading the Plan — The 60-Second Version

For a thorough treatment, see How to Read EXPLAIN ANALYZE Output. The minimum you need to follow this diagnostic:

Start from the innermost node. The most indented node executes first. Read inside out.

actual time=X..Y — Y is the cumulative time for that node. X is the time to the first row.

rows=N (estimated) vs. rows=M (actual) — A mismatch greater than 10x means the planner built its strategy on a faulty assumption.

Buffers: shared hit=X read=Y — hit is pages served from memory. read is pages fetched from the OS. A high read count suggests the working set does not fit in memory.

loops=N — The node executed N times. All time and row values are per-loop. Multiply actual time by loops for the true cost.

Step 2 — Is the Right Index Present?

If the query plan reveals a sequential scan on a large table, the first question to ask is whether an appropriate index exists.

Check Existing Indexes

SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'orders'
ORDER BY indexname;

Common Index Mismatches

These are the situations I encounter most frequently. The index exists, the query looks correct, and yet the two do not connect.

Function on the indexed column. A B-tree index on email will not be used for WHERE LOWER(email) = 'user@example.com'. Fix: create an expression index on LOWER(email).

Type mismatch. Comparing a text column to an integer value forces an implicit cast that prevents index use.

OR conditions across columns. WHERE customer_id = 42 OR status = 'pending' cannot use a single B-tree index efficiently. Fix: rewrite as a UNION ALL:

UNION ALL for OR conditions

SELECT * FROM orders WHERE customer_id = 42
UNION ALL
SELECT * FROM orders WHERE status = 'pending' AND customer_id != 42;

Leading wildcard in LIKE. WHERE name LIKE '%smith' defeats a B-tree index. Fix: use a pg_trgm GIN index for substring matching:

Trigram index for LIKE patterns

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_name_trgm ON customers USING GIN (name gin_trgm_ops);

After Creating the Index

Re-run EXPLAIN (ANALYZE, BUFFERS) and confirm the plan changed from Seq Scan to Index Scan or Index Only Scan. Trust, but verify.

If the index exists and PostgreSQL still chooses a sequential scan, the problem lies elsewhere. Go to Step 3.

Partial and Expression Indexes

Partial indexes include only a subset of rows — smaller, faster to scan, and less costly to maintain:

Partial index

CREATE INDEX idx_active_orders ON orders (customer_id)
WHERE status = 'active';

Expression indexes match functions or casts in the WHERE clause:

Expression index

CREATE INDEX idx_lower_email ON users (LOWER(email));

Covering indexes include additional columns so that index-only scans can satisfy the query without visiting the table at all:

Covering index

CREATE INDEX idx_orders_covering ON orders (customer_id)
INCLUDE (total, created_at);

Step 3 — Are Table Statistics Current?

PostgreSQL's query planner relies on table statistics to estimate how many rows each operation will produce. When they are wrong, the planner builds a strategy for a table that no longer resembles reality.

Check When Statistics Were Last Updated

SELECT
  relname,
  last_analyze,
  last_autoanalyze,
  n_live_tup,
  n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'orders';

If last_analyze and last_autoanalyze are both NULL, the table has never been analyzed — the planner has been working with no reliable information at all.

Check Row Estimate Accuracy

Compare estimated vs actual rows

-- What the planner thinks
SELECT reltuples::bigint AS estimated_rows
FROM pg_class
WHERE relname = 'orders';

-- What the table actually contains
SELECT count(*) AS actual_rows FROM orders;

If these differ by more than 10x, the statistics are stale.

Fix: Run ANALYZE

ANALYZE orders;

After running ANALYZE, re-run your EXPLAIN (ANALYZE, BUFFERS) query. If the row estimates improved and the plan changed, you have found your culprit — and the remedy was a single command.

When Autoanalyze Falls Behind

After bulk loads, large deletes, or COPY operations, the autovacuum daemon may not have triggered autoanalyze yet. You can tune this per-table:

Per-table autoanalyze tuning

ALTER TABLE orders SET (
  autovacuum_analyze_threshold = 1000,
  autovacuum_analyze_scale_factor = 0.02
);

Extended Statistics for Correlated Columns

If statistics are fresh but row estimates remain stubbornly wrong, the cause may be column correlation:

Extended statistics for correlated columns

CREATE STATISTICS orders_stats (dependencies)
ON city, zip_code FROM customers;

ANALYZE customers;

If statistics are fresh and estimates look reasonable but the query persists in its slowness, go to Step 4.

Step 4 — Is a Lock Blocking the Query?

A query with an excellent plan can still appear slow if it is waiting for a lock held by another session. The query itself is not the problem — it is simply queued behind someone who has not yet finished their business.

Diagnostic: Find Waiting Queries

Find lock-waiting queries

SELECT
  pid,
  usename,
  state,
  wait_event_type,
  wait_event,
  query_start,
  LEFT(query, 80) AS query_snippet
FROM pg_stat_activity
WHERE state = 'active'
  AND wait_event_type = 'Lock'
ORDER BY query_start;

Find What Is Blocking What

Identify blocker and blocked

SELECT
  blocked.pid AS blocked_pid,
  blocked.query AS blocked_query,
  blocker.pid AS blocker_pid,
  blocker.query AS blocker_query,
  blocker.state AS blocker_state
FROM pg_stat_activity AS blocked
JOIN LATERAL (
  SELECT unnest(pg_blocking_pids(blocked.pid)) AS pid
) AS blocking_pids ON true
JOIN pg_stat_activity AS blocker ON blocker.pid = blocking_pids.pid
WHERE blocked.wait_event_type = 'Lock';

Common Lock Scenarios

Long-running transaction holding row locks. An UPDATE or DELETE inside an uncommitted transaction blocks other sessions attempting to modify the same rows. The blocker's state will show idle in transaction.

ALTER TABLE waiting for AccessExclusiveLock. Schema changes require an exclusive lock on the table, which blocks all other queries — including SELECT.

Autovacuum blocked by long transactions. Autovacuum cannot clean dead tuples when a long-running transaction holds a snapshot that sees those tuples as potentially still visible.

Fix: Terminate the Blocker

Cancel or terminate the blocker

-- Try cancelling the query first (graceful)
SELECT pg_cancel_backend(12345);

-- If that doesn't work, terminate the connection
SELECT pg_terminate_backend(12345);

Prevention: Timeout Settings

Timeout guardrails

-- Cancel any statement that runs longer than 30 seconds
SET statement_timeout = '30s';

-- Abort if a lock cannot be acquired within 5 seconds
SET lock_timeout = '5s';

-- Kill sessions that sit idle in a transaction for more than 5 minutes
ALTER SYSTEM SET idle_in_transaction_session_timeout = '5min';
SELECT pg_reload_conf();

For a more thorough treatment of lock types and contention patterns, see PostgreSQL Lock Contention.

If no locks are present, the delay lies elsewhere. Go to Step 5.

Step 5 — Is Table Bloat the Problem?

Table bloat is the accumulation of dead tuples — rows that have been updated or deleted but not yet cleaned up by vacuum. They occupy space between live rows, forcing sequential scans to read more pages than necessary.

Diagnostic: Check Dead Tuple Count

Quick bloat check

SELECT
  relname,
  n_live_tup,
  n_dead_tup,
  ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
  last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'orders';

If dead_pct is above 20%, bloat is very likely contributing to the slowness you are investigating.

Detailed Diagnostic with pgstattuple

Precise bloat measurement

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
  dead_tuple_count,
  dead_tuple_percent,
  free_space,
  free_percent
FROM pgstattuple('orders');

This scans the entire table, so run it during low-traffic periods.

Fix: Remove Bloat

VACUUM FULL rewrites the table without dead tuples, reclaiming disk space. It requires an AccessExclusiveLock, which means the table is completely unavailable for the duration:

VACUUM FULL orders;

pg_repack accomplishes the same result online, without holding a long exclusive lock.

Index Bloat

Indexes accumulate their own form of bloat over time:

Check index sizes

SELECT
  indexrelname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY pg_relation_size(indexrelid) DESC;

If an index is significantly larger than expected, rebuild it without downtime:

Online index rebuild

REINDEX INDEX CONCURRENTLY idx_orders_customer_id;

For a more thorough treatment, see Table Bloat and Autovacuum Tuning.

If bloat is not the issue, we continue. Go to Step 6.

Step 6 — Is It a Configuration Problem?

If you have arrived here — the plan is reasonable, the right index exists, statistics are fresh, no locks are present, and bloat is under control — the problem may reside in PostgreSQL's configuration. The defaults are conservative by design.

shared_buffers

SHOW shared_buffers;

PostgreSQL's dedicated memory cache. Set to 25% of total system RAM. If set too low, look for high Buffers: ... read= values relative to shared hit= in EXPLAIN output.

work_mem

SHOW work_mem;

The default is 4 MB. If EXPLAIN shows Sort Method: external merge Disk, the sort exceeded work_mem. Test a higher value:

Test work_mem per-session

-- Test per-session first
SET work_mem = '64MB';
-- Then re-run the query with EXPLAIN (ANALYZE, BUFFERS)

Do not set work_mem globally to very high values. Each connection can use multiple work_mem allocations simultaneously.

effective_cache_size

A hint to the planner about how much memory is available for caching. Set to approximately 75% of total system RAM. If set too low, the planner assumes most data will require disk access and biases toward sequential scans.

random_page_cost

The default is 4.0 — a value calibrated for spinning hard drives. On SSDs, set this to 1.1–1.5:

Fix random_page_cost for SSDs

ALTER SYSTEM SET random_page_cost = 1.1;
SELECT pg_reload_conf();

A high random_page_cost on SSD storage is one of the most common configuration-related causes of unnecessary sequential scans.

effective_io_concurrency

The default is 1. For SSDs:

ALTER SYSTEM SET effective_io_concurrency = 200;
SELECT pg_reload_conf();

If configuration is already well-tuned, there is one possibility remaining. Go to Step 7.

Step 7 — Is It the Query Itself?

If you have arrived at this step — indexes present, statistics fresh, no locks, bloat controlled, configuration tuned — then the problem may be the query itself. It is doing more work than necessary. No amount of indexing or configuration tuning will resolve a query that is fundamentally asking for too much.

Patterns That Indicate a Query-Level Problem

Selecting more columns than needed. SELECT * on a table with 50 columns when the application uses 3.

Missing LIMIT on exploratory queries. Queries that return unbounded result sets when the application displays only the first 20 rows.

Correlated subqueries that should be JOINs:

Correlated subquery vs JOIN

-- Slow: correlated subquery (runs for each customer)
SELECT c.name,
  (SELECT SUM(total) FROM orders o WHERE o.customer_id = c.id) AS lifetime_total
FROM customers c;

-- Faster: JOIN with aggregation
SELECT c.name, COALESCE(SUM(o.total), 0) AS lifetime_total
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

DISTINCT hiding a join problem. SELECT DISTINCT on a large result set is often a signal that the query has an unintended many-to-many join producing duplicate rows.

Rewrite Strategies

Lateral joins for correlated access:

Lateral join for top-N

SELECT c.name, latest.total, latest.created_at
FROM customers c
CROSS JOIN LATERAL (
  SELECT o.total, o.created_at
  FROM orders o
  WHERE o.customer_id = c.id
  ORDER BY o.created_at DESC
  LIMIT 3
) AS latest;

Materialized views for repeated expensive aggregations:

Materialized view

CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
  DATE_TRUNC('month', created_at) AS month,
  SUM(total) AS revenue,
  COUNT(*) AS order_count
FROM orders
GROUP BY DATE_TRUNC('month', created_at);

-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;

When the answer is "this query needs a different approach" rather than "this query needs tuning," no amount of diagnostic work will resolve it. The query must be redesigned.

The Diagnostic Cheat Sheet

A summary of the decision tree above, designed to be bookmarked and consulted during an incident.

Step	Diagnostic SQL	What to Look For	Fix	If Not Resolved
1	`EXPLAIN (ANALYZE, BUFFERS)`	Seq scan, bad row estimates, disk sorts	Identifies the category of problem	Go to Step 2, 3, or 4
2	`SELECT * FROM pg_indexes WHERE tablename = 't'`	Missing or mismatched index	`CREATE INDEX`	Go to Step 3
3	`SELECT last_analyze FROM pg_stat_user_tables`	Stale or missing statistics	`ANALYZE tablename;`	Go to Step 4
4	`SELECT * FROM pg_stat_activity WHERE wait_event_type = 'Lock'`	Lock waits, blocked queries	`pg_cancel_backend()` + timeouts	Go to Step 5
5	`SELECT n_dead_tup FROM pg_stat_user_tables`	Dead tuple ratio > 20%	`VACUUM FULL` or `pg_repack`	Go to Step 6
6	`SHOW work_mem;` `SHOW random_page_cost;`	Disk sorts, high random_page_cost	Adjust configuration	Go to Step 7
7	Read the query	SELECT *, missing LIMIT, correlated subqueries	Rewrite the query	—