EXPLAIN ANALYZE
If I may be direct: guessing why a query is slow is beneath you. This tool shows you exactly what happened.
I have never once resolved a performance issue by theorizing about it. EXPLAIN ANALYZE executes a SQL statement and returns the query plan that PostgreSQL used, annotated with actual execution times, actual row counts, and the number of loop iterations at each step. Where plain EXPLAIN shows you what the planner predicted would happen, EXPLAIN ANALYZE shows you what actually happened. The gap between the two is where slow queries hide.
What EXPLAIN ANALYZE does
EXPLAIN ANALYZE does two things: it runs the query, and it instruments the execution. Every node in the query plan is wrapped with timing logic that records when it started, how many rows it produced, and how many times it was invoked. After execution completes, PostgreSQL formats the plan as a tree, with each node showing both the planner's estimates and the actual numbers observed during execution.
This is not a separate profiling tool. It is built into PostgreSQL and works on any query — SELECTs, INSERTs, UPDATEs, DELETEs, CTEs, subqueries, functions. The syntax is straightforward:
-- EXPLAIN shows the planner's estimated plan (does NOT run the query)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- Output:
-- Index Scan using orders_customer_id_idx on orders (cost=0.43..8.45 rows=1 width=64)
-- Index Cond: (customer_id = 42) Plain EXPLAIN shows the estimated plan without running the query. Add ANALYZE and it runs for real:
-- EXPLAIN ANALYZE actually runs the query and shows real numbers
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
-- Output:
-- Index Scan using orders_customer_id_idx on orders
-- (cost=0.43..8.45 rows=1 width=64)
-- (actual time=0.023..0.025 rows=1 loops=1)
-- Index Cond: (customer_id = 42)
-- Planning Time: 0.092 ms
-- Execution Time: 0.048 ms The critical addition is the (actual time=...rows=...loops=...) line. That is the evidence. Everything else is the planner's best guess — and the planner, while competent, is not above being wrong.
Why it matters
The PostgreSQL query planner is good, but it is not omniscient. It makes decisions — which index to use, which join strategy to pick, what order to process tables in — based on table statistics that may be stale, correlation assumptions that may be wrong, and cost estimates that may not reflect your hardware. When those decisions go wrong, a query that should take 5 milliseconds takes 5 seconds.
Without EXPLAIN ANALYZE, you know the query is slow. With it, you know where it is slow and why. Specifically:
- Row estimate mismatches reveal stale statistics or correlated columns the planner does not account for
- Sequential scans on large tables suggest missing indexes
- Nested loop joins with high loop counts show where the planner underestimated the outer side's row count
- Sort and hash operations spilling to disk indicate insufficient
work_mem - High buffer read counts point to data that does not fit in the cache
Every performance investigation starts the same way: run EXPLAIN ANALYZE, read the plan, find the node where estimates diverge from reality. One does not diagnose by intuition when the evidence is a single command away.
Practical examples
Adding BUFFERS for I/O visibility
The BUFFERS option adds block-level I/O information to each node. This tells you how much data came from PostgreSQL's buffer cache ("shared hit") versus disk or the OS page cache ("shared read"). Nodes with high read counts are where the I/O bottleneck lives.
-- BUFFERS shows how much data came from cache vs disk
EXPLAIN (ANALYZE, BUFFERS) SELECT *
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > now() - interval '7 days';
-- Output:
-- Nested Loop (cost=0.85..1234.56 rows=150 width=96)
-- (actual time=0.045..3.812 rows=147 loops=1)
-- Buffers: shared hit=892 read=34
-- -> Index Scan using orders_created_at_idx on orders o
-- (cost=0.43..456.78 rows=150 width=64)
-- (actual time=0.023..1.456 rows=147 loops=1)
-- Index Cond: (created_at > (now() - '7 days'::interval))
-- Buffers: shared hit=450 read=12
-- -> Index Scan using customers_pkey on customers c
-- (cost=0.42..5.18 rows=1 width=32)
-- (actual time=0.002..0.003 rows=1 loops=147)
-- Index Cond: (id = o.customer_id)
-- Buffers: shared hit=442 read=22
-- Planning Time: 0.215 ms
-- Execution Time: 3.971 ms In this plan, 892 blocks came from cache and only 34 were read from disk. A healthy ratio — the household is running efficiently. If the reads were dominant, you would know the working set exceeds shared_buffers, or the query is scanning more data than it needs to.
Catching a row estimation mismatch
This is the single most common cause of slow queries: the planner expects a small number of rows and chooses a plan that is efficient for small results (e.g., a sequential scan with a filter) but catastrophic for the actual volume.
-- Row estimation mismatch: planner expected 1 row, got 12,847
EXPLAIN ANALYZE SELECT * FROM events WHERE status = 'pending';
-- Output:
-- Seq Scan on events (cost=0.00..25432.00 rows=1 width=128)
-- (actual time=0.012..187.345 rows=12847 loops=1)
-- Filter: (status = 'pending')
-- Rows Removed by Filter: 987153
-- Planning Time: 0.054 ms
-- Execution Time: 189.012 ms
-- The planner estimated 1 row but got 12,847 — stale statistics.
-- Fix: run ANALYZE events; then check the plan again. The planner estimated 1 row. The query returned 12,847. I find this sort of discrepancy — off by four orders of magnitude — to be less a rounding error and more a cry for attention. Because the planner thought it was looking for a needle, it chose a sequential scan with a filter, scanning nearly a million rows to find them. With accurate statistics, it would likely choose an index scan. The fix here is to run ANALYZE events to update the statistics, then verify that the plan changes.
Key things to look for
When reading an EXPLAIN ANALYZE plan, focus on these areas:
Estimated rows vs actual rows
Compare rows=N in the cost estimate with rows=N in the actual output. When these differ by more than 10x, the planner is working with bad information. Common causes: stale statistics (fix with ANALYZE), correlated columns, or functions in WHERE clauses that the planner cannot estimate.
Scan types
A Seq Scan on a large table with a selective WHERE clause usually means a missing index. An Index Scan or Index Only Scan is generally what you want for selective queries. A Bitmap Index Scan followed by a Bitmap Heap Scan means PostgreSQL is combining multiple index lookups — normal for OR conditions or moderate selectivity.
Join methods and loop counts
A Nested Loop is efficient when the inner side is fast and the outer side has few rows. Check the loops value — if a nested loop's inner side runs thousands of times with high per-loop cost, the planner may have underestimated the outer side. A Hash Join or Merge Join handles larger result sets more predictably.
Sort and hash operations
Look for Sort Method: external merge or sort operations with large disk usage — these indicate the sort spilled to disk because work_mem was too small for the data volume. In-memory sorts show Sort Method: quicksort.
Buffer statistics
With BUFFERS enabled, compare shared hit (cache) vs shared read (disk). High read counts on specific nodes tell you which part of the query is I/O-bound. Temp read/written values indicate operations spilling to disk.
How Gold Lapel relates
Allow me to describe what I do on your behalf. Gold Lapel reads every query plan that passes through its proxy — not once, but continuously. Row estimates are compared against actuals over time, patterns that indicate suboptimal plans are noted, and when I recommend an index or create a materialized view, it is because the accumulated evidence leaves little room for doubt.
I should be clear: this does not replace EXPLAIN ANALYZE. When you are investigating a specific query, there is no substitute for reading the plan yourself. EXPLAIN ANALYZE gives you the full picture for one query at one moment. Gold Lapel gives you the aggregate picture across your entire workload over time. One is the magnifying glass; the other is the ongoing inventory. Both belong in a well-run household.