EXPLAIN ANALYZE

Q: Does EXPLAIN ANALYZE modify my data?

Yes, and I am glad you asked before finding out the hard way. EXPLAIN ANALYZE actually executes the statement, including INSERTs, UPDATEs, and DELETEs. To inspect a write query without committing the changes, wrap it in a transaction and roll back: BEGIN; EXPLAIN ANALYZE DELETE FROM ...; ROLLBACK. The plan output will show real numbers, and the rollback ensures no data is changed.

Q: What is the difference between EXPLAIN and EXPLAIN ANALYZE?

EXPLAIN shows the planner's estimated plan without running the query — what it intends to do, based on the statistics it has on hand. EXPLAIN ANALYZE actually executes the query and reports the real numbers — actual time, actual rows, and actual loops — alongside the estimates. The difference between the two sets of numbers is where the diagnostic value lives. When estimates and actuals diverge significantly, the planner made a poor decision based on incomplete information, and you have found your culprit.

Q: Why are my estimated rows so different from actual rows?

Almost always stale statistics. The planner bases its estimates on data collected by ANALYZE (run manually or by autovacuum). If the data distribution has changed significantly since the last ANALYZE, the estimates will be wrong — and the planner will make decisions accordingly, with predictable consequences. Other causes include correlated columns that the planner assumes are independent, functions in WHERE clauses that the planner cannot estimate well, and parameterized queries where the statistics do not reflect the specific parameter value being used.

Q: What does "loops" mean in EXPLAIN ANALYZE output?

The loops value tells you how many times a plan node was executed. In a nested loop join, the inner side runs once per row from the outer side. If the outer side returns 147 rows, you will see loops=147 on the inner node. The "actual time" and "rows" values shown are per-loop averages — and this is where many readers are misled. To get the total time spent in that node, multiply the actual time by the number of loops. A node showing actual time=0.003 ms with loops=10000 is actually consuming 30 ms total. The plan tells you everything, if you know how to read it.

Q: Should I use EXPLAIN ANALYZE in production?

With care, and I would not have it any other way. EXPLAIN ANALYZE runs the query for real, so it consumes real resources and takes real time. For SELECT queries this is generally safe — the query would have run anyway, and you are only adding the overhead of timing instrumentation. For write queries, use a transaction with ROLLBACK. For automated, continuous plan capture in production, consider the auto_explain extension, which logs plans for queries exceeding a configurable threshold without requiring you to be present at the console.

Q: What does the BUFFERS option tell me?

BUFFERS adds I/O information to each plan node, and I consider it indispensable. "shared hit" is the number of 8KB blocks read from PostgreSQL's buffer cache (fast). "shared read" is blocks read from disk or the OS page cache (slower). A high ratio of reads to hits on a specific node suggests that the data being accessed does not fit in the buffer cache — either because the working set is too large, or because the query is scanning more data than necessary. This is one of the most actionable details in an EXPLAIN ANALYZE plan.

If I may be direct: there is no need to guess why a query is slow. This tool shows you exactly what happened.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 7 min read

I have never once resolved a performance issue by theorizing about it. EXPLAIN ANALYZE executes a SQL statement and returns the query plan that PostgreSQL used, annotated with actual execution times, actual row counts, and the number of loop iterations at each step. Where plain EXPLAIN shows you what the planner predicted would happen, EXPLAIN ANALYZE shows you what actually happened. The gap between the two is where slow queries hide.

What EXPLAIN ANALYZE does

EXPLAIN ANALYZE does two things: it runs the query, and it instruments the execution. The EXPLAIN command reference documents every available option. Every node in the query plan is wrapped with timing logic that records when it started, how many rows it produced, and how many times it was invoked. After execution completes, PostgreSQL formats the plan as a tree, with each node showing both the planner's estimates and the actual numbers observed during execution.

This is not a separate profiling tool. It is built into PostgreSQL and works on any query — SELECTs, INSERTs, UPDATEs, DELETEs, CTEs, subqueries, functions. The syntax is straightforward:

SQL

-- EXPLAIN shows the planner's estimated plan (does NOT run the query)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

-- Output:
-- Index Scan using orders_customer_id_idx on orders  (cost=0.43..8.45 rows=1 width=64)
--   Index Cond: (customer_id = 42)

Plain EXPLAIN shows the estimated plan without running the query. Add ANALYZE and it runs for real:

SQL

-- EXPLAIN ANALYZE actually runs the query and shows real numbers
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;

-- Output:
-- Index Scan using orders_customer_id_idx on orders
--   (cost=0.43..8.45 rows=1 width=64)
--   (actual time=0.023..0.025 rows=1 loops=1)
--   Index Cond: (customer_id = 42)
-- Planning Time: 0.092 ms
-- Execution Time: 0.048 ms

The critical addition is the (actual time=...rows=...loops=...) line. That is the evidence. Everything else is the planner's best guess — and the planner, while competent, is not above being wrong.

Why it matters

The PostgreSQL query planner is good, but it is not omniscient. It makes decisions — which index to use, which join strategy to pick, what order to process tables in — based on table statistics that may be stale, correlation assumptions that may be wrong, and cost estimates that may not reflect your hardware. When those decisions go wrong, a query that should take 5 milliseconds takes 5 seconds.

Without EXPLAIN ANALYZE, you know the query is slow. With it, you know where it is slow and why. Specifically:

Row estimate mismatches reveal stale statistics or correlated columns the planner does not account for
Sequential scans on large tables suggest missing indexes
Nested loop joins with high loop counts show where the planner underestimated the outer side's row count
Sort and hash operations spilling to disk indicate insufficient work_mem
High buffer read counts point to data that does not fit in the cache

Every performance investigation starts the same way: run EXPLAIN ANALYZE, read the plan, find the node where estimates diverge from reality. One does not diagnose by intuition when the evidence is a single command away.

Practical examples

Adding BUFFERS for I/O visibility

The BUFFERS option adds block-level I/O information to each node. This tells you how much data came from PostgreSQL's buffer cache ("shared hit") versus disk or the OS page cache ("shared read"). Nodes with high read counts are where the I/O bottleneck lives.

SQL

-- BUFFERS shows how much data came from cache vs disk
EXPLAIN (ANALYZE, BUFFERS) SELECT *
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > now() - interval '7 days';

-- Output:
-- Nested Loop  (cost=0.85..1234.56 rows=150 width=96)
--              (actual time=0.045..3.812 rows=147 loops=1)
--   Buffers: shared hit=892 read=34
--   ->  Index Scan using orders_created_at_idx on orders o
--         (cost=0.43..456.78 rows=150 width=64)
--         (actual time=0.023..1.456 rows=147 loops=1)
--         Index Cond: (created_at > (now() - '7 days'::interval))
--         Buffers: shared hit=450 read=12
--   ->  Index Scan using customers_pkey on customers c
--         (cost=0.42..5.18 rows=1 width=32)
--         (actual time=0.002..0.003 rows=1 loops=147)
--         Index Cond: (id = o.customer_id)
--         Buffers: shared hit=442 read=22
-- Planning Time: 0.215 ms
-- Execution Time: 3.971 ms

In this plan, 892 blocks came from cache and only 34 were read from disk. A healthy ratio — the household is running efficiently. If the reads were dominant, you would know the working set exceeds shared_buffers, or the query is scanning more data than it needs to.

Catching a row estimation mismatch

This is the single most common cause of slow queries: the planner expects a small number of rows and chooses a plan that is efficient for small results (e.g., a sequential scan with a filter) but catastrophic for the actual volume.

SQL

-- Row estimation mismatch: planner expected 1 row, got 12,847
EXPLAIN ANALYZE SELECT * FROM events WHERE status = 'pending';

-- Output:
-- Seq Scan on events  (cost=0.00..25432.00 rows=1 width=128)
--                     (actual time=0.012..187.345 rows=12847 loops=1)
--   Filter: (status = 'pending')
--   Rows Removed by Filter: 987153
-- Planning Time: 0.054 ms
-- Execution Time: 189.012 ms

-- The planner estimated 1 row but got 12,847 — stale statistics.
-- Fix: run ANALYZE events; then check the plan again.

The planner estimated 1 row. The query returned 12,847. A discrepancy of four orders of magnitude — the statistics the planner relied on no longer reflect the data. Because the planner thought it was looking for a needle, it chose a sequential scan with a filter, scanning nearly a million rows to find them. With accurate statistics, it would likely choose an index scan. The fix here is to run ANALYZE events to update the statistics, then verify that the plan changes.

Key things to look for

When reading an EXPLAIN ANALYZE plan, the Using EXPLAIN chapter in the PostgreSQL manual walks through plan interpretation in depth. Focus on these areas:

Estimated rows vs actual rows

Compare rows=N in the cost estimate with rows=N in the actual output. When these differ by more than 10x, the planner is working with bad information. Common causes: stale statistics (fix with ANALYZE), correlated columns, or functions in WHERE clauses that the planner cannot estimate.

Scan types

A Seq Scan on a large table with a selective WHERE clause usually means a missing index. An Index Scan or Index Only Scan is generally what you want for selective queries. A Bitmap Index Scan followed by a Bitmap Heap Scan means PostgreSQL is combining multiple index lookups — normal for OR conditions or moderate selectivity.

Join methods and loop counts

A Nested Loop is efficient when the inner side is fast and the outer side has few rows. Check the loops value — if a nested loop's inner side runs thousands of times with high per-loop cost, the planner may have underestimated the outer side. A Hash Join or Merge Join handles larger result sets more predictably.

Sort and hash operations

Look for Sort Method: external merge or sort operations with large disk usage — these indicate the sort spilled to disk because work_mem was too small for the data volume. In-memory sorts show Sort Method: quicksort.

Buffer statistics

With BUFFERS enabled, compare shared hit (cache) vs shared read (disk). High read counts on specific nodes tell you which part of the query is I/O-bound. Temp read/written values indicate operations spilling to disk.

How Gold Lapel relates

Allow me to describe what I do on your behalf. Gold Lapel reads every query plan that passes through its proxy — not once, but continuously. Row estimates are compared against actuals over time, patterns that indicate suboptimal plans are noted, and when I recommend an index or create a materialized view, it is because the accumulated evidence leaves little room for doubt.

I should be clear: this does not replace EXPLAIN ANALYZE. When you are investigating a specific query, there is no substitute for reading the plan yourself. EXPLAIN ANALYZE gives you the full picture for one query at one moment. Gold Lapel gives you the aggregate picture across your entire workload over time. One is the magnifying glass; the other is the ongoing inventory. Both belong in a well-run household.

EXPLAIN ANALYZE

What EXPLAIN ANALYZE does

Why it matters

Practical examples

Adding BUFFERS for I/O visibility

Catching a row estimation mismatch

Key things to look for

Estimated rows vs actual rows

Scan types

Join methods and loop counts

Sort and hash operations

Buffer statistics

How Gold Lapel relates

Frequently asked questions

Does EXPLAIN ANALYZE modify my data?

What is the difference between EXPLAIN and EXPLAIN ANALYZE?

Why are my estimated rows so different from actual rows?

What does "loops" mean in EXPLAIN ANALYZE output?

Should I use EXPLAIN ANALYZE in production?

What does the BUFFERS option tell me?

Related content