Row estimation
The planner's prediction of how many rows a query step will return — and, if you'll permit me, the single most common reason it picks a bad plan.
Every time PostgreSQL plans a query, it estimates how many rows each step will produce. These estimates drive every downstream decision — which join strategy to use, whether to scan an index or the whole table, how much memory to allocate for sorts. The estimates come from table statistics collected by ANALYZE. When the estimates are close to reality, the planner produces excellent plans. When they are orders of magnitude off — estimated 50 rows, actual 247,000 — the planner makes choices that can turn a 50-millisecond query into a 15-second one. I find this personally alarming, and I suspect you will too.
What row estimation is
At every node in a query plan, the planner needs to answer one question: how many rows will this step produce? For a sequential scan with a WHERE status = 'pending' filter, the planner looks up the frequency of 'pending' in the column's statistics and multiplies it by the table's row count. For a join, it estimates how many rows the join condition will match. For an aggregation, it estimates the number of groups.
These estimates are stored in pg_statistic (exposed through the pg_stats view) and include the most common values and their frequencies, histogram buckets for range queries, the number of distinct values, and the fraction of nulls. The ANALYZE command collects this data by sampling rows from each table — 30,000 values per column by default.
The planner does not guess. It calculates. But the calculation is only as good as the statistics it has been given — and stale statistics are, I'm afraid, rather common in the wild.
-- See what the planner knows about a column's data distribution
SELECT
attname,
n_distinct,
most_common_vals,
most_common_freqs,
null_frac
FROM pg_stats
WHERE tablename = 'orders'
AND attname = 'status';
-- If most_common_vals is NULL or outdated, the planner is guessing. Why estimates matter
Row estimates are the foundation of every plan decision. The planner uses them to compare the cost of alternative strategies, and even modest errors can flip the choice between a plan that finishes in milliseconds and one that overstays its welcome by several seconds.
- Join strategy — A nested loop join is efficient when the outer side has few rows (each row triggers one index lookup on the inner side). When the planner estimates 10 rows but the actual is 50,000, the inner side executes 50,000 times instead of 10. A hash join — which builds a hash table once and probes it for each outer row — would have been orders of magnitude faster. The planner chose correctly for the data it believed it had. The data it believed it had was fiction.
- Scan method — The planner chooses between a sequential scan and an index scan based on how many rows it expects to return. If it estimates a query will return 80% of a table's rows, a sequential scan is cheaper than thousands of random index lookups. If it estimates 0.01% but the real answer is 5%, it may choose an index scan when a bitmap scan would be more efficient, or skip the index entirely.
- Memory allocation — Sort and hash operations are sized based on estimated row counts. Underestimates cause spills to disk when the actual data volume exceeds
work_mem. - Join order — In multi-table joins, the planner arranges tables to minimize intermediate result sizes. Wrong row estimates at one table can cascade through the join order, producing millions of intermediate rows that a correct estimate would have avoided.
-- A bad estimate at one node cascades through the entire plan
EXPLAIN ANALYZE
SELECT o.*, p.name, c.email
FROM orders o
JOIN products p ON p.id = o.product_id
JOIN customers c ON c.id = o.customer_id
WHERE o.region = 'eu-west' AND o.created_at > now() - interval '1 day';
-- Nested Loop (cost=1.27..45.89 rows=3 width=192)
-- (actual time=0.045..14523.678 rows=8400 loops=1)
-- -> Nested Loop (cost=0.85..32.45 rows=3 width=160)
-- (actual time=0.034..9876.543 rows=8400 loops=1)
-- -> Index Scan using orders_created_at_idx on orders o
-- (cost=0.43..12.67 rows=3 width=128)
-- (actual time=0.021..123.456 rows=8400 loops=1)
-- Filter: (region = 'eu-west')
-- -> Index Scan using products_pkey on products p
-- (cost=0.42..6.59 rows=1 width=32)
-- (actual time=0.002..0.002 rows=1 loops=8400)
-- -> Index Scan using customers_pkey on customers c
-- (cost=0.42..4.48 rows=1 width=32)
-- (actual time=0.001..0.001 rows=1 loops=8400)
-- The planner estimated 3 rows from orders. It got 8,400.
-- It chose nested loops for both joins — fine for 3 rows,
-- catastrophic for 8,400 (the inner sides each execute 8,400 times). How to spot bad estimates
EXPLAIN ANALYZE is the diagnostic tool — allow me to recommend it unreservedly. It runs the query and shows both the planner's estimated row count and the actual row count at every node in the plan. The comparison between these two numbers tells you whether the planner had accurate information or was operating on outdated intelligence.
-- Compare estimated rows vs actual rows
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';
-- Output:
-- Seq Scan on orders (cost=0.00..25432.00 rows=1 width=128)
-- (actual time=0.015..203.456 rows=47823 loops=1)
-- Filter: (status = 'pending')
-- Rows Removed by Filter: 952177
-- Planning Time: 0.062 ms
-- Execution Time: 211.789 ms
-- The planner estimated 1 row. The query returned 47,823.
-- That 47,000x error caused the planner to skip an index scan
-- that would have finished in under 50 ms. What to look for:
- Orders-of-magnitude differences — An estimate of 1 with an actual of 47,823 is the planner flying entirely blind. Anything beyond 10x warrants your attention.
- Consistent underestimates at the same node — If a filter or join consistently returns far more rows than estimated across multiple query executions, the statistics for that column are stale or insufficient.
- Cascading errors in joins — A bad estimate at an early node produces wrong cost calculations for every node above it. Look for the first node where estimates diverge — that is where the trouble begins, and where your fix should start.
- Seq Scan on a large table with a selective filter — If the planner chose a sequential scan and the actual rows are a tiny fraction of the table, the estimate was too high (the planner thought the filter was not selective) or statistics are missing entirely.
Common causes of bad estimates
Stale statistics
The most frequent cause — and, mercifully, the simplest to resolve. A table receives a bulk load, a mass update changes the distribution of a status column, or a new category appears in a previously stable dataset. Autovacuum triggers ANALYZE when roughly 10% of rows have changed, but that threshold can be too slow for rapidly changing tables. Until ANALYZE runs, the planner works with the old distribution and produces estimates that bear no resemblance to the data actually in the table.
Correlated columns
The planner assumes that conditions on different columns are statistically independent. If you filter on WHERE country = 'US' AND state = 'California', the planner multiplies each selectivity separately: 40% (country) times 12% (state) equals 4.8%. But California only exists within the US, so the actual selectivity is closer to 15%. The planner's arithmetic is impeccable; its assumption is wrong. This independence assumption produces underestimates whenever columns are correlated — which is common in real-world data.
Complex expressions
The planner has detailed statistics for simple column comparisons (status = 'pending') but falls back to default estimates for expressions it cannot analyze. WHERE extract(year from created_at) = 2026 gets a generic 0.5% selectivity estimate because the planner does not maintain statistics on computed expressions. Rewriting as a range condition (WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01') lets the planner use the histogram.
Joins multiplying errors
Row estimation errors compound across joins. If the planner underestimates table A by 5x and table B by 3x, the join estimate can be off by 15x or more. Multi-table joins are particularly vulnerable — a moderate error at the first table propagates through every subsequent join, and by the time the planner estimates the final result, the number can be off by several orders of magnitude. It is, if I may say so, the infrastructural equivalent of a small navigational error at departure producing a landing several hundred miles from the intended airport.
Fixing bad estimates
Run ANALYZE
The first and most effective fix — and I appreciate when the remedy is proportionate to the problem. If statistics are stale, updating them is usually all it takes. One command, no downtime, immediate improvement. For tables that change rapidly, lower autovacuum_analyze_scale_factor so autovacuum runs ANALYZE more frequently.
-- Update statistics for the table
ANALYZE orders;
-- Verify the plan changes
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';
-- Output after ANALYZE:
-- Bitmap Heap Scan on orders (cost=534.12..18923.45 rows=47500 width=128)
-- (actual time=12.345..45.678 rows=47823 loops=1)
-- Recheck Cond: (status = 'pending')
-- Heap Blocks: exact=8234
-- -> Bitmap Index Scan on orders_status_idx (cost=0.00..522.25 rows=47500 width=0)
-- (actual time=8.901..8.901 rows=47823 loops=1)
-- Index Cond: (status = 'pending')
-- Planning Time: 0.098 ms
-- Execution Time: 51.234 ms
-- 4x faster. Same query, same data. Better statistics. Increase the statistics target
The default statistics target of 100 means PostgreSQL collects 100 histogram buckets per column. For columns with many distinct values or heavily skewed distributions, this resolution may be too coarse. Increasing it to 500 or 1000 gives the planner finer-grained data to work with, at the cost of slightly slower ANALYZE and slightly more memory for the statistics catalog.
-- Increase the statistics target for a column with many distinct values
ALTER TABLE orders ALTER COLUMN product_id SET STATISTICS 1000;
ANALYZE orders;
-- Default target is 100 histogram buckets.
-- 1000 gives the planner a finer-grained picture of the distribution.
-- Useful for high-cardinality columns or skewed data. Create extended statistics for correlated columns
Since PostgreSQL 10, CREATE STATISTICS lets you tell the planner about dependencies between columns. This is a precision tool — it directly addresses the independence assumption that causes underestimates on correlated filters, and I am rather fond of it.
-- The planner assumes independent columns and multiplies selectivities.
-- country = 'US' returns 40% of rows. state = 'California' returns 12%.
-- Planner estimates: 40% * 12% = 4.8% of rows.
-- Reality: 15% of rows, because California is entirely within the US.
-- Fix: create extended statistics (PostgreSQL 10+)
CREATE STATISTICS orders_country_state (dependencies)
ON country, state FROM orders;
ANALYZE orders;
-- The planner now knows that state depends on country
-- and adjusts its row estimate accordingly. Extended statistics support three types: dependencies (functional dependencies between columns), ndistinct (distinct value counts for column combinations), and mcv (most common value combinations, PostgreSQL 12+). For most correlated-column problems, dependencies is sufficient.
Rewrite expressions the planner can analyze
Replace function calls and computed expressions in WHERE clauses with range conditions on the original column. WHERE date_trunc('month', created_at) = '2026-03-01' gets a default estimate. WHERE created_at >= '2026-03-01' AND created_at < '2026-04-01' uses the histogram. The results are identical; the estimates are not. The planner can only work with what it can see.
How Gold Lapel relates
Gold Lapel sits between your application and PostgreSQL as a query proxy. I mention this because row estimation is precisely the kind of problem that benefits from continuous observation rather than periodic investigation. Gold Lapel watches every query's execution plan over time, including the row estimates and the actual row counts. When a query pattern consistently shows large estimation errors — the kind that cause the planner to choose nested loops where it should choose hash joins — it flags the pattern and applies complementary optimizations.
These optimizations work with the planner, not against it. Gold Lapel may create a materialized view that pre-computes a problematic join, recommend an index that changes the set of plans available, or restructure a query so the planner receives better selectivity information. The goal is to change the inputs to the planner so that its row estimates — and therefore its plan choices — improve without you having to run EXPLAIN ANALYZE on every query yourself. Though I would never discourage you from doing so.