Fix PostgreSQL Sequential Scans: When and How to Force Indexes
An index on this illustration exists. The planner has chosen to ignore it.
Not Every Sequential Scan Is a Problem
I should say this plainly at the outset: PostgreSQL chooses sequential scans deliberately. The query planner evaluates the cost of available access paths and picks the one it estimates to be cheapest. When it chooses a sequential scan, it is usually correct. The planner is not careless — it is making a cost-based decision with the information available to it.
When a Sequential Scan Is Optimal
Small tables. A table with fewer than roughly 1,000 rows fits in a handful of pages. Reading those pages sequentially is faster than traversing an index structure, dereferencing pointers, and fetching heap pages via random I/O.
High selectivity (returning most of the table). When a query returns more than 5–10% of the table's rows, sequential reads are faster than the random I/O pattern of an index scan.
No usable index exists. If no index matches the WHERE clause, a sequential scan is the only available option.
The Diagnostic Question
The question is not "is there a sequential scan?" — that alone tells you very little. The question is: is this scan reading significantly more rows than it returns?
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_id = 8472; Seq Scan on orders (cost=0.00..28734.00 rows=18 width=120)
(actual time=245.18..3912.44 rows=15 loops=1)
Filter: (customer_id = 8472)
Rows Removed by Filter: 1350012
Buffers: shared hit=6200 read=18100 This scan read 1.35 million rows to return 15. The Rows Removed by Filter value is the most important number in this output — it tells you how much work was wasted. If it is orders of magnitude higher than the rows returned, an index should be handling this.
Why PostgreSQL Ignores Your Index
The planner estimates cost using data from pg_statistic and a set of cost model parameters. When it chooses a sequential scan despite an index being available, the planner is not being stubborn — it is responding to the information it has. One of six causes is almost always responsible.
1. Stale Table Statistics
PostgreSQL's planner relies on table statistics to estimate how many rows will match a filter. When those statistics are stale, the planner's row estimates are wrong — and wrong estimates produce wrong plans.
Diagnostic:
SELECT
relname,
last_analyze,
last_autoanalyze,
n_live_tup,
n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'orders'; Cross-check row estimates:
-- Planner's estimate
SELECT reltuples::bigint FROM pg_class WHERE relname = 'orders';
-- Reality
SELECT count(*) FROM orders; A difference of 10x or more confirms stale statistics.
Fix:
ANALYZE orders; Then re-run EXPLAIN (ANALYZE, BUFFERS). If the plan changes from Seq Scan to Index Scan, the problem was stale statistics.
Prevention: Lower the autoanalyze threshold for high-churn tables:
ALTER TABLE orders SET (
autovacuum_analyze_threshold = 1000,
autovacuum_analyze_scale_factor = 0.02
); For more details, see Autovacuum Tuning.
2. Cost Parameters Favor Sequential Reads
Two parameters heavily influence the planner's comparison between sequential and index scans.
random_page_cost defaults to 4.0. This tells the planner that a random page fetch (as used by index scans) is 4x more expensive than a sequential page fetch. On SSDs, this is not accurate.
effective_cache_size tells the planner how much data it can expect to find cached in memory. If set too low, the planner assumes most pages require disk access, inflating the estimated cost of random I/O.
Diagnostic:
SELECT name, setting, unit
FROM pg_settings
WHERE name IN ('random_page_cost', 'seq_page_cost', 'effective_cache_size'); Fix: Test the impact by changing the parameter for a single session first:
SET random_page_cost = 1.1;
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_id = 8472; If the plan switches to an index scan and performance improves, apply globally:
ALTER SYSTEM SET random_page_cost = 1.1;
SELECT pg_reload_conf(); Set effective_cache_size to approximately 75% of total system RAM:
ALTER SYSTEM SET effective_cache_size = '12GB'; -- on a 16 GB server
SELECT pg_reload_conf(); 3. Function or Expression on the Indexed Column
A B-tree index on a column stores the column's raw values. When the query applies a function or cast, the planner sees an entirely different expression.
Common cases:
-- B-tree index on 'email' — not used
WHERE LOWER(email) = 'user@example.com'
-- B-tree index on 'created_at' — not used
WHERE created_at::date = '2026-01-15'
-- B-tree index on 'data' (jsonb) — not used
WHERE data->>'status' = 'active' Fix: Create an expression index that matches the query exactly:
-- Matches WHERE LOWER(email) = ...
CREATE INDEX idx_lower_email ON users (LOWER(email));
-- Matches WHERE created_at::date = ...
CREATE INDEX idx_created_date ON events ((created_at::date));
-- Matches WHERE data->>'status' = ...
CREATE INDEX idx_data_status ON items ((data->>'status')); Alternative: Rewrite the query to avoid the function. For date filtering on a timestamp column, use a range instead of a cast:
-- Instead of: WHERE created_at::date = '2026-01-15'
WHERE created_at >= '2026-01-15' AND created_at < '2026-01-16' This uses the B-tree index on created_at directly. No expression index required.
4. Type Mismatch Between Column and Value
When the column type and the comparison value type differ, PostgreSQL inserts an implicit cast that can quietly prevent index use.
Fix: Ensure the value in the query matches the column type:
-- Column is integer: pass an integer
WHERE integer_column = 42
-- Column is text: pass text
WHERE text_column = '42' This is particularly common with ORMs and query builders, which may pass parameters as unexpected types without any visible indication in the application code.
5. High Selectivity — The Planner Is Right
When a query returns more than 5–10% of a table's rows, a sequential scan is genuinely faster than an index scan. This is not a bug. This is the planner doing precisely what it should.
If you remain unconvinced, test it explicitly:
-- Force PostgreSQL to avoid sequential scans
SET enable_seqscan = off;
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE status = 'pending';
-- Reset
SET enable_seqscan = on; Compare the actual time of both plans. If the sequential scan plan is faster, the planner was right.
6. OR Conditions and IN Lists
OR conditions across different columns present a structural challenge — no single B-tree index can serve both sides.
Fix: Rewrite as a UNION ALL of individually indexed queries:
SELECT * FROM orders WHERE customer_id = 42
UNION ALL
SELECT * FROM orders WHERE status = 'pending' AND customer_id != 42; Each branch can use its own index. Before rewriting, check whether PostgreSQL is already handling OR efficiently with bitmap scans. A BitmapOr node in the plan indicates the planner has found a way to combine multiple index scans on its own.
Creating the Right Index
When a sequential scan is genuinely the wrong choice and no suitable index exists, the next step is creating one. The index type must match the query pattern.
B-tree (Default)
Supports equality (=) and range (<, >, <=, >=, BETWEEN) queries. This is the correct choice for the vast majority of cases.
CREATE INDEX idx_orders_customer_id ON orders (customer_id); GIN (Generalized Inverted Index)
Designed for values that contain multiple elements — full-text search with tsvector, JSONB containment queries (@>), and array overlap (&&).
-- Full-text search
CREATE INDEX idx_articles_fts ON articles USING GIN (to_tsvector('english', body));
-- JSONB containment
CREATE INDEX idx_items_data ON items USING GIN (data); GiST (Generalized Search Tree)
Supports geometric data, range types, and nearest-neighbor searches. Also used by the pg_trgm extension for similarity and LIKE '%pattern%' queries.
-- Trigram index for LIKE '%pattern%'
CREATE INDEX idx_name_trgm ON customers USING GiST (name gist_trgm_ops); BRIN (Block Range Index)
Stores summary information for ranges of physical table blocks. Remarkably small — effective only when column values are strongly correlated with physical storage order (e.g., an auto-incrementing id or append-only created_at).
CREATE INDEX idx_events_created_brin ON events USING BRIN (created_at); Partial Indexes — Index Only What You Query
A partial index includes only rows that satisfy a predicate — smaller, faster to scan, and less costly to maintain:
CREATE INDEX idx_active_orders ON orders (customer_id)
WHERE status = 'active'; This index is used only when the query includes the matching condition:
-- Uses the partial index
SELECT * FROM orders WHERE customer_id = 42 AND status = 'active';
-- Does NOT use the partial index
SELECT * FROM orders WHERE customer_id = 42 AND status = 'shipped'; Covering Indexes — Skip the Table Entirely
A covering index includes additional columns via INCLUDE, so the query can be satisfied entirely from the index without visiting the heap table at all:
CREATE INDEX idx_orders_cover ON orders (customer_id)
INCLUDE (total, created_at); Expression Indexes — Match Your WHERE Clause
Expression indexes can be combined with partial indexes:
CREATE INDEX idx_active_lower_email ON users (LOWER(email))
WHERE active = true; This creates a small, precisely targeted index for a specific query pattern. Minimal in size, maximal in purpose.
The "Force Index" Question
PostgreSQL does not have a FORCE INDEX hint. This is a deliberate design decision, not an oversight. The proper response to a bad plan is to correct the information, not to override the planner.
enable_seqscan = off
This setting does not truly disable sequential scans — it assigns them an artificially high cost, making the planner prefer any alternative. It is a diagnostic tool for testing whether an index scan would be faster. It is not a production setting.
-- Plan with sequential scans enabled (default)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_id = 42;
-- Plan with sequential scans penalized
SET enable_seqscan = off;
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_id = 42;
SET enable_seqscan = on; If the forced index scan is significantly faster, one of the six causes above is responsible. Find and fix the root cause.
pg_hint_plan Extension
The pg_hint_plan extension provides MySQL-style query hints:
/*+ IndexScan(orders idx_orders_customer_id) */
SELECT * FROM orders WHERE customer_id = 42; I would urge caution: hints bypass the planner's cost model entirely. If the underlying data distribution changes, the hint may force a plan that is slower than what the planner would have chosen. See pg_hint_plan: Direct Query Plan Control for details.
The Right Answer
If you find yourself reaching for a way to force an index, allow me to suggest a different approach. One of the six reasons above is the actual root cause. The fix is almost always: correct statistics, the right index, and the right cost parameters. Address the cause, and the planner will make the right choice on its own.
After the Fix — Verifying and Monitoring
Creating an index or adjusting configuration is not the end of the matter. Verify that the change had the intended effect, and set up monitoring to catch regressions.
Verify the Plan Changed
Re-run EXPLAIN (ANALYZE, BUFFERS) and confirm the plan now shows Index Scan, Index Only Scan, or Bitmap Index Scan instead of Seq Scan.
Check Index Usage Over Time
SELECT
indexrelname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY idx_scan DESC; idx_scan shows how many times each index has been used since the last statistics reset. If the new index shows scans accumulating, it is earning its keep.
Check Table Scan Ratios
SELECT
relname,
seq_scan,
idx_scan,
ROUND(100.0 * idx_scan / NULLIF(seq_scan + idx_scan, 0), 1) AS idx_pct
FROM pg_stat_user_tables
WHERE relname = 'orders'; For tables that should be accessed primarily by index, idx_pct should be high (90%+). A low percentage indicates that many queries are still performing sequential scans.
Drop Unused Indexes
Indexes that are never scanned are actively costly. They consume disk space and slow down every INSERT, UPDATE, and DELETE.
SELECT
indexrelname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE relname = 'orders'
AND idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC; If idx_scan = 0 after several weeks of normal traffic, the index is not being used. Confirm it is not needed for uniqueness constraints or foreign keys — and if it is not, let it go.
Verify Query Performance with pg_stat_statements
SELECT
LEFT(query, 80) AS query,
calls,
ROUND(mean_exec_time::numeric, 2) AS mean_ms,
ROUND(total_exec_time::numeric, 2) AS total_ms
FROM pg_stat_statements
WHERE query LIKE '%orders%'
ORDER BY mean_exec_time DESC
LIMIT 10; Compare mean_exec_time before and after the fix. If the value dropped, the fix achieved its purpose.
Set Up Monitoring
Track sequential scan counts for critical tables over time. A sudden increase in seq_scan on a table that should be accessed via indexes indicates a regression — likely caused by a new query pattern, stale statistics, or a dropped index.
The Diagnostic Cheat Sheet
A quick-reference decision tree for sequential scan diagnosis. Keep this near at hand.
Step 1 — Is the table small? Query pg_class for reltuples. If under 1,000 rows, the sequential scan is likely the right choice.
Step 2 — Is Rows Removed by Filter high? Run EXPLAIN (ANALYZE). If it is orders of magnitude higher than returned rows, continue to Step 3.
Step 3 — Are statistics current? Check last_analyze in pg_stat_user_tables. If stale, run ANALYZE tablename;.
Step 4 — Are cost parameters correct for your storage? Check random_page_cost. If using SSDs and the value is still 4.0, set it to 1.1.
Step 5 — Does the index match the query? Check for function calls, type mismatches, OR conditions. Create the appropriate expression index or rewrite the query.
Step 6 — Create the index. Choose the right index type (B-tree, GIN, GiST, BRIN). Consider partial and covering indexes.
Step 7 — Verify. Re-run EXPLAIN (ANALYZE, BUFFERS). Check pg_stat_user_indexes. Monitor with pg_stat_statements.
Quick-Reference SQL
-- Check table size
SELECT reltuples::bigint FROM pg_class WHERE relname = 'tablename';
-- Check last analyze
SELECT last_analyze, last_autoanalyze FROM pg_stat_user_tables WHERE relname = 'tablename';
-- Check cost parameters
SHOW random_page_cost;
SHOW effective_cache_size;
-- List indexes on a table
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'tablename';
-- Check index usage
SELECT indexrelname, idx_scan FROM pg_stat_user_indexes WHERE relname = 'tablename';
-- Check seq scan vs index scan ratio
SELECT seq_scan, idx_scan FROM pg_stat_user_tables WHERE relname = 'tablename';