Index bloat

Q: Is index bloat the same as table bloat?

Related, but distinct. Table bloat is dead row versions sitting in the heap — the files themselves, if you will. Index bloat is dead or sparse entries within the index structure — the filing system that helps you find those files. Half-empty leaf pages in a B-tree, for example. A table can be bloated without its indexes being bloated (rare in practice), and index bloat can persist even after VACUUM cleans dead heap tuples, because VACUUM does not compact index pages.

Q: Does VACUUM fix index bloat?

Only partially, I am afraid. VACUUM removes dead index entries that point to dead heap tuples, which prevents index bloat from growing indefinitely. But it does not compact the index structure. Once a B-tree leaf page has been split or emptied, standard VACUUM does not merge pages or reclaim the space. Over time, the index accumulates sparse pages that waste space and slow scans. To actually compact the index, you need REINDEX, REINDEX CONCURRENTLY, or pg_repack.

Q: How much index bloat is acceptable?

A healthy B-tree index typically has an avg_leaf_density of 90% or higher, meaning most leaf page space holds live index entries. Density between 70-90% is normal for indexes on tables with moderate update activity. Below 50% indicates significant bloat that is likely affecting scan performance and wasting buffer cache space. The threshold for rebuilding depends on the index size and how heavily it is scanned — a 10 GB index at 40% density is wasting 6 GB, while a 10 MB index at the same density barely matters.

Q: Does REINDEX CONCURRENTLY have any downsides?

It does not block writes, which is its principal virtue. But it temporarily uses roughly double the disk space for the index being rebuilt, because it creates a new copy before dropping the old one. It also takes longer than a regular REINDEX because it must handle concurrent modifications during the rebuild. If the rebuild fails partway through, it leaves behind an INVALID index that you will need to drop manually — an untidy outcome, so do monitor free disk space carefully.

Q: Can I prevent index bloat entirely?

Not entirely — some degree of index churn is inherent to PostgreSQL's MVCC model, and no amount of diligence eliminates it. But you can minimize it considerably. HOT (Heap-Only Tuple) updates bypass index updates entirely when the updated columns are not indexed and the new row fits on the same page. Lowering fillfactor on the table (e.g., to 70) increases the chance of HOT updates by leaving room on each page. Keeping autovacuum aggressive ensures dead index entries are cleaned promptly. And avoiding unnecessary indexes on frequently updated columns reduces the number of index entries that must be maintained per write. Prevention will not make the problem disappear, but it keeps the problem manageable.

Your index has grown rather larger than its contents warrant. If I may, I'd like to show you where the dead entries are hiding — and how to restore order.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 6 min read

Index bloat is wasted space within a PostgreSQL index — a filing system that has accumulated outdated entries nobody will ever look up again. When rows are updated or deleted, the old index entries are not immediately removed; they linger as dead pointers until VACUUM attends to them. Even after dead entries are cleared, B-tree indexes do not automatically compact their page structure. Leaf pages that were once full can end up half-empty or entirely empty, and those pages stay allocated. The result is an index that is physically larger than it needs to be, forcing index scans to read more pages, wasting buffer cache on sparse data, and inflating disk usage. REINDEX or pg_repack will rebuild the index from scratch and restore it to proper working order.

What index bloat is

A B-tree index in PostgreSQL is a balanced tree of 8 KB pages, as the B-tree implementation documentation describes. Leaf pages hold the actual index entries — each one a (key, pointer) pair that maps a column value to a heap tuple. When a row is updated, PostgreSQL creates a new heap tuple and inserts a new index entry pointing to it. The old index entry, still pointing to the now-dead heap tuple, remains in the leaf page until VACUUM removes it. Think of it as a directory where someone has crossed out old entries but never torn out the pages.

This creates two forms of wasted space. First, dead index entries occupy bytes on leaf pages — entries that will never be followed but must be skipped during scans. Second, when VACUUM does remove dead entries, or when page splits leave behind unevenly filled pages, the leaf pages become sparse. A page that could hold 200 index entries might contain only 40. PostgreSQL does not merge underfilled pages or return empty pages to the free space map in normal operation. The pages stay allocated, and the index stays large.

Over time, on tables with heavy update or delete activity, B-tree indexes can grow to two, three, or even ten times the size that the live data warrants. The index still works — queries still find the right rows — but every scan must thumb through pages of stale references to reach the ones that matter.

Why index bloat matters

Allow me to be direct: index bloat is not merely a storage concern. It affects query performance in several compounding ways, each one quietly making the next worse:

More I/O per index scan — an index scan traverses the tree from root to leaf and then follows pointers to the heap. Sparse leaf pages mean the scan reads more pages to find the same number of live entries. For range scans that touch many leaf pages, the overhead multiplies.
Wasted buffer cache — PostgreSQL caches index pages in shared_buffers. Sparse pages full of dead or empty space occupy cache slots that could hold denser, more useful data. Effective cache capacity drops.
Reduced index-only scan effectiveness — index-only scans avoid heap access by reading values directly from the index. When index pages are sparse, the scan reads more pages for the same data, reducing the I/O advantage that makes index-only scans fast.
Inflated backups — physical backups (pg_basebackup, WAL archiving) copy index files at their full size, bloat included. A 10 GB index that should be 2 GB adds 8 GB to every base backup. One pays to store the emptiness, which I find rather dispiriting.
Slower writes — inserting into a bloated index is not significantly slower, but the increased tree depth from sparse pages can add an extra level of traversal. More importantly, the wasted buffer cache from bloated indexes puts pressure on all other operations.

Measuring index bloat

Before one can fix what ails the index, one must know how far the situation has deteriorated. There are two approaches: precise measurement with pgstatindex() and quick estimation from catalog statistics.

pgstatindex (precise)

The pgstattuple extension provides pgstatindex(), which physically reads every page of a B-tree index and reports structural details. The key metric is avg_leaf_density — the percentage of leaf page space occupied by live index entries.

SQL

-- Measure index bloat precisely with pgstatindex
CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
  indexrelname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  (pgstatindex(indexrelid)).version,
  (pgstatindex(indexrelid)).tree_level,
  (pgstatindex(indexrelid)).index_size AS raw_size,
  (pgstatindex(indexrelid)).leaf_pages,
  (pgstatindex(indexrelid)).empty_pages,
  (pgstatindex(indexrelid)).deleted_pages,
  (pgstatindex(indexrelid)).avg_leaf_density,
  (pgstatindex(indexrelid)).leaf_fragmentation
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 10;

-- avg_leaf_density: percentage of leaf page space used by live tuples
--   90%+ = healthy
--   50-70% = moderate bloat
--   below 50% = significant bloat, rebuild recommended

A word of caution: because pgstatindex() reads the entire index, it is slow on large indexes. I would not run this against a 50 GB index during peak hours. Off-hours or a replica, please.

Catalog estimation (quick)

For a faster but less precise check, compare index sizes from pg_stat_user_indexes against what you would expect for the table's row count. An index that is substantially larger than its table is often bloated. Indexes with zero scans since the last statistics reset may be both unused and bloated — candidates for DROP rather than rebuild.

SQL

-- Quick bloat estimation from pg_stat_user_indexes
-- Compares index size to the expected size for the number of rows
SELECT
  schemaname,
  relname AS table_name,
  indexrelname AS index_name,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  pg_size_pretty(pg_relation_size(relid)) AS table_size,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 15;

-- An index that is larger than its table is a red flag
-- An index with zero idx_scan may be unused — and bloated

Ongoing monitoring

Track index sizes over time. A healthy index grows roughly in proportion to the table's live row count. An index whose size climbs steadily while the table's row count stays flat is accumulating bloat — the directory is expanding without new entries to justify it. Monitoring pg_stat_user_indexes at regular intervals catches this drift before it becomes a performance problem.

SQL

-- Monitor index usage alongside size to find bloated or unused indexes
SELECT
  schemaname,
  relname AS table_name,
  indexrelname AS index_name,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan AS scans_since_reset,
  pg_stat_get_last_analyze_time(relid) AS last_analyze
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND pg_relation_size(indexrelid) > 1024 * 1024  -- > 1 MB
ORDER BY pg_relation_size(indexrelid) DESC;

-- Unused indexes that are also bloated are candidates for DROP, not REINDEX

Fixing index bloat

Once an index is bloated, there is no gentle remedy — the only fix is to rebuild it. VACUUM removes dead entries but does not compact pages. Rebuilding creates a fresh, dense index structure from the current live data. The good news: PostgreSQL provides civilized ways to do this.

REINDEX

The built-in approach, documented in the REINDEX command reference. REINDEX INDEX drops the old index and builds a new one from scratch. The result is a compact, bloat-free index. The cost: it acquires an ACCESS EXCLUSIVE lock on the index, blocking all reads and writes to the table for the duration. Effective, but not what I would call graceful under pressure.

SQL

-- REINDEX: rebuild a single index (locks writes)
REINDEX INDEX my_table_pkey;

-- REINDEX: rebuild all indexes on a table
REINDEX TABLE my_table;

-- REINDEX CONCURRENTLY (PostgreSQL 12+): rebuild without locking writes
-- Takes longer and uses more disk space temporarily, but does not block DML
REINDEX INDEX CONCURRENTLY my_table_pkey;
REINDEX TABLE CONCURRENTLY my_table;

REINDEX CONCURRENTLY

Available since PostgreSQL 12, and — if I may say — the approach I would recommend for any system that values its uptime. The routine reindexing guide recommends it for the same reasons. It builds a new index alongside the old one without blocking writes. Once the new index is complete and consistent, PostgreSQL swaps it in and drops the old one. It uses roughly double the disk space temporarily and takes longer than a regular REINDEX, but it does not cause downtime. A proper renovation that never closes the house to guests.

pg_repack

pg_repack can rebuild individual indexes online, similar to REINDEX CONCURRENTLY but available on older PostgreSQL versions. It can also rebuild an entire table and all its indexes in one operation. For indexes specifically, it offers the same benefit: a compact rebuild without blocking DML.

pg_squeeze

An extension that automates compaction based on configurable bloat thresholds. Register a table, and pg_squeeze monitors it and rebuilds when bloat exceeds the threshold — including its indexes. Useful for hands-off maintenance on tables with predictable bloat patterns.

Shell / SQL

-- pg_repack: rebuild specific indexes online
-- Run from the command line, not SQL
-- pg_repack -d mydb --index my_table_pkey
-- pg_repack -d mydb --table my_table  -- rebuilds table and all its indexes

-- pg_squeeze: automatic compaction when bloat exceeds a threshold
CREATE EXTENSION pg_squeeze;
SELECT squeeze.start_worker();
INSERT INTO squeeze.tables (tabschema, tabname, free_space_extra)
VALUES ('public', 'my_table', 20);

How Gold Lapel relates

Gold Lapel notices when an index scan takes longer than it should for the number of rows returned — more page reads, lower cache hit ratios than the query plan warrants. I correlate that with index size relative to table size to identify bloat as the likely cause. The distinction matters: a slow scan caused by a poorly chosen index needs a different index; a slow scan caused by a well-chosen but bloated index needs a rebuild. Different diagnosis, different remedy, and getting it wrong wastes your time on the wrong fix.

Index bloat

What index bloat is

Why index bloat matters

Measuring index bloat

pgstatindex (precise)

Catalog estimation (quick)

Ongoing monitoring

Fixing index bloat

REINDEX

REINDEX CONCURRENTLY

pg_repack

pg_squeeze

How Gold Lapel relates

Frequently asked questions

Is index bloat the same as table bloat?

Does VACUUM fix index bloat?

How much index bloat is acceptable?

Does REINDEX CONCURRENTLY have any downsides?

Can I prevent index bloat entirely?

Related content