Bloat

Q: How do I know if my table is bloated?

The simplest indicator is the dead tuple ratio in pg_stat_user_tables. If n_dead_tup is a significant fraction of n_live_tup — say, 20% or more — the table is carrying dead weight. For a more precise measurement, the pgstattuple extension reports exact dead space and free space within the table — a proper inspection rather than a glance. You can also compare the table's actual size on disk (pg_relation_size) against what you would expect from the live row count and average row width. A table that is two or three times larger than its data warrants is bloated, and deserves prompt attention.

Q: Does VACUUM fix bloat?

It manages it, which is not quite the same as fixing it. Standard VACUUM reclaims dead space for reuse within the table, but it does not return space to the operating system. The table file stays the same size on disk — PostgreSQL will fill the reclaimed gaps with future inserts and updates. If the table has already grown far beyond its data size and you need to actually shrink the file, you need VACUUM FULL (which locks the table) or an online tool like pg_repack or pg_squeeze. For most workloads, keeping VACUUM running consistently prevents bloat from accumulating in the first place.

Q: Can indexes get bloated too?

Yes, and it is often worse than table bloat — a fact that catches many by surprise. Every time a row is updated or deleted, the corresponding index entries become dead. Unlike the heap, where PostgreSQL can reuse dead tuple space in place, B-tree indexes tend to develop sparse, half-empty pages that waste space and slow down scans. REINDEX or pg_repack can rebuild indexes without bloat. You can measure index bloat with the pgstatindex function from the pgstattuple extension — look at avg_leaf_density, where values below 50% indicate significant bloat.

Q: What causes autovacuum to fall behind on bloat?

Three common causes. First, the default autovacuum_vacuum_scale_factor of 0.2 means a table with 10 million rows will not trigger vacuum until 2 million dead tuples accumulate — far too many. Lower the scale factor for large tables. Second, long-running transactions hold the visibility horizon open, preventing VACUUM from cleaning tuples that might still be visible to that transaction. A forgotten BEGIN in a psql session can block vacuuming across the entire database. Third, autovacuum cost-based throttling may be too conservative, causing each vacuum run to finish too slowly.

Q: Is bloat always a problem?

Not always, no. Some amount of dead space is normal and even beneficial. PostgreSQL reuses dead space for new inserts and updates, which means a table with moderate free space can absorb writes without extending the file. Bloat becomes a problem when dead space is a significant fraction of the table — say, 30% or more — because sequential scans must read through all that dead space, the buffer cache wastes memory on dead pages, and disk usage grows beyond what the live data requires. The goal is not zero bloat but controlled bloat. A spare room is an asset. An entire unused wing is waste.

Q: Does fillfactor help with bloat?

Indirectly, and it is a technique I recommend for update-heavy tables. The fillfactor setting reserves free space on each page for future updates. When a row is updated and the new version fits on the same page, PostgreSQL can perform a HOT (Heap-Only Tuple) update, which skips index updates entirely. This reduces index bloat and write amplification. A common setting is fillfactor = 70, leaving 30% of each page empty for in-place updates. It does not prevent table bloat from deletes, but it significantly reduces bloat from updates.

Wasted space in tables and indexes caused by dead rows that VACUUM has not yet reclaimed — or cannot reclaim fast enough.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 9 min read

If you have read the entries on MVCC and VACUUM, you know the cause and the cure. Bloat is what happens when the cure does not arrive quickly enough. PostgreSQL's MVCC model never overwrites a row in place — every UPDATE creates a new row version and leaves the old one behind, and every DELETE marks a row as dead without removing it. These dead tuples occupy physical space until VACUUM reclaims it. When dead space accumulates faster than VACUUM cleans it, tables and indexes grow beyond their useful size. Sequential scans slow down because they must read through pages of dead rows. The buffer cache wastes memory on data no transaction will ever see. Disk usage climbs without a corresponding increase in live data. A well-run household does not tolerate this kind of accumulation.

What bloat is

Every table in PostgreSQL is stored as a sequence of 8 KB pages. Each page holds some number of row versions (tuples). As the routine vacuuming documentation explains, when a row is updated, the old version remains on its page, marked with an xmax transaction ID to indicate it has been superseded. When a row is deleted, the tuple stays in place, similarly marked. These dead tuples are invisible to new transactions but they still occupy physical bytes on disk.

Bloat is what happens when those dead tuples accumulate. A table that holds 1 million live rows but contains 500,000 dead tuples is roughly 50% larger than it needs to be. PostgreSQL must read through all of those pages during a sequential scan, regardless of whether the tuples on them are live or dead. Index entries pointing to dead tuples remain in the index until VACUUM removes them, inflating the index beyond its useful size as well. It is, if I may be direct, clutter — and clutter compounds.

Some amount of dead space is normal. PostgreSQL reuses it for future inserts and updates. Bloat becomes a problem when the dead-to-live ratio grows large enough to measurably affect performance — typically when 20-30% or more of a table's physical space is dead.

Why bloat matters

I should enumerate the consequences, because bloat is one of those problems that sounds merely untidy until you see the bill.

Slower sequential scans — a sequential scan reads every page of a table. Dead tuples inflate page count, so the scan reads more data than necessary. A table with 50% bloat takes roughly 50% longer to scan.
Wasted buffer cache — PostgreSQL caches pages in shared_buffers. Bloated pages full of dead tuples occupy cache slots that could hold live data, effectively reducing your usable cache size.
Inflated disk usage — dead space consumes real disk. On managed databases where you pay per GB, bloat directly increases cost.
Index inefficiency — dead index entries make B-tree indexes sparse and larger than necessary. Index scans traverse more pages, and index-only scans become less effective when the visibility map marks bloated pages as not all-visible.
Slower backups and replication — physical backups (pg_basebackup, WAL archiving) copy the full table files, dead space included. Bloated tables mean larger backups and longer base backup times.

Measuring bloat

Before prescribing a remedy, one must establish the severity. There are several ways to assess bloat, from quick estimates to precise measurements.

pg_stat_user_tables

The fastest check — a glance around the room, not a formal inspection. The n_dead_tup column reports an estimate of dead tuples per table. Compare it to n_live_tup to get a dead tuple ratio. This is an estimate updated by autovacuum and ANALYZE, not a real-time count, but it is good enough for identifying which tables deserve closer attention.

SQL

-- Check dead tuple counts and bloat indicators per table
SELECT
  schemaname,
  relname,
  pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
  n_live_tup,
  n_dead_tup,
  round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  last_autovacuum,
  last_vacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 15;

pgstattuple

For precise measurement, the pgstattuple extension scans the table physically and reports exact dead tuple counts, dead space, and free space. It reads every page, so it is slower than querying statistics views — but the numbers are real, not estimated. I appreciate a proper inventory.

SQL

-- Detailed physical bloat measurement with pgstattuple
CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
  table_len,
  tuple_count,
  dead_tuple_count,
  dead_tuple_len,
  round(dead_tuple_len::numeric / nullif(table_len, 0) * 100, 2) AS dead_pct,
  free_space,
  round(free_space::numeric / nullif(table_len, 0) * 100, 2) AS free_pct
FROM pgstattuple('my_table');

Bloat estimation queries

A middle ground: estimate bloat by comparing the table's actual disk size against what you would expect from its live row count and average row width. These queries use planner statistics and do not scan the table.

SQL

-- Estimate table bloat by comparing actual size to expected size
-- Based on average row width and live tuple count
SELECT
  schemaname,
  relname,
  pg_size_pretty(pg_relation_size(relid)) AS table_size,
  n_live_tup,
  pg_size_pretty(
    pg_relation_size(relid) -
    (n_live_tup * (SELECT avg(avg_width) FROM pg_stats WHERE tablename = relname)::bigint)
  ) AS estimated_bloat
FROM pg_stat_user_tables
WHERE n_live_tup > 0
ORDER BY pg_relation_size(relid) DESC
LIMIT 10;

Index bloat

Index bloat is measured separately. The pgstatindex function from pgstattuple reports leaf page density and fragmentation for B-tree indexes. An avg_leaf_density below 50% signals that the index has significant wasted space.

SQL

-- Check index bloat using pgstattuple
SELECT
  indexrelname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan,
  (pgstatindex(indexrelid)).avg_leaf_density,
  (pgstatindex(indexrelid)).leaf_fragmentation
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 10;

-- avg_leaf_density below 50% suggests significant index bloat
-- A healthy B-tree index typically has 90%+ leaf density

Fixing bloat

Once bloat has accumulated, you have several options to reclaim the space. Each involves a trade-off between thoroughness and disruption — the eternal tension between a deep clean and keeping the household running.

VACUUM

Standard VACUUM marks dead space as reusable within the table. It does not shrink the table file on disk — the OS-level file stays the same size — but PostgreSQL will fill the reclaimed gaps with future writes. The VACUUM command reference details every option available. This is the right tool for prevention: run it consistently and the clutter never accumulates beyond what the table can absorb.

VACUUM FULL

Rewrites the entire table to a new file, eliminating all dead space. The table physically shrinks on disk. The cost is an ACCESS EXCLUSIVE lock for the duration of the rewrite — no reads or writes until it completes. On a large table, this can mean minutes or hours of downtime.

SQL

-- Standard VACUUM: marks dead space as reusable
-- Does NOT return space to the OS — table stays the same size on disk
VACUUM VERBOSE my_table;

-- VACUUM FULL: rewrites the entire table, reclaims disk space
-- WARNING: acquires ACCESS EXCLUSIVE lock — blocks all reads and writes
VACUUM FULL my_table;

pg_repack

Rebuilds a table or index online, without the heavy lock of VACUUM FULL. It creates a copy of the table, replays changes that occurred during the copy, and swaps the old and new tables in a brief lock at the end. The result is the same as VACUUM FULL — a compact, bloat-free table — without closing the doors to guests. This is the preferred tool for fixing bloat on production tables, and I am rather fond of it.

pg_squeeze

An extension that automates compaction based on bloat thresholds. You configure a table to be monitored, and pg_squeeze rebuilds it automatically when bloat exceeds the threshold. Uses the same logical decoding approach as pg_repack but runs as a background worker without manual intervention.

Shell / SQL

-- pg_repack: rebuild table online without heavy locks
-- Install the extension first
CREATE EXTENSION pg_repack;

-- Repack a single table (runs online, minimal locking)
-- Run from the command line, not SQL:
-- pg_repack -d mydb -t my_table

-- Repack a specific index
-- pg_repack -d mydb -i my_table_pkey

-- pg_squeeze: automatic compaction triggered by bloat thresholds
CREATE EXTENSION pg_squeeze;

-- Register a table for automatic compaction
SELECT squeeze.start_worker();
INSERT INTO squeeze.tables (tabschema, tabname, free_space_extra)
VALUES ('public', 'my_table', 20);

REINDEX

For index-only bloat, REINDEX rebuilds an index from scratch. REINDEX CONCURRENTLY (PostgreSQL 12+) does this without locking writes, though it uses more disk space temporarily. pg_repack can also rebuild individual indexes online.

Preventing bloat

Fixing bloat after the fact works, but I confess a strong preference for preventing it altogether. Three strategies cover most cases.

Tune autovacuum for write-heavy tables

The default autovacuum_vacuum_scale_factor of 0.2 means vacuum does not trigger until 20% of the table is dead tuples. For a table with 10 million rows, that is 2 million dead tuples before vacuum runs — far too many. The autovacuum configuration parameters let you lower the scale factor to 0.01 or even 0.005 for high-write tables, so vacuum runs earlier and more often.

SQL / postgresql.conf

-- Lower autovacuum thresholds for write-heavy tables
ALTER TABLE hot_table SET (
  autovacuum_vacuum_scale_factor = 0.01,  -- trigger at 1% dead tuples
  autovacuum_vacuum_threshold = 1000       -- minimum dead tuples
);

-- Increase autovacuum aggressiveness globally
-- In postgresql.conf:
-- autovacuum_vacuum_cost_delay = 2    -- ms (default 2, lower = faster)
-- autovacuum_vacuum_cost_limit = 400  -- higher = more work per cycle

-- Check for long-running transactions that block VACUUM
SELECT
  pid,
  now() - xact_start AS duration,
  state,
  query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND state != 'idle'
ORDER BY xact_start
LIMIT 10;

Avoid long-running transactions

VACUUM can only reclaim tuples that are invisible to all active transactions. A long-running transaction holds the visibility horizon open, preventing cleanup of any tuples created after it started. A single forgotten BEGIN in a monitoring session can block vacuuming across the entire database. Monitor pg_stat_activity for transactions that have been open for hours and terminate them. I cannot stress this enough — one idle transaction, left unattended, can undo every other precaution on this list.

Maximize HOT updates

HOT (Heap-Only Tuple) updates occur when an update changes no indexed columns and the new row version fits on the same page. HOT updates skip index updates entirely, which means no dead index entries and no index bloat. Two things help: keep fillfactor below 100 (e.g., 70) to leave room for in-page updates, and avoid indexing columns that are frequently updated.

SQL

-- Check HOT update effectiveness per table
SELECT
  schemaname,
  relname,
  n_tup_upd,
  n_tup_hot_upd,
  round(n_tup_hot_upd::numeric / nullif(n_tup_upd, 0) * 100, 1) AS hot_pct
FROM pg_stat_user_tables
WHERE n_tup_upd > 1000
ORDER BY n_tup_upd DESC
LIMIT 10;

-- High hot_pct (80%+) means most updates avoid index writes
-- Low hot_pct means updates are creating index bloat too

How Gold Lapel relates

Gold Lapel monitors table and index bloat as part of its continuous database analysis. When we detect that query performance has degraded on a table carrying significant dead space, we correlate the two — distinguishing between a query that is slow because it lacks an index and a query that is slow because the table it scans is three times larger than its live data warrants.

This distinction matters because the fix is different. A missing index calls for an index recommendation. A bloated table calls for vacuum tuning or a rebuild. Gold Lapel surfaces both the symptom (slow query) and the underlying cause (bloat, missing index, stale statistics, or something else), so you address the right problem rather than adding indexes to a table that simply needs a thorough cleaning.