← PostgreSQL Concepts

Dead tuple

Every UPDATE and DELETE leaves something behind. Allow me to explain what, and how the household is kept in order.

Concept · March 21, 2026 · 7 min read

A dead tuple is an old row version that no active transaction can see — a record in the filing cabinet that no one will ever consult again, yet which continues to occupy the drawer. PostgreSQL's MVCC model never overwrites a row in place. An UPDATE creates a new version of the row and marks the old one as superseded. A DELETE marks the row as removed but does not physically delete it. These obsolete row versions — dead tuples — remain in the table, consuming disk space and slowing down scans, until VACUUM tidies them away.

What a dead tuple is

In PostgreSQL, every row stored on disk is a tuple — a specific version of a row, stamped with the transaction IDs that created it (xmin) and invalidated it (xmax). A tuple is "live" when it is the current, visible version of a row. It becomes "dead" when a newer version has replaced it (via UPDATE) or the row has been deleted (via DELETE), and no running transaction still needs to see the old version.

If you have read our MVCC glossary entry, you will recognise the mechanism at work here. Rather than locking rows during updates, PostgreSQL lets readers see the old version while the writer creates the new one. This is elegant concurrency management — but it comes with an obligation. Once all transactions that could see the old version have finished, that version becomes dead. It will never be read again, yet it still occupies physical space in the table file. The old records accumulate, and someone must clear them out.

MVCC versioning
-- An UPDATE in PostgreSQL does not overwrite. It creates a new version.

-- Before UPDATE: one live tuple
-- | xmin=100 | xmax=0 | name='Alice' |   <-- visible

-- After: UPDATE users SET name = 'Bob' WHERE id = 1;
-- | xmin=100 | xmax=200 | name='Alice' |   <-- dead tuple (old version)
-- | xmin=200 | xmax=0   | name='Bob'   |   <-- live tuple (new version)

-- The old row is not removed. It is marked with xmax = the updating transaction.
-- It becomes a dead tuple once no active transaction can see it.

An UPDATE always produces one dead tuple (the old version) and one live tuple (the new version). A DELETE produces one dead tuple and nothing else. An INSERT does not create dead tuples — it only adds a new live tuple. The arithmetic is straightforward: update-heavy and delete-heavy workloads accumulate dead tuples the fastest, and they do so with a consistency one might almost admire, were it not so inconvenient.

Why dead tuples matter

Dead tuples are invisible to queries but not to the storage engine. They occupy real space on disk and in memory, and their effects compound as they accumulate — dust on the mantelpiece that thickens into something rather more structural if left unattended.

  • Table bloat — dead tuples inflate the physical size of a table. A table with 1 million live rows and 500,000 dead tuples is roughly 50% larger than it needs to be. PostgreSQL stores data in 8 KB pages, and dead tuples occupy space on those pages that could hold live data.
  • Slower sequential scans — a sequential scan reads every page of a table. Pages containing dead tuples must still be read from disk and loaded into memory, even though the dead rows will be filtered out. More dead tuples means more pages, which means longer scans.
  • Wasted buffer cache — shared_buffers caches table pages, dead tuples included. Pages bloated with dead rows waste cache slots that could be holding live data, effectively reducing the usable cache size.
  • Index bloat — when a row is updated or deleted, the corresponding index entries are not immediately cleaned up. They continue to point to dead heap tuples, making indexes larger and sparser. Index scans must traverse more pages, and index-only scans lose effectiveness because the visibility map cannot mark pages with dead tuples as all-visible.
  • Increased I/O for backups and replication — physical backups copy the full table files, dead space included. A bloated table means a larger backup and longer replication lag during base backups. You are, in effect, archiving the rubbish along with the valuables.

Monitoring dead tuples

PostgreSQL provides two levels of visibility into dead tuple counts: statistical estimates and physical measurements. I would recommend checking both, in the same way one might consult the thermostat and then also place a hand on the radiator.

pg_stat_user_tables

The fastest check. The n_dead_tup column reports an estimate of dead tuples per table, updated by autovacuum and ANALYZE. Compare it to n_live_tup to get a dead tuple ratio, and check last_autovacuum to see when cleanup last ran. A high dead tuple count with a recent autovacuum is a table that needs attention — the cleaning is happening, but the mess is arriving faster.

SQL
-- Check dead tuple counts per table
SELECT
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  last_autovacuum,
  last_vacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

pgstattuple

For precise measurement, the pgstattuple extension physically scans the table and reports exact dead tuple counts, dead space in bytes, and free space. It reads every page, so it is slower than querying statistics views — but the numbers are real, not estimated. When the estimates raise an eyebrow, pgstattuple provides the definitive answer.

SQL
-- Precise dead tuple measurement with pgstattuple
CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
  tuple_count AS live_tuples,
  dead_tuple_count AS dead_tuples,
  dead_tuple_len AS dead_bytes,
  round(dead_tuple_len::numeric / nullif(table_len, 0) * 100, 2) AS dead_pct,
  free_space,
  round(free_space::numeric / nullif(table_len, 0) * 100, 2) AS free_pct
FROM pgstattuple('my_table');

How dead tuples are cleaned up

Dead tuples are removed by VACUUM — PostgreSQL's dedicated cleanup process. If dead tuples are the filing cabinet's expired records, VACUUM is the staff member who periodically opens each drawer and removes what is no longer needed.

Standard VACUUM

Marks dead tuple space as reusable within the table file. The file itself does not shrink — PostgreSQL will fill the reclaimed gaps with future inserts and updates. Think of it as clearing the drawers without replacing the cabinet. Standard VACUUM runs concurrently with reads and writes (it acquires only a lightweight SHARE UPDATE EXCLUSIVE lock), making it safe to run at any time.

SQL
-- Standard VACUUM: reclaims dead tuple space for reuse
-- Does NOT return disk space to the OS
VACUUM VERBOSE my_table;

-- VACUUM ANALYZE: reclaim space and update planner statistics
VACUUM ANALYZE my_table;

Autovacuum

PostgreSQL's autovacuum daemon runs VACUUM automatically based on configurable thresholds. It triggers when the number of dead tuples in a table exceeds a threshold calculated from the table's size. The default formula allows up to 20% of a table to become dead before triggering — adequate for moderate workloads, but I must be candid: for write-heavy tables, those defaults are rather too permissive.

SQL / postgresql.conf
-- Autovacuum trigger formula:
-- vacuum when dead tuples > threshold + (scale_factor * live tuples)
--
-- Defaults:
--   autovacuum_vacuum_threshold = 50
--   autovacuum_vacuum_scale_factor = 0.2
--
-- For a 1M-row table: 50 + (0.2 * 1,000,000) = 200,050 dead tuples

-- Tune per-table for write-heavy tables
ALTER TABLE hot_table SET (
  autovacuum_vacuum_scale_factor = 0.01,
  autovacuum_vacuum_threshold = 1000
);

When autovacuum falls behind, dead tuples accumulate faster than they are cleaned. The symptoms are unmistakable: n_dead_tup in pg_stat_user_tables climbs steadily, sequential scans slow down, and the table grows on disk. The remedy is almost always per-table tuning — lowering the scale factor so vacuum triggers sooner, before the accumulation becomes unsightly.

VACUUM FULL

Rewrites the entire table to a new file, eliminating all dead space and physically shrinking the table on disk. I should note that this is the nuclear option: it acquires an ACCESS EXCLUSIVE lock that blocks all reads and writes for the duration of the rewrite. For large tables, I would direct your attention to pg_repack, which achieves the same result without closing the doors to your guests.

SQL / Shell
-- VACUUM FULL rewrites the table, physically shrinking it on disk
-- WARNING: acquires ACCESS EXCLUSIVE lock — blocks all reads and writes
VACUUM FULL my_table;

-- Prefer pg_repack for online compaction without the heavy lock
-- pg_repack -d mydb -t my_table

How Gold Lapel relates

I find accumulated dead tuples as offensive as dust on the mantelpiece, and Gold Lapel shares my disposition. It monitors dead tuple accumulation as part of its continuous database analysis, and when it detects a query scanning a table with a high dead tuple ratio, it draws the distinction that matters: is this query slow because it lacks an index, or because the table is bloated with dead rows that inflate every scan?

The distinction matters because the remedy is different. A missing index calls for an index recommendation. A bloated table calls for vacuum tuning. Gold Lapel surfaces the underlying cause alongside the symptom, so you attend to the right problem rather than treating the wrong one with great efficiency.

Frequently asked questions