← How-To

PostgreSQL VACUUM and Autovacuum: A Practical Guide

Garbage collection for PostgreSQL. When it falls behind, things get worse gradually, then suddenly.

The Butler of Gold Lapel · March 29, 2026 · 24 min read
The illustrator is conducting a dress rehearsal of their own. We await the final performance.

Why VACUUM exists — MVCC and dead tuples

PostgreSQL uses MVCC (Multi-Version Concurrency Control) to allow readers and writers to operate without blocking each other. This concurrency model has a physical consequence that deserves your full attention.

When you execute an UPDATE in PostgreSQL, the database does not modify the existing row in place. Instead, it creates a new version of the row with the updated values and marks the old version as invisible to new transactions. The old version remains physically present in the table's data files.

DELETE works the same way. The row is not removed from disk — it is marked as invisible. Future transactions will not see it, but it still occupies space.

These invisible old row versions are called dead tuples. They consume disk space, occupy pages in the buffer cache (displacing useful data), and slow down sequential scans (the database must read and skip over them).

VACUUM is PostgreSQL's garbage collector. Its job is to scan tables, identify dead tuples that are no longer visible to any active transaction, and mark their space as reusable for future inserts and updates.

Without VACUUM, every table grows indefinitely. This is not a bug — it is the direct cost of MVCC's guarantee that reads never block writes and writes never block reads. The growth is gradual at first and then accelerates as dead tuples compound. A table that accumulates 20% dead space does not just waste 20% of disk — it also degrades sequential scan performance by 20%, because every scan must read and skip those dead tuples.

The key mental model: VACUUM is garbage collection for PostgreSQL. It runs in the background, it reclaims space, and when it falls behind, things get worse gradually, then suddenly. I have seen this pattern enough times to assure you it is worth understanding before it arrives at your door.

Regular VACUUM vs VACUUM FULL — the critical distinction

Regular VACUUM

Regular VACUUM (invoked as VACUUM table_name or triggered automatically by autovacuum) scans the table page by page, identifies dead tuples that no active transaction can see, marks their space as reusable within the existing table file, and updates the visibility map and free space map.

What regular VACUUM does not do:

  • It does not return space to the operating system. The table file stays the same size on disk.
  • It does not require an exclusive lock. Reads and writes continue normally. It acquires only a ShareUpdateExclusive lock.
  • It does not rewrite the table. It works in place, page by page.

Regular VACUUM is fast, lightweight, and safe to run on production tables at any time. It is the routine maintenance that keeps your household in order.

VACUUM FULL

VACUUM FULL is a fundamentally different operation: it rewrites the entire table into a new physical file, compacting all live tuples together, rebuilds every index from scratch, returns the reclaimed space to the operating system, and requires an ACCESS EXCLUSIVE lock for the entire duration — no reads or writes can touch the table.

Conceptually, VACUUM FULL is equivalent to:

VACUUM FULL — conceptually
CREATE TABLE new_table AS SELECT * FROM old_table;
DROP TABLE old_table;
ALTER TABLE new_table RENAME TO old_table;
-- (plus rebuild all indexes, constraints, and grants)

VACUUM FULL requires temporary disk space for the full copy of the table. If you have a 100GB table, you need at least 100GB of free disk space during the operation.

When VACUUM FULL is justified: only when the table has massive free space — 50% or more of the file is reusable but not reclaimable by regular VACUUM — and you genuinely need the disk space returned to the operating system.

The decision: which VACUUM do I need?

DimensionRegular VACUUMVACUUM FULL
Lock levelShareUpdateExclusive (allows reads and writes)ACCESS EXCLUSIVE (blocks everything)
Disk space impactMarks space reusable within the fileReturns space to the OS; file shrinks
OS space recoveryNoYes
DurationMinutes for most tablesHours for large tables with many indexes
Index rebuildNoYes — all indexes rebuilt from scratch
Production safetySafe to run anytimeRequires a maintenance window
Temporary disk spaceMinimalRequires space for a full table copy

Before reaching for VACUUM FULL, consider whether pg_repack or pg_squeeze can accomplish the same compaction without an exclusive lock. VACUUM FULL is a maintenance window operation. Regular VACUUM is an always-on operation. The vast majority of systems — yours quite possibly included — never need VACUUM FULL if autovacuum is configured properly.

Autovacuum — PostgreSQL's built-in garbage collector

How autovacuum decides when to run

Autovacuum uses a formula to decide when a table qualifies for vacuuming:

Autovacuum trigger formula
vacuum threshold = autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor × estimated live rows)

With default settings (autovacuum_vacuum_threshold = 50, autovacuum_vacuum_scale_factor = 0.2), autovacuum triggers when dead tuples exceed 50 + 20% of the table's estimated row count.

For small tables, this works well. For large tables, the default scale factor becomes a problem:

  • 10,000,000-row table: vacuums after 2,000,050 dead tuples — potentially gigabytes of wasted space before autovacuum even triggers.
  • 100,000,000-row table: vacuums after 20,000,050 dead tuples. Twenty million dead rows.

The scale factor model, if you'll permit the observation, assumes all tables are small. For large tables, per-table overrides are essential.

The settings that matter

autovacuum_vacuum_threshold (default: 50) — the base number of dead tuples before autovacuum considers vacuuming. The default is fine for most cases.

autovacuum_vacuum_scale_factor (default: 0.2) — the fraction of estimated live rows added to the threshold. This is the value that causes problems on large tables.

autovacuum_naptime (default: 1 minute) — how frequently the autovacuum launcher wakes up to check for tables that need vacuuming. One minute is almost always fine.

autovacuum_max_workers (default: 3) — the maximum number of concurrent autovacuum worker processes. If you have many large tables, increase to 4-6.

autovacuum_vacuum_cost_delay / autovacuum_vacuum_cost_limit — the throttling mechanism. Autovacuum uses cost-based I/O throttling to avoid contending with production queries. The defaults are overly conservative on modern SSDs and cloud storage.

Per-table overrides for hot tables

For large, frequently-updated tables:

Per-table autovacuum override
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_vacuum_threshold = 1000
);

This triggers autovacuum after 1% of the table changes instead of 20%. For a 100-million-row table, this means vacuuming triggers after approximately 1,001,000 dead tuples instead of 20,000,050.

You can also tune the analyze frequency independently:

Per-table analyze override
ALTER TABLE orders SET (
    autovacuum_analyze_scale_factor = 0.005,
    autovacuum_analyze_threshold = 500
);

To identify candidates for per-table overrides, look for tables that are both large and frequently churned:

Find candidates for per-table overrides
SELECT
    schemaname || '.' || relname AS table_name,
    n_live_tup,
    n_dead_tup,
    round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
    last_autovacuum
FROM pg_stat_user_tables
WHERE n_live_tup > 1000000
ORDER BY n_dead_tup DESC
LIMIT 20;

Monitoring VACUUM — is it keeping up?

Key views and queries

Quick health check — tables where dead tuples exceed 10% of live tuples:

VACUUM health check
SELECT
    schemaname || '.' || relname AS table_name,
    n_live_tup,
    n_dead_tup,
    round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
    last_autovacuum,
    last_vacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > n_live_tup * 0.1
ORDER BY n_dead_tup DESC;

pg_stat_progress_vacuum (PostgreSQL 9.6+) shows currently-running VACUUM operations with their progress:

VACUUM progress monitoring
SELECT
    p.pid,
    a.query,
    p.relid::regclass AS table_name,
    p.phase,
    p.heap_blks_total,
    p.heap_blks_scanned,
    p.heap_blks_vacuumed,
    round(100.0 * p.heap_blks_vacuumed / NULLIF(p.heap_blks_total, 0), 1) AS pct_complete
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a ON a.pid = p.pid;

The phases you will see: initializing, scanning heap, vacuuming indexes, vacuuming heap, cleaning up indexes, truncating heap, and performing final cleanup. For most tables, "vacuuming indexes" is the longest phase.

Signs autovacuum is falling behind

  • n_dead_tup growing over time across multiple tables when queried repeatedly
  • last_autovacuum timestamps are old — hours or days behind on tables that are actively receiving updates
  • Table sizes growing on disk without corresponding growth in live data
  • Sequential scan performance degrading on large tables
  • Buffer cache hit ratio dropping on specific tables — dead tuples displace useful live data

pgstattuple for accurate measurement

The n_dead_tup value in pg_stat_user_tables is an estimate. For definitive measurement, pgstattuple reads actual table pages:

Exact bloat measurement
SELECT * FROM pgstattuple('orders');

This returns exact values for dead_tuple_count, dead_tuple_len, free_space, and free_percent — numbers you can trust for capacity planning.

Diagnosing VACUUM problems

Problem: autovacuum runs but does not reclaim space

VACUUM can only remove dead tuples that are no longer visible to any active transaction. The most common cause: long-running transactions. A transaction open for 3 hours prevents VACUUM from cleaning up any rows modified after that transaction started. This applies database-wide, not just to the table the transaction is touching.

Find the offending transactions:

Find long-running transactions blocking VACUUM
SELECT
    pid,
    age(backend_xid) AS xid_age,
    now() - xact_start AS duration,
    state,
    substr(query, 1, 80) AS query_preview
FROM pg_stat_activity
WHERE state != 'idle'
  AND backend_xid IS NOT NULL
ORDER BY xid_age DESC NULLS LAST
LIMIT 10;

Pay special attention to transactions in the idle in transaction state — these hold back VACUUM just as effectively as actively-running queries.

Fixes:

Set idle_in_transaction_session_timeout
ALTER SYSTEM SET idle_in_transaction_session_timeout = '5min';
SELECT pg_reload_conf();

Also review application code for transaction management issues — unclosed transactions, connection leaks, long-running background jobs. Abandoned replication slots cause the same problem: drop unused slots.

Problem: autovacuum is too slow

Even when autovacuum triggers at the right time, its default throttling makes it deliberately slow. On modern SSDs and cloud storage, these conservative defaults are unnecessary.

Make autovacuum work harder
# postgresql.conf — make autovacuum work harder
autovacuum_vacuum_cost_delay = 0     # or 1ms — remove the artificial slowdown
autovacuum_vacuum_cost_limit = 1000  # allow more work per cycle (default effective: 200)
autovacuum_max_workers = 5           # up from default 3
maintenance_work_mem = '1GB'         # more memory = fewer passes for large tables

Per-table throttling is also available:

Per-table throttling override
ALTER TABLE large_orders SET (
    autovacuum_vacuum_cost_delay = 0,
    autovacuum_vacuum_cost_limit = 2000
);

Problem: VACUUM FULL seems to take forever

VACUUM FULL rewrites the entire table and rebuilds every index. For a 100GB table with 5 indexes, the operation involves reading and copying all live data, rebuilding each index from scratch, and holding an ACCESS EXCLUSIVE lock the entire time.

Monitor progress with pg_stat_progress_cluster (PostgreSQL 12+):

VACUUM FULL progress monitoring
SELECT
    pid,
    relid::regclass AS table_name,
    phase,
    heap_blks_total,
    heap_blks_scanned,
    round(100.0 * heap_blks_scanned / NULLIF(heap_blks_total, 0), 1) AS pct_scanned
FROM pg_stat_progress_cluster;

Alternatives to VACUUM FULL: pg_repack does the same online compaction without an exclusive lock. pg_squeeze achieves the same using logical decoding, with even lower overhead. Both require only a brief exclusive lock at the final swap.

Transaction ID wraparound — the emergency that deserves your attention now

What it is

PostgreSQL uses 32-bit transaction IDs (XIDs) to track which transactions can see which row versions. Without intervention, transaction IDs would cycle back to the beginning after approximately 2.1 billion values, causing the database to lose track of visibility. I raise this not to alarm, but because this is the one VACUUM-related problem where prevention is not merely advisable — it is essential.

PostgreSQL prevents this through freezing: when a row version is old enough that it should be permanently visible, VACUUM marks it as "frozen" — permanently valid, no longer needing a transaction ID. The critical safety mechanism: when a table's oldest unfrozen transaction ID approaches the wraparound threshold, autovacuum forces a freeze operation regardless of dead tuple count.

Monitoring wraparound risk

Database-level check:

Database-level wraparound check
SELECT
    datname,
    age(datfrozenxid) AS xid_age,
    round(100.0 * age(datfrozenxid) / 2147483647, 2) AS pct_to_wraparound
FROM pg_database
ORDER BY xid_age DESC;

Table-level check:

Table-level wraparound check
SELECT
    schemaname || '.' || relname AS table_name,
    age(relfrozenxid) AS xid_age,
    pg_size_pretty(pg_total_relation_size(oid)) AS total_size,
    last_autovacuum
FROM pg_class
JOIN pg_stat_user_tables USING (relname)
WHERE relkind = 'r'
ORDER BY age(relfrozenxid) DESC
LIMIT 20;

Thresholds:

  • Normal: xid_age under 100,000,000 — routine autovacuum is handling freezing
  • Warning: approaching 150,000,000 — investigate if autovacuum is not triggering freeze operations
  • Critical: approaching 1,200,000,000 — PostgreSQL will start refusing new write transactions at approximately 2 billion

How to prevent it

  1. Do not disable autovacuum. If autovacuum is causing performance concerns, tune it — do not turn it off.
  2. Kill long-running transactions that prevent freezing.
  3. Monitor xid_age as part of your regular health checks. Add alerts at the warning threshold.
  4. Drop abandoned replication slots.
  5. If approaching the critical threshold, run a manual freeze:
Manual freeze with increased resources
SET maintenance_work_mem = '2GB';
VACUUM FREEZE verbose orders;

Honest counterpoint — when VACUUM is not the problem

I should be forthcoming about something. High dead tuple counts are a symptom. VACUUM is the mechanism that manages the symptom. But the root cause is the rate at which dead tuples are created — and sometimes the right answer is to address the root cause.

Schema design patterns that reduce dead tuple creation:

  • Append-only event tables. Instead of updating a status column (creating a dead tuple for every status change), insert events into a separate table.
  • Separate hot columns. If a table has 30 columns but only 2 are frequently updated, split the frequently-updated columns into a narrow companion table. VACUUM on a narrow table is dramatically faster.
  • Use HOT updates (Heap-Only Tuples). When an UPDATE modifies only columns not part of any index, PostgreSQL can perform a HOT update — the new row version is placed on the same page and the old version is pruned without requiring a full VACUUM. Keep frequently-updated columns out of indexes when possible.
  • Batch operations. Instead of updating 1 million rows in a single transaction, batch into chunks of 10,000 so autovacuum can clean up between batches.

Sometimes the answer is not "tune VACUUM harder" but "reduce the churn that creates the dead tuples." The best-run systems address both. For broader performance tuning guidance, see the comprehensive tuning guide.

How Gold Lapel helps

VACUUM and autovacuum require ongoing attention — monitoring dead tuple ratios, adjusting per-table settings as workload patterns change, watching for long-running transactions that block cleanup. Gold Lapel observes your workload continuously and surfaces the queries and tables that are generating the most dead tuple churn, helping you focus vacuum tuning and schema design where it matters most.

Frequently asked questions