MVCC (Multi-Version Concurrency Control)

Q: Why does PostgreSQL create new row versions instead of updating in place?

In-place updates would require locking the row for the duration of every read, because a reader might see a half-written value. By creating a new version and leaving the old one intact, PostgreSQL allows readers to continue accessing the previous version without any lock. The trade-off is dead tuples that need vacuuming, but the concurrency gain is substantial — readers never wait for writers, and writers never wait for readers.

Q: What are xmin and xmax?

Every row (tuple) in PostgreSQL carries two hidden transaction ID fields. xmin is the ID of the transaction that inserted or updated the row into existence. xmax is the ID of the transaction that deleted the row or locked it for update. When xmax is 0, the row has not been deleted. PostgreSQL uses these values, combined with each transaction's snapshot, to determine which rows are visible to which transactions.

Q: How does MVCC differ from locking-based concurrency control?

Traditional locking systems acquire shared locks for reads and exclusive locks for writes, meaning readers block writers and writers block readers. MVCC avoids this by giving each transaction its own consistent view of the data. Readers see a snapshot and never acquire row-level locks. Writers create new versions rather than modifying existing rows. The result is dramatically higher concurrency, especially on read-heavy workloads. PostgreSQL still uses locks for write-write conflicts — two transactions cannot update the same row simultaneously — but read-write conflicts are eliminated entirely.

Q: What is transaction ID wraparound and why is it dangerous?

PostgreSQL transaction IDs are 32-bit unsigned integers, giving roughly 4 billion values. Since the system uses modular arithmetic to determine which transactions are "in the past" and which are "in the future," it can only look about 2 billion transactions in each direction. If a table has rows with transaction IDs old enough that they would wrap around, PostgreSQL must freeze those rows — marking them as visible to all transactions — before it runs out of headroom. If freezing falls too far behind, PostgreSQL will shut down and refuse connections to prevent data corruption. This is the worst consequence of neglected vacuuming.

Q: Does MVCC affect index operations?

Yes. When a row is updated, the new version gets a new physical location (ctid), and every index on the table must be updated to point to it. This is write amplification — a single logical update can cause many physical writes. HOT (Heap-Only Tuple) updates mitigate this: when the updated columns are not indexed and the new tuple fits on the same page, PostgreSQL skips the index updates entirely. Dead index entries also accumulate and cause index bloat, which VACUUM must clean up.

Q: Can I see which rows are dead tuples?

Not directly through a normal query — dead tuples are invisible by definition. But you can see their effects. The pg_stat_user_tables view reports n_dead_tup per table. The pgstattuple extension provides more granular visibility, including the exact amount of free space and dead space within a table and its indexes. For most operational purposes, monitoring n_dead_tup and the dead tuple ratio is sufficient.

PostgreSQL's concurrency model. Readers never block writers. Writers never block readers. Someone, however, must attend to the dead tuples.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 9 min read

MVCC is the mechanism PostgreSQL uses to handle concurrent access to data. Rather than locking rows when they are read, each transaction sees a consistent snapshot of the database as it existed when the transaction began. When a row is updated, PostgreSQL does not overwrite it — it creates a new version and marks the old one as dead. Readers never block writers. Writers never block readers. The household keeps a record of everything that happened, but someone must periodically clear away the old records. That someone is VACUUM.

What MVCC is

If you'll permit me a moment of admiration for the architecture: MVCC is, at its core, a simple idea. Instead of one version of each row, keep multiple versions. Every INSERT creates a tuple. Every UPDATE creates a new tuple and marks the old one for eventual removal. Every DELETE marks a tuple as dead. No data is overwritten in place. The PostgreSQL documentation devotes an entire chapter to concurrency control, which speaks to the mechanism's centrality.

Each tuple carries two hidden fields: xmin and xmax. xmin records the transaction ID that created the tuple. xmax records the transaction ID that deleted it (or zero if it is still live). When a transaction reads a table, it uses these fields — combined with its own snapshot of which transactions have committed — to determine which version of each row is visible.

SQL

-- Every row in PostgreSQL carries hidden system columns
SELECT xmin, xmax, ctid, *
FROM your_table
LIMIT 5;

-- xmin: the transaction ID that created this row version
-- xmax: the transaction ID that deleted or locked this row version (0 if still live)
-- ctid: the physical location of the tuple (page, offset)

This is what makes PostgreSQL's concurrency model work. Every transaction sees a self-consistent view of the data, even while other transactions are actively modifying it. No dirty reads. No read locks. The foundation of the manor, if you will — everything else is built on top of this.

Why MVCC matters

In a traditional locking system, a reader that wants to access a row must wait if a writer is modifying it. A writer that wants to modify a row must wait if a reader is accessing it. Under load, this creates contention — transactions queue behind each other, and throughput drops. It is, to be frank, an unseemly arrangement.

MVCC eliminates read-write contention entirely. A SELECT never acquires a row-level lock. It reads from its snapshot, which is immutable. A concurrent UPDATE creates a new version of the row, but the reader does not see it — it continues reading the version that was current when its transaction began. Neither transaction waits for the other.

SQL

-- Each transaction sees a consistent snapshot
-- Session A:
BEGIN;
SELECT count(*) FROM orders;  -- returns 1000

-- Session B (concurrently):
INSERT INTO orders (item) VALUES ('widget');
COMMIT;

-- Session A (still in its transaction):
SELECT count(*) FROM orders;  -- still returns 1000
COMMIT;

-- Session A sees the snapshot from when its transaction began,
-- not the row Session B inserted.

Write-write conflicts still exist — I should be honest about that. Two transactions cannot update the same row at the same time; the second will block until the first commits or rolls back. But read-write conflicts, which are far more common in typical workloads, are gone entirely. This is why PostgreSQL handles mixed read/write workloads with such composure.

How it works in practice

Allow me to walk through the mechanism. Every transaction is assigned a transaction ID (a 32-bit integer that increments monotonically). When a transaction starts, it takes a snapshot: a record of which transaction IDs are committed, which are in-progress, and which have not started yet.

To determine whether a tuple is visible, PostgreSQL checks:

Is xmin committed and visible in my snapshot? If not, the tuple does not exist yet from my perspective.
Is xmax zero, or is the deleting transaction not yet committed in my snapshot? If so, the tuple is still live.
If both conditions pass, the tuple is visible. Otherwise, it is not.

This check happens for every tuple PostgreSQL accesses. It is fast — the snapshot comparison is essentially a set membership check — but it is not free. Nothing in a well-run household is truly free. The visibility map, which tracks pages where all tuples are known-visible to all transactions, lets PostgreSQL skip these checks entirely for clean pages. This is what makes index-only scans possible.

PostgreSQL's default isolation level is Read Committed: each statement within a transaction gets a fresh snapshot. Under Repeatable Read or Serializable, the snapshot is taken once at the start of the transaction and reused for every statement. The MVCC machinery is the same in either case — only the snapshot timing changes. The transaction isolation documentation covers how each level interacts with snapshots.

The cost of MVCC

I should be forthcoming about the trade-off, because pretending it does not exist would be a disservice. MVCC's elegance has a cost, and that cost is dead tuples.

Every UPDATE leaves behind the old version of the row. Every DELETE leaves behind the entire row. These dead tuples are invisible to new transactions but still occupy physical space on disk — old records that no one will consult again, yet they remain in the filing cabinet, taking up space. They must be scanned over during sequential scans, and their index entries must be checked during index scans. Until VACUUM removes them, they are dead weight.

SQL

-- Check dead tuple counts per table
SELECT
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC
LIMIT 10;

If dead tuples accumulate faster than VACUUM removes them, tables and indexes grow — a condition called bloat. Bloated tables slow down sequential scans because PostgreSQL must read through pages full of dead rows. Bloated indexes slow down index scans and waste memory in the buffer cache.

The second cost is transaction ID management. Since transaction IDs are 32-bit integers, PostgreSQL has roughly 4 billion of them before the counter wraps around. To prevent old rows from suddenly appearing to be from the future, PostgreSQL must periodically "freeze" rows — replacing their xmin with a special frozen transaction ID that is always considered visible. This is what aggressive vacuuming and VACUUM FREEZE do.

SQL

-- Current transaction ID and age of the oldest unfrozen row
SELECT
  txid_current() AS current_txid,
  age(datfrozenxid) AS oldest_unfrozen_age,
  datname
FROM pg_database
WHERE datname = current_database();

-- When age approaches 2 billion, PostgreSQL forces a VACUUM FREEZE
-- to prevent transaction ID wraparound.

If freezing falls behind — because autovacuum is too slow, or long-running transactions prevent it from making progress — PostgreSQL will eventually refuse new connections and shut down to protect against wraparound corruption. This is the most severe consequence of neglected MVCC maintenance. It is also entirely preventable. One simply has to pay attention to the vacuum health of the household.

How Gold Lapel relates

Gold Lapel operates at the proxy level, between your application and PostgreSQL. It does not alter how MVCC works — nor would you want it to; the architecture deserves respect, not revision. But it deals directly with MVCC's consequences.

Table bloat caused by dead tuple accumulation is one of the primary performance problems Gold Lapel monitors for. When bloat reaches a level that materially affects query performance, Gold Lapel surfaces it alongside the queries most affected. It also tracks vacuum health — whether autovacuum is keeping pace with write volume, and whether any tables are approaching transaction ID wraparound thresholds.

The materialized views Gold Lapel creates are themselves subject to MVCC. When a materialized view is refreshed, the old data becomes dead tuples that need vacuuming. Gold Lapel accounts for this in its refresh scheduling — it would be rather embarrassing to create the very problem one is hired to prevent.

MVCC (Multi-Version Concurrency Control)

What MVCC is

Why MVCC matters

How it works in practice

The cost of MVCC

How Gold Lapel relates

Frequently asked questions

Why does PostgreSQL create new row versions instead of updating in place?

What are xmin and xmax?

How does MVCC differ from locking-based concurrency control?

What is transaction ID wraparound and why is it dangerous?

Does MVCC affect index operations?

Can I see which rows are dead tuples?

Related content