← PostgreSQL Concepts

Write amplification

A single logical write that causes multiple physical writes — one of those costs PostgreSQL does not volunteer, but which I believe you deserve to see itemized.

Concept · March 21, 2026 · 8 min read

Write amplification is the ratio of physical I/O operations to logical data changes. In PostgreSQL, a single UPDATE writes a new heap tuple, inserts a new entry into every index on the table, and generates WAL records for each of those writes. A table with five indexes turns one logical update into eleven physical writes (heap + 5 indexes + 5 WAL entries for each). The application sees one row changed. The storage layer has been rather busier. HOT updates, fewer indexes, and WAL compression are the primary tools for bringing this cost under control.

What write amplification is

PostgreSQL never updates a row in place. Its MVCC model writes a new tuple for every UPDATE and marks the old one as dead. That alone is a 2x amplification — one logical change produces two physical artifacts (the new tuple and the dead one left behind). A modest surcharge. But the real cost comes from indexes.

Every index on the table must point to the current tuple. When a new tuple is written to a different location on the heap, every index needs a new entry pointing to that new location. A table with four indexes means four additional writes on top of the heap write. Each of those writes also generates a WAL record for crash recovery. The WAL records are sequential and batched, but they still consume I/O bandwidth and disk space. Each index you add is another name on the payroll — and every name gets paid on every update.

Anatomy of a single UPDATE
-- A single UPDATE on a table with 4 indexes:
--
-- UPDATE orders SET status = 'shipped' WHERE id = 42;
--
-- PostgreSQL performs:
-- 1. Write new heap tuple (new row version)
-- 2. Update index on id        → new index entry
-- 3. Update index on status    → new index entry
-- 4. Update index on customer  → new index entry
-- 5. Update index on created   → new index entry
-- 6. WAL record for the heap write
-- 7. WAL record for each index write (4 more)
--
-- Total: 5 physical writes + 5 WAL entries = 10 I/O operations
-- for a single-column change.

This is write amplification: a single UPDATE orders SET status = 'shipped' produces a cascade of physical writes proportional to the number of indexes, not the amount of data changed. The application sees one row modified. The storage layer sees ten or more I/O operations. I find it helpful to see the full invoice rather than just the total.

Sources of write amplification

Allow me to walk you through the four sources. They layer on top of one another, which is part of what makes the total so easy to underestimate.

Index maintenance

The largest contributor for most workloads. Every index on a table must be updated when a non-HOT update occurs. The cost is linear in the number of indexes: a table with N indexes produces N+1 physical writes per update (one heap write plus N index writes). Tables with 8 or 10 indexes — common in applications where an ORM has been left unsupervised — pay a steep write tax on every modification.

WAL (Write-Ahead Log)

Every data change — heap writes, index writes, sequence increments — generates a WAL record before the change is applied. WAL is essential for crash recovery and replication, and it cannot be disabled. Nor should it be — one does not dismiss the fire safety officer because the paperwork is inconvenient. But WAL volume tracks the number of physical writes, so high write amplification from indexes compounds into proportionally high WAL volume.

Full-page writes

After each checkpoint, the first modification to any 8 KB data page causes PostgreSQL to write the entire page image to WAL, not just the delta. This protects against torn pages during a crash. A checkpoint followed by a burst of updates across many pages generates a surge of WAL — each touched page contributes a full 8 KB image rather than the typical few-hundred-byte change record. The wal_fpi counter in pg_stat_wal (PostgreSQL 14+) tracks how often this happens. The numbers after a checkpoint can be startling.

TOAST

When a column value exceeds approximately 2 KB, PostgreSQL stores it out-of-line in a separate TOAST table. Updating a TOASTed value writes a new entry in the TOAST table and a new entry in the TOAST table's index, each with WAL records. If the large value did not change, PostgreSQL reuses the existing TOAST pointer and the overhead is zero. TOAST amplification is proportional to how often you modify large column values.

Measuring write amplification

PostgreSQL does not report a single "write amplification factor" metric. One assembles the picture from several views — a bit like appraising household expenses from the individual ledgers rather than a single summary.

HOT update ratio

The most actionable metric, and the one I recommend checking first. pg_stat_user_tables tracks total updates (n_tup_upd) and HOT updates (n_tup_hot_upd). HOT updates skip all index maintenance, so the ratio tells you what fraction of your updates pay the full amplification cost versus what fraction avoid it.

SQL
-- Check HOT update ratio per table
-- Higher hot_pct = less write amplification
SELECT
  schemaname,
  relname,
  n_tup_upd,
  n_tup_hot_upd,
  CASE WHEN n_tup_upd > 0
    THEN round(n_tup_hot_upd::numeric / n_tup_upd * 100, 1)
    ELSE 0
  END AS hot_pct
FROM pg_stat_user_tables
WHERE n_tup_upd > 1000
ORDER BY n_tup_upd DESC;

WAL volume

The total WAL generated over a time window gives you the aggregate cost of all writes. Compare this against the number of logical row changes to approximate the amplification factor.

SQL
-- Measure WAL volume generated over a time window
-- Take a snapshot, wait, take another, and compare.

-- Snapshot 1:
SELECT pg_current_wal_lsn() AS lsn_before;

-- ... run workload ...

-- Snapshot 2:
SELECT
  pg_current_wal_lsn() AS lsn_after,
  pg_size_pretty(
    pg_wal_lsn_diff(pg_current_wal_lsn(), '<lsn_before_value>')
  ) AS wal_generated;

pg_stat_wal

In PostgreSQL 14+, pg_stat_wal provides cumulative WAL statistics including total records, total bytes, and the number of full-page images. A high ratio of full-page images to total records indicates that checkpoint-triggered full-page writes are a significant source of amplification.

SQL (PostgreSQL 14+)
-- PostgreSQL 14+: cumulative WAL statistics
SELECT
  wal_records,
  wal_fpi,
  pg_size_pretty(wal_bytes) AS wal_total,
  pg_size_pretty(wal_fpi * 8192) AS full_page_writes_est,
  round(wal_fpi::numeric / nullif(wal_records, 0) * 100, 1) AS fpi_pct
FROM pg_stat_wal;

-- wal_fpi = full-page images written after checkpoints
-- A high fpi_pct means full-page writes dominate WAL volume

Reducing write amplification

Four strategies, presented in order of typical impact. The first two address the root cause; the latter two manage the symptoms.

HOT updates via fillfactor

When an UPDATE changes no indexed column and the new tuple fits on the same heap page as the old one, PostgreSQL performs a HOT (Heap-Only Tuple) update — it writes only the heap tuple and skips all index updates. This collapses N+1 writes down to one. It is, in my experience, the single most effective remedy for write amplification on update-heavy tables.

The default fillfactor of 100 packs pages completely full, leaving no room for new tuple versions. Lowering it to 80 or 90 reserves 10-20% of each page as free space, dramatically increasing the likelihood that updates stay on the same page and qualify for HOT. You are trading a modest amount of disk space for a substantial reduction in write cost — a trade I have never seen regretted on a table with frequent updates.

SQL
-- Reserve page space so updates can stay on the same page (HOT eligible)
ALTER TABLE orders SET (fillfactor = 80);

-- Rewrite existing pages with the new fillfactor
VACUUM FULL orders;
-- Or, without blocking reads/writes:
-- pg_repack -d mydb -t orders

-- Verify fillfactor setting
SELECT
  relname,
  reloptions
FROM pg_class
WHERE relname = 'orders';

Fewer indexes

Every index you add increases write amplification by one write per non-HOT update. Removing an unused or redundant index reduces it by the same amount. Audit your indexes periodically: drop indexes that serve no queries, replace overlapping single-column indexes with composite indexes, and use partial indexes to limit the rows indexed. An index that nobody queries is not free — it is staff drawing a salary for work that is never requested.

SQL
-- Tables with the most indexes (more indexes = more write amplification)
SELECT
  schemaname,
  tablename,
  count(*) AS index_count,
  string_agg(indexname, ', ' ORDER BY indexname) AS indexes
FROM pg_indexes
WHERE schemaname = 'public'
GROUP BY schemaname, tablename
HAVING count(*) > 3
ORDER BY count(*) DESC;

Partial indexes

A partial index with a WHERE clause only indexes rows that match the condition. Rows that do not match are not in the index and do not need index updates when modified. If 90% of your updates affect rows outside the partial index's predicate, those updates do not pay any write cost for that index. Partial indexes reduce both the size of the index and the write amplification it causes. They are precision instruments — indexing exactly what is needed, nothing more.

WAL compression

Enabling wal_compression compresses full-page images before writing them to WAL. This does not reduce the number of writes, but it reduces the volume of WAL data — typically by 50-70% for full-page images. The CPU cost is minimal on modern hardware, and the I/O savings are significant for workloads with high full-page write ratios. I should be clear: compression manages the symptom, not the cause. But managing a symptom this cheaply is perfectly reasonable.

postgresql.conf
-- Enable WAL compression to reduce WAL volume
-- postgresql.conf:
-- wal_compression = on        -- PostgreSQL 15+: on, off, pglz, lz4, zstd
-- wal_compression = pglz      -- PostgreSQL 9.5-14: on/off only

-- Check current setting
SHOW wal_compression;

How Gold Lapel relates

This is a matter I take some care with. When Gold Lapel evaluates whether to recommend a new index, it does not consider read performance in isolation. Every index has a write cost: one additional physical write for every non-HOT update on the table. Gold Lapel weighs the read benefit of a proposed index against the write amplification it would introduce, factoring in the table's update rate, current HOT ratio, and existing index count.

On a table with a high update rate and a healthy HOT ratio, Gold Lapel may decline to recommend an index that would technically improve read performance for a small number of queries — because the write cost across thousands of updates per second would exceed the read savings. Conversely, on a read-heavy table with few updates, the write cost of an additional index is negligible and the recommendation threshold is lower. The goal is net performance improvement, not read optimization at the expense of writes. A recommendation that speeds up reads while quietly doubling the write bill would be, in my view, no recommendation at all.

Frequently asked questions