Checkpoint

Q: What is the difference between checkpoints_timed and checkpoints_req?

checkpoints_timed are scheduled checkpoints triggered by checkpoint_timeout — these are normal and expected. checkpoints_req are requested checkpoints, triggered when WAL accumulates past max_wal_size or when someone runs CHECKPOINT manually. If checkpoints_req is a significant fraction of total checkpoints, your WAL is growing faster than the timeout interval can handle. The usual fix is to increase max_wal_size so the timeout triggers first.

Q: Do checkpoints block queries?

No. Checkpoints run in the background and do not block reads or writes. However, they do generate disk I/O, which can compete with query I/O on shared storage. This is why checkpoint_completion_target exists — spreading writes over a longer window reduces the I/O spike. If queries slow down at regular intervals matching your checkpoint_timeout, I/O contention from checkpoints is a likely cause.

Q: How long does crash recovery take?

Recovery replays all WAL records generated since the last completed checkpoint. The time depends on how much WAL has accumulated. With a 5-minute checkpoint_timeout and moderate write load, recovery typically takes seconds to a few minutes. With a 30-minute timeout and heavy writes, recovery could take significantly longer. The trade-off is explicit: more frequent checkpoints mean faster recovery but more I/O during normal operation.

Q: Should I increase checkpoint_timeout?

For most production workloads, increasing checkpoint_timeout from the 5-minute default to 10 or 15 minutes is sensible. This reduces the total number of checkpoint I/O cycles and allows more dirty pages to be batched together. The cost is longer crash recovery. If your database can tolerate an extra minute or two of recovery time after an unclean shutdown, the reduced I/O overhead during normal operation is a good trade.

Q: What does buffers_backend mean in pg_stat_bgwriter?

When a backend process needs to write a dirty page to make room in the buffer cache, and neither the checkpointer nor the background writer has flushed it yet, the backend writes the page itself. This is buffers_backend. It is the least efficient path — backends are supposed to be running queries, not writing pages. A high buffers_backend count relative to buffers_checkpoint suggests that shared_buffers is too small or that checkpoints and background writer are not keeping up.

Q: How do I know if my checkpoints are causing I/O problems?

Enable log_checkpoints and look at the write and sync times in the log output. If checkpoint_write_time approaches or exceeds the checkpoint interval, writes are not completing before the next checkpoint starts — a sign of I/O saturation. Also compare checkpoint_sync_time: high sync times indicate the storage cannot flush to stable media fast enough. On the monitoring side, correlate pg_stat_bgwriter with your system I/O metrics to see if query latency spikes align with checkpoint activity.

The periodic operation that flushes dirty pages from memory to disk. Checkpoints are where PostgreSQL's in-memory changes become permanent on the data files.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 8 min read

A checkpoint is a periodic operation where PostgreSQL writes all dirty pages (modified data that exists only in the buffer cache) to the underlying data files on disk. It then updates the control file to record where in the WAL stream the checkpoint occurred. After a crash, PostgreSQL replays WAL only from the last checkpoint forward — everything before it is already safely on disk. Frequent checkpoints mean less WAL to replay but more I/O during normal operation. Infrequent checkpoints reduce I/O overhead but increase the time needed to recover from a crash.

What a checkpoint is

PostgreSQL does not write data changes directly to tables and indexes on disk at commit time. Instead, it writes a WAL record (ensuring durability) and modifies the page in the shared buffer cache. The actual data files lag behind — they contain a mix of current and stale data, with the WAL holding the authoritative record of recent changes.

A checkpoint closes this gap. As the WAL configuration documentation describes, the checkpointer process scans the buffer cache, identifies every dirty page (one that has been modified since it was last written to disk), and flushes all of them to the data files. Once all dirty pages are written and an fsync confirms they are on stable storage, PostgreSQL updates the pg_control file with the checkpoint's WAL position. This position is the recovery starting point — the guarantee that everything before it is safely persisted in the data files.

After a checkpoint completes, the WAL segments that predate it are no longer needed for crash recovery. PostgreSQL recycles these segments, reusing the disk space for new WAL writes. This is how WAL disk usage stays bounded rather than growing without limit.

Why checkpoints matter

Checkpoints sit at the center of a fundamental trade-off: crash recovery speed versus steady-state I/O load.

Crash recovery — after an unclean shutdown, PostgreSQL replays all WAL generated since the last checkpoint. A checkpoint that completed 2 minutes ago means 2 minutes of WAL to replay. A checkpoint that completed 30 minutes ago means 30 minutes of WAL. The checkpoint interval directly controls your worst-case recovery time.
I/O overhead — each checkpoint flushes potentially thousands of dirty pages to disk. More frequent checkpoints mean this I/O burst happens more often. On I/O-constrained systems, checkpoint activity can compete with query workloads and cause periodic latency spikes.
WAL volume — PostgreSQL performs a full-page write (the entire 8 KB page) to WAL the first time a page is modified after a checkpoint. This contributes to write amplification. Frequent checkpoints mean more full-page writes, which increases WAL volume and can affect replication bandwidth and archive storage.

The goal is to find a checkpoint frequency that keeps recovery time acceptable without creating noticeable I/O interference during normal operation. Most production systems land on a checkpoint interval between 5 and 15 minutes.

Key configuration

Three settings control when and how checkpoints happen. A fourth controls visibility into their behavior. The full list of parameters is in the checkpoint configuration reference.

postgresql.conf

-- Key checkpoint settings (shown with typical production values)

-- Maximum time between automatic checkpoints
checkpoint_timeout = '10min'

-- WAL size that triggers an early checkpoint
max_wal_size = '4GB'

-- Minimum WAL retained after a checkpoint
min_wal_size = '1GB'

-- Spread writes over this fraction of the checkpoint interval
checkpoint_completion_target = 0.9

-- Log checkpoint start, completion, and statistics
log_checkpoints = on

checkpoint_timeout

The maximum time between automatic checkpoints. The default is 5 minutes. When this timer expires, PostgreSQL starts a new checkpoint regardless of how much WAL has been generated. Increasing this to 10 or 15 minutes reduces checkpoint frequency and the associated I/O, at the cost of longer crash recovery.

max_wal_size

The soft limit on WAL size between checkpoints. If WAL accumulates beyond this threshold before checkpoint_timeout fires, PostgreSQL forces an early checkpoint. These forced checkpoints appear as checkpoints_req in pg_stat_bgwriter. If you see many requested checkpoints, raise max_wal_size so the timeout triggers first.

checkpoint_completion_target

Controls how aggressively the checkpointer writes dirty pages. A value of 0.9 means PostgreSQL spreads checkpoint writes over 90% of the checkpoint interval. This smooths out the I/O load rather than writing everything in a burst at the start.

SQL

-- checkpoint_completion_target controls I/O spreading
-- With checkpoint_timeout = 10min and completion_target = 0.9:
-- PostgreSQL spreads dirty page writes over 9 minutes (90% of interval)
-- This prevents a burst of I/O at checkpoint time

-- A lower value (e.g., 0.5) writes faster but creates I/O spikes
-- A higher value (e.g., 0.9) spreads writes more evenly — recommended

-- Check current setting
SHOW checkpoint_completion_target;

The default changed from 0.5 to 0.9 in PostgreSQL 14, reflecting the consensus that spreading writes is almost always preferable. If you are on an older version, setting this to 0.9 manually is one of the easiest performance improvements available.

log_checkpoints

When enabled, PostgreSQL logs detailed information at the start and completion of every checkpoint — how many buffers were written, how long the write and sync phases took, and the WAL distance covered. This is essential for understanding checkpoint behavior.

PostgreSQL log output

-- Example log_checkpoints output (in PostgreSQL log):
-- LOG:  checkpoint starting: time
-- LOG:  checkpoint complete: wrote 8234 buffers (50.2%);
--       0 WAL file(s) added, 3 removed, 2 recycled;
--       write=53.012 s, sync=0.089 s, total=53.241 s;
--       sync files=412, longest=0.014 s, average=0.001 s;
--       distance=524288 kB, estimate=524288 kB

Monitoring checkpoints

The pg_stat_bgwriter view is the primary source of checkpoint statistics. It accumulates counters since the last server restart.

SQL

-- Check checkpoint activity
SELECT
  checkpoints_timed,
  checkpoints_req,
  checkpoint_write_time / 1000 AS write_seconds,
  checkpoint_sync_time / 1000 AS sync_seconds,
  buffers_checkpoint,
  buffers_clean,
  buffers_backend
FROM pg_stat_bgwriter;

-- checkpoints_timed:  scheduled checkpoints (normal, triggered by checkpoint_timeout)
-- checkpoints_req:    requested checkpoints (triggered by max_wal_size or manual CHECKPOINT)
-- buffers_backend:    pages written directly by backends (should be low)

checkpoints_timed vs checkpoints_req — in a well-tuned system, nearly all checkpoints should be timed (scheduled). A high ratio of requested checkpoints means WAL is hitting max_wal_size before the timeout, which creates more frequent and potentially less efficient checkpoints. Increase max_wal_size or checkpoint_timeout to correct this.

buffers_checkpoint vs buffers_backend — dirty pages should be flushed by the checkpointer (buffers_checkpoint) or the background writer (buffers_clean), not by backend processes running queries (buffers_backend). High buffers_backend indicates that the buffer cache is under pressure and backends are being forced to evict dirty pages themselves, which adds latency to the queries those backends are running.

SQL

-- Force an immediate checkpoint (use sparingly)
CHECKPOINT;

-- Check when the last checkpoint occurred
SELECT
  checkpoint_time,
  redo_lsn,
  redo_wal_file
FROM pg_control_checkpoint();

The pg_control_checkpoint() function shows when the last checkpoint occurred and the WAL position it recorded. This tells you exactly where crash recovery would begin if the server went down right now.

How Gold Lapel relates

Gold Lapel operates above the checkpoint layer — it works at the query level, sitting between your application and PostgreSQL. It does not trigger, configure, or directly interact with checkpoints.

That said, query optimization has downstream effects on checkpoint behavior. When Gold Lapel routes a frequently executed aggregation to a materialized view instead of re-executing the underlying joins and sorts, the avoided writes generate fewer dirty pages. Fewer dirty pages mean less work for each checkpoint and less I/O contention. Similarly, when Gold Lapel recommends an index that replaces a sequential scan, the more targeted reads reduce buffer cache churn — pages stay useful longer and are less likely to be evicted and re-read in a pattern that amplifies checkpoint writes.

The relationship is indirect but real: more efficient queries mean a calmer buffer cache, and a calmer buffer cache means smoother checkpoints.

Checkpoint

What a checkpoint is

Why checkpoints matter

Key configuration

checkpoint_timeout

max_wal_size

checkpoint_completion_target

log_checkpoints

Monitoring checkpoints

How Gold Lapel relates

Frequently asked questions

What is the difference between checkpoints_timed and checkpoints_req?

Do checkpoints block queries?

How long does crash recovery take?

Should I increase checkpoint_timeout?

What does buffers_backend mean in pg_stat_bgwriter?

How do I know if my checkpoints are causing I/O problems?

Related content