pgstattuple
The difference between suspecting bloat and knowing it. Allow me to show you what your tables look like on the inside.
Most diagnostic tools offer estimates. pgstattuple offers the truth. It is a PostgreSQL contrib extension that physically examines every page of a table or index and reports exactly what it finds: live tuples, dead tuples, free space, index fragmentation, and leaf page density. No sampling, no inference — a full accounting of what is actually occupying your storage.
What pgstattuple does
pgstattuple scans the physical pages of a table or index and counts what it finds: live tuples, dead tuples (rows deleted or updated but not yet reclaimed by VACUUM), and free space. For B-tree indexes, pgstatindex reports tree depth, leaf page density, empty and deleted pages, and fragmentation.
This is a physical-level inspection. Where PostgreSQL's built-in statistics views (pg_stat_user_tables, pg_stat_user_indexes) estimate dead tuple counts based on activity counters — educated guesses, if we're being candid — pgstattuple reads every page and delivers exact numbers. The trade-off is cost: a full sequential scan of the relation. Precision has its price.
The extension also provides pgstattuple_approx, a lighter alternative that uses the visibility map to skip pages known to contain only visible tuples. This returns exact dead tuple counts with approximate live tuple and free space figures, at a fraction of the I/O cost.
When to use pgstattuple
pgstattuple is a diagnostic instrument, not something you run continuously. Think of it as a thorough physical examination — invaluable when symptoms appear, unnecessary as a daily ritual. Reach for it when:
- Investigating table bloat — when a table's disk size seems disproportionate to its row count, pgstattuple shows exactly how much space is occupied by dead tuples and free space
- Evaluating autovacuum effectiveness — a high
dead_tuple_percentafter autovacuum has run suggests the vacuum is not keeping up with the write workload - Deciding whether to VACUUM FULL or pg_repack — if
free_percentis high butdead_tuple_percentis low, the space has been reclaimed by VACUUM but not returned to the OS; only VACUUM FULL or pg_repack will shrink the file - Assessing index health —
pgstatindexreveals deleted pages, empty pages, and fragmentation, helping you decide whether REINDEX is worthwhile - Verifying maintenance operations — run pgstattuple before and after VACUUM, VACUUM FULL, REINDEX, or pg_repack to confirm the operation had the expected effect
Installation and setup
pgstattuple is a contrib module that ships with PostgreSQL. It does not require shared_preload_libraries — a simple CREATE EXTENSION is all you need, with no server restart.
-- pgstattuple is a contrib module — no shared_preload_libraries needed
CREATE EXTENSION pgstattuple;
-- Verify it's available
SELECT * FROM pgstattuple('pg_catalog.pg_class'); By default, only superusers and members of the pg_stat_scan_tables role can execute pgstattuple functions. To grant access to a non-superuser:
GRANT pg_stat_scan_tables TO your_monitoring_user; Output columns
The two primary functions — pgstattuple for tables and pgstatindex for B-tree indexes — return different column sets.
-- pgstattuple() output columns
table_len -- physical table size in bytes
tuple_count -- number of live tuples
tuple_len -- total size of live tuples in bytes
tuple_percent -- percentage of space used by live tuples
dead_tuple_count -- number of dead tuples (not yet vacuumed)
dead_tuple_len -- total size of dead tuples in bytes
dead_tuple_percent -- percentage of space used by dead tuples
free_space -- total free space in bytes
free_percent -- percentage of free space
-- pgstatindex() output columns
version -- B-tree version number
tree_level -- depth of the B-tree
index_size -- total index size in bytes
leaf_pages -- number of leaf pages
empty_pages -- number of empty pages
deleted_pages -- number of deleted pages
avg_leaf_density -- average fill percentage of leaf pages
leaf_fragmentation -- physical vs logical ordering mismatch (%) Note that table_len will always exceed the sum of tuple_len, dead_tuple_len, and free_space. The difference is accounted for by fixed page overhead, per-page tuple pointers, and alignment padding. The books never quite balance to the byte — but this is PostgreSQL's housekeeping, not an error in the accounting.
Practical examples
Measure table bloat
The core use case — and, if I may say, the reason most guests arrive at this page. This shows precisely how much of a table is occupied by live data, dead tuples, and free space.
-- Measure table bloat
SELECT
table_len,
tuple_count,
tuple_len,
round(tuple_percent::numeric, 2) AS live_pct,
dead_tuple_count,
dead_tuple_len,
round(dead_tuple_percent::numeric, 2) AS dead_pct,
free_space,
round(free_percent::numeric, 2) AS free_pct
FROM pgstattuple('my_table'); Measure index bloat and fragmentation
pgstatindex works on B-tree indexes. Low avg_leaf_density or high leaf_fragmentation suggests the index would benefit from REINDEX.
-- Measure B-tree index bloat and fragmentation
SELECT
version,
tree_level,
index_size,
leaf_pages,
empty_pages,
deleted_pages,
round(avg_leaf_density::numeric, 2) AS avg_leaf_density,
round(leaf_fragmentation::numeric, 2) AS leaf_frag_pct
FROM pgstatindex('my_table_pkey'); Lightweight scan with pgstattuple_approx
For large tables where a full scan would be imprudent, pgstattuple_approx uses the visibility map to skip pages that contain only visible tuples. Dead tuple counts remain exact; live tuple and free space figures are approximate. A reasonable compromise — not every examination requires opening every drawer.
-- Lightweight alternative — skips pages visible in the VM
-- Available since PostgreSQL 9.5
SELECT
table_len,
round(scanned_percent::numeric, 2) AS scanned_pct,
approx_tuple_count,
approx_tuple_len,
dead_tuple_count,
dead_tuple_len,
round(dead_tuple_percent::numeric, 2) AS dead_pct,
approx_free_space,
round(approx_free_percent::numeric, 2) AS free_pct
FROM pgstattuple_approx('my_table'); Inspect a GIN index
pgstatginindex reports the pending list size for GIN indexes. A large pending list means inserts are being buffered rather than merged into the index — this is normal for write-heavy workloads using fastupdate, but queries will be slower until the pending list is flushed.
-- Inspect a GIN index's pending list
SELECT
version,
pending_pages,
pending_tuples
FROM pgstatginindex('my_gin_index'); Find the most bloated tables
Scan all tables in the database to find the ones with the highest dead tuple percentage. This query calls pgstattuple for each table, so it performs a full scan of every relation in your database. I would not run this during peak hours. Quietly, after dinner.
-- Find the most bloated tables in a database
SELECT
relname,
pg_size_pretty(pg_relation_size(oid)) AS size,
(pgstattuple(oid)).dead_tuple_percent AS dead_pct,
(pgstattuple(oid)).free_percent AS free_pct
FROM pg_class
WHERE relkind = 'r'
AND pg_relation_size(oid) > 10 * 1024 * 1024 -- tables over 10 MB
ORDER BY (pgstattuple(oid)).dead_tuple_percent DESC
LIMIT 10; Cloud availability
| Provider | Status |
|---|---|
| Amazon RDS / Aurora | Available — install with CREATE EXTENSION |
| Google Cloud SQL | Available — install with CREATE EXTENSION |
| Azure Database for PostgreSQL | Available — full support on PostgreSQL 14+; limited pg_toast access on 11-13 |
| Supabase | Available — enable via the Extensions dashboard |
| Neon | Available — install with CREATE EXTENSION |
How Gold Lapel relates
I monitor table and index health as a matter of course — it is simply what attentive service looks like. When queries begin slowing on tables with high write churn, or when sequential scans appear on tables that ought to fit comfortably in cache, I flag those relations as candidates for maintenance: pg_repack, REINDEX, or a conversation about autovacuum settings.
pgstattuple provides the diagnosis. Gold Lapel provides the ongoing attention. Where pgstattuple gives you a point-in-time snapshot — precise, thorough, and entirely manual — I watch continuously and surface concerns before they become emergencies. The two pair rather well: I identify which tables warrant your attention, and pgstattuple lets you confirm the findings with your own eyes, down to the last dead tuple.