work_mem
Per-operation memory for sorts, hash joins, and hash aggregations. Set it too low and your queries spill to disk. Set it too high globally and, well — I trust you can imagine how that ends.
work_mem is a PostgreSQL configuration parameter that controls how much memory each individual sort, hash join, or hash aggregation operation can use before spilling to temporary files on disk. The default is 4MB. I should emphasize: this is not a per-connection setting — it is per-operation. A single complex query with multiple sorts and hash joins can allocate several times work_mem simultaneously. The prudent approach is to keep the global default conservative and raise it per-session for the queries that genuinely need the room.
What work_mem is
If you'll allow me a moment to be precise about this — work_mem sets the maximum amount of memory that a single internal operation can use before it starts writing intermediate results to temporary files on disk. The operations affected are:
- Sorts — ORDER BY, DISTINCT, merge joins, and index creation
- Hash joins — building the in-memory hash table for a hash join
- Hash aggregations — GROUP BY operations that use hashing
Each of these operations gets its own work_mem allocation independently. This is the detail that catches people: work_mem is not a per-connection limit. It is a per-operation limit within each backend process. A query plan with a hash join feeding into a sort feeding into another hash join can use up to three times work_mem for that single query. One query, three allocations. Multiply that by your connection count and you begin to see the arithmetic that keeps database administrators up at night.
When the data for an operation fits within work_mem, it completes entirely in RAM. When it does not, PostgreSQL writes the overflow to temporary files on disk and processes the data in batches. The in-memory path is, as you might expect, dramatically faster.
Why it matters
When work_mem is too low for the data being processed, sorts switch from an in-memory quicksort to an external merge sort that reads and writes temporary files. Hash joins build partial hash tables that must be processed in multiple batches. The performance difference is not subtle — disk-based sorts can be 10x to 100x slower than in-memory sorts, depending on disk speed and data volume. That is the kind of slowdown a guest notices.
The other side deserves equal attention. When work_mem is set too high globally, every concurrent operation on every connection can claim that much memory. On a server with 200 connections and a global work_mem of 256MB, a worst-case scenario where every connection runs a query with two hash joins could demand over 100GB of memory. The operating system will start swapping, or the OOM killer will terminate PostgreSQL entirely. I would counsel against either outcome.
The tension between these two failure modes is what makes work_mem one of the trickier PostgreSQL parameters to get right. A blunt global value is a blunt instrument. The sophisticated approach — and the one I would recommend — is to keep the global default conservative and raise it selectively where it is actually needed.
Detecting disk spills
The most direct way to see whether a query is spilling to disk is EXPLAIN (ANALYZE, BUFFERS). It will tell you plainly which sort method was used and whether it fit in memory.
-- A sort that spills to disk (work_mem too low)
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders ORDER BY created_at;
-- Look for these in the output:
-- Sort Method: external merge Disk: 45328kB <-- spilled to disk
-- Sort Method: quicksort Memory: 2048kB <-- fit in memory Sort Method: external merge Disk is the phrase you are looking for — it means the sort exceeded work_mem and wrote temporary files. The number after Disk: tells you how much data was written. By contrast, Sort Method: quicksort Memory means the sort completed entirely in RAM. The number after Memory: tells you how much was used. For hash joins, look for Batches: 1 (fits in memory) versus Batches: 8 or higher (spilling to disk, one batch at a time).
For a broader view, enable log_temp_files to have PostgreSQL log every temp file it creates. This catches disk spills across all queries — not just the one you happened to be examining at the time.
-- Find queries that created temp files (disk spills)
-- Requires log_temp_files = 0 in postgresql.conf
-- Check pg_stat_database for cumulative temp file usage
SELECT
datname,
temp_files,
pg_size_pretty(temp_bytes) AS temp_size
FROM pg_stat_database
WHERE datname = current_database(); Setting work_mem
There are three levels at which you can set work_mem, and the right approach usually involves at least two of them. Allow me to walk through each.
-- Global default (postgresql.conf, requires reload)
work_mem = '16MB'
-- Per-session: raise for a complex reporting query
SET work_mem = '256MB';
SELECT ... FROM large_join ORDER BY ...;
-- Per-transaction: automatically reverts at COMMIT/ROLLBACK
BEGIN;
SET LOCAL work_mem = '256MB';
SELECT ... FROM large_join ORDER BY ...;
COMMIT; -- work_mem reverts to the global default The global default in postgresql.conf applies to every new connection. Keep this conservative — it is the baseline for your most common workload, which is typically OLTP queries that operate on small result sets and rarely need large sorts. This is your household budget. Modest and responsible.
The per-session SET is the workhorse. Before running a reporting query, a data export, or any query you know involves large sorts or joins, raise work_mem for that session. This is safe because you are only increasing memory for one connection, and you presumably know the query warrants it. Targeted generosity, not global recklessness.
The per-transaction SET LOCAL is the most disciplined option. It automatically reverts when the transaction ends, so there is no risk of leaving an elevated work_mem on a connection that returns to a pool. I am rather fond of this approach — it cleans up after itself, which is a quality I value highly.
Sizing guidance
There is no single correct value — anyone who offers one without knowing your workload is guessing. But here is a framework for arriving at yours.
The rough ceiling for a global default: take the RAM available to PostgreSQL (total RAM minus shared_buffers minus OS needs), divide by max_connections, then divide by the typical number of memory-consuming operations per query (often 2 to 4). For a 64GB server with 16GB of shared_buffers, 200 connections, and an average of 2 operations per query: (64 - 16 - 8) / 200 / 2 = roughly 100MB as an absolute ceiling. Most people settle well below that.
Practical defaults for common workloads:
- OLTP (web applications) — 4MB to 16MB. Most queries are simple lookups and small joins that never approach these limits.
- Mixed workloads — 16MB to 64MB globally, with per-session increases for reporting queries.
- Analytics / data warehousing — 64MB to 256MB or higher, especially for dedicated reporting connections with controlled concurrency.
To find where spills are actually occurring — rather than where you imagine they might be — enable log_temp_files and let it run for a representative period.
-- Log every temp file created (size threshold in kB, 0 = log all)
-- Add to postgresql.conf, then reload
log_temp_files = 0
-- Then check the PostgreSQL log for entries like:
-- LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp12345.0", size 47185920 Each log entry shows the size of a temp file that was created — each one a small confession from a query that needed more room than it was given. Cross-reference with pg_stat_statements to identify which queries are responsible, and you will know precisely where to direct your attention.
How Gold Lapel relates
I should mention — Gold Lapel's query optimization can reduce the need for large work_mem values by eliminating the operations that require it in the first place. When Gold Lapel creates an index that turns a sequential scan plus sort into an index scan that returns rows in order, the sort operation disappears from the query plan entirely. No sort, no work_mem allocation. The problem is not solved — it is dissolved.
Similarly, when Gold Lapel replaces an expensive aggregation with a pre-computed materialized view, the hash aggregation that would have consumed work_mem is replaced by a simple read. The query that used to require 256MB of work_mem to avoid disk spills now works comfortably with the 4MB default.
I would not suggest you can ignore work_mem tuning — that would be a disservice. But the queries most likely to spill — large sorts and aggregations on substantial tables — are often the same queries that benefit most from the indexes and materialized views Gold Lapel creates. Tune work_mem for the workload you have today, and let Gold Lapel reduce the workload that needs tuning tomorrow.