← Connection Pooling & Resource Management

A Gentleman's Guide to PostgreSQL Connection Pooling

Your application is opening 200 connections. PostgreSQL would prefer 20. A pooler resolves this disagreement with the quiet efficiency of someone who has done it thousands of times before.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 18 min read
We prepared an illustration. It is waiting in the connection queue.

Good evening. Might we discuss your connection count?

PostgreSQL handles each client connection with a dedicated backend process. This is an elegant architecture — each connection gets its own memory space, its own transaction state, its own execution context. It is the database equivalent of a private room for every guest: attentive, isolated, and extraordinarily expensive when the guest list grows.

Each backend process consumes 5-10MB of RAM. Creating a new connection requires forking a process, setting up shared memory mappings, and (with TLS) completing a cryptographic handshake. On a busy application server with 200 connections, that is 2GB of RAM and 200 OS processes before a single query is executed.

A connection pooler sits between your application and PostgreSQL, maintaining a small pool of persistent connections that are shared among many clients. It is one of the few infrastructure changes that is universally beneficial and has essentially no downsides — provided you understand the two or three places where it requires attention, which we shall attend to presently.

I should be honest at the outset: connection pooling is not a performance optimization in the way that adding an index is. It is a resource management strategy. It makes your database infrastructure sustainable at scale, it eliminates wasteful connection overhead, and it prevents a class of outages that I have seen bring down production systems at three in the morning. But it will not make a slow query fast. That distinction matters, and we shall return to it.

What does a connection actually cost?

-- Each PostgreSQL connection forks a new backend process.
-- On Linux, each backend consumes:
--   ~5-10MB of RAM (shared_buffers mapping, work_mem, catalog cache)
--   1 OS process (context switches, scheduling overhead)
--   1 file descriptor set

-- 200 connections = ~2GB RAM + 200 OS processes.
-- 500 connections = ~5GB RAM + significant scheduling overhead.
-- 2000 connections = you are having a very bad day.

-- Creating a new connection: ~3-5ms (TLS: 10-20ms).
-- Using a pooled connection: ~0.1ms.

The per-query overhead of creating a new connection — 3-5ms for plain TCP, 10-20ms with TLS — may seem small. But consider the arithmetic. An application executing 500 queries per second, each on a fresh connection, spends 1,500-10,000ms per second on connection setup alone. That is between 1.5 and 10 CPU-seconds dedicated entirely to handshaking. Pooling eliminates this overhead entirely: borrowing a connection from the pool costs approximately 0.1ms.

The memory overhead is more insidious. Each backend process allocates its own work_mem for sorts, its own catalog cache, its own connection state. At 200 connections, the aggregate memory consumption is substantial — and much of it sits idle. A connection that executes one query per second is idle 99.9% of the time. The resources it holds are not.

If you will permit me a brief anatomy lesson, here is what PostgreSQL does each time a new connection arrives:

-- What actually happens when you open a PostgreSQL connection:

-- 1. TCP handshake (SYN, SYN-ACK, ACK)               ~0.5ms local, 1-50ms remote
-- 2. PostgreSQL startup message                        ~0.1ms
-- 3. Authentication (md5/scram-sha-256)                ~0.5ms
-- 4. TLS negotiation (if enabled)                      ~5-15ms
-- 5. Backend process fork()                            ~1-2ms
-- 6. Shared memory mapping                             ~0.5ms
-- 7. Catalog cache initialization                      ~0.5-1ms
-- 8. GUC (config) parameter loading                    ~0.1ms
-- 9. Connection ready for queries                      Total: 3-20ms

-- For comparison, a simple SELECT by primary key: ~0.1-0.3ms.
-- The connection setup can cost 100x the query itself.

Nine steps. Each one involves system calls, memory allocation, or cryptographic computation. The connection setup frequently costs more than the query it was created to serve. I find this — creating an entire operating system process to execute a 0.2ms primary key lookup — to be the infrastructural equivalent of hiring a moving company to carry a letter across the room.

Application-level pools: necessary but insufficient

Before we discuss external poolers, a word about the connection pool your application framework almost certainly provides. SQLAlchemy has one. HikariCP has one. Go's pgxpool has one. Even node-postgres has one. These pools are useful — they prevent your application from opening a new connection for every query, and they provide timeout and health-check mechanisms.

# Application-level pooling (Python/SQLAlchemy example)
from sqlalchemy import create_engine

engine = create_engine(
    "postgresql://user:pass@localhost:5432/mydb",
    pool_size=10,              # persistent connections in the pool
    max_overflow=5,            # temporary connections above pool_size
    pool_timeout=30,           # seconds to wait for a connection
    pool_recycle=1800,         # recycle connections after 30 minutes
    pool_pre_ping=True,        # verify connection is alive before use
)

# This pool lives inside your application process.
# If you run 4 Gunicorn workers, you have 4 pools = 40-60 connections.
# If you run 10 Kubernetes pods with 4 workers each: 400-600 connections.
# Application-level pooling does NOT solve the total connection count problem.

The problem is scope. An application-level pool manages connections within a single process. If you run four Gunicorn workers, you have four pools. If you run ten Kubernetes pods with four workers each, you have forty pools. Each pool maintains its own set of persistent connections to PostgreSQL, and they know nothing about each other.

This is not a criticism of application-level pooling — it is doing exactly what it was designed to do. But it solves the per-process problem while leaving the total-connection-count problem untouched. For that, you need a pooler that sits between all your application processes and the database: an external connection pooler.

The two layers are complementary. The application pool prevents unnecessary churn within each process. The external pooler consolidates all those processes into a manageable number of backend connections. I recommend both.

How does PgBouncer work?

PgBouncer is the most widely deployed PostgreSQL connection pooler. It is lightweight (single-threaded, minimal memory footprint — typically under 10MB for thousands of connections), battle-tested (15+ years in production at organizations of every size), and straightforward to configure.

# pgbouncer.ini — transaction pooling (most common)
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

pool_mode = transaction          # release connection after each transaction
default_pool_size = 20           # max connections per user/database pair
max_client_conn = 1000           # max incoming client connections
reserve_pool_size = 5            # emergency overflow connections
reserve_pool_timeout = 3         # seconds before using reserve pool

# Your app connects to port 6432.
# PgBouncer maintains 20 connections to PostgreSQL.
# 1000 application connections share those 20 backend connections.

The key insight: 1,000 application connections share 20 backend connections. Since most application connections are idle at any given moment — waiting for user input, rendering templates, making HTTP calls to other services — 20 backend connections can serve 1,000 clients without queueing. The condition is that no more than 20 clients are actively executing queries simultaneously, which, for a typical web application, is a generous allowance.

PgBouncer is single-threaded. This is both its limitation and its virtue. The limitation: on a very high-throughput system (tens of thousands of queries per second), a single PgBouncer instance can become a CPU bottleneck. The virtue: single-threaded means no lock contention, no thread synchronization overhead, and extraordinary predictability. For the vast majority of deployments, a single PgBouncer instance handles the load comfortably. For those that outgrow it, you can run multiple instances behind a TCP load balancer, or consider the multi-threaded alternatives we shall discuss shortly.

What are the pooling modes?

ModeConnection lifecycleTrade-offRestrictions
SessionConnection held for entire client sessionNo savings if clients stay connected. Good for interactive psql.None — full PostgreSQL feature support
TransactionConnection released after each transactionBest balance of pooling efficiency and compatibility.No SET, LISTEN/NOTIFY, prepared statements across transactions
StatementConnection released after each statementMaximum pooling but severe restrictions.No multi-statement transactions. Rarely used.

Transaction mode is the right choice for the vast majority of web applications. The connection is released back to the pool after each transaction completes, maximizing sharing. The restriction — no session-level state across transactions — is rarely a problem for applications that use parameterized queries and explicit transactions.

Session mode deserves a fair hearing. If your application relies on SET commands, LISTEN/NOTIFY, or prepared statements that persist across transactions, session mode avoids the compatibility headaches of transaction mode. The cost is that you lose most of the pooling benefit — each client holds a backend connection for its entire session, which for a long-lived web server process may be minutes or hours. Session mode is appropriate for interactive tools (psql, database administration interfaces) and applications with heavy session-state requirements. It is rarely appropriate for web APIs.

Statement mode I include for completeness. It releases the connection after every individual statement, prohibiting multi-statement transactions entirely. I have encountered it in production exactly once, in a read-only analytics dashboard where every query was a single SELECT. Unless your use case is similarly constrained, avoid it.

The session state trap

The most common pooling failure I encounter is not a configuration error — it is an application that unknowingly depends on session-level state. This warrants an example:

-- The session state problem with transaction pooling:

-- Connection A executes:
SET search_path TO tenant_42, public;
SELECT * FROM users WHERE id = 1;   -- reads from tenant_42.users
-- Transaction ends. PgBouncer releases connection A back to the pool.

-- Connection B (different backend) executes next query:
SELECT * FROM orders WHERE user_id = 1;
-- search_path is still the DEFAULT — not tenant_42.
-- This query reads from public.orders, not tenant_42.orders.
-- Data from the wrong tenant. A security incident.

-- Session-level state that does NOT survive transaction pooling:
--   SET / SET LOCAL (GUC parameters)
--   PREPARE / DEALLOCATE (prepared statements)
--   LISTEN / NOTIFY channels
--   DECLARE CURSOR (outside transactions)
--   Advisory locks (session-level)
--   Temporary tables

Multi-tenant applications that use SET search_path or SET ROLE are particularly vulnerable. The fix is straightforward: set the search path or role at the beginning of every transaction, not once per connection. Most ORMs provide a hook for this — SQLAlchemy's event.listen(engine, "checkout", set_search_path), for instance. But you must know to look for the problem, and transaction-mode pooling is where it surfaces.

The prepared statement problem

If there is one subject that generates more support tickets, confused Stack Overflow questions, and late-night debugging sessions than any other in the connection pooling world, it is prepared statements. I have written a thorough examination of the asyncpg variant of this problem, but the issue is universal to any client library that caches prepared statements when used with transaction-mode pooling.

The mechanism is simple and the failure is reliable: Client A prepares a statement on backend X. The transaction ends. PgBouncer returns backend X to the pool. Client B receives backend X and attempts to prepare the same statement name. PostgreSQL responds, quite reasonably, that a prepared statement with that name already exists. Your application crashes.

-- The prepared statement problem with transaction-mode pooling:
--
-- Client A prepares "stmt_1" on backend X.
-- Transaction ends. Backend X returns to pool.
-- Client B gets backend X. Client B prepares "stmt_1".
-- ERROR: prepared statement "stmt_1" already exists.
--
-- Solutions:

-- 1. Disable prepared statements in your client library:
-- asyncpg (Python):
conn = await asyncpg.connect(statement_cache_size=0)

-- psycopg3 (Python):
conn = psycopg.connect(prepare_threshold=None)

-- node-postgres (use unnamed/anonymous queries — this is the default):
-- Named prepared statements are only created when you explicitly set a "name"
-- property on your query config. The default behavior already avoids them.

-- 2. Use PgBouncer 1.21+ with prepared statement tracking:
-- pgbouncer.ini:
-- max_prepared_statements = 100

-- 3. Use pgcat or Supavisor, which handle this transparently.

The honest counterpoint: disabling prepared statement caching does cost you something. Prepared statements allow PostgreSQL to parse and plan a query once, then execute it many times with different parameters. Without caching, each execution incurs the parsing and planning overhead — typically 0.1-0.5ms per query. For most web applications, this overhead is negligible. For high-throughput systems executing the same query thousands of times per second, it is measurable. In such cases, PgBouncer 1.21+ with max_prepared_statements, or pgcat's transparent statement mapping, gives you both pooling and prepared statement performance.

How do the poolers compare?

PgBouncer is not the only option. pgcat (Rust, multi-threaded, built-in load balancing) and Supavisor (Elixir, multi-tenant, cloud-native) are newer alternatives with broader feature sets. Each makes different trade-offs, and the right choice depends on what you need beyond basic pooling.

FeaturePgBouncerpgcatSupavisor
Protocol supportv3 (basic)v3 (extended)v3 (named portals)
LanguageCRustElixir
Multi-tenancyLimitedBuilt-inBuilt-in
Load balancingNoYes (read replicas)Yes
Prepared statementsPgBouncer 1.21+ (protocol-level support)Yes (server-side mapping)Yes (named → unnamed)
MultithreadedNo (single-threaded)Yes (Tokio)Yes (BEAM)
Connection queueingBasic FIFOConfigurablePer-tenant
MaturityBattle-tested (15+ years)Production-ready (newer)Emerging
Query optimizationNoNoNo

PgBouncer remains my default recommendation for teams that need a connection pooler and nothing else. It is simple to operate, its behavior is well-documented and predictable, and it has been debugged by fifteen years of production use. There is something to be said for infrastructure that does one thing and does it faultlessly.

pgcat is the appropriate choice when you need read replica load balancing, multi-tenancy, or transparent prepared statement handling alongside pooling. It is written in Rust on Tokio, so it is multi-threaded and can handle higher throughput than a single PgBouncer instance. The trade-off is a younger project with a smaller community and less operational folklore.

Supavisor was built for Supabase's multi-tenant cloud environment and excels at per-tenant pool management. If you are running a SaaS platform with hundreds of tenant databases, Supavisor's architecture is purpose-built for that problem. For a single-database deployment, it is more machinery than required.

Note the last row: none of these poolers optimize queries. They manage connections — which is valuable but orthogonal to query performance. A pooler ensures your application can reach the database efficiently. What happens once the query arrives is a separate matter entirely.

What is the right pool size?

-- Rule of thumb for pool size:
-- pool_size = (num_cpu_cores * 2) + num_disks
-- For a 4-core server with SSD: (4 * 2) + 1 = 9

-- Why so small? PostgreSQL performs best with fewer,
-- busier connections. More connections mean more lock
-- contention, more context switching, more shared buffer
-- management overhead.

-- Common mistake: pool_size = 200.
-- You have recreated the problem pooling was meant to solve.

The most common mistake is setting the pool size too large. A pool of 200 connections does not make PostgreSQL faster — it makes it slower. More concurrent connections means more lock contention, more context switching between processes, and more shared buffer management overhead. PostgreSQL's throughput typically peaks at a connection count near (CPU cores * 2) + disk spindles.

For a typical cloud instance with 4 vCPUs and SSD storage, a pool size of 10-20 is optimal. If your application cannot function with 20 concurrent database connections, the bottleneck is in query duration (slow queries holding connections open), not in connection capacity. The correct response is to make the queries faster, not to add more connections.

This principle is counterintuitive, and I find it is the one that encounters the most resistance. Allow me to explain why it is true.

PostgreSQL uses a process-per-connection architecture. Each backend process shares access to shared buffers, the WAL, and lock tables. When 200 processes contend for these shared resources, the overhead of synchronization — lightweight locks, buffer pin management, process scheduling — can exceed the useful work being done. Reducing to 20 connections does not mean 10x less throughput. It means the same throughput with 10x less contention. The queries actually complete faster because they spend less time waiting for locks and more time executing.

Sizing across multiple applications

-- Real-world sizing with multiple applications:
-- Your database server: 8 cores, SSD, 32GB RAM.
-- Optimal backend connections: (8 * 2) + 1 = 17. Let's say 20.
--
-- But you have:
--   Web API:        4 pods × 4 workers = 16 app connections
--   Background jobs: 2 pods × 8 workers = 16 app connections
--   Admin panel:    1 pod × 2 workers  = 2 app connections
--   Monitoring:     1 connection
--   Migrations:     1 connection
--                                        = 36 total

-- Without a pooler: 36 backend connections (above optimal).
-- With PgBouncer (pool_size=20): 36 app connections share 20 backends.
-- With PgBouncer as apps scale to 100 pods: 800 app connections
--   still share 20 backends.

-- Set max_connections in PostgreSQL to pool_size + some headroom:
-- max_connections = 30  (20 pool + 5 reserve + 5 direct/admin)
-- Yes, 30. Not the default 100. Not 500. Thirty.

The most important number is not your pool size per application — it is the total backend connections across all applications. This is where the external pooler earns its keep. Without it, scaling your application to more pods or workers directly increases your PostgreSQL backend count. With it, you can scale application instances freely while keeping the backend count fixed at the optimal level.

I should note an honest limitation of this formula: it assumes a workload dominated by short queries. If your workload includes long-running analytical queries (5+ seconds), those queries hold backend connections for their entire duration, effectively reducing the pool available for transactional queries. In such cases, either use a separate pool for analytical queries or consider routing them to a read replica.

The benchmark: pooled vs unpooled

ScenarioAvg latencyBackend connectionsThroughput
No pooler (new connection per query)14.2ms200 backend70 qps
No pooler (persistent connections)1.1ms200 backend900 qps
PgBouncer (transaction mode, pool=20)1.2ms20 backend870 qps
PgBouncer (transaction mode, pool=50)1.1ms50 backend910 qps
PgBouncer (session mode, pool=20)1.1ms20 backend910 qps

The striking result: PgBouncer with 20 backend connections achieves 97% of the throughput of 200 persistent connections. The 180 eliminated connections freed approximately 1.8GB of RAM and reduced OS scheduling overhead — all without meaningful sacrifice in query performance.

The session-mode result is included to demonstrate that the pooling mode itself adds negligible latency. The difference between session mode and transaction mode is not performance — it is compatibility. Choose transaction mode for the pooling efficiency; choose session mode only when you need session-state compatibility.

The new-connection-per-query scenario is included for completeness. It is 13x slower and represents the worst possible approach. If your application is doing this — and serverless functions without connection reuse often do — a pooler is not optional. It is urgent.

Monitoring your pool: the numbers that matter

A connection pooler you cannot observe is a connection pooler you cannot trust. PgBouncer exposes its state through SHOW commands on its admin interface. These are the numbers I check first when diagnosing pool-related issues:

-- Essential PgBouncer monitoring queries via SHOW commands:

-- Current pool state:
SHOW POOLS;
-- database | user | cl_active | cl_waiting | sv_active | sv_idle
-- mydb     | app  |    47     |     0      |     8     |   12

-- cl_active:  clients executing queries (47 app connections active)
-- cl_waiting: clients waiting for a backend (0 = healthy)
-- sv_active:  backends running queries (8 of 20)
-- sv_idle:    backends available (12 of 20)
-- If cl_waiting > 0 for sustained periods, your pool is too small
-- or your queries are too slow.

-- Connection stats:
SHOW STATS;
-- total_xact_count | total_query_count | avg_xact_time | avg_query_time
-- Check avg_query_time trends. Rising averages mean slower queries,
-- which means connections are held longer, which causes pool pressure.

-- PostgreSQL side — who is connected:
SELECT usename, application_name, state, count(*)
FROM pg_stat_activity
WHERE backend_type = 'client backend'
GROUP BY usename, application_name, state
ORDER BY count DESC;

The single most important metric is cl_waiting — clients waiting for a backend connection. If this number is consistently above zero, your pool is saturated. But before you increase the pool size, ask why. The most common cause of pool saturation is not insufficient pool size — it is slow queries holding connections for too long. A query that takes 2 seconds holds a backend connection for 2,000ms. During that time, no other client can use it. Fix the slow query and the pool pressure disappears without changing any pool configuration.

This is, in my experience, the most important operational insight about connection pooling: pool saturation is usually a symptom of slow queries, not small pools. Increasing the pool size treats the symptom. Optimizing the query treats the cause.

"Pool saturation is usually a symptom, not a cause. Before you increase the pool size, ask what is holding the connections open for so long. The answer is almost always a slow query."

— from You Don't Need Redis, Chapter 17: Sorting Out the Connection Poolers

Serverless and connection pooling: a particular urgency

Serverless functions — AWS Lambda, Vercel Edge Functions, Cloudflare Workers — present the connection pooling problem in its most acute form. Each function invocation may create a new connection. Cold starts mean new TCP and TLS handshakes. And the autoscaling that makes serverless attractive is precisely what makes it dangerous to an unpooled database: a traffic spike that scales your functions from 10 to 500 simultaneously creates 500 new PostgreSQL connections.

I have seen this bring down production databases. The failure mode is not graceful. PostgreSQL hits max_connections, begins rejecting new connections, and every function invocation fails simultaneously. The autoscaler, observing failures, may scale up further — creating more functions that try to connect, fail, and trigger more scaling. A feedback loop that resolves only when someone manually intervenes.

The solution is an external pooler between your serverless functions and PostgreSQL. Every major cloud PostgreSQL provider now offers managed connection pooling for this reason. Supabase uses Supavisor. Neon offers a serverless WebSocket proxy. AWS RDS Proxy was built specifically for the Lambda-to-RDS use case.

The honest caveat: adding a pooler between serverless functions and the database introduces another network hop — typically 0.5-2ms of additional latency per query. For most applications this is invisible. For latency-critical paths already measured in single-digit milliseconds, it is worth measuring. The alternative — catastrophic connection exhaustion during traffic spikes — is almost always worse than 1ms of additional latency.

When you do not need an external pooler

I have spent considerable effort advocating for connection pooling, so it would be a disservice not to mention the cases where an external pooler is unnecessary overhead.

Single-process applications with bounded concurrency. If your application is a single process — a desktop tool, a CLI, a cron job — with a known maximum of 5-10 concurrent database operations, your application-level pool is sufficient. There is no consolidation benefit from an external pooler because there is nothing to consolidate.

Connection counts well below the optimal threshold. If your total application connections across all processes is 15 and your server can optimally handle 20, an external pooler adds complexity without material benefit. You are already within the optimal range.

Workloads dominated by long-running sessions. If your application opens a connection, uses it continuously for minutes (streaming replication, long ETL jobs, interactive analytics sessions), session-mode pooling provides no benefit and transaction mode would interfere with your session state. Direct connections with a sufficiently high max_connections may be simpler.

These cases are real, but they are the minority. Most production web applications — multiple processes, variable concurrency, short transactional queries — benefit substantially from an external pooler.

Pooling and optimization: complementary, not competing

Connection pooling and query optimization solve different problems. The pooler ensures efficient transport — connections are cheap, reused, and limited to what PostgreSQL can handle. Query optimization ensures the queries themselves are fast — proper indexes for slow filters, materialized views for repeated aggregations, batching for N+1 patterns.

The relationship between the two is closer than it first appears. A slow query does not merely waste time — it holds a pooled connection for the duration of its execution. A query that takes 2 seconds occupies a backend connection for 2 seconds. In a pool of 20 connections, that single slow query consumes 5% of your total database capacity for its entire duration. Fix the query to run in 20ms, and that backend connection is held for 1% of the time, available for other work for the remaining 99%.

This means connection pool sizing and query performance are not independent variables. Faster queries need fewer concurrent connections, which means smaller pools, which means less PostgreSQL resource consumption, which means faster queries. The virtuous cycle runs in both directions.

Gold Lapel operates at the same architectural position as a connection pooler — between the application and the database — but addresses the query side rather than the connection side. In practice, many deployments use both: PgBouncer (or pgcat) for connection management, Gold Lapel for query optimization. The pooler ensures your application can reach the database efficiently. Gold Lapel ensures what happens after arrival is efficient too. The two are complementary, not competing.

Frequently asked questions

Terms referenced in this article

If the question of which pooler to deploy interests you beyond the comparison table above, I have prepared a dedicated comparison of PostgreSQL poolers — PgBouncer, pgcat, Odyssey, and the newer entrants — with architecture diagrams, failure mode analysis, and honest assessments of where each falls short.