← Connection Pooling & Resource Management

Connection Pool Exhaustion in Vercel Fluid Compute: Why Your Timeouts Never Fire

Your idle timeout is configured correctly. It simply never executes. Permit me to explain.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 20 min read
The illustration is frozen mid-render. It will resume when connections become available.

Good evening. Your connections are leaking, and it is not your fault.

If you are running a Next.js application on Vercel with a PostgreSQL database — Neon, Supabase, RDS, or self-hosted — and you have recently noticed too many connections errors that appear without warning and vanish just as mysteriously, you are experiencing one of the more elegant failure modes in modern serverless infrastructure.

I say "elegant" because there is nothing wrong with your code. Your pg.Pool configuration is textbook. Your idleTimeoutMillis is set to a reasonable value. Your max connections are conservative. You followed the documentation. You may have even read a blog post recommending exactly these settings.

Everything is correct. And everything is leaking.

The issue is that Vercel Fluid Compute — Vercel's own documentation describes the model, and it is worth understanding thoroughly — suspends your function between invocations, and when a Node.js process is suspended, setTimeout stops counting. Your idle timeout is configured to fire after 10 seconds. It will fire after 10 seconds of active execution — which, in a serverless function that handles one request every few minutes, might take hours of wall-clock time.

In the meantime, the connection sits open. PostgreSQL is holding a backend process for it. Your connection limit is being consumed by ghosts — connections that are neither in use nor being cleaned up, existing in a liminal state between "idle" and "dead" that no monitoring tool on the client side can observe.

I find this behaviour — creating connections with a mechanism for closing them that is then prevented from operating — to be the infrastructural equivalent of setting a kitchen timer and then unplugging the clock. The timer is not broken. The kitchen is frozen in time.

How does Vercel Fluid Compute actually work?

To understand why this happens, one must understand what Fluid Compute actually does — not the marketing summary, but the execution model. The distinction matters, because the symptoms only make sense once you see the mechanism clearly.

Traditional serverless — what Vercel calls "Standard" functions — boots a new instance per invocation and tears it down when the response is sent. Each invocation is isolated. Connection pools are useless because nothing persists between requests. Every request pays the cold start tax: process initialization, module loading, connection establishment. For a Next.js API route connecting to PostgreSQL with TLS, that cold start alone can add 50-200ms to every request.

Fluid Compute, which became Vercel's default execution model, takes a fundamentally different approach. The function instance persists between invocations. Global variables survive. Connection pools survive. In-memory caches work. Module-level initialization runs once. This is excellent for performance — subsequent requests skip the cold start entirely, and a warm connection pool means queries execute with sub-millisecond acquisition latency instead of 10-20ms of TCP and TLS negotiation.

The catch is in the word "persists." The function instance persists, but it does not run continuously. Between invocations, Vercel suspends the function. The Node.js event loop is frozen. Not "idle" — frozen. No timers fire. No I/O callbacks execute. No keepalive packets are sent. No garbage collection runs. The process is, from its own perspective, paused in time. A nanosecond or an hour — the function cannot tell the difference.

When the next request arrives, the function is thawed. The event loop resumes exactly where it stopped. Timers pick up where they left off — not from where wall-clock time says they should be, but from where they were when the event loop was frozen. A setTimeout(fn, 10000) that had 9,900ms remaining when the function was suspended will fire after 100ms of active runtime, regardless of whether the suspension lasted 5 seconds or 5 hours.

But the outside world has moved on. TCP connections may have been closed by the server. PostgreSQL may have reaped idle backends. Neon may have scaled the compute endpoint to zero. The pool's internal state says "10 healthy connections." Reality says otherwise.

The anatomy of a leaked connection

Allow me to walk through the timeline precisely, because the failure mode is subtle and the symptoms are intermittent — which makes it particularly unpleasant to debug.

Anatomy of a leaked connection
Timeline of a leaked connection in Vercel Fluid Compute:

 t=0ms     Function invoked, connection checked out from pool
 t=12ms    Query executes, connection returned to pool (idle)
 t=12ms    idleTimeoutMillis timer starts: fire in 10,000ms
 t=50ms    Response sent to client
 t=100ms   No more work — Vercel suspends the function
           ─── event loop frozen ───
           The 10,000ms timer is paused at 9,900ms remaining.
           The connection sits open. PostgreSQL sees it as active.
           ─── minutes pass ───
 t=180s    PostgreSQL's idle_in_transaction_session_timeout fires (if set)
           OR: the connection stays open until the server's tcp_keepalive
           finally kills it (default: ~2 hours on most cloud providers)
 t=???     Next request arrives. Vercel thaws the function.
           The timer resumes with 9,900ms remaining.
           But the connection is already dead server-side.
           Pool doesn't know. Next query: "Connection terminated unexpectedly."

The critical detail: idleTimeoutMillis uses setTimeout internally. When Vercel freezes the event loop, setTimeout does not advance. A 10-second timer set at t=12ms will not fire until the function has accumulated 10 seconds of active runtime — which could be dozens or hundreds of invocations later, depending on traffic patterns.

During low-traffic periods — nights, weekends, after a deploy with no immediate traffic — the function may be suspended for minutes or hours. Every connection in the pool sits open, consuming a PostgreSQL backend slot, completely invisible to the pool's health-check logic.

With a managed database like Neon (default limit: 100 connections — their connection pooling documentation is worth consulting for the specifics of your plan) or Supabase (depending on plan), it takes remarkably few function instances to exhaust the limit entirely. Five instances with max: 20 is 100 connections — and none of them will be released by idleTimeoutMillis until active traffic resumes.

It is not only setTimeout. It is everything.

I should be thorough here, because the frozen-timer problem extends beyond idleTimeoutMillis. Every timer-based mechanism in the Node.js connection lifecycle stops working under suspension.

TCP keepalive under container suspension
// What happens to TCP keepalive under Fluid Compute:
//
// pg.Pool default: keepAlive = true (Node.js socket option)
// Node.js sets SO_KEEPALIVE on the TCP socket.
// The OS sends keepalive probes every keepAliveInitialDelayMillis (default: 0).
//
// But keepalive probes are sent by the OS network stack,
// which IS running even when the Node.js event loop is frozen.
//
// ...right?
//
// Wrong. Vercel Fluid Compute doesn't just pause the event loop.
// The entire container is suspended. The OS scheduler does not run.
// No keepalive probes are sent. The TCP connection goes silent.
//
// PostgreSQL sees: a connection that stopped sending data.
// If tcp_keepalives_idle is set (default varies by provider):
//   - Neon: aggressive — may close within 60-300s
//   - Supabase: depends on plan and pooler config
//   - RDS: default Linux tcp_keepalive_time = 7200s (2 hours)
//   - Self-hosted: whatever you configured (or didn't)
//
// The connection is dead. The pool doesn't know.
// This is not a Node.js problem. It is a container suspension problem.

TCP keepalive probes are sent by the operating system's network stack, which you might reasonably expect to continue operating independently of the Node.js event loop. Under traditional process suspension (e.g., SIGSTOP), you would be correct — the kernel continues sending probes. But Vercel's container-level suspension freezes the entire userspace, including the kernel's timer subsystem for that container. No probes are sent. The connection appears healthy from the client's perspective and appears dead from the server's.

Pool health checks — the pg library does not have an active health-check loop, but some ORMs (Sequelize, TypeORM) implement periodic connection validation via setInterval. These intervals freeze with everything else.

Tarn.js reaper — Knex's pool library runs a periodic reaper on setInterval to evict idle connections. Under suspension, the reaper sleeps alongside everything else. When it wakes, it checks connections that have been idle for idleTimeoutMillis of active runtime, not wall-clock time. Connections that have been idle for hours appear to have been idle for milliseconds.

The pattern is consistent: any mechanism that relies on the passage of time within the Node.js process is unreliable under Fluid Compute. Time does not pass when you are frozen. This is not a bug in Vercel — it is a fundamental property of process suspension that happens to interact poorly with connection lifecycle management.

How to observe the leak in production

Before fixing the problem, you should be able to see it. The leak is invisible from the client side — your application has no way of knowing its connections are dead until it tries to use one. But PostgreSQL can see everything.

Querying pg_stat_activity for leaked connections
-- Diagnosing leaked connections from pg_stat_activity:
SELECT
  pid,
  usename,
  application_name,
  client_addr,
  state,
  state_change,
  now() - state_change AS idle_duration,
  query
FROM pg_stat_activity
WHERE backend_type = 'client backend'
  AND state = 'idle'
ORDER BY idle_duration DESC;

-- What to look for:
-- idle_duration > 5 minutes from Vercel function IPs = leaked connections
-- Multiple connections from same client_addr, all idle = frozen pool
--
-- Expected output during a leak:
--  pid  | usename | state | idle_duration |  query
-- ------+---------+-------+---------------+---------------------------
--  1234 | app     | idle  | 00:47:23      | SELECT id FROM users ...
--  1235 | app     | idle  | 00:47:23      | SELECT * FROM orders ...
--  1236 | app     | idle  | 00:32:11      | SELECT count(*) FROM ...
--
-- These connections were returned to the pool inside the function,
-- but the function was suspended before idleTimeoutMillis fired.
-- PostgreSQL is holding 3 backend processes for connections
-- that will never send another query until the function thaws.

If you run this query against your production database and see idle connections from Vercel's IP ranges with idle_duration in the tens of minutes, you are looking at leaked connections from suspended function instances. The query column shows the last query that ran on each connection — useful for identifying which function or API route is responsible.

For Neon users: the Neon dashboard shows active connections in the "Monitoring" tab. If your connection count stays elevated during low-traffic periods (nights, weekends) rather than dropping to near-zero, your connections are leaking. Neon's auto-suspend feature for the compute endpoint will eventually kill these connections, but only after the idle timeout for the entire compute instance fires — which is measured in minutes, not the seconds your pool was configured for.

For Supabase users: the "Database" section in the dashboard shows connection counts. Supabase also provides Supavisor as a built-in pooler, which mitigates the thundering herd problem but does not fix the frozen-timer issue on the client side.

Server-side timeouts: the safety net you should already have

Before we discuss the proper fix, I want to address an immediate mitigation that many deployments are missing entirely. PostgreSQL has server-side timeout mechanisms that fire regardless of what the client is doing — or not doing.

PostgreSQL server-side timeout configuration
-- Server-side timeout configuration as a safety net:
-- These fire regardless of client-side suspension.

-- Close idle connections after 5 minutes (adjust to your traffic pattern):
ALTER SYSTEM SET idle_session_timeout = '300s';       -- Postgres 14+

-- Close idle-in-transaction connections after 30 seconds:
ALTER SYSTEM SET idle_in_transaction_session_timeout = '30s';

-- TCP keepalive settings (detect dead connections faster):
ALTER SYSTEM SET tcp_keepalives_idle = 60;     -- start probing after 60s
ALTER SYSTEM SET tcp_keepalives_interval = 10; -- probe every 10s
ALTER SYSTEM SET tcp_keepalives_count = 3;     -- give up after 3 probes
-- Total detection time: 60 + (10 × 3) = 90 seconds

SELECT pg_reload_conf();  -- apply without restart

-- Trade-off: aggressive timeouts will also kill legitimate
-- long-idle connections (e.g., admin sessions, monitoring).
-- Set per-role overrides if needed:
ALTER ROLE app_user SET idle_session_timeout = '300s';
ALTER ROLE admin_user SET idle_session_timeout = '0';  -- never timeout

These settings act as a safety net. Even if your client-side pool fails to clean up connections (because it is frozen, or because the function crashed, or because the network dropped), PostgreSQL will eventually reclaim the backend slots.

I should note the trade-off: aggressive idle_session_timeout values will also disconnect legitimate long-running sessions — your psql terminal, your monitoring connections, your migration tooling. Use per-role overrides to exempt administrative users. The goal is to be aggressive with application connections and lenient with everything else.

This is a safety net, not a solution. Setting idle_session_timeout = '300s' means leaked connections survive for up to 5 minutes before being reclaimed. During a traffic spike followed by a lull, that is 5 minutes of wasted backend slots. It limits the damage. It does not prevent it.

Vercel's fix: attachDatabasePool

Vercel recognized this problem and shipped the attachDatabasePool API in the @vercel/functions package. It is, to their credit, a clean solution to the frozen-timer problem specifically.

// Vercel's attachDatabasePool API (2025)
// Registers the pool with the runtime so Vercel can
// drain connections before suspending the function.
import { Pool } from 'pg';
import { attachDatabasePool } from '@vercel/functions';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 5,
  idleTimeoutMillis: 0,   // disable — it will never fire reliably
});

// Tell Vercel about the pool.
// On suspend: Vercel calls pool.end() before freezing.
// On thaw: a new pool is created on next invocation.
attachDatabasePool(pool);

export default async function handler(req, res) {
  const { rows } = await pool.query('SELECT * FROM orders WHERE status = $1', ['pending']);
  res.json(rows);
}

When you register a pool with attachDatabasePool, Vercel's runtime hooks into the suspend lifecycle. Before freezing the function, it calls pool.end(), which gracefully closes all connections and drains the pool. When the function is thawed on the next request, the pool lazily re-establishes connections as needed.

What happens during suspend and thaw
// What attachDatabasePool actually does under the hood:
//
// 1. Vercel's runtime maintains a registry of pools
// 2. Before suspending the function instance:
//    - Calls pool.end() on each registered pool
//    - pool.end() sends Terminate messages to PostgreSQL
//    - Waits for graceful disconnect (with a short timeout)
//    - Then freezes the container
//
// 3. On thaw (next request arrives):
//    - The pool object still exists in memory (global scope)
//    - But its internal connections array is empty
//    - First pool.query() triggers lazy connection creation
//    - New TCP + TLS handshake to PostgreSQL
//
// Cost of thaw reconnection:
//   Without TLS:  3-5ms per connection
//   With TLS:     10-20ms per connection (typical for cloud DBs)
//   With pgBouncer/proxy in same region: 1-3ms
//
// If your function handles bursty traffic (5 requests, then silence),
// you pay this reconnection cost after every silence period.
// With a proxy in the same region, the cost drops to near-zero.

This is a meaningful improvement. No more ghost connections. No more stale pool state. The pool starts clean on every thaw.

But it requires three things:

  1. Your ORM or query builder must expose the underlying pg.Pool instance. If the pool is internal and inaccessible (Prisma, Sequelize, TypeORM), attachDatabasePool cannot help.
  2. You must register every pool. If you have a read replica pool and a primary pool, both must be registered. If you initialize a pool in a module that is lazily imported, it must be registered before the first suspend — which may happen before that module is ever imported.
  3. You must accept the reconnection latency. After every suspend period, the first request pays the cost of re-establishing connections: 3-20ms per connection depending on TLS configuration and network distance. For a pool of 5 connections with TLS, that is 15-100ms of additional latency on the first request after a cold period.

The honest counterpoint

I should be forthcoming about the limitations, because attachDatabasePool is frequently presented as "the fix" when it is more accurately "a fix for one of the two problems."

attachDatabasePool solves the frozen-timer problem. It does not solve the thundering-herd problem. During a deployment, all function instances are replaced simultaneously. Each new instance creates a fresh pool with zero registered connections. attachDatabasePool has nothing to drain because nothing has been created yet. The burst of new connections from dozens of simultaneously-booting instances hits PostgreSQL unmitigated.

It also introduces a new failure mode, though a minor one: if the pool.end() call hangs (because a connection is stuck in a long-running query), Vercel must either wait (adding latency to the suspend) or force-close (potentially interrupting the query). In practice, Vercel applies a short timeout to the drain, but this means connections in the middle of long queries may be terminated ungracefully.

For ORMs that manage their own connection pools internally — Prisma, Sequelize, TypeORM — attachDatabasePool cannot help at all. There is no pool to hand it. This is where a server-side connection pooler becomes not just helpful but necessary.

"The connection pooler landscape has expanded from a single incumbent into a competitive field of purpose-built tools, each making different architectural trade-offs."

— from You Don't Need Redis, Chapter 17: Sorting Out the Connection Poolers

Correct patterns by ORM

Each ORM handles connection pooling differently, which means each requires a different approach to Fluid Compute compatibility. I have tested these against Vercel's 2025-2026 runtime behavior and verified the patterns against each library's source code.

pg (node-postgres)

The default pattern — leaks connections under Fluid Compute
// The setup that works perfectly in development
// and leaks connections in Vercel Fluid Compute.
import { Pool } from 'pg';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 10,
  idleTimeoutMillis: 10000,   // close idle connections after 10s
  connectionTimeoutMillis: 5000,
});

export default async function handler(req, res) {
  const { rows } = await pool.query('SELECT id, name FROM users WHERE active = true');
  res.json(rows);
  // Connection returns to the pool.
  // idleTimeoutMillis will close it after 10s of inactivity.
  // ...unless the function gets suspended before then.
}

This is the configuration you will find in most tutorials and most production codebases. It is correct for traditional server environments and correct for Vercel Standard functions (where the pool is destroyed after each invocation anyway). Under Fluid Compute, it leaks.

The fix is straightforward: register the pool with attachDatabasePool and disable the idle timeout entirely. The idle timeout will never fire reliably, so it should not be relied upon — and its presence creates false confidence that connections are being managed when they are not.

Drizzle ORM

// Drizzle ORM — correct pattern for Vercel Fluid Compute
import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';
import { attachDatabasePool } from '@vercel/functions';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 5,
  idleTimeoutMillis: 0,   // disable idle timeout entirely
});

attachDatabasePool(pool);

const db = drizzle(pool);

// Usage in API route:
export async function GET() {
  const users = await db.select().from(usersTable).where(eq(usersTable.active, true));
  return Response.json(users);
}

Drizzle wraps pg.Pool without hiding it, so the fix is identical to raw pg. Create the pool, pass it to both drizzle() and attachDatabasePool(). Drizzle's architectural decision to not abstract the pool away from you is, in this particular context, an advantage. You have full control over the pool lifecycle because Drizzle never took it from you.

Prisma

// Prisma — correct pattern for Vercel Fluid Compute
// Prisma manages its own connection pool internally.
// The fix: set connection_limit and pool_timeout in the URL.
//
// prisma/.env:
// DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=5&pool_timeout=10"
//
// Prisma does NOT expose its pool to attachDatabasePool.
// Instead, rely on Prisma Accelerate (their own proxy) for serverless:

import { PrismaClient } from '@prisma/client';

// For Vercel: use Prisma Accelerate connection string
const prisma = new PrismaClient({
  datasourceUrl: process.env.PRISMA_ACCELERATE_URL,
});

// Accelerate handles the connection lifecycle server-side.
// Your function never holds a direct Postgres connection.

Prisma's connection pool is internal and inaccessible. The PrismaClient manages its own pool through the Prisma engine binary — a separate process that handles connection management. You cannot pass this pool to attachDatabasePool. There is no public API to access it.

The recommended path is Prisma Accelerate — the Prisma team documents it in their Accelerate guide — which moves the connection pool to a managed proxy. Your function never holds a direct PostgreSQL connection — it connects to Accelerate's edge endpoint, and Accelerate maintains the persistent pool to your database.

Alternative: Prisma with an external pooler
// If you cannot use Prisma Accelerate (cost, latency requirements,
// or preference for self-managed infrastructure), the alternative
// is an external pooler between Prisma and PostgreSQL:
//
// prisma/.env:
// DATABASE_URL="postgresql://user:pass@pgbouncer-host:6432/db?connection_limit=5&pgbouncer=true"
//
// The pgbouncer=true flag tells Prisma to:
//   - Use simple protocol instead of extended protocol
//   - Avoid named prepared statements (incompatible with transaction pooling)
//   - Disable implicit transactions for single queries
//
// This works with PgBouncer, pgcat, Supavisor, or Gold Lapel.
//
// The connection_limit still matters — it controls how many connections
// THIS Prisma instance opens to the pooler. Set it low (3-5).

If Prisma Accelerate is not an option — whether due to cost, latency requirements, data residency concerns, or a preference for self-managed infrastructure — the alternative is any server-side pooler between Prisma and PostgreSQL. The pgbouncer=true flag in the connection string is essential: it adjusts Prisma's query protocol to be compatible with transaction-mode pooling.

Knex.js

// Knex.js — correct pattern for Vercel Fluid Compute
import Knex from 'knex';
import { attachDatabasePool } from '@vercel/functions';

const knex = Knex({
  client: 'pg',
  connection: process.env.DATABASE_URL,
  pool: {
    min: 0,          // allow pool to empty completely
    max: 5,
    idleTimeoutMillis: 0,
    reapIntervalMillis: 0, // disable the reaper — it uses setInterval
  },
});

// Knex uses tarn.js for pooling, which wraps pg.Pool.
// Access the underlying pool for attachDatabasePool:
attachDatabasePool(knex.client.pool);

// Usage:
const orders = await knex('orders').where('status', 'pending').limit(100);

Knex uses tarn.js for pool management, which has its own reaper interval (reapIntervalMillis) running on setInterval. This is a second frozen timer that will also fail to fire under Fluid Compute. You must disable both idleTimeoutMillis and reapIntervalMillis.

Accessing the underlying pool for attachDatabasePool requires reaching into knex.client.pool, which is the tarn.js pool instance. This is not officially documented by Knex but is stable across versions and is the approach Vercel's own documentation recommends.

Sequelize

// Sequelize — the uncomfortable truth for Fluid Compute
import { Sequelize } from 'sequelize';

const sequelize = new Sequelize(process.env.DATABASE_URL, {
  dialect: 'postgres',
  pool: {
    max: 5,
    min: 0,
    acquire: 10000,
    idle: 0,           // disable idle eviction (frozen timers)
    evict: 0,          // disable periodic eviction (frozen setInterval)
  },
  dialectOptions: {
    // If using a pooler with transaction-mode:
    statement_timeout: 30000,
    idle_in_transaction_session_timeout: 30000,
  },
});

// Sequelize's pool is NOT compatible with attachDatabasePool.
// The internal pool (sequelize-pool) does not expose end() in
// the way Vercel expects.
//
// Your options:
//   1. Use a server-side pooler (PgBouncer, Gold Lapel) — recommended
//   2. Call sequelize.close() in a beforeSuspend hook (if Vercel adds one)
//   3. Accept leaked connections and set aggressive server-side timeouts
//
// Option 1 is the only reliable choice today.

Sequelize presents the most difficult case. Its internal pool (sequelize-pool) does not expose the interface that attachDatabasePool expects. There is no clean way to hook into Vercel's suspend lifecycle from Sequelize's API.

If you are using Sequelize on Vercel Fluid Compute, a server-side pooler is not optional — it is required. This is the only ORM in this comparison where I would say the proxy is strictly necessary rather than merely recommended.

TypeORM

TypeORM's situation mirrors Sequelize's. The internal pool is not compatible with attachDatabasePool, and there is no clean workaround. Use a server-side pooler. I will spare you the code example, as the configuration is identical to Sequelize: disable idle eviction, set min: 0, and rely on the proxy for connection lifecycle management.

Summary table

ORM / DriverFluid Compute fixServer-side poolerNotes
pg (node-postgres)attachDatabasePool(pool) + idleTimeoutMillis: 0Optional but recommendedMost control. Set max: 5 or lower.
DrizzleSame as pg — Drizzle wraps pg.PoolOptional but recommendedPass pool to both drizzle() and attachDatabasePool()
PrismaPrisma Accelerate or external poolerStrongly recommendedNo pool access for attachDatabasePool. Use a proxy.
KnexattachDatabasePool(knex.client.pool) + idleTimeoutMillis: 0Optional but recommendedAlso disable reapIntervalMillis (tarn.js reaper).
SequelizeExternal pooler onlyRequiredInternal pool incompatible with attachDatabasePool.
TypeORMExternal pooler onlyRequiredInternal pool not compatible with Vercel lifecycle hooks.

The pattern is clear: if your ORM exposes the pool, use attachDatabasePool and disable idle timeouts. If it does not, use a server-side pooler. In either case, idle timeouts should be disabled. They are not merely unreliable under Fluid Compute — they are actively misleading, because they create the appearance of connection management without actually providing it.

The thundering herd on deploy

There is a second connection exhaustion vector that attachDatabasePool does not address, and it occurs precisely when you are least able to debug it: during deployment.

// The thundering herd on deploy:
// 1. You push a new deployment to Vercel
// 2. All existing function instances are replaced
// 3. New instances boot cold — each creates a fresh pool
// 4. If 50 instances boot simultaneously, each with max=10:
//    → 500 new connections hit PostgreSQL at once
//
// PostgreSQL default max_connections = 100.
// You are now 5x over limit.
//
// Mitigation: use a server-side connection pooler.
//
// With PgBouncer / Gold Lapel between Vercel and Postgres:
//   50 instances × 10 pool connections = 500 client connections
//   PgBouncer holds 20 actual Postgres connections
//   500 clients multiplex onto 20 backends
//   PostgreSQL sees 20 connections. It is content.

When you push a new deployment, Vercel replaces all existing function instances with fresh ones. Each new instance initializes its connection pool from scratch. If traffic is high — or if a preview deployment triggers a burst of synthetic health checks — dozens of instances may boot simultaneously, each establishing max connections to PostgreSQL.

With max: 10 and 30 simultaneous instances, that is 300 connection requests hitting PostgreSQL in the span of a few hundred milliseconds. If PostgreSQL's max_connections is set to the typical cloud-provider default of 100, the first 100 connections succeed and the remaining 200 receive FATAL: too many connections for role.

The arithmetic of connection exhaustion
// Real-world thundering herd scenarios:
//
// Scenario A: Small Next.js app on Vercel Pro
//   Typical concurrent instances: 5-15
//   Pool max per instance: 10 (common default)
//   Peak connection demand: 150
//   Neon free tier max_connections: 100
//   Result: FATAL errors during deploy ✗
//
// Scenario B: Medium traffic API, Vercel Enterprise
//   Typical concurrent instances: 30-80
//   Pool max per instance: 10
//   Peak connection demand: 800
//   Supabase Pro max_connections: 200
//   Result: Sustained connection failures during deploy ✗
//
// Scenario C: Same medium traffic API, with server-side pooler
//   Client connections to pooler: 800
//   Pooler upstream to Postgres: 25
//   Postgres max_connections: 30
//   Result: No errors. Pooler queues briefly during burst. ✓
//
// The math always wins. Client-side pooling multiplies.
// Server-side pooling divides.

This is the thundering herd problem, and no amount of client-side configuration can solve it. You cannot coordinate between function instances — they do not know each other exist. You cannot predict how many instances Vercel will spawn — that is determined by traffic patterns and Vercel's autoscaling algorithm. You cannot rate-limit connection creation without adding latency to every request.

You need something between the clients and the database that absorbs the burst: a connection pooler. I have prepared a PostgreSQL connection pooling guide that walks through the available options and their trade-offs.

The Neon and Supabase question

Both Neon and Supabase, the two most common managed PostgreSQL providers used with Vercel, provide built-in connection poolers. This is worth discussing specifically because the answer to "do I need an additional pooler?" depends on which provider you are using and how you are connecting.

Neon pooler configuration for Vercel
// Neon-specific considerations for Vercel Fluid Compute:
//
// Neon provides a built-in connection pooler (PgBouncer-based).
// Your Neon dashboard shows two connection strings:
//   1. Direct:  postgresql://user:pass@ep-cool-name.us-east-2.aws.neon.tech/mydb
//   2. Pooled:  postgresql://user:pass@ep-cool-name.us-east-2.aws.neon.tech/mydb?pgbouncer=true
//
// For Vercel Fluid Compute, ALWAYS use the pooled endpoint.
//
// But: Neon's pooler runs in transaction mode by default.
// This means:
//   - No SET commands that persist across queries
//   - No LISTEN/NOTIFY
//   - No session-level prepared statements
//   - No advisory locks (session-level)
//
// If your ORM uses prepared statements (most do by default),
// you need to disable them:
//
// node-postgres:
const pool = new Pool({
  connectionString: process.env.DATABASE_URL, // pooled endpoint
  max: 5,
  idleTimeoutMillis: 0,
});

// Drizzle + Neon pooler:
// No special config needed — Drizzle uses simple protocol by default.

// Prisma + Neon pooler:
// Add ?pgbouncer=true to DATABASE_URL in prisma/.env

Neon provides a PgBouncer-based pooler accessible via the ?pgbouncer=true connection string parameter. This pooler runs in transaction mode, which handles the thundering herd problem and limits backend connection count. However, the frozen-timer problem still applies to the client-side pool inside your function. You should still set idleTimeoutMillis: 0 and use attachDatabasePool.

Neon also has an additional layer of complexity: the compute endpoint itself can scale to zero. If your database has been idle and Neon has suspended the compute, the first connection after suspension must wake the compute — adding 500ms-2s of latency. This is separate from the Fluid Compute suspension issue and cannot be mitigated by attachDatabasePool. If this latency is unacceptable, disable Neon's auto-suspend or use their "always on" compute option.

Supabase provides Supavisor, an Elixir-based pooler that supports both transaction and session modes. Like Neon's pooler, it handles server-side connection management and thundering herd absorption. The same client-side recommendations apply: disable idle timeouts, use attachDatabasePool where possible, and keep max low.

In both cases, the built-in pooler is sufficient for most applications. You do not need an additional external pooler unless you need features the built-in pooler does not provide — such as query-level optimization, cross-region routing, or connection management across multiple database providers.

Why a server-side proxy resolves both problems

The frozen-timer problem and the thundering-herd problem share a root cause: the connection lifecycle is being managed on the client side, in an environment where the client is unreliable. Vercel functions suspend unpredictably. They scale unpredictably. They are replaced without notice. Asking them to manage long-lived database connections is asking a temporary employee to hold the keys to the building.

A server-side connection proxy — PgBouncer, pgcat, Supavisor, or Gold Lapel — moves the connection lifecycle to a stable process that does not suspend, does not scale to zero, and does not get replaced on deploy.

// With Gold Lapel as your connection endpoint:
import { Pool } from 'pg';

const pool = new Pool({
  // npm install goldlapel, then goldlapel.start() — connections route through GL
  connectionString: process.env.GOLDLAPEL_URL,
  max: 5,
  idleTimeoutMillis: 0,  // still disable — frozen timers still freeze
});

// Gold Lapel handles:
// - Session-mode connection pooling to upstream Postgres
// - Server-side keepalive (not dependent on client timers)
// - Connection lifecycle management when clients disconnect
// - Thundering herd absorption on deploy
//
// If the Vercel function suspends and the client connection dies,
// Gold Lapel detects it server-side and returns the upstream
// connection to the pool. No leaked connections. No stale state.

When your Vercel function connects to Gold Lapel instead of directly to PostgreSQL, several things change:

  • Frozen timers become irrelevant. Gold Lapel manages upstream connections with server-side keepalive. If a client connection goes silent because the function was suspended, Gold Lapel detects it and returns the upstream Postgres connection to the pool. No leaked connections. The detection happens on the proxy side, using timers that are not frozen because the proxy is not frozen.
  • Thundering herds are absorbed. Fifty function instances opening 5 connections each means 250 connections to Gold Lapel — which multiplexes them onto a much smaller set of actual Postgres connections. PostgreSQL sees a steady, manageable connection count regardless of how many Vercel instances are running. If all 250 connections arrive simultaneously during a deploy, Gold Lapel queues the excess and drains the queue as backend connections become available. The burst is invisible to PostgreSQL.
  • TLS handshake cost is paid once. The connection from Gold Lapel to Postgres is persistent and pre-established. Your Vercel function's connection to Gold Lapel can be a fast, regional TCP connection rather than a cross-region TLS handshake to the database. If Gold Lapel runs in the same region as your Vercel functions, the reconnection cost after a thaw drops from 10-20ms to 1-3ms.
  • Connection state is clean. Gold Lapel operates in session mode, so each client connection gets a clean upstream session. No stale prepared statements, no leaked transaction state, no SET commands from previous invocations bleeding through. When the function thaws and reconnects, it gets a pristine session — not one that was used by a previous invocation that may have set a different search_path or statement_timeout.

When a proxy is not the right answer

A waiter who overstates his case is no waiter at all. There are scenarios where a server-side proxy adds complexity without sufficient benefit:

  • Single-function applications with low traffic. If your entire application is one API route handling 10 requests per hour, the connection leak is real but manageable with attachDatabasePool alone. Adding a proxy introduces another running service, another failure point, and another line item on the bill. The juice may not be worth the squeeze.
  • Provider-managed poolers that already suffice. If you are using Neon's pooled endpoint or Supabase's Supavisor and your connection count is stable, you already have a server-side pooler. Adding another one in front of it is redundant.
  • Development and staging environments. The frozen-timer problem only manifests in Vercel's production runtime. Local development (even with vercel dev) does not suspend functions. Do not add operational complexity to environments where the problem does not exist.

The honest recommendation: if your managed database provider includes a pooler and you are using attachDatabasePool correctly, that combination handles the vast majority of cases. An external proxy becomes valuable when you outgrow the provider's pooler (connection limits, feature limitations), when you need cross-provider consistency, or when you want query-level optimization alongside connection management.

You should still disable idleTimeoutMillis on the client pool — frozen timers are frozen regardless of what sits on the other end. And attachDatabasePool is still worth using for the clean-shutdown benefit. But the catastrophic failure modes — leaked connections, exhausted limits, thundering herds — are handled by the proxy, not by timer arithmetic in a process that may be suspended at any moment.

A checklist for Vercel + Postgres deployments

For those who prefer their guidance in list form, here is what I recommend for any Next.js application on Vercel connecting to PostgreSQL:

  1. Set max to 5 or lower. Each Vercel instance should hold minimal connections. The pooler handles multiplexing. A pool of 5 provides headroom for parallel queries within a single request without consuming excessive backend slots across instances.
  2. Set idleTimeoutMillis to 0. Disable it entirely. It will not fire reliably under Fluid Compute and creates false confidence that connections are being managed. A disabled timer is honest about its limitations. An unreliable timer is not.
  3. Set min to 0. Allow the pool to empty completely. There is no benefit to maintaining minimum connections in a process that suspends. Pre-warmed connections become cold connections when the function freezes.
  4. Use attachDatabasePool if your ORM exposes the underlying pool. This handles graceful shutdown on suspend, draining connections before the container is frozen.
  5. Place a server-side pooler between Vercel and Postgres. PgBouncer, Supavisor, Neon's pooler, or Gold Lapel. This is the only reliable solution to the thundering herd. If your database provider includes a pooler, use it. If it does not, deploy one.
  6. Configure server-side timeouts in PostgreSQL. Set idle_session_timeout, idle_in_transaction_session_timeout, and aggressive TCP keepalive settings. These are your safety net for connections that slip through every other mitigation.
  7. Monitor pg_stat_activity during deploys. Watch for connection spikes. If you see the count hit max_connections, your pooler is misconfigured or absent. Set up an alert for numbackends approaching 80% of max_connections.
  8. Monitor idle connection duration. Connections idle for more than 5 minutes from application IPs are almost certainly leaked. If you see them regularly, your client-side configuration is incomplete.
  9. Test under realistic suspend conditions. The Vercel CLI's local dev server does not suspend functions. These bugs only appear in production or preview deployments. If you have a staging environment on Vercel, deploy there first and watch the connection count during and after deploy.

The broader pattern: serverless and stateful resources

The uncomfortable truth about Vercel Fluid Compute is that it is an excellent execution model with a non-obvious interaction with stateful connections. The function reuse is genuinely beneficial for performance. The suspension is genuinely beneficial for cost. But the combination means that any client-side timer — idle timeouts, keepalive intervals, reaper loops — becomes unreliable in a way that is silent, intermittent, and difficult to reproduce locally.

This is not unique to Vercel. AWS Lambda has the same freeze/thaw model with the same timer implications. Google Cloud Functions behaves similarly. Azure Functions in consumption plans exhibit the same pattern. Any platform that suspends processes between invocations will interact poorly with client-side connection lifecycle management.

The fix is architectural, and it is the same everywhere: move connection management to a process that does not suspend. Your application connects to the proxy. The proxy connects to the database. The proxy does not freeze, does not scale to zero, and does not lose track of time.

It is, if I may say so, the sort of arrangement that prevents unpleasant surprises at 2 AM. A household runs most smoothly when the staff who manage the keys are permanent, not seasonal. Your functions are seasonal workers — excellent at their specific task, but ill-suited to holding long-lived responsibilities. The connections to your database are long-lived responsibilities. A permanent member of staff should hold them.

In infrastructure, as in households, the best arrangements are the ones nobody notices. Connections that are managed properly are invisible. It is only when they leak, exhaust, or die unexpectedly that anyone is forced to think about them. The goal is to return to a state where you do not think about your connection lifecycle at all — not because you are ignoring it, but because it is genuinely handled.

Frequently asked questions

Terms referenced in this article

The serverless connection problem you have just navigated is not unique to Vercel. I have written a comparison of serverless PostgreSQL platforms — Neon, Supabase, and PlanetScale — that examines how each handles connection pooling, cold starts, and the idle timeout behaviour that Fluid Compute makes so treacherous.