Neon's Serverless Driver: HTTP or WebSocket? Allow Me to Present the Latency Evidence

Q: Should I use Neon HTTP mode or WebSocket mode for my serverless function?

For most serverless functions, HTTP mode is the better default. It requires zero connection management, works on every JavaScript runtime, and handles parallel reads exceptionally well via Promise.all. Choose WebSocket mode when you need sub-15ms latency on warm requests, multi-statement transactions with per-statement error handling, or prepared statement caching for repeated query shapes.

Q: How much latency does Neon HTTP mode add compared to WebSocket?

On warm requests from an edge function in the same region, HTTP mode adds 25-40ms per query versus 5-15ms for a pooled WebSocket connection. The gap widens at tail latencies: HTTP p99 reaches 110ms while WebSocket p99 stays around 38ms. For parallel independent reads, however, HTTP mode closes the gap because each query fires as a separate request via Promise.all.

Q: What causes the ~500ms cold start on Neon serverless?

The dominant cold start cost is Neon compute wake-up, not the connection mode. If your Neon compute has scaled to zero after 5 minutes of inactivity, the first request waits approximately 500ms for it to start. Setting min_compute_size to 0.25 CU (about $18.50/month) eliminates this entirely. Neither HTTP nor WebSocket mode can avoid the compute wake-up — it happens at the infrastructure level before either mode reaches the database.

Q: Can I use both HTTP and WebSocket modes in the same Neon application?

Yes, and it is often the best approach for mixed workloads. Use HTTP mode for dashboard pages and parallel reads where its stateless simplicity and Promise.all parallelism shine. Use WebSocket mode for checkout flows and multi-statement transactions where you need proper BEGIN/COMMIT semantics and per-statement error handling. The two modes coexist cleanly — there is no architectural conflict in using both.

Two connection modes, one database, and a great deal of conflicting advice. Permit me to supply the numbers instead.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 26 min read

The photographer attempted to capture a WebSocket handshake. The shutter never closed.

Good evening. You have chosen a serverless database.

An excellent choice, if I may say so. Neon has done something rather remarkable: taken PostgreSQL — a database that traditionally requires a long-running server, a stable IP address, and a sysadmin who remembers to run VACUUM — and made it available to functions that exist for 50 milliseconds at a time, on infrastructure you do not control, in regions you may not be able to name.

The engineering behind this deserves genuine respect. Neon's serverless driver offers two distinct connection modes: an HTTP-based SQL API and a WebSocket proxy that speaks the full PostgreSQL wire protocol. Both connect to the same Neon Postgres database. Both execute the same SQL. And yet they exhibit markedly different latency characteristics depending on what you are doing with them. Neon's driver documentation covers both modes thoroughly.

Neon's documentation explains how each mode works. What it does not provide — and what I have found developers most urgently need — is a comparative benchmark across real-world query patterns, a clear decision framework, and an honest accounting of cold start costs. The documentation tells you that two modes exist. It does not tell you which one to use, when, and why.

I have run the benchmarks. Measured the cold starts. Tested tail latencies under load. Compared both modes against Cloudflare Hyperdrive's TCP pooling approach. Verified runtime compatibility across six edge platforms. And I have opinions about when each mode is appropriate, which I shall state without hedging.

If you will permit me: the evidence.

How do the two modes actually work?

Understanding the latency numbers requires understanding the mechanics. They are different in ways that matter — and the differences are more subtle than "one is HTTP, the other is WebSocket."

HTTP mode: the stateless path

HTTP mode sends each query (or batch of queries) as an HTTPS POST request to Neon's SQL API endpoint. There is no persistent connection. No socket to manage. No pool to configure. You send SQL over HTTPS, you receive JSON over HTTPS. The entire PostgreSQL interaction — connection setup, authentication, query parsing, execution, result serialization — happens on Neon's infrastructure, invisible to your code.

HTTP mode — @neondatabase/serverless

import { neon } from '@neondatabase/serverless';

// HTTP mode — one query per HTTP request to Neon's SQL API
const sql = neon(process.env.DATABASE_URL);

export default async function handler(req) {
  // Each call makes an HTTPS POST to Neon's /sql endpoint
  const users = await sql`SELECT id, name, email FROM users WHERE active = true`;
  return Response.json(users);
}

Each call incurs the overhead of an HTTPS round trip to Neon's API. For a single query from a Vercel Edge Function in the same region, that is typically 25-40ms. The API handles connection management, query parsing, and result serialization entirely on Neon's side. Your function never opens a PostgreSQL connection. It never closes one. It never leaks one.

The appeal is simplicity. No connection pools. No client.release() calls to forget. No WebSocket polyfills. Import the function, call it, receive rows. For serverless functions that execute infrequently, this is genuinely attractive — not in a "good enough" sense, but in a "this is the correct architecture" sense.

The driver also supports a fullResults option that returns the same metadata shape as node-postgres, which matters if your code inspects rowCount or fields.

HTTP mode — fullResults for node-postgres compatibility

import { neon } from '@neondatabase/serverless';

// fetchFullRows: true returns full Postgres type info
const sql = neon(process.env.DATABASE_URL, { fullResults: true });

const result = await sql`SELECT id, name, balance FROM accounts WHERE id = ${accountId}`;
// result.rows — the data
// result.fields — column metadata (name, dataTypeID, etc.)
// result.rowCount — number of rows returned
// result.command — 'SELECT', 'INSERT', etc.

// Without fullResults, you get just the rows array.
// With it, you get the same shape as node-postgres client.query().

I should note what HTTP mode is doing under the hood, because the simplicity of the API obscures genuine engineering. Each call opens a fresh connection to Neon's SQL API proxy, which maintains its own pool of PostgreSQL connections to your Neon compute endpoint. Your query runs on a pre-warmed connection, even though your code never created one. The proxy handles connection lifecycle, authentication, and TLS termination — work that would otherwise fall to your serverless function.

This is, if I may be direct, an elegant architecture. The trade-off is that you pay HTTPS overhead per request, and you have no control over which backend connection executes your query. For most workloads, that trade-off is entirely acceptable. For some, it is not.

WebSocket mode: the persistent path

WebSocket mode establishes a connection through Neon's WebSocket proxy, which translates WebSocket frames into the PostgreSQL wire protocol. This gives you a proper pg-compatible Pool and Client, with all the connection semantics that implies — prepared statements, session variables, advisory locks, LISTEN/NOTIFY, and multi-statement transactions with proper BEGIN/COMMIT/ROLLBACK.

WebSocket mode — @neondatabase/serverless

import { Pool } from '@neondatabase/serverless';
import ws from 'ws';
import { neonConfig } from '@neondatabase/serverless';

// WebSocket mode — persistent connection through Neon's proxy
neonConfig.webSocketConstructor = ws;

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

export default async function handler(req) {
  const client = await pool.connect();
  try {
    const { rows } = await client.query(
      'SELECT id, name, email FROM users WHERE active = $1', [true]
    );
    return Response.json(rows);
  } finally {
    client.release();
  }
}

The first connection pays a WebSocket upgrade handshake (~100ms). Subsequent queries on the same connection skip that cost entirely. With connection pooling, warm requests drop to 5-15ms — less than half the HTTP mode's latency. The connection stays open across multiple queries, which means the PostgreSQL session state persists: temporary tables remain visible, prepared statements stay cached, GUC settings hold.

The cost is complexity. You need a WebSocket constructor (provided natively in Node.js 21+, required as a polyfill in older runtimes). You need connection pool management. You need to handle connection lifecycle — acquiring, releasing, and handling errors that leave connections in an indeterminate state. These are not difficult things, but they are things. And in serverless environments, "things" have a tendency to become "things that go wrong at 3am."

Pool sizing in particular requires thought, because your edge function pool is competing with every other instance of your edge function for Neon's connection limit.

WebSocket pool sizing — the numbers that matter

import { Pool, neonConfig } from '@neondatabase/serverless';
import ws from 'ws';

neonConfig.webSocketConstructor = ws;

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 5,                // max connections in the pool
  idleTimeoutMillis: 30000,   // close idle connections after 30s
  connectionTimeoutMillis: 10000, // fail if connect takes > 10s
});

// Why max: 5 and not max: 50?
// Each WebSocket connection goes through Neon's proxy.
// The proxy has its own connection limits.
// Neon Free tier: 20 concurrent connections.
// Neon Pro tier: 100-500 depending on compute size.
// Your edge function pool should be a fraction of that limit,
// because multiple edge function instances share the same Neon endpoint.

// Monitor pool health
setInterval(async () => {
  console.log({
    total: pool.totalCount,
    idle: pool.idleCount,
    waiting: pool.waitingCount,
  });
}, 10000);

I find this to be the most commonly misconfigured aspect of WebSocket mode. A developer reads that pooling improves performance, sets max: 50 because it sounds generous, and then wonders why Neon is rejecting connections. The connection budget is shared across all instances of your function. Generosity here is precisely the wrong instinct.

A matter of runtime compatibility

Before we proceed to benchmarks, a practical consideration that the performance discussion often obscures: not every runtime supports every mode.

Runtime compatibility — which mode works where

// Runtime compatibility matrix for Neon serverless driver
//
// Runtime                  | HTTP mode | WebSocket mode | Notes
// -------------------------+-----------+----------------+-------------------------
// Vercel Edge Functions    |    Yes    |      Yes       | ws polyfill needed < Node 21
// Cloudflare Workers       |    Yes    |      Yes*      | *Use neonConfig.webSocketConstructor
// Deno Deploy              |    Yes    |      Yes       | Native WebSocket available
// AWS Lambda (Node.js)     |    Yes    |      Yes       | Standard Node.js runtime
// AWS Lambda@Edge          |    Yes    |      No**      | **5MB bundle limit, ws adds size
// Bun                      |    Yes    |      Yes       | Native WebSocket
// Netlify Edge Functions   |    Yes    |      Yes       | Deno-based runtime
//
// If you are unsure about your runtime: HTTP mode works everywhere.
// WebSocket mode requires verifying WebSocket constructor availability.

HTTP mode works everywhere JavaScript runs. This is its most underappreciated advantage. If your application may deploy to Cloudflare Workers today and Deno Deploy tomorrow — or if different parts of your application run on different platforms — HTTP mode gives you a single code path that works without modification.

WebSocket mode requires a WebSocket constructor. In Node.js 21+ and Bun, this is globally available. In Cloudflare Workers, you need to configure neonConfig.webSocketConstructor. In older Node.js runtimes, you need the ws package. In AWS Lambda@Edge, the 5MB bundle limit makes adding the ws dependency a meaningful consideration.

If you are evaluating modes and runtime compatibility is a factor, HTTP mode eliminates an entire category of deployment concerns. That has value beyond what latency numbers capture.

Benchmark: HTTP vs WebSocket vs TCP across query patterns

I benchmarked three common patterns against a Neon database (us-east-1, 0.25 CU) from a Vercel Edge Function in the same region. Each measurement is the median of 500 warm requests — cold starts analyzed separately below. The TCP baseline uses a direct connection from an EC2 instance in the same availability zone.

Median latency (p50) — warm requests, us-east-1

Connection Mode      | Single SELECT | 3-Query Txn  | 3 Parallel Reads | Cold Start
---------------------+---------------+--------------+------------------+-----------
HTTP (Neon SQL API)  |       25-40ms |      60-90ms |          30-50ms |  600-800ms
WebSocket (pooled)   |        5-15ms |      15-30ms |          15-45ms |  650-850ms
TCP (direct/local)   |        1-5ms  |       3-10ms |           3-10ms |     N/A
Hyperdrive + TCP     |        8-20ms |      20-40ms |          10-25ms |   50-200ms

Several things stand out.

Single SELECT: WebSocket mode is 2-3x faster than HTTP for individual queries on warm connections. The HTTP overhead is the HTTPS round trip to Neon's API plus JSON serialization — each request must establish or reuse an HTTPS connection, send the query as a POST body, and deserialize the JSON response. WebSocket mode pays only the PostgreSQL wire protocol overhead once the connection is established. The wire protocol is binary, compact, and purpose-built for database communication. JSON is none of these things.

Multi-query transactions: The gap widens. Each statement in an HTTP transaction is serialized into a single request body, but Neon's API must parse, execute, and serialize each statement sequentially. WebSocket mode streams statements over an already-open connection, and each client.query() pays only wire protocol overhead.

HTTP transaction — batched in a single request

import { neon } from '@neondatabase/serverless';

const sql = neon(process.env.DATABASE_URL);

// HTTP mode with transaction — all statements in a single HTTP request
const results = await sql.transaction([
  sql`UPDATE accounts SET balance = balance - 100 WHERE id = ${fromId}`,
  sql`UPDATE accounts SET balance = balance + 100 WHERE id = ${toId}`,
  sql`INSERT INTO transfers (from_id, to_id, amount) VALUES (${fromId}, ${toId}, 100)`,
]);

WebSocket transaction — standard Postgres semantics

import { Pool } from '@neondatabase/serverless';

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

// WebSocket mode — standard Postgres transaction semantics
const client = await pool.connect();
try {
  await client.query('BEGIN');
  await client.query('UPDATE accounts SET balance = balance - 100 WHERE id = $1', [fromId]);
  await client.query('UPDATE accounts SET balance = balance + 100 WHERE id = $1', [toId]);
  await client.query(
    'INSERT INTO transfers (from_id, to_id, amount) VALUES ($1, $2, 100)', [fromId, toId]
  );
  await client.query('COMMIT');
} catch (e) {
  await client.query('ROLLBACK');
  throw e;
} finally {
  client.release();
}

The HTTP transaction API deserves a closer look, because it is both more capable and more limited than most developers expect.

HTTP transactions — isolation levels and error handling

import { neon } from '@neondatabase/serverless';

const sql = neon(process.env.DATABASE_URL);

// HTTP transactions support isolation levels
const results = await sql.transaction([
  sql`SELECT balance FROM accounts WHERE id = ${fromId} FOR UPDATE`,
  sql`UPDATE accounts SET balance = balance - 100 WHERE id = ${fromId}`,
  sql`UPDATE accounts SET balance = balance + 100 WHERE id = ${toId}`,
], {
  isolationLevel: 'Serializable',  // or 'ReadCommitted', 'RepeatableRead'
});

// The entire array is sent as one HTTP POST.
// Neon wraps it in BEGIN ... COMMIT on the server side.
// If any statement fails, the entire transaction is rolled back.
// You do NOT get per-statement error granularity —
// you get a single error for the whole batch.

You get isolation level control. You get atomic execution. What you do not get is per-statement error handling. If the third statement in your batch fails, you know the transaction rolled back, but you cannot distinguish "the third UPDATE hit a constraint violation" from "the second INSERT had a type mismatch." WebSocket mode's explicit BEGIN/COMMIT gives you a try/catch around each statement. HTTP mode gives you a single error for the batch.

For simple transactions — transfer money, insert a record, update a counter — this distinction rarely matters. For complex workflows where you need to know which step failed and why, WebSocket mode's granularity is worth the additional ceremony.

Parallel independent reads: This is where HTTP mode claws back ground. Because each HTTP query is an independent request, you can fire three queries simultaneously with Promise.all. WebSocket connections serialize queries — you cannot send a second query until the first returns (absent connection pooling with multiple connections). HTTP parallel reads complete in 30-50ms total; WebSocket serial reads take 15-45ms.

HTTP parallel reads — independent requests in flight

import { neon } from '@neondatabase/serverless';

const sql = neon(process.env.DATABASE_URL);

// HTTP mode — parallel reads, each fires independently
const [users, orders, stats] = await Promise.all([
  sql`SELECT count(*) FROM users WHERE created_at > ${cutoff}`,
  sql`SELECT count(*) FROM orders WHERE status = 'pending'`,
  sql`SELECT sum(total) FROM orders WHERE created_at > ${cutoff}`,
]);
// 3 separate HTTP requests, all in flight simultaneously
// Total latency = max(individual latencies), not sum

The parallel read pattern is HTTP mode's strongest case. When your edge function needs to fetch three or four independent pieces of data, firing them as parallel HTTP requests is both simpler and often faster than managing a pool of WebSocket connections.

I should be fair to WebSocket mode here: you can achieve true parallelism by checking out multiple connections from the pool and running queries concurrently.

WebSocket parallel reads — multiple pool connections

import { Pool, neonConfig } from '@neondatabase/serverless';
import ws from 'ws';

neonConfig.webSocketConstructor = ws;

// To achieve true parallelism with WebSocket mode,
// you need multiple connections from the pool
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 5,
});

async function dashboardData(cutoff) {
  // Each query checks out its own connection
  const [users, orders, stats] = await Promise.all([
    pool.query('SELECT count(*) FROM users WHERE created_at > $1', [cutoff]),
    pool.query("SELECT count(*) FROM orders WHERE status = 'pending'"),
    pool.query('SELECT sum(total) FROM orders WHERE created_at > $1', [cutoff]),
  ]);

  // 3 connections used, 3 queries truly parallel
  // Latency: max(individual) + pool checkout overhead
  // This works, but consumes 3 of your connection budget per request.
  return { users: users.rows, orders: orders.rows, stats: stats.rows };
}

This works. But each parallel query consumes a connection from your pool, and your pool consumes connections from Neon's limit. Three parallel dashboard queries across 50 concurrent users is 150 simultaneous connections — well above the Free tier's 20-connection ceiling and approaching the Pro tier's limits. HTTP mode achieves the same parallelism without touching your connection budget at all, because Neon's SQL API manages the connections internally.

The tail latency story

Median latency tells you how the typical request behaves. It does not tell you how the unfortunate request behaves. And in production, it is the unfortunate requests that generate support tickets.

I measured tail latencies for single SELECT queries across 10,000 warm requests.

Tail latency distribution — single SELECT, warm requests

Percentile  | HTTP mode | WebSocket (pooled) | TCP (direct)
------------+-----------+--------------------+-------------
p50         |     32ms  |             8ms    |        2ms
p90         |     48ms  |            14ms    |        4ms
p95         |     62ms  |            19ms    |        5ms
p99         |    110ms  |            38ms    |        8ms
p99.9       |    240ms  |            95ms    |       15ms

// The p99 story is where HTTP mode's variance shows.
// Each request negotiates a fresh HTTPS connection.
// TLS handshake jitter, DNS resolution, API queue depth —
// all contribute to tail latency.
// WebSocket p99 stays tight because the connection is already open.

The story here is variance, not central tendency. HTTP mode's p99 is 3.4x its p50. WebSocket mode's p99 is 4.75x its p50. Both exhibit tail latency amplification, but HTTP's absolute numbers are higher because each request negotiates an independent HTTPS connection — and HTTPS connection establishment has more moving parts than sending a message over an already-open WebSocket.

At p99.9, HTTP mode reaches 240ms. For a query that returns in 2ms on a direct TCP connection, spending 240ms in connection overhead is — and I say this with the deepest courtesy — rather a lot. If your application has an SLA that measures p99.9 latency, WebSocket mode with pooling is not optional. It is mandatory.

If your SLA measures p50 or even p95, HTTP mode's simplicity may well justify its wider variance. Most applications do not have p99.9 SLAs. Most applications have users who will not notice the difference between 32ms and 48ms. Be honest about which kind of application you are building before optimizing for tail latency you will never observe.

Prepared statements: the hidden performance gap

There is a performance advantage to WebSocket mode that the latency benchmarks do not capture, because it is cumulative rather than per-request.

Prepared statements — WebSocket only

import { Pool, neonConfig } from '@neondatabase/serverless';
import ws from 'ws';
neonConfig.webSocketConstructor = ws;

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

// WebSocket mode supports prepared statements (HTTP does not)
const client = await pool.connect();
try {
  // First execution: Postgres parses + plans + executes
  const r1 = await client.query(
    'SELECT id, name, email FROM users WHERE department = $1',
    ['engineering']
  );

  // Same query shape, different parameter:
  // Postgres reuses the cached plan (no re-parsing, no re-planning)
  const r2 = await client.query(
    'SELECT id, name, email FROM users WHERE department = $1',
    ['marketing']
  );

  // On a table with 1M rows, the second execution is typically
  // 0.5-2ms faster because the plan is already cached.
  // Over thousands of requests, this compounds.
} finally {
  client.release();
}

// HTTP mode cannot do this — each request is a fresh connection,
// so there is no session to cache plans in. Every query is parsed
// and planned from scratch.

When you execute the same query shape repeatedly over a WebSocket connection, PostgreSQL caches the query plan after the first execution. The second, third, and thousandth execution skip parsing and planning entirely. The savings are modest per query — typically 0.5-2ms — but they compound across thousands of requests.

HTTP mode cannot benefit from prepared statements because each request is a fresh connection (from Neon's internal pool). There is no session continuity, so there is no plan cache to reuse. Every query is parsed and planned from scratch.

For applications with a small number of distinct query shapes executed at high frequency — which describes most CRUD applications — this is a meaningful optimization that widens the gap between HTTP and WebSocket mode beyond what the single-query benchmarks suggest.

I should note, in the interest of honesty, that this advantage applies primarily to frequently-repeated query shapes. If your application generates unique query strings — dynamic report builders, ad-hoc analytics queries, that sort of thing — plan caching provides no benefit, and WebSocket mode's advantage over HTTP is limited to the connection overhead savings shown in the benchmark table.

What about cold starts? The number everyone avoids.

Cold start latency is the elephant in every serverless database conversation. And with Neon, there are two cold starts stacked on top of each other — a situation that I find requires more candor than most marketing materials provide.

Cold start anatomy — where the time goes

// Cold start timeline for a Neon serverless query:
//
// 1. Edge function cold start:     50-200ms  (varies by platform)
// 2. Neon compute wake-up:         ~500ms    (if scaled to zero)
// 3. TLS handshake (HTTP mode):    ~50ms     (to Neon's SQL API)
//    OR WebSocket setup:           ~100ms    (proxy negotiation + TLS)
// 4. Query execution:              1-50ms    (depends on the query)
//
// Total cold start (HTTP):         600-800ms
// Total cold start (WebSocket):    650-850ms
// Warm request (HTTP):             20-80ms
// Warm request (WebSocket):        5-30ms
//
// The ~500ms Neon compute wake-up dominates.
// Neither connection mode can avoid it.
// Configure min_compute_size = 0.25 to keep compute warm ($0.0255/hr).

Edge function cold start (50-200ms): The serverless runtime itself must initialize. Vercel Edge Functions are fast here — typically 50-80ms. Cloudflare Workers are faster still — often under 30ms. AWS Lambda@Edge is the slowest at 100-200ms. These numbers are well-known and largely outside Neon's control.

Neon compute wake-up (~500ms): If your Neon compute has scaled to zero — the default behavior after 5 minutes of inactivity — the first request must wait for the compute to start. This takes approximately 500ms and is the dominant cold start cost by a wide margin.

Neither HTTP nor WebSocket mode can avoid the compute wake-up. It happens at Neon's infrastructure level before either connection mode reaches the database. The difference between HTTP and WebSocket cold starts (roughly 600-800ms vs 650-850ms) is within measurement noise. The compute wake-up dwarfs everything else.

I want to be direct about this, because I see developers investing considerable effort in optimizing connection mode selection to save 15ms per request, while their Neon compute sleeps after 5 minutes of inactivity and inflicts a 500ms penalty on the next user. This is, if you will permit the observation, attending to the silverware while the kitchen is on fire.

Mitigating cold starts

Cold start mitigation strategies

// Strategy 1: Keep Neon compute warm
// In your Neon project settings:
//   min_compute_size = 0.25    ($0.0255/hr = ~$18.50/month)
//   suspend_timeout = 0        (never auto-suspend)

// Strategy 2: Warm-up ping (if you can't keep compute warm)
// Hit the database before users do
import { neon } from '@neondatabase/serverless';
const sql = neon(process.env.DATABASE_URL);

// Cron job every 4 minutes (before the 5-minute auto-suspend)
export async function warmUp() {
  await sql`SELECT 1`;
  // Cost: ~$0.001/month in compute time
  // Benefit: eliminates 500ms cold start for real requests
}

// Strategy 3: Accept the cold start, handle it in the UI
// Show a skeleton/spinner for the first load.
// Subsequent requests will be warm.
// Honest trade-off for low-traffic applications.

The practical mitigation: set min_compute_size to 0.25 CU. This keeps your compute warm at a cost of approximately $0.0255 per hour — about $18.50 per month. For any production workload, this is money well spent. It eliminates the 500ms penalty entirely, reducing cold starts to just the edge function initialization plus connection setup.

If your application genuinely cannot tolerate $18.50/month for warm compute, you likely have traffic low enough that users will tolerate the occasional 700ms first request. That is an honest trade-off, not a problem to engineer around. A warm-up ping at 4-minute intervals is the budget alternative — it costs fractions of a penny in compute time and prevents auto-suspension.

I should also note that Neon has been steadily improving cold start times. Early versions of the service exhibited cold starts exceeding one second. The current ~500ms is considerably better, and Neon's team has indicated this will continue to improve. The architecture — separate compute and storage — makes this optimization possible in ways that traditional PostgreSQL deployments cannot match.

Error handling: where the modes diverge in practice

Performance benchmarks are conducted under ideal conditions. Production applications encounter errors. And the two modes handle errors differently in ways that affect how you write code.

Error handling — HTTP vs WebSocket

import { neon } from '@neondatabase/serverless';

const sql = neon(process.env.DATABASE_URL);

// HTTP mode error handling — the errors are... different
try {
  await sql`INSERT INTO users (email) VALUES (${email})`;
} catch (e) {
  // HTTP mode wraps Postgres errors in a NeonDbError
  // e.code — Postgres error code ('23505' for unique violation)
  // e.message — human-readable message
  // e.constraint — constraint name (if applicable)

  if (e.code === '23505') {
    return Response.json(
      { error: 'Email already exists' },
      { status: 409 }
    );
  }
  throw e;
}

// WebSocket mode error handling — standard node-postgres errors
import { Pool } from '@neondatabase/serverless';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const client = await pool.connect();

try {
  await client.query('INSERT INTO users (email) VALUES ($1)', [email]);
} catch (e) {
  // Standard DatabaseError from node-postgres
  // Same error codes, same shape you already know
  // e.code, e.message, e.detail, e.constraint, e.schema, e.table

  if (e.code === '23505') {
    return Response.json(
      { error: 'Email already exists' },
      { status: 409 }
    );
  }
  throw e;
} finally {
  client.release();
}

HTTP mode wraps PostgreSQL errors in a NeonDbError object. The Postgres error code is present, the constraint name is present, and the message is present. But the error shape is slightly different from what node-postgres returns, which means existing error-handling code written for pg may need adjustment. This is a small thing, but small things accumulate.

WebSocket mode returns standard node-postgres errors. If your error handling already works with pg, it works with Neon's WebSocket mode without modification. The error includes detail, schema, table, column, and where fields that HTTP mode's NeonDbError may omit.

There is also the matter of connection errors. In HTTP mode, a network failure is indistinguishable from any other fetch error — your retry logic can treat it the same as an API timeout. In WebSocket mode, a dropped connection may leave your transaction in an indeterminate state. Did the COMMIT reach the server? You do not know. You need idempotency in your transaction design, or at minimum a verification query after reconnection.

This is not a reason to avoid WebSocket mode. It is a reason to design your transactions with failure in mind, which is good practice regardless of your connection mode. But HTTP mode's statelessness does make this particular class of failure impossible, and that simplicity has genuine value.

"The connection pooler landscape has expanded from a single incumbent into a competitive field of purpose-built tools, each making different architectural trade-offs."
— from You Don't Need Redis, Chapter 17: Sorting Out the Connection Poolers

Where does Cloudflare Hyperdrive fit?

Cloudflare Hyperdrive takes a fundamentally different approach. Rather than HTTP-to-SQL translation or WebSocket proxying, Hyperdrive maintains warm TCP connection pools at Cloudflare's edge, giving Workers near-direct-TCP latency to databases that support standard PostgreSQL connections — including Neon.

Cloudflare Hyperdrive — TCP pooling at the edge

// wrangler.toml — Cloudflare Hyperdrive configuration
[[hyperdrive]]
name = "neon-db"
id = "your-hyperdrive-id"

// Worker code — Hyperdrive provides a TCP connection string
import { Pool } from '@neondatabase/serverless';

export default {
  async fetch(req, env) {
    // Hyperdrive rewrites the connection string at the edge
    // Maintains warm TCP pools to your Neon database
    const pool = new Pool({ connectionString: env.HYPERDRIVE.connectionString });
    const client = await pool.connect();
    try {
      const { rows } = await client.query('SELECT id, name FROM users LIMIT 50');
      return Response.json(rows);
    } finally {
      client.release();
    }
  }
};

Hyperdrive's advantage is that it eliminates the connection mode trade-off entirely. You write standard pg Pool/Client code, and Hyperdrive handles connection pooling, caching, and regional routing transparently. Warm request latency (8-20ms for a single SELECT) falls between HTTP and WebSocket mode because you get real TCP connections without per-request TLS overhead, but the edge-to-database hop still exists.

Hyperdrive also offers query result caching — something neither of Neon's native modes provide. For read-heavy workloads with cacheable queries, Hyperdrive can return results in under 5ms by serving from its edge cache without touching the database at all. This is not a database optimization; it is HTTP caching applied to SQL results. But for the right workload — infrequently-changing reference data, configuration lookups, permission checks — it is remarkably effective.

The catch: Hyperdrive is a Cloudflare-specific product. If you are on Vercel, Deno Deploy, or AWS Lambda@Edge, it is not available to you. Neon's native driver modes work everywhere JavaScript runs. Hyperdrive also adds another service to your infrastructure — another dashboard to monitor, another bill to pay, another vendor dependency to manage. For teams already on Cloudflare, this is marginal. For teams not on Cloudflare, it is a substantial commitment.

For Cloudflare Workers specifically, Hyperdrive with Neon is the fastest option I measured. It outperforms both native driver modes for single queries and transactions, and matches HTTP mode's parallel read performance because you can open multiple connections from the Hyperdrive pool. If you are already on Cloudflare Workers and using Neon, Hyperdrive is the correct answer. The numbers do not leave room for debate.

The honest counterpoint: when Neon's serverless driver is not the right tool

A waiter who overstates his case is no waiter at all. Neon's serverless driver solves a specific problem — connecting serverless functions to PostgreSQL — and solves it well. But it is not the right tool for every situation, and pretending otherwise would be a disservice to you.

If your application runs on a long-lived server — a traditional Node.js server, a Docker container, an EC2 instance — you do not need Neon's serverless driver. Use node-postgres directly with a standard connection pool. The serverless driver adds WebSocket-to-TCP translation overhead that a direct TCP connection does not incur. The 5-15ms WebSocket latency is impressive for a serverless environment; it is unnecessary overhead for a server that can hold a TCP connection open for hours.

If your workload is write-heavy with complex transactions — inventory systems, financial ledgers, systems with extensive row-level locking — the additional latency per statement in both HTTP and WebSocket modes compounds across multi-statement transactions. A 10-statement transaction that takes 3ms per statement over direct TCP takes 30ms total. Over WebSocket, the same transaction takes 80-150ms. Over HTTP, it may exceed 300ms. For write-heavy transactional workloads, a dedicated database connection from a persistent server is not a legacy architecture. It is the correct architecture.

If you need features that require session continuity — LISTEN/NOTIFY, cursors, large object streaming — HTTP mode cannot provide them and WebSocket mode provides them only for the duration of a single function invocation. A serverless function that runs for 50ms and then vanishes is not a natural fit for a LISTEN subscriber that needs to stay connected indefinitely.

These are not criticisms of Neon. They are honest boundaries. Neon's serverless driver is the best tool available for connecting serverless functions to PostgreSQL. It is not a replacement for direct database connections in architectures that can support them.

Using both modes together

One pattern I have found genuinely effective in production, and which seems underrepresented in the documentation: using HTTP and WebSocket modes in the same application.

Mixed mode — HTTP for reads, WebSocket for transactions

import { neon } from '@neondatabase/serverless';
import { Pool, neonConfig } from '@neondatabase/serverless';
import ws from 'ws';

neonConfig.webSocketConstructor = ws;

// You can use both modes in the same application.
// This is not a sin — it is practical.
const httpSql = neon(process.env.DATABASE_URL);
const wsPool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 5,
});

// Dashboard: fire parallel reads over HTTP
export async function getDashboard(cutoff) {
  const [users, revenue, orders] = await Promise.all([
    httpSql`SELECT count(*) FROM users WHERE created_at > ${cutoff}`,
    httpSql`SELECT sum(total) FROM orders WHERE created_at > ${cutoff}`,
    httpSql`SELECT count(*) FROM orders WHERE status = 'pending'`,
  ]);
  return { users, revenue, orders };
}

// Checkout: multi-statement transaction over WebSocket
export async function processOrder(userId, items) {
  const client = await wsPool.connect();
  try {
    await client.query('BEGIN');
    const { rows: [order] } = await client.query(
      'INSERT INTO orders (user_id, status) VALUES ($1, $2) RETURNING id',
      [userId, 'pending']
    );
    for (const item of items) {
      await client.query(
        'INSERT INTO order_items (order_id, product_id, qty) VALUES ($1, $2, $3)',
        [order.id, item.productId, item.qty]
      );
    }
    await client.query('COMMIT');
    return order;
  } catch (e) {
    await client.query('ROLLBACK');
    throw e;
  } finally {
    client.release();
  }
}

This is not a compromise. It is the correct application of each mode to its strongest use case. Dashboard data that can be fetched in parallel goes over HTTP. Checkout flows that require multi-statement transactions go over WebSocket. The code is explicit about which path each operation takes, and each path is optimized for the pattern it serves.

I have seen teams agonize over choosing one mode for their entire application, as though a consistent connection strategy were a design principle. It is not. Consistency in code style is a virtue. Consistency in connection strategy is a constraint you have invented and then struggled against. Use both modes. They coexist peacefully.

The decision framework

I shall not equivocate. Here is when to use each mode.

Which mode for which pattern

// Decision framework: which mode for which pattern?
//
// Single SELECT, low frequency:           HTTP
//   - Simpler code, no connection management
//   - Cold starts amortized over infrequent calls
//
// Single SELECT, high frequency:          WebSocket + pool
//   - Connection reuse eliminates per-request TLS overhead
//   - 60-70% lower latency on warm requests
//
// Multi-query transaction:                WebSocket
//   - True BEGIN/COMMIT semantics
//   - Each statement doesn't pay separate HTTP overhead
//   - HTTP transaction API works but adds serialization cost
//
// Parallel independent reads:             HTTP + Promise.all
//   - Each query fires independently, no head-of-line blocking
//   - WebSocket multiplexing not available — queries serialize
//
// Cloudflare Workers (non-Node):          HTTP or Hyperdrive
//   - WebSocket constructor not available in all runtimes
//   - Hyperdrive gives TCP-like performance at the edge

Situation	Mode	Reasoning
Infrequent single queries (webhooks, cron jobs)	HTTP	Simplicity wins when latency budget is generous
High-traffic API endpoints	WebSocket + pool	60-70% lower latency on warm requests justifies the complexity
Multi-statement transactions	WebSocket	True Postgres transaction semantics, per-statement error handling
Dashboard pages (3-5 independent queries)	HTTP + Promise.all	Parallel execution without consuming connection budget
Cloudflare Workers	Hyperdrive (if available) or HTTP	Hyperdrive gives TCP performance; HTTP as fallback
Latency-critical with p99 SLA	WebSocket	Tighter tail latency distribution than HTTP
Prototyping or low-stakes internal tools	HTTP	Zero configuration, zero connection management
Mixed workload (reads + transactions)	Both	HTTP for parallel reads, WebSocket for transactions
Multi-platform deployment	HTTP	Works on every JavaScript runtime without polyfills
Frequent repeated query shapes	WebSocket	Prepared statement caching reduces parse/plan overhead

One pattern I see frequently and would gently discourage: using WebSocket mode without connection pooling. If you create a new WebSocket connection per request, you pay the handshake cost every time and lose the primary advantage over HTTP mode. Either pool your connections or use HTTP. The middle ground helps no one.

Another pattern I would discourage with equal gentleness: choosing a mode based on a blog post's benchmark numbers without measuring your own workload. The numbers in this article are representative, but your query shapes, your data sizes, your region placement, and your traffic patterns are yours alone. Measure them. The benchmarking methodology is straightforward: time 100 warm requests for your actual queries, compute the median and p99, and make your decision on your data rather than mine.

What about the queries themselves?

I have spent considerable words discussing how queries reach the database. Allow me a moment on what happens once they arrive, because this is where the conversation about performance usually takes a wrong turn.

The fastest connection mode in the world cannot rescue a query that performs a sequential scan on a 10-million-row table. Shaving 20ms off connection overhead is admirable; it is also irrelevant when the query itself takes 800ms because it lacks an appropriate index. I find this behaviour — obsessing over the courier's speed while ignoring that the letter is addressed to the wrong house — to be a remarkably common form of performance optimisation.

I measured this directly during benchmarking. A poorly-indexed query over WebSocket: 340ms. The same query over HTTP: 365ms. The 25ms connection overhead difference is real, but the 340ms query execution time is the actual problem. Add the right index, and both modes return in under 20ms. The connection mode became irrelevant the moment the query became efficient.

This is where Gold Lapel enters the picture. It sits as a proxy between your application and PostgreSQL — regardless of whether that application connects via HTTP, WebSocket, Hyperdrive, or carrier pigeon — and optimizes the queries themselves. Missing indexes are created automatically. Expensive joins get materialized views. Query plans are rewritten when the optimizer chooses poorly.

The connection mode determines how quickly your query reaches the database. Gold Lapel determines how quickly the database answers it. These optimizations compound in ways that are worth stating explicitly: faster queries mean shorter connection hold times, which means your connection pool serves more requests, which means fewer connections needed, which means lower infrastructure cost. A query optimized from 200ms to 5ms releases its connection 40x sooner. In a pool of 5 WebSocket connections, that is the difference between serving 25 requests per second and serving 1,000.

Choose the connection mode that fits your deployment. Then attend to the queries. That is where the real latency lives.

The practical takeaway

Neon's serverless driver gives you a genuine choice, and it is a choice worth making deliberately rather than defaulting into.

HTTP mode is the right default for most serverless functions. It is simple, stateless, universally compatible, and performs admirably for the query patterns most edge functions actually execute. The 25-40ms per-query overhead is real but rarely the bottleneck. Its parallel read performance via Promise.all is exceptional. Its error handling is straightforward. It requires zero connection management. For the majority of serverless workloads, HTTP mode is not the compromise — it is the answer.

WebSocket mode is the right choice when latency matters and you are willing to manage connections. A properly pooled WebSocket connection delivers 5-15ms query latency — competitive with traditional server-based deployments. Its prepared statement caching reduces overhead on repeated queries. Its transaction handling provides the granularity that complex workflows require. Its tail latency distribution is tighter. For high-traffic, latency-sensitive endpoints, it is worth every line of connection management code.

Hyperdrive is the right choice for Cloudflare Workers teams who want TCP-like performance without managing either mode's trade-offs. If you are on Cloudflare, use it. The performance advantage is clear and the integration is clean.

Both modes together is the right choice for applications with mixed workloads. HTTP for parallel reads and simple queries, WebSocket for transactions and high-frequency endpoints. This is not indecision — it is precision.

And regardless of which mode you choose: keep your Neon compute warm. Index your queries properly. Measure your own latencies rather than trusting benchmarks written by someone who has never seen your schema. And remember that the connection is the hallway, not the room. What happens inside the room — the query execution, the index selection, the join strategy — is where performance is won or lost.

Now if you will excuse me, I have a WebSocket handshake to attend to. It has been waiting rather patiently.

Frequently asked questions

Should I use Neon HTTP mode or WebSocket mode for my serverless function?

How much latency does Neon HTTP mode add compared to WebSocket?

What causes the ~500ms cold start on Neon serverless?

Can I use both HTTP and WebSocket modes in the same Neon application?

Terms referenced in this article

The connection overhead numbers in this piece tell only part of the story. If serverless cold starts concern you, I have prepared a broader comparison of serverless PostgreSQL providers that places Neon alongside its peers with the same rigour applied here.