← PostgreSQL Concepts

Prepared statement

Parse once, execute many times. A well-run household does not re-read the same instructions to the staff each morning.

Concept · March 21, 2026 · 9 min read

A prepared statement is a query that has been parsed and analyzed once, then stored so it can be executed repeatedly with different parameter values. Rather than sending raw SQL each time — repeating yourself, as it were — the application sends only the parameter values. PostgreSQL skips parsing, and potentially skips planning, reusing the work it already did. Most database drivers handle this automatically through the extended query protocol. The main thing to understand is how PostgreSQL decides whether to reuse a query plan or build a fresh one, because that decision can go wrong when your data is not evenly distributed.

What a prepared statement is

When PostgreSQL receives a SQL query, it goes through several phases: parse (check syntax and resolve names), rewrite (apply rules and views), plan (choose indexes, join strategies, scan methods), and execute (do the actual work). For a one-off query, all four phases run every time. One finds this rather wasteful when the same query arrives thousands of times a minute.

A prepared statement separates these phases. You prepare the statement once — PostgreSQL parses and validates it, noting where parameters go — and then execute it as many times as you need with different values. The parsing work happens once, not once per execution. Standing instructions, rather than repeating yourself to the staff each time.

At the SQL level, this looks like PREPARE and EXECUTE:

SQL
-- Prepare a named statement
PREPARE user_by_email(text) AS
  SELECT id, name, created_at
  FROM users
  WHERE email = $1;

-- Execute it with a specific value
EXECUTE user_by_email('alice@example.com');

-- Execute again with a different value — no re-parsing
EXECUTE user_by_email('bob@example.com');

-- Deallocate when done (or it's freed when the session ends)
DEALLOCATE user_by_email;

In practice, most applications never write PREPARE directly. Database drivers use PostgreSQL's extended query protocol, which splits communication into Parse, Bind, and Execute messages. When you pass parameters separately from the query text — the way every modern driver encourages — the driver handles preparation behind the scenes. You are likely already benefiting from this without realizing it.

Application code
# Python (psycopg2) — prepared statements happen automatically
cursor.execute(
    "SELECT id, name FROM users WHERE email = %s",
    ("alice@example.com",)
)

# Java (JDBC) — PreparedStatement is explicit
PreparedStatement ps = conn.prepareStatement(
    "SELECT id, name FROM users WHERE email = ?"
);
ps.setString(1, "alice@example.com");
ResultSet rs = ps.executeQuery();

# Node.js (node-postgres) — parameterized queries are prepared
const result = await client.query(
    'SELECT id, name FROM users WHERE email = $1',
    ['alice@example.com']
);

The benefit is twofold. First, parsing overhead is eliminated on repeated executions. For simple queries on a hot path that runs thousands of times per second, the savings are not trivial. Second, because parameter values are never interpolated into the SQL text, prepared statements are the primary defense against SQL injection — a matter of security, not convenience.

Why prepared statements matter

Avoiding repeated work. Parsing and planning are not free. For a complex query with multiple joins and subqueries, planning alone can take several milliseconds. If that query runs 10,000 times per minute, you are asking PostgreSQL to re-read instructions it has already understood. Eliminating that redundancy saves real CPU time.

SQL injection prevention. When parameters are sent separately from the SQL structure, there is no way for a parameter value to alter the query's logic. The statement SELECT * FROM users WHERE id = $1 will always be a lookup by id, regardless of what value $1 contains. This is not a secondary benefit — it is the reason every security guide recommends parameterized queries.

Plan reuse. After enough executions, PostgreSQL may settle on a generic plan that works for any parameter value, skipping the planning phase entirely. For queries that run frequently with varying parameters, this saves the planning cost on every subsequent execution. A rather clever optimization — though, as we shall see, cleverness has its edge cases.

Custom plans vs generic plans

This is where prepared statements become genuinely interesting — and where I should draw your attention, because most of the trouble lives here.

When you first execute a prepared statement, PostgreSQL generates a custom plan — a plan built using the actual parameter values you provided. The planner knows the specific value, can look up statistics for it, and can choose the best strategy for that particular execution.

After the first 5 executions, PostgreSQL considers switching to a generic plan — a plan generated without knowing the parameter values. If the average cost of the custom plans so far is not significantly better than the generic plan's estimated cost, PostgreSQL starts using the generic plan for all subsequent executions. This saves the cost of replanning every time.

The plan_cache_mode setting (PostgreSQL 12+) controls this behavior:

SQL
-- Check the current plan_cache_mode setting
SHOW plan_cache_mode;

-- Options:
--   auto (default) — use custom plans for the first 5 executions,
--                     then switch to generic if cost is comparable
--   force_custom_plan — always re-plan with actual parameter values
--   force_generic_plan — always use the generic plan

-- Force custom plans for a session with skewed data
SET plan_cache_mode = 'force_custom_plan';

For most queries, the generic plan is perfectly adequate. If a query always uses an index scan regardless of the parameter value, the generic plan will also use an index scan, and you save planning time with no downside. The problem arises when different parameter values should lead to different plans — and that brings us to skewed data, which I am afraid requires a frank discussion.

The skewed data problem

Not all values are created equal. In many real-world tables, a small number of values account for the vast majority of rows — and the planner's one-size-fits-all generic plan begins to show its seams.

SQL
-- Imagine an orders table: 95% have status = 'completed',
-- 5% have status = 'pending'

-- With a generic plan, the planner estimates average selectivity.
-- For status = 'pending' (5% of rows), an index scan is ideal.
-- For status = 'completed' (95% of rows), a sequential scan is better.

-- The generic plan picks one strategy for both cases.
-- If it chooses a sequential scan (based on average), the
-- status = 'pending' query is unnecessarily slow.

PREPARE orders_by_status(text) AS
  SELECT * FROM orders WHERE status = $1;

-- First 5 executions: PostgreSQL builds a custom plan each time
-- using the actual parameter value — gets it right.
EXECUTE orders_by_status('pending');    -- index scan
EXECUTE orders_by_status('completed');  -- seq scan

-- After 5 executions: may switch to a generic plan that
-- uses one strategy for all values.

-- Check what plan is being used
EXPLAIN EXECUTE orders_by_status('pending');

When the data distribution is heavily skewed, the optimal plan for one parameter value may be completely wrong for another. An index scan on status = 'pending' (5% of rows) is fast. A sequential scan on status = 'completed' (95% of rows) is appropriate. But the generic plan has to pick one strategy and use it for both.

The symptoms are subtle, which is what makes this particular issue so vexing. The query works fine most of the time, then occasionally takes 10x longer for specific parameter values. Or it switches to a generic plan after 5 executions and regresses for a subset of inputs — no code change, no deployment, no obvious cause. The sort of thing that surfaces at 2 AM and resists easy explanation.

Remedies:

  • Set plan_cache_mode = 'force_custom_plan' for sessions or transactions that query skewed columns. This forces replanning every time, which costs CPU but guarantees the planner always sees the actual parameter value.
  • Improve statistics. Increase default_statistics_target or set a column-specific target with ALTER TABLE ... ALTER COLUMN ... SET STATISTICS. More histogram buckets give the planner a better picture of the distribution.
  • Restructure the query. If one value dominates, consider splitting it into separate queries — one for the common case, one for the rare case — so each gets its own plan.

Prepared statements and connection poolers

Prepared statements are session-scoped. They exist on a specific PostgreSQL backend connection and vanish when that connection closes. Allow me to flag a consequence of this that has caught more than a few teams off guard: it creates a well-known friction with connection poolers like PgBouncer.

In session mode, the client holds the same server connection for the entire session. Prepared statements work exactly as they would with a direct connection. No issues.

In transaction mode — the most common production configuration — the server connection is returned to the pool between transactions. The next transaction may land on a different server connection where the prepared statement does not exist. The driver believes the statement is prepared; the server has never seen it. The result is an error, and one that can be genuinely puzzling if you are not expecting it.

Configuration
-- In PgBouncer transaction mode (pre-1.21), prepared statements
-- break because each transaction may land on a different
-- server connection. The prepared statement exists on
-- connection A, but your next transaction runs on connection B.

-- Workaround 1: Use PgBouncer 1.21+ with prepared statement
-- tracking enabled (the default in recent versions).

-- Workaround 2: Use session mode (but you lose multiplexing).

-- Workaround 3: Disable prepared statements in your driver.
-- PostgreSQL JDBC:
--   prepareThreshold=0
-- psycopg2:
--   No action needed — psycopg2 doesn't use server-side
--   prepared statements by default
-- node-postgres:
--   { statement_timeout: 0 }  // or use unnamed statements

PgBouncer 1.21+ resolved this with some welcome diligence: it tracks which statements each client has prepared and transparently re-prepares them on new server connections. This is now the default behavior in recent versions. If you are running an older PgBouncer, the practical workarounds are: upgrade (the preferred path), use session mode (sacrificing multiplexing), or disable server-side prepared statements in your driver and accept the per-query parsing cost. None of these are ideal, which is precisely why the 1.21 fix was so well received.

Inspecting prepared statements

PostgreSQL provides the pg_prepared_statements system view to see what is currently prepared in your session.

SQL
-- See all prepared statements in the current session
SELECT name, statement, parameter_types, result_types
FROM pg_prepared_statements;

-- Check how many generic vs custom plans have been used
-- (PostgreSQL 16+)
SELECT name, generic_plans, custom_plans
FROM pg_prepared_statements;

I find this view indispensable for debugging driver behavior. Connect to the database, run a few queries through your application, and check whether the driver is actually using server-side prepared statements or falling back to simple protocol messages. The answer is not always what you would expect.

How Gold Lapel relates

Gold Lapel operates as a PostgreSQL proxy and handles prepared statements at the protocol level. When your application sends Parse, Bind, and Execute messages, Gold Lapel intercepts them, tracks the prepared statement by name, and forwards the appropriate messages to PostgreSQL. If the underlying connection changes — because Gold Lapel manages its own connection pool — it re-prepares statements transparently. The PgBouncer transaction-mode headache described above simply does not arise.

Gold Lapel also uses the query patterns it observes through prepared statements to build its workload model. Because prepared statements naturally normalize queries — the same SQL template with different parameters — they map directly to the pattern analysis that drives Gold Lapel's automatic optimizations. The more your application uses parameterized queries, the cleaner the signal. And clean signal, if you will permit the observation, is the foundation of good service.

Frequently asked questions