CTE (Common Table Expression)
A well-organized query is a courtesy to everyone who reads it after you. CTEs give your SQL proper structure — and since PostgreSQL 12, they cost you nothing for the privilege.
A Common Table Expression (CTE) is a named temporary result set defined with the WITH keyword. It exists only for the duration of a single query. In PostgreSQL 12 and later, simple CTEs are inlined into the main query and optimized normally by the planner — no performance penalty, no compromise. Use MATERIALIZED when you want a CTE to execute on its own terms, and WITH RECURSIVE for hierarchical and graph traversal queries.
What a CTE is
A CTE is a named subquery introduced by the WITH keyword. You define it once at the top of a statement, then reference it by name in the main query — as many times as you need. Think of it as labeling your work before presenting it. Clear instructions produce clear results.
-- A simple CTE: named subquery with WITH
SELECT department, avg_salary
FROM (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
) sub
WHERE avg_salary > 80000;
-- Same query using a CTE — easier to read
WITH department_salaries AS (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
)
SELECT department, avg_salary
FROM department_salaries
WHERE avg_salary > 80000; The two queries above produce the same result. The CTE version gives the intermediate result set a name — department_salaries — which makes the query's intent visible at a glance. For simple queries, this is a modest improvement. For queries with three or four levels of nesting, it is the difference between a query you can reason about and one you must simply trust.
You can define multiple CTEs in a single WITH clause, separated by commas. Each CTE can reference any CTE defined before it, which lets you build up a query in stages.
-- Multiple CTEs, each referencing earlier ones
WITH active_orders AS (
SELECT customer_id, order_id, total
FROM orders
WHERE status = 'active'
),
customer_totals AS (
SELECT customer_id, SUM(total) AS lifetime_total, COUNT(*) AS order_count
FROM active_orders
GROUP BY customer_id
)
SELECT c.name, ct.lifetime_total, ct.order_count
FROM customers c
JOIN customer_totals ct ON c.id = ct.customer_id
WHERE ct.lifetime_total > 1000
ORDER BY ct.lifetime_total DESC; This compositional style — small, named steps that feed into each other — is the main reason CTEs exist. The SQL is longer than a nested subquery, but it reads top to bottom instead of inside out. A well-run query, like a well-run household, follows a clear sequence of operations.
CTEs before and after PostgreSQL 12
Before PostgreSQL 12, CTEs were optimization fences. The planner materialized every CTE into a temporary result set, executed it completely, then fed those results into the outer query. This meant the planner could not push WHERE clauses from the outer query into the CTE, could not reorder joins across the CTE boundary, and could not eliminate columns the outer query did not need.
This was a deliberate design choice — it guaranteed the CTE executed exactly once — but it meant that using a CTE could make a query significantly slower than the equivalent subquery. For years, developers on older PostgreSQL versions learned to avoid CTEs in performance-sensitive code, choosing between readability and speed. An unfortunate trade-off.
PostgreSQL 12 resolved this rather decisively. Simple, non-recursive CTEs that are referenced only once are now inlined into the outer query automatically. The planner treats them as if you had written a subquery, applying all its normal optimizations — predicate pushdown, join reordering, column pruning. You may now have your readability and your performance. As it should be.
Two keywords give you explicit control over this behavior — and I appreciate that PostgreSQL gives you the choice rather than making it for you:
MATERIALIZED— forces the CTE to execute separately, as in pre-12 behaviorNOT MATERIALIZED— forces inlining, even when the CTE is referenced more than once
-- PostgreSQL 12+: force the CTE to execute separately
WITH expensive_scan AS MATERIALIZED (
SELECT id, compute_score(data) AS score
FROM large_table
)
SELECT * FROM expensive_scan WHERE score > 0.9;
-- Force inlining (the default for simple CTEs, but explicit)
WITH filtered AS NOT MATERIALIZED (
SELECT * FROM orders WHERE status = 'active'
)
SELECT * FROM filtered WHERE total > 100; If a CTE is referenced multiple times and you do not specify either keyword, the planner decides on your behalf — materializing to avoid redundant work, or inlining if it estimates that is cheaper. For CTEs referenced only once, inlining is the default. But a deliberate MATERIALIZED or NOT MATERIALIZED declaration tells the planner exactly what you intend, and I do respect a query that states its intentions clearly.
Recursive CTEs
A recursive CTE uses WITH RECURSIVE and contains two parts joined by UNION ALL (or UNION): a base case that seeds the result, and a recursive case that references the CTE itself. PostgreSQL evaluates the base case first, then repeatedly evaluates the recursive case — using the rows produced in the previous iteration — until no new rows are generated. It is, if you will, the database following a chain of command down through the household.
-- Recursive CTE: traverse an employee hierarchy
WITH RECURSIVE org_chart AS (
-- Base case: start from the CEO (no manager)
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: find each employee's direct reports
SELECT e.id, e.name, e.manager_id, oc.depth + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT depth, name, manager_id
FROM org_chart
ORDER BY depth, name; This is the standard approach for querying hierarchical data: organizational charts, category trees, bill-of-materials structures, threaded comments. Without recursive CTEs, these queries require either multiple round trips from the application — the infrastructural equivalent of sending a separate messenger to each floor of the house — or PostgreSQL-specific extensions like ltree.
Recursive CTEs also handle graph traversal, where you need to find all nodes reachable from a starting point. Cycle detection requires explicit handling — PostgreSQL does not prevent infinite loops automatically, and an unguarded recursive CTE will keep walking until it runs out of memory. Mind the cycles.
-- Find all reachable nodes in a directed graph
WITH RECURSIVE reachable AS (
SELECT end_node AS node, ARRAY[start_node, end_node] AS path
FROM edges
WHERE start_node = 'A'
UNION
SELECT e.end_node, r.path || e.end_node
FROM edges e
JOIN reachable r ON e.start_node = r.node
WHERE e.end_node <> ALL(r.path) -- prevent cycles
)
SELECT DISTINCT node, path FROM reachable; Recursive CTEs are always materialized. The planner cannot inline them because each iteration depends on the output of the previous one.
When to use CTEs
Allow me to be specific about when CTEs earn their place in a query:
- Readability — when a query has multiple levels of nesting, breaking it into named CTEs makes it dramatically easier to read and maintain
- Recursive queries — hierarchies and graph traversals cannot be expressed with plain subqueries;
WITH RECURSIVEis the only declarative option - Reusing a subquery — when the same derived table appears in multiple places in a query, a CTE lets you define it once and reference it by name
- Data-modifying statements — you can use
INSERT,UPDATE, orDELETEinside a CTE and capture the affected rows withRETURNINGfor use in the outer query - Staging complex logic — building a query in sequential, named steps makes it easier to debug because you can run each CTE independently
On PostgreSQL 12+, there is no performance reason to avoid CTEs for readability. They compile to the same plan as the equivalent subquery. Clarity at no cost — one of the finer things PostgreSQL offers.
Performance considerations
If I may be direct: the most important thing to know about CTE performance is which version of PostgreSQL you are running.
On PostgreSQL 11 and earlier, every CTE is an optimization fence. If you have a CTE that selects from a million-row table and the outer query filters it down to 10 rows, the planner cannot push that filter into the CTE. All million rows are materialized first. On these versions, rewrite performance-sensitive CTEs as subqueries — or, if I may suggest, consider whether it is time to upgrade.
On PostgreSQL 12+, simple CTEs are inlined and carry no penalty. But there are cases where MATERIALIZED is the better choice:
- Volatile functions — if a CTE calls
random(),now(), or any volatile function, inlining means the function may be evaluated multiple times with different results.MATERIALIZEDensures it runs once. - Side effects — data-modifying CTEs (
INSERT/UPDATE/DELETE) are always materialized regardless, but being explicit with the keyword makes the intent clear. - Expensive computations referenced multiple times — if a CTE is expensive to compute and referenced in several places,
MATERIALIZEDprevents re-evaluation. Without it,NOT MATERIALIZED(or the planner's default choice to inline) could cause the work to be done repeatedly.
Use EXPLAIN ANALYZE to see whether the planner inlined or materialized a CTE. A materialized CTE appears as a "CTE Scan" node in the plan; an inlined CTE disappears entirely — its logic folded into the surrounding plan nodes. If you cannot find your CTE in the plan, that is good news. It means the planner has absorbed it completely.
How Gold Lapel relates
Gold Lapel sits at the proxy level, between your application and PostgreSQL. Your CTEs pass through untouched — we do not rewrite them, strip them, or alter their semantics. The query that reaches PostgreSQL is the query your application sent. I would consider it rather presumptuous to rearrange your instructions without asking.
Where Gold Lapel earns its place is in the analysis layer. When it identifies a slow query that uses CTEs, the diagnostic information reflects the actual execution plan — including whether the planner chose to inline or materialize each CTE. If a materialized CTE is the bottleneck, you will see precisely that in the plan breakdown, alongside the time it consumed.