Lateral join
A correlated subquery that has been given proper standing in the FROM clause. If you need the top-N related rows per parent row, this is the tool I would reach for first.
A lateral join uses the LATERAL keyword to let a subquery in the FROM clause reference columns from tables that appear earlier in the same FROM clause. This makes it possible to run a row-dependent subquery — one that changes its behavior based on each row of the outer table. The most common use is fetching the top-N related rows per parent row, something that is awkward or impossible with standard joins and window functions alone. It is one of PostgreSQL's more expressive tools, and I find it is underused.
What a lateral join is
In standard SQL, a subquery in the FROM clause is independent — it cannot reference other tables in the same FROM list. The LATERAL keyword lifts that restriction. A LATERAL subquery can reference any column from any table that appears before it in the FROM clause, and PostgreSQL evaluates it once per row of the preceding table.
If you are familiar with correlated subqueries in the SELECT clause, LATERAL is the same idea — but promoted to the FROM clause, where it can do considerably more. Because it lives in FROM, it can return multiple rows and multiple columns, not just a single scalar value. A correlated subquery that has been given proper standing, if you will.
-- Without LATERAL: subquery cannot reference the outer table
SELECT c.name,
(SELECT MAX(o.total) FROM orders o WHERE o.customer_id = c.id) AS max_order
FROM customers c;
-- With LATERAL: subquery in FROM can reference preceding tables
SELECT c.name, latest.total, latest.created_at
FROM customers c
JOIN LATERAL (
SELECT o.total, o.created_at
FROM orders o
WHERE o.customer_id = c.id
ORDER BY o.created_at DESC
LIMIT 1
) latest ON true; The correlated subquery in the first example can return only one value per customer. The lateral join in the second returns a full row — and could return multiple rows if you removed the LIMIT 1. The SQL expresses exactly what you mean: for each customer, look up their most recent order. I appreciate a query that states its intentions this clearly.
When to use lateral joins
Allow me to be specific about where lateral joins earn their place. They solve a particular class of problems where each row of the outer table needs its own dependent subquery. The most common patterns:
- Top-N per group — fetch the 3 most recent orders per customer, the 5 highest-rated reviews per product, the latest log entry per server. Lateral joins handle this cleanly with
ORDER BY ... LIMIT Ninside the subquery. - Set-returning functions per row — call
unnest(),generate_series(),json_array_elements(), or any other set-returning function with arguments that depend on the current row. - Replacing multiple correlated subqueries in SELECT — when you need several aggregated values from the same related table, a single lateral subquery is cleaner and often faster than multiple correlated subqueries.
- Dependent joins — any situation where the inner side of a join needs to know something about the outer row before it can execute, such as using the outer row's value as a function argument or a
LIMIT.
Practical examples
Top-N per group
This is the pattern that makes lateral joins genuinely indispensable. For each customer, fetch their top 3 orders by amount. The LIMIT 3 inside the lateral subquery applies per customer, not globally.
-- Top 3 orders per customer, by total amount
SELECT c.name, top_orders.order_id, top_orders.total, top_orders.created_at
FROM customers c
JOIN LATERAL (
SELECT o.id AS order_id, o.total, o.created_at
FROM orders o
WHERE o.customer_id = c.id
ORDER BY o.total DESC
LIMIT 3
) top_orders ON true; This pattern is difficult to replicate with window functions. A ROW_NUMBER() approach computes the window over the entire orders table before filtering — the database equivalent of reading every receipt in the filing cabinet to find the three largest. A lateral join, backed by an index, stops after N rows per group. It knows when to stop looking. That is a meaningful difference on large tables.
Set-returning functions per row
Lateral joins pair naturally with set-returning functions — each invocation depends on the current row's data, and LATERAL makes that dependency explicit in the query structure.
-- Generate a row for each tag in a comma-separated column
SELECT p.id, p.title, tag.value AS tag
FROM posts p
JOIN LATERAL unnest(string_to_array(p.tags, ',')) AS tag(value) ON true;
-- Generate date series per subscription
SELECT s.id, s.plan, billing_date.d AS invoice_date
FROM subscriptions s
JOIN LATERAL generate_series(
s.start_date,
COALESCE(s.end_date, CURRENT_DATE),
interval '1 month'
) AS billing_date(d) ON true; Replacing correlated subqueries
When a query uses multiple correlated subqueries in the SELECT clause, each one scans the related table independently. This is rather like sending a separate messenger to the same department for each piece of information. A lateral join consolidates them into a single subquery — one trip, all the answers.
-- Correlated subquery in SELECT: one value per row
SELECT c.name,
(SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS order_count,
(SELECT SUM(o.total) FROM orders o WHERE o.customer_id = c.id) AS total_spent
FROM customers c;
-- Rewritten with LATERAL: one subquery, multiple columns
SELECT c.name, stats.order_count, stats.total_spent
FROM customers c
JOIN LATERAL (
SELECT COUNT(*) AS order_count, SUM(o.total) AS total_spent
FROM orders o
WHERE o.customer_id = c.id
) stats ON true; The lateral version scans the orders table once per customer instead of twice. It also scales gracefully — add a third aggregate and the lateral version costs the same, while the correlated approach adds another full scan. The cost structure, in other words, stays well-managed as the query's responsibilities grow.
LEFT JOIN LATERAL
Use LEFT JOIN LATERAL when the subquery might return no rows and you want to keep the outer row with NULLs.
-- LEFT JOIN LATERAL keeps customers with no orders
SELECT c.name, latest.total, latest.created_at
FROM customers c
LEFT JOIN LATERAL (
SELECT o.total, o.created_at
FROM orders o
WHERE o.customer_id = c.id
ORDER BY o.created_at DESC
LIMIT 1
) latest ON true; Customers with no orders appear in the result with NULL for total and created_at. A plain JOIN LATERAL would drop them entirely — which is occasionally what you want, but more often it is not, and discovering the omission in production is never a pleasant experience.
Performance considerations
A lateral join is executed as a nested loop: for each row from the outer table, PostgreSQL runs the lateral subquery. This is not inherently a concern — nested loops are perfectly efficient when the inner side is fast. But if I may be direct: it becomes a concern rather quickly when the inner side is not.
The single most important optimization is indexing the inner query. The lateral subquery typically filters on a column from the outer table (e.g., WHERE o.customer_id = c.id) and may include an ORDER BY and LIMIT. An index that covers the filter and sort columns turns each inner execution into an index scan that returns a handful of rows. This is the difference between a lateral join that performs beautifully and one that does not perform at all.
-- Index that supports the inner subquery of a lateral join
CREATE INDEX idx_orders_customer_created ON orders (customer_id, created_at DESC);
-- With this index, the LATERAL subquery per customer becomes
-- an index scan returning at most N rows — fast even for
-- millions of orders. Without this index, each inner execution is a sequential scan of the entire related table. For 10,000 customers and 1 million orders, that is 10,000 sequential scans. A query that should take milliseconds takes minutes. I have seen this pattern in production more often than I would like to admit, and it is nearly always a missing index — not a flaw in the lateral join itself.
Other performance considerations:
- Outer row count matters — the inner subquery runs once per outer row. If the outer table has millions of rows and you need all of them, even an indexed inner query adds up. Filter the outer table first when possible.
- EXPLAIN ANALYZE shows the nested loop — look for a Nested Loop node with the lateral subquery as the inner side. Check that the inner side shows an Index Scan, not a Seq Scan.
- Compare with window functions — for top-N per group, a
ROW_NUMBER()approach scans the inner table once and sorts. A lateral join scans it N times (once per outer row) but reads fewer rows each time. Which is faster depends on the data: lateral wins when N is small and the outer table is small; window functions win when the inner table is small relative to the outer.
How Gold Lapel relates
Gold Lapel sits at the proxy level, between your application and PostgreSQL. Your lateral joins pass through untouched — we do not rewrite them, rearrange them, or second-guess their structure. The query that reaches PostgreSQL is the query you wrote, and I would consider it presumptuous to do otherwise.
Where Gold Lapel earns its place is in the analysis. When a query pattern using LATERAL shows up with high execution time in Gold Lapel's workload analysis, the diagnostic surfaces the nested loop structure and flags whether the inner subquery is using an index or falling back to sequential scans. If a supporting index is missing, Gold Lapel can recommend one based on the filter and sort columns observed in the lateral subquery. The lateral join is rarely the problem. The missing index nearly always is.