Sequelize's Cartesian Explosion: When findAll with Nested Includes Brings PostgreSQL to Its Knees

Q: What is a Cartesian product in Sequelize?

A Cartesian product occurs when Sequelize joins multiple hasMany or belongsToMany associations in a single query using LEFT OUTER JOINs. Each join multiplies the row count by the number of child rows, producing an intermediate result set that can be orders of magnitude larger than the actual data. For example, 500 orders with 24 items each and 4.4 tags per product produces 52,800 flattened rows instead of the ~4,100 distinct rows across all tables (500 + 2,400 + 1,200).

Q: How do I fix Sequelize Cartesian product performance?

Add separate: true to the include options for any hasMany or belongsToMany association that causes row multiplication. This tells Sequelize to fetch that association in a separate query using an IN clause instead of joining it into the main query. For example: include: [{ model: OrderItem, separate: true, include: [...] }]. This eliminates the Cartesian product while returning identical results.

Q: Does separate: true work with belongsTo associations in Sequelize?

No. The separate option only works with hasMany and belongsToMany associations. Sequelize will throw an error if you try to use it with belongsTo or hasOne. This is by design — belongsTo and hasOne associations produce 1:1 joins that do not multiply rows, so there is no Cartesian product to prevent.

Q: Why does Sequelize LIMIT return fewer results than expected with nested includes?

When you use limit with nested includes, the LIMIT applies to the flattened result set — the Cartesian product of all joined rows — not to the number of parent objects. If each order produces 96 flattened rows (24 items x 4 tags), LIMIT 100 returns only 1 complete order. Using separate: true fixes this because the LIMIT applies to the parent query alone, correctly returning 100 orders.

Q: Is separate: true slower because it sends multiple queries?

No, it is dramatically faster. The single Cartesian JOIN on 500 orders takes ~6.8 seconds because PostgreSQL must construct 52,800 intermediate rows. The three separate queries take ~6.3 milliseconds total. The extra round trips add microseconds; the Cartesian product adds seconds. As the dataset grows, the gap widens further because the Cartesian product grows as the product of cardinalities while separate queries grow as the sum.

Q: How do I detect Cartesian products in Sequelize queries?

Three approaches: (1) Add logging: console.log to your Sequelize options and count the LEFT OUTER JOINs — more than two nested JOINs with hasMany associations means a Cartesian product. (2) Use EXPLAIN ANALYZE on the generated SQL and look for intermediate row counts that far exceed the base table count. (3) Query pg_stat_statements for queries with multiple LEFT JOINs where avg_rows_per_call is 10x or more the expected base table count.

Q: What is the difference between separate and subQuery in Sequelize?

They solve different problems. subQuery controls whether Sequelize wraps the base model SELECT in a subquery when LIMIT is used with includes — it helps the LIMIT apply to parent rows rather than flattened rows. separate: true tells Sequelize to fetch an association in a completely independent query using an IN clause. subQuery is a band-aid for the LIMIT problem; separate: true is the structural fix for the Cartesian product problem. For deep nesting with multiple hasMany associations, separate: true is the correct solution.

Your 500-row query is returning 52,800 intermediate rows. Allow me to show you where they came from and how to send them back.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 22 min read

Fifty-two thousand rows attempting one door. The artwork was deemed too distressing.

Good evening. I see you have nested your includes.

There is a particular moment in every Sequelize developer's career — it arrives between month three and month eight, typically — when a page that loaded in 200ms begins taking 7 seconds. Nothing changed. The code is the same. The data grew by perhaps 30%.

The culprit, nearly without exception, is a findAll with nested include clauses. What appears to be a reasonable "give me orders with their items and products" query has been silently constructing a multi-table LEFT OUTER JOIN that produces a Cartesian product of breathtaking proportions.

I do not use the word "breathtaking" as praise.

This is Sequelize's most well-documented performance trap. GitHub issue #3682 describes it. So does #4868. And #1327. Developers report 5 to 15+ second queries on datasets that would fit comfortably in a spreadsheet. The fix has existed for years. It is one option, on one line. But you have to know it exists, and more importantly, you have to understand why it exists — because applying it incorrectly introduces a different category of problems.

Allow me to introduce you to the problem, the fix, the gotchas, and the deeper question of how to stop this from happening again.

The setup: a perfectly normal schema

Consider a modest e-commerce application. Users place orders. Orders contain items. Items reference products. Products have tags. Nothing exotic — five tables, standard associations, the sort of schema you could sketch on a napkin.

Sequelize models

const { Sequelize, DataTypes } = require('sequelize');
const sequelize = new Sequelize('postgres://user:pass@localhost:5432/myapp');

const User = sequelize.define('User', {
  name: DataTypes.STRING,
  email: DataTypes.STRING,
});

const Order = sequelize.define('Order', {
  total: DataTypes.DECIMAL(10, 2),
  status: DataTypes.STRING,
});

const OrderItem = sequelize.define('OrderItem', {
  quantity: DataTypes.INTEGER,
  price: DataTypes.DECIMAL(10, 2),
});

const Product = sequelize.define('Product', {
  name: DataTypes.STRING,
  sku: DataTypes.STRING,
});

const Tag = sequelize.define('Tag', {
  label: DataTypes.STRING,
});

// Associations
User.hasMany(Order);
Order.belongsTo(User);
Order.hasMany(OrderItem);
OrderItem.belongsTo(Order);
OrderItem.belongsTo(Product);
Product.belongsToMany(Tag, { through: 'ProductTags' });
Tag.belongsToMany(Product, { through: 'ProductTags' });

The dataset is small by any standard: 500 completed orders, 2,400 order items, 800 products, 400 tags with 1,200 product-tag assignments. This is a startup on day one, not a company at scale. The entire database fits in 8 MB.

I should note: none of these associations are unusual. User.hasMany(Order) and Order.hasMany(OrderItem) are textbook one-to-many relationships. Product.belongsToMany(Tag) is a standard many-to-many through a join table. You would find this schema in any Sequelize tutorial.

The schema is not the problem. The schema is perfectly sound. The problem is what happens when you ask Sequelize to load all of it at once.

The query that seems innocent

Now, the query that brings it to its knees:

The innocent query

// "Give me all orders with their items, products, and tags."
// Seems reasonable. It is not.
const orders = await Order.findAll({
  include: [
    { model: User },
    {
      model: OrderItem,
      include: [
        {
          model: Product,
          include: [{ model: Tag }],
        },
      ],
    },
  ],
  where: { status: 'completed' },
  limit: 100,
});

Three levels of nesting. A limit: 100 to keep things reasonable. Nothing about this code suggests danger. It reads like a polite request: "give me 100 completed orders with their items, products, and tags."

It is not a polite request. It is a demand that PostgreSQL construct the Cartesian product of five tables and return the flattened result.

The developer who wrote this was not careless. The Sequelize documentation actively encourages this pattern. The eager loading guide demonstrates nested includes as the primary mechanism for loading related data. The code is doing exactly what the documentation suggests. The documentation simply does not mention what happens next.

What SQL does Sequelize actually generate?

If you add logging: console.log to your Sequelize options — and you should, at least once, for every query that matters — you will see something close to this:

Generated SQL

-- What Sequelize generates (simplified):
SELECT
  "Order"."id",
  "Order"."total",
  "Order"."status",
  "User"."id"          AS "User.id",
  "User"."name"        AS "User.name",
  "OrderItems"."id"    AS "OrderItems.id",
  "OrderItems"."quantity" AS "OrderItems.quantity",
  "OrderItems->Product"."id"   AS "OrderItems.Product.id",
  "OrderItems->Product"."name" AS "OrderItems.Product.name",
  "OrderItems->Product->Tags"."id"    AS "OrderItems.Product.Tags.id",
  "OrderItems->Product->Tags"."label" AS "OrderItems.Product.Tags.label"
FROM "Orders" AS "Order"
LEFT OUTER JOIN "Users" AS "User"
  ON "Order"."UserId" = "User"."id"
LEFT OUTER JOIN "OrderItems" AS "OrderItems"
  ON "Order"."id" = "OrderItems"."OrderId"
LEFT OUTER JOIN "Products" AS "OrderItems->Product"
  ON "OrderItems"."ProductId" = "OrderItems->Product"."id"
LEFT OUTER JOIN (
  "ProductTags" AS "OrderItems->Product->Tags->ProductTags"
  INNER JOIN "Tags" AS "OrderItems->Product->Tags"
    ON "OrderItems->Product->Tags"."id" = "OrderItems->Product->Tags->ProductTags"."TagId"
) ON "OrderItems->Product"."id" = "OrderItems->Product->Tags->ProductTags"."ProductId"
WHERE "Order"."status" = 'completed';

-- That is a 5-table LEFT OUTER JOIN with a nested INNER JOIN.
-- The LIMIT 100 you specified? It applies to the flattened result,
-- not to the number of orders.

Five tables joined with LEFT OUTER JOINs. Every row from every level of the hierarchy is flattened into a single wide result set. An order with 24 items, each item's product having 4 tags, produces 24 x 4 = 96 rows in the output — all carrying duplicate copies of the order and user data.

And notice the LIMIT 100. It applies to the flattened row count, not to orders. If your first order has 96 flattened rows, your "100 orders" query returns exactly one complete order and 4 rows of a second. Sequelize reassembles the objects in JavaScript afterward, but the database still did all the work of producing tens of thousands of rows.

If you have not seen this before, I recommend sitting with it for a moment. Every include with a hasMany or belongsToMany association adds another LEFT JOIN. Every LEFT JOIN multiplies the result set by the average number of child rows. The multiplication is compounding. It is not additive. It is multiplicative. This distinction is the entire problem.

What does EXPLAIN ANALYZE reveal?

Numbers, not narratives, settle these matters. Here is the EXPLAIN ANALYZE output from the generated query:

EXPLAIN ANALYZE output

-- EXPLAIN ANALYZE on a dataset of:
-- 500 orders, 2,400 order items, 800 products, 1,200 product-tag pairs

QUERY PLAN
─────────────────────────────────────────────────────────────────
 Limit  (cost=0.00..58924.31 rows=100 width=312)
        (actual time=0.089..6847.221 rows=100 loops=1)
   ->  Hash Left Join  (cost=412.50..58924.31 rows=52800 width=312)
              (actual time=0.088..6847.194 rows=100 loops=1)
         Hash Cond: ("OrderItems->Product"."id" = "ProductTags"."ProductId")
         ->  Hash Left Join  (cost=198.00..24816.00 rows=12000 width=248)
                   (actual time=0.062..3412.108 rows=52800 loops=1)
               Hash Cond: ("OrderItems"."ProductId" = "OrderItems->Product"."id")
               ->  Hash Left Join  (cost=112.50..8640.00 rows=12000 width=184)
                         (actual time=0.041..891.442 rows=12000 loops=1)
                     Hash Cond: ("Order"."UserId" = "User"."id")
                     ->  Hash Left Join  (cost=56.25..4320.00 rows=12000 width=120)
                               (actual time=0.028..412.891 rows=12000 loops=1)
                           Hash Cond: ("Order"."id" = "OrderItems"."OrderId")
                           ->  Seq Scan on "Orders"  (cost=0.00..18.50 rows=500 width=56)
                                     (actual time=0.008..1.204 rows=500 loops=1)
                                 Filter: (status = 'completed')
                           ->  Hash  (cost=44.00..44.00 rows=2400 width=64)
                                     (actual time=0.891..0.891 rows=2400 loops=1)
                                 ->  Seq Scan on "OrderItems"
                                           (actual time=0.004..0.412 rows=2400 loops=1)
               ->  Hash  (cost=64.00..64.00 rows=800 width=64)
                         (actual time=0.018..0.018 rows=800 loops=1)
                     ->  Seq Scan on "Products"
                               (actual time=0.003..0.234 rows=800 loops=1)
         ->  Hash  (cost=168.00..168.00 rows=1200 width=64)
                   (actual time=0.024..0.024 rows=1200 loops=1)
               ->  Hash Join  (cost=36.00..168.00 rows=1200 width=64)
                         (actual time=0.012..0.198 rows=1200 loops=1)
                     ->  Seq Scan on "Tags"  (cost=0.00..22.00 rows=400 width=32)
                     ->  Hash  (cost=18.00..18.00 rows=1200 width=32)
                           ->  Seq Scan on "ProductTags"  (rows=1200 loops=1)

 Planning Time: 2.841 ms
 Execution Time: 6851.442 ms

-- 6.8 seconds. For 500 orders.
-- Note the row count: 52,800 intermediate rows from 500 orders.
-- That is the Cartesian product. 500 orders x ~4.8 items x ~1.5 tags x ~4.4
-- Each LEFT JOIN multiplies the row count by the average relation cardinality.

6.8 seconds. On 500 orders. With an 8 MB dataset.

The key number is in the Hash Left Join at the top: 52,800 intermediate rows produced from 500 base rows. That is a 105x multiplication factor. Each LEFT JOIN compounds the previous one:

Join step	Row count	Multiplier	Explanation
Orders (base)	500	-	WHERE status = completed
+ Users (1:1)	500	1x	Each order has one user
+ OrderItems (1:N)	12,000	24x	~24 items per order
+ Products (1:1)	12,000	1x	Each item has one product
+ Tags (M:N)	52,800	4.4x	~4.4 tags per product

The 1:1 joins (User, Product) do not multiply rows. They are well-behaved guests. The 1:N join (OrderItems) is the first explosion — 500 rows become 12,000. The M:N join (Tags through ProductTags) multiplies again — 12,000 rows become 52,800. If products averaged 10 tags instead of 4.4, you would be looking at 120,000 rows.

This is a Cartesian product. Not in the formal mathematical sense — the joins have conditions — but in the practical sense that matters: every combination of parent and child rows appears in the flattened result, and the row count grows as the product of the cardinalities at each level of nesting.

I find this behaviour — constructing 52,800 rows to represent 500 orders — to be the database equivalent of photocopying an entire book in order to quote a single paragraph.

The LIMIT problem deserves its own discussion

The Cartesian product is the performance problem. The broken LIMIT is the correctness problem. They are related but distinct, and the second is arguably more dangerous because it silently returns wrong results.

Why LIMIT breaks with nested includes

// You wrote: limit: 100, expecting 100 orders.
// What Sequelize does internally:

// Step 1: Generate SQL with LIMIT 100 on the flattened join
// Step 2: Send to PostgreSQL
// Step 3: Receive 100 flattened rows
// Step 4: De-duplicate into order objects

// The 100 flattened rows might contain:
//   Order #1: 24 items x 4 tags = 96 rows
//   Order #2: 4 rows (partial — only the first item's tags)
//
// Result: 1 complete order and 1 partial order.
// You asked for 100. You got 1.

// Sequelize v6 tries to fix this with a subquery:
//   SELECT * FROM (SELECT ... FROM "Orders" LIMIT 100) AS "Order"
//   LEFT JOIN ...
//
// But this subquery approach has its own problem:
// the LIMIT 100 inside the subquery correctly limits orders,
// but the outer query still produces the Cartesian product
// of those 100 orders' items and tags.
//
// 100 orders x 24 items x 4.4 tags = 10,560 rows.
// For 100 orders. Transferred, parsed, de-duplicated in JS.

Sequelize v6 introduced a subquery approach to mitigate this: it wraps the primary model SELECT in a subquery and applies the LIMIT there, so the LIMIT correctly targets orders rather than flattened rows. This fixes pagination. It does not fix the Cartesian product — the outer query still joins all five tables and produces the same explosion of intermediate rows.

You can have correct pagination and catastrophic performance simultaneously. The subquery ensures you get 100 orders. It does not prevent those 100 orders from producing 10,560 flattened rows (100 x 24 items x 4.4 tags) that PostgreSQL must construct, transmit, and that Sequelize must de-duplicate in JavaScript.

Why does Sequelize do this?

A fair question, and one that deserves a fair answer — not a dismissal.

Sequelize's eager loading was designed for simplicity. One query, one round trip, one response. The ORM reassembles the nested objects from the flattened rows in JavaScript. For simple, shallow includes — an Order with its User — this works beautifully. One LEFT JOIN, no row multiplication, minimal overhead. The design is not wrong. It is optimized for the common case.

The pathology only appears with nested 1:N or M:N includes. Sequelize cannot know at query-build time how many child rows each parent has. It generates the JOIN faithfully, and the explosion happens at execution time inside PostgreSQL, invisible to the ORM until the result set comes back over the wire — 52,800 rows that Sequelize then spends considerable time de-duplicating and reassembling into 500 order objects.

The Sequelize eager loading documentation covers the basic include syntax thoroughly. What it does not emphasize — not nearly enough — is the performance cliff you hit the moment you nest a hasMany inside another hasMany.

I should be fair to the Sequelize maintainers: this is not unique to Sequelize. Every ORM that implements eager loading via JOINs faces this problem. Hibernate's @EntityGraph produces the same Cartesian product. Django's select_related limits itself to foreign keys specifically to avoid it. Prisma added relationLoadStrategy to address the same issue. The Cartesian explosion is an inherent risk of JOIN-based eager loading across one-to-many relationships. Sequelize is not uniquely at fault — it is merely the ORM where the fix is least discoverable.

The fix: `separate: true`

The solution is one option. It has been available since Sequelize v4. It tells Sequelize to fetch the association in a separate query using an IN clause instead of joining it into the main query.

The fix

// The fix: separate: true on the associations that cause multiplication
const orders = await Order.findAll({
  include: [
    { model: User },
    {
      model: OrderItem,
      separate: true,          // <-- fires as a second query with IN clause
      include: [
        {
          model: Product,
          include: [{
            model: Tag,
            through: { attributes: [] },  // exclude join table columns
          }],
        },
      ],
    },
  ],
  where: { status: 'completed' },
  limit: 100,
});

// Instead of one massive 5-table JOIN, Sequelize now sends:
//
// Query 1: SELECT * FROM "Orders" WHERE status = 'completed' LIMIT 100
// Query 2: SELECT * FROM "OrderItems"
//          WHERE "OrderId" IN (1, 2, 3, ... 100)
// Query 3: SELECT * FROM "Products"
//          INNER JOIN ... WHERE "ProductId" IN (...)
//
// Three focused queries instead of one Cartesian monster.

One word. separate. Set it to true. That is the entire change.

Instead of one monolithic 5-table JOIN, Sequelize now sends three focused queries:

What the database sees

-- Query 1: fetch the orders (clean, indexable, fast)
SELECT "id", "total", "status", "UserId"
FROM "Orders"
WHERE "status" = 'completed'
LIMIT 100;
-- 0.8ms, 100 rows

-- Query 2: fetch items for those specific orders
SELECT "id", "quantity", "price", "OrderId", "ProductId"
FROM "OrderItems"
WHERE "OrderId" IN (1, 2, 3, 4, 5, ... 100);
-- 2.1ms, ~2,400 rows

-- Query 3: fetch products with tags for those items
SELECT "Product"."id", "Product"."name", "Product"."sku",
       "Tags"."id" AS "Tags.id", "Tags"."label" AS "Tags.label"
FROM "Products" AS "Product"
LEFT OUTER JOIN ("ProductTags"
  INNER JOIN "Tags" ON "Tags"."id" = "ProductTags"."TagId")
  ON "Product"."id" = "ProductTags"."ProductId"
WHERE "Product"."id" IN (12, 34, 56, 78, ...);
-- 3.4ms, ~1,200 rows

-- Total: ~6.3ms across 3 queries
-- vs. 6,851ms in the single Cartesian join

Three queries. 6.3 milliseconds total. The same 500 orders, the same 2,400 items, the same tags — but without the Cartesian multiplication.

The reason this is so much faster is structural, not incidental. With the single JOIN, row count grows as the product of cardinalities: 500 x 24 x 4.4 = 52,800. With separate queries, row count grows as the sum: 500 + 2,400 + 1,200 = 4,100. Product versus sum. That is the entire explanation, and it explains why the gap widens with every additional row of data.

How much faster is it, exactly?

I prefer to let numbers speak for themselves.

Approach	Queries	Intermediate rows	Wall time	Memory peak
Single JOIN (no separate)	1	52,800	6,851 ms	~48 MB
separate: true on OrderItems	3	3,700	6.3 ms	~4 MB
Difference		14x fewer	1,087x faster	12x less

1,087x faster. On 500 rows. The improvement is not linear — it is super-linear, because the Cartesian product grows as the product of cardinalities while the separate queries grow as the sum. As the dataset grows, the gap widens further.

Here is how both approaches scale:

Orders	JOIN rows	JOIN time	Separate rows	Separate time
100	10,560	~1.2 s	3,700	~4 ms
500	52,800	~6.8 s	15,200	~12 ms
1,000	105,600	~28 s	28,400	~22 ms
5,000	528,000	> 60 s (OOM risk)	124,000	~85 ms
10,000	1,056,000	Does not complete	242,000	~160 ms

At 5,000 orders with the same relationship density, the single JOIN approach exceeds 60 seconds and begins consuming hundreds of megabytes of PostgreSQL work_mem. The separate: true approach takes approximately 85ms. At 10,000 orders, the single JOIN does not complete at all — PostgreSQL runs out of memory or the client times out. The separate approach finishes in 160ms.

The difference is not performance tuning. It is the difference between a working application and one that falls over.

The memory cost nobody mentions

Query time is the visible symptom. Memory consumption is the silent one, and in a Node.js server handling concurrent requests, it is often the one that actually brings the process down.

Memory analysis

// The memory cost of Cartesian products is worse than the query time.
//
// Consider the 52,800-row result set:
//
// Each row contains ALL columns from ALL 5 tables — even when most
// values are duplicates. The pg driver allocates a JavaScript object
// for each row before Sequelize de-duplicates them.
//
// Approximate memory per row: ~900 bytes (strings, numbers, object overhead)
// 52,800 rows x 900 bytes = ~47.5 MB for a SINGLE query
//
// After de-duplication:
//   500 Order objects          ~225 KB
//   2,400 OrderItem objects    ~1.1 MB
//   800 Product objects        ~360 KB
//   1,200 Tag assignments      ~540 KB
//   Total useful data:         ~2.2 MB
//
// Sequelize allocated 47.5 MB to produce 2.2 MB of useful data.
// 95.4% of the allocated memory is waste.
//
// On a Node.js server handling 10 concurrent requests,
// that is 475 MB of transient allocations — all garbage collected
// within seconds, all pressuring the V8 heap.
//
// With separate: true, total allocation is ~3.5 MB per request.
// Ten concurrent requests: 35 MB instead of 475 MB.

47.5 MB of transient allocations to produce 2.2 MB of useful data. That is a 95.4% waste ratio. Multiply by the number of concurrent requests your server handles, and the V8 garbage collector begins spending more time collecting Cartesian debris than executing your application code.

I have seen Node.js servers with 512 MB heap limits crash under load — not because they had a memory leak, but because three concurrent Sequelize queries with nested includes allocated 150 MB each. The queries returned correct results. The process exited with FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory.

With separate: true, the same three concurrent requests allocate approximately 10.5 MB total. The server stays upright. The garbage collector stays calm. Nobody gets paged at 3 AM.

What are the gotchas with `separate: true`?

I would be a poor waiter indeed if I presented only the virtues without the constraints. separate: true is not universally applicable, and misunderstanding its limitations causes a different category of confusion.

Constraints and edge cases

// Gotcha 1: separate only works on hasMany and belongsToMany
// This will NOT work:
Order.findAll({
  include: [{
    model: User,
    separate: true,  // Error: separate is not supported for belongsTo
  }],
});

// Gotcha 2: separate breaks top-level ordering on included models
// This will NOT work as expected:
Order.findAll({
  include: [{
    model: OrderItem,
    separate: true,
  }],
  order: [
    [OrderItem, 'price', 'DESC'],  // Cannot order parent by child when separate
  ],
});

// Instead, order within the include:
Order.findAll({
  include: [{
    model: OrderItem,
    separate: true,
    order: [['price', 'DESC']],  // Order items within their own query
  }],
});

// Gotcha 3: separate + limit on the parent
// With separate: true, the LIMIT applies correctly to orders.
// Without it, LIMIT 100 gives you 100 flattened rows — maybe 4 orders.

The key points:

separate only works on hasMany and belongsToMany associations. For belongsTo and hasOne, the standard LEFT JOIN is already optimal — no row multiplication occurs. Sequelize will throw if you try.
You cannot order the parent query by a column in a separated child. The child data arrives in a separate query, so PostgreSQL cannot sort the parent by it. Order within the child's own include block instead.
LIMIT now behaves correctly. Without separate, limit: 100 limits the flattened rows, potentially returning only a handful of complete parent objects. With separate, the limit applies to the parent table as intended. This is often the more important fix.

"I have observed, in production systems, pages generating over 400 database round trips for what appeared to be a simple list view."
— from You Don't Need Redis, Chapter 3: The ORM Tax

Nested `separate`: when one level is not enough

A subtlety that trips up even experienced Sequelize developers: separate: true does not propagate to nested includes. If you have a Cartesian product at two levels of nesting — OrderItems within Orders, and Tags within Products — you may need separate at both levels.

Separate at multiple levels

// Gotcha 4: separate does not propagate to nested includes.
// If Product -> Tags is also a many-to-many, you may need
// a second separate: true deeper in the tree.

const orders = await Order.findAll({
  include: [
    { model: User },
    {
      model: OrderItem,
      separate: true,
      include: [
        {
          model: Product,
          include: [{
            model: Tag,
            separate: true,  // <-- second separate for the M:N
            through: { attributes: [] },
          }],
        },
      ],
    },
  ],
  where: { status: 'completed' },
  limit: 100,
});

// This fires 4 queries instead of 3:
//   1. Orders (with User join — 1:1, no explosion)
//   2. OrderItems for those orders
//   3. Products for those items
//   4. Tags for those products
//
// Each query is flat. No Cartesian products anywhere.
// Total: ~8ms. Still 850x faster than the single JOIN.

Four queries instead of one. Each query is flat — no Cartesian products anywhere in the chain. The total time is approximately 8ms, still 850x faster than the single JOIN.

The rule of thumb: every hasMany or belongsToMany in your include tree is a candidate for separate: true. If the association connects a parent to multiple children, and those children will be joined against further grandchildren, the multiplicative explosion applies at each level independently.

The honest counterpoint: when `separate: true` is worse

A waiter who overstates his case is no waiter at all. There are situations where separate: true performs worse than the single JOIN, and I would be doing you a disservice not to name them.

Very small datasets with low cardinality. If you have 10 orders with 2 items each and no tags, the single JOIN produces 20 rows — fast, cheap, no explosion. Adding separate: true turns one round trip into three, adding ~1ms of connection overhead for no benefit. On datasets this small, the Cartesian product is not a product at all — it is just a join. The fix is unnecessary and slightly slower.

Network-heavy environments. If your application server and database are separated by 15ms of network latency (common in multi-region deployments), three separate queries add 30ms of round-trip overhead. The single JOIN, even with 20x row multiplication, might complete in 25ms total. The threshold depends on your specific latency and cardinality — but the general rule is: high network latency favours fewer queries, high cardinality favours more queries. When both are high, separate: true still wins, but by a narrower margin.

Aggregation queries. If you need to aggregate across the joined result — "total revenue by product tag" — the Cartesian product in the database might actually be what you want, because PostgreSQL can aggregate it server-side. Splitting into separate queries means you must aggregate in JavaScript, which is slower and uses more memory. For aggregation queries, consider a raw SQL query instead of either approach.

These are real constraints. They do not change the recommendation for the vast majority of cases. If your findAll has nested hasMany includes and returns more than a few dozen parent rows, separate: true is the correct choice. But verify it with your actual data and your actual network topology. Numbers, not dogma.

Transaction consistency with separate queries

A question that deserves a direct answer: if Sequelize sends three separate queries instead of one JOIN, can the data change between queries?

Transaction considerations

// Should you wrap separate queries in a transaction?
//
// Sequelize does NOT wrap separate: true queries in a transaction
// by default. Each query runs independently. In theory, data could
// change between Query 1 (orders) and Query 2 (items).
//
// In practice, this matters far less than you think:
//
// 1. The queries execute within milliseconds of each other.
// 2. For read-heavy pages (dashboards, lists, reports), momentary
//    inconsistency is invisible to the user.
// 3. For truly consistency-critical reads, wrap in a transaction:

const orders = await sequelize.transaction(async (t) => {
  return Order.findAll({
    include: [
      { model: User },
      {
        model: OrderItem,
        separate: true,
        include: [{ model: Product, include: [{ model: Tag }] }],
      },
    ],
    where: { status: 'completed' },
    limit: 100,
    transaction: t,
    // With READ COMMITTED (PostgreSQL default), each query sees
    // its own snapshot. For a true consistent snapshot, use:
    // isolationLevel: Sequelize.Transaction.ISOLATION_LEVELS.REPEATABLE_READ,
  });
});

// The performance cost of wrapping in a transaction is negligible.
// The queries still run separately — they just share a snapshot.

In the default case (no explicit transaction), yes — technically, data could change between Query 1 and Query 2. An order item could be added or deleted in the milliseconds between the queries. In practice, this is vanishingly unlikely for read-heavy pages, and the visual inconsistency (if it occurred) would be corrected on the next page load.

For truly consistency-critical reads — financial reports, audit logs, anything where momentary inconsistency is unacceptable — wrap the query in a transaction with REPEATABLE READ isolation. The queries still execute separately, but they share a consistent snapshot. The performance cost is negligible.

Seeing what Sequelize sends: logging done properly

The most valuable diagnostic tool for Sequelize performance is not a profiler or an APM dashboard. It is logging: console.log. The number of developers who have never seen the SQL their ORM generates is, in my experience, alarmingly high.

Enabling SQL logging

// Enable logging to see what Sequelize actually sends.
// Do this once for every query that matters.

const sequelize = new Sequelize(connectionString, {
  logging: (sql, timing) => {
    console.log(sql);
    // With benchmark: true, timing is the execution time in ms
  },
  benchmark: true,
});

// Or per-query:
const orders = await Order.findAll({
  include: [/* ... */],
  logging: console.log,
  benchmark: true,
});

// What to look for:
//   1. The number of LEFT OUTER JOINs (more than 2 = danger)
//   2. The presence of nested JOINs inside parentheses (M:N through tables)
//   3. Where the LIMIT clause appears (inside a subquery or on the outer query)
//
// If you see 4+ LEFT JOINs in a single statement, that query is a candidate
// for separate: true. No exceptions.

Read the generated SQL. Count the LEFT OUTER JOINs. If there are more than two, and any of them involve hasMany or belongsToMany associations, you have a Cartesian product candidate. This is not optional due diligence. It is the minimum standard of care for any query that runs in production.

The `subQuery` option: related but different

Sequelize has a second relevant option, subQuery, that occasionally appears in Stack Overflow answers alongside separate. They address different problems.

subQuery option

// Alternative: subQuery option for simpler cases
// When you only have one level of hasMany nesting
const orders = await Order.findAll({
  include: [
    { model: User },
    { model: OrderItem },
  ],
  where: { status: 'completed' },
  limit: 100,
  subQuery: false,  // Sequelize wraps the base query in a subquery by default
                    // Setting false can help or hurt — test both
});

// Sequelize's default behavior wraps the primary model SELECT in a subquery
// when limit is used with includes. This is usually correct.
// But with deep nesting, even the subquery approach produces Cartesian products
// in the outer join. separate: true is the reliable fix.

subQuery controls whether Sequelize wraps the base model's SELECT in a subquery when limit is used with include. The default (true) ensures the limit applies to the base model before joining. Setting it to false applies the limit after the join — which, with a Cartesian product, gives you far fewer parent rows than you requested.

subQuery is a band-aid for the LIMIT problem. separate: true is the structural fix for the Cartesian problem. They are complementary, not interchangeable. For deep nesting with 1:N and M:N associations, separate: true is the answer you want.

When the ORM is not enough: raw queries as a strategic option

There are queries where neither the single JOIN nor separate: true produces the result you need. Paginated orders with inline aggregations. Complex filtering across multiple relationship levels. Queries that need LATERAL JOIN or CTEs for correct pagination.

For these cases, a raw query is not a failure of ORM proficiency. It is a recognition that the ORM's query builder has limits, and those limits are well-defined.

Raw query with CTE + LATERAL JOIN

-- When separate: true is not enough, raw queries give full control.
-- This is not a retreat — it is a strategic escalation.

-- Approach: CTE + lateral join for paginated orders with items
WITH target_orders AS (
  SELECT id, total, status, "UserId"
  FROM "Orders"
  WHERE status = 'completed'
  ORDER BY "createdAt" DESC
  LIMIT 100
)
SELECT
  o.id, o.total, o.status,
  u.name AS user_name,
  items.id AS item_id,
  items.quantity,
  items.price,
  p.name AS product_name,
  p.sku
FROM target_orders o
JOIN "Users" u ON u.id = o."UserId"
LEFT JOIN LATERAL (
  SELECT oi.id, oi.quantity, oi.price, oi."ProductId"
  FROM "OrderItems" oi
  WHERE oi."OrderId" = o.id
) items ON true
LEFT JOIN "Products" p ON p.id = items."ProductId";

-- The CTE locks the pagination to orders, not flattened rows.
-- The LATERAL join scopes items per order.
-- No Cartesian product. 3.2ms on 100 orders.
--
-- You lose Sequelize's object hydration, but you gain
-- full control over the query plan.

You lose Sequelize's automatic object hydration. You gain complete control over the query plan, correct pagination without workarounds, and a query that PostgreSQL can optimize fully. The trade-off is explicit — you write more SQL, you get better performance and correctness.

I should note: reaching for raw SQL is not an indictment of Sequelize. It is an acknowledgement that ORMs are abstractions, and all abstractions have edges. Knowing where those edges are — and crossing them deliberately — is a sign of maturity, not defeat.

Can you detect Cartesian products in production?

Yes. pg_stat_statements is your starting point. Cartesian explosions have a distinctive signature: a query with multiple LEFT OUTER JOINs that returns a disproportionately high number of rows relative to the base table.

Diagnostic query

-- Find your Cartesian explosions in production.
-- Look for queries where rows_returned >> rows_in_base_table:

SELECT
  query,
  calls,
  rows / NULLIF(calls, 0) AS avg_rows_per_call,
  mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms
FROM pg_stat_statements
WHERE query ILIKE '%LEFT OUTER JOIN%'
  AND rows / NULLIF(calls, 0) > 1000
ORDER BY total_exec_time DESC
LIMIT 20;

-- If avg_rows_per_call is 10x-100x the number of base table rows,
-- you have a Cartesian product. The join is multiplying, not filtering.

If avg_rows_per_call is 10x to 100x what you expect from the base table, you have a Cartesian product. The EXPLAIN ANALYZE guide covers how to read the query plan to identify exactly which join is responsible for the multiplication.

There is also a simpler heuristic: if a Sequelize query has more than one level of include nesting and any of those levels involve hasMany or belongsToMany, it is a candidate for separate: true. Audit those queries proactively. Do not wait for the page load time to reach 7 seconds.

An audit checklist for your codebase

If you have an existing Sequelize application and suspect Cartesian products lurk within it — they almost certainly do — here is a systematic approach to finding them.

Audit checklist

// A checklist for auditing Sequelize queries in your codebase.
//
// Run this mental test on every findAll with include:
//
// 1. Count the include nesting levels.
//    - 1 level with hasMany/belongsToMany? Monitor it.
//    - 2+ levels with hasMany/belongsToMany? Fix it now.
//
// 2. Check if limit is used.
//    - limit + nested include without separate = broken pagination.
//    - The limit applies to flattened rows, not parent objects.
//
// 3. Check if order references included models.
//    - Ordering by a child column forces Sequelize to include it
//      in the main query — separate: true will break that ordering.
//    - Move the ordering into the child include block instead.
//
// 4. Check the association cardinalities.
//    - 1:1 (belongsTo, hasOne): safe to join, no explosion.
//    - 1:N (hasMany): explosion proportional to N.
//    - M:N (belongsToMany): explosion proportional to M*N.
//    - Nested 1:N inside 1:N: explosion is N1 * N2. Dangerous.
//
// 5. Check data growth.
//    - A query safe at 10 orders with 3 items each (30 rows)
//      becomes lethal at 1,000 orders with 20 items (20,000 rows).
//    - If the relationship cardinality will grow, use separate.

Search your codebase for findAll calls that include nested include arrays. For each one, count the hasMany and belongsToMany associations in the include tree. If the count exceeds one, or if a single hasMany is nested inside another hasMany, that query is producing a Cartesian product. The only question is whether the dataset is large enough for you to notice yet.

Do not rely on "it's fast enough now." Cartesian products grow super-linearly with data. A query that takes 50ms today on 100 orders will take 50 seconds on 10,000 orders. The fix takes one line. The outage takes a weekend.

Where a query-aware proxy fits in

Sequelize's separate: true solves the problem when you know about it. The concern, as with most ORM performance issues, is the queries you do not know about — the ones buried three service layers deep, added six months ago by someone who has since moved on, slowly worsening as the data grows.

Gold Lapel detects Cartesian explosions at the PostgreSQL wire protocol level. The signature is unmistakable: a multi-table JOIN where the intermediate row count exceeds the base row count by an order of magnitude or more, repeated with predictable regularity. It is the query plan equivalent of a fire alarm.

When Gold Lapel identifies these patterns, it can materialize the expensive join as a materialized view and transparently rewrite subsequent queries to use it. The application sends the same Cartesian-producing SQL. PostgreSQL reads from a pre-computed, properly indexed result. The query that took 6.8 seconds takes 3 milliseconds.

I should be honest about the boundary here: a proxy cannot replace developer understanding. If you know your Sequelize queries have nested includes, the right fix is separate: true — it is simpler, it is zero-latency, and it works without any infrastructure changes. A proxy catches the queries you missed, the ones added after the audit, the ones that grow into problems over months of data accumulation. Both approaches serve the same goal: keeping the household running smoothly, whether the staff are under your direct supervision or not.

Fix the Sequelize queries you can find. Add separate: true where it belongs. And for the N+1 patterns and Cartesian explosions that will inevitably appear in the future — let the proxy handle what the ORM cannot see.

Frequently asked questions

What is a Cartesian product in Sequelize?

How do I fix Sequelize Cartesian product performance?

Does separate: true work with belongsTo associations in Sequelize?

Why does Sequelize LIMIT return fewer results than expected with nested includes?

Is separate: true slower because it sends multiple queries?

How do I detect Cartesian products in Sequelize queries?

What is the difference between separate and subQuery in Sequelize?

Terms referenced in this article

The cartesian product you have just dismantled is one instance of a broader pattern. I have written a piece on ORM versus raw SQL that examines when the abstraction helps, when it hinders, and how to know which situation you are in — a question Sequelize raises more often than most.