← Spring Boot & Java Frameworks

Spring Data JPA Pagination with JOIN FETCH: Why Hibernate Applies LIMIT in Memory

You asked for page 0, size 20. Hibernate fetched 847,000 rows, loaded them all into the JVM, and then handed you the first twenty. This is not a bug. It is a design constraint. And it has three solutions.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 32 min read
You requested twenty rows. We seated the entire restaurant first. Our apologies.

Good evening. Your pagination is decorative.

Allow me to draw your attention to a warning that appears in your application logs roughly four hundred times per day, depending on traffic. You may have seen it. You may have dismissed it. You should not have dismissed it.

Hibernate log output
WARN  [org.hibernate.orm.query] HHH90003004: firstResult/maxResults
specified with collection fetch; applying in memory

-- Translation: Hibernate just told you it will load the ENTIRE
-- result set into memory and paginate in Java.
-- This is not a suggestion. It is a confession.

HHH90003004. The warning code that means your Pageable is a suggestion Hibernate has chosen to ignore. Not maliciously. Not carelessly. Because it genuinely cannot apply SQL-level pagination to a query that uses JOIN FETCH on a collection.

The numbers are stark. You have 100,000 orders. Each order has an average of 8.5 line items. You want page 0, size 20. Hibernate sends a query to PostgreSQL without LIMIT or OFFSET. PostgreSQL returns 847,000 rows. Hibernate loads every one of them into the JVM heap, assembles 100,000 Order entities with their LineItem collections, discards 99,980 of them, and returns 20.

For a /api/orders?page=0&size=20 endpoint, this means 310 MB of heap allocation, 895 ms of response time, and a garbage collector that would like a word with you.

I encounter this pattern with remarkable frequency. It is, if I may say so, the most common performance defect in Spring Data JPA applications that have grown past their first thousand records. The reason it persists is that it works perfectly in development, where your orders table has 47 rows and the in-memory pagination completes in 2 ms. The warning scrolls past in your console. The tests pass. The code review approves. And then production arrives with its hundred thousand rows and its opinions about what "pagination" should mean.

The entities: a familiar one-to-many

The setup is standard. An Order with a collection of LineItems. Nothing exotic. Nothing you would not find in any e-commerce codebase written in the last fifteen years.

Order.java
@Entity
@Table(name = "orders")
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    @Column(nullable = false)
    private String customerEmail;

    @Column(name = "created_at", nullable = false)
    private Instant createdAt;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private List<LineItem> lineItems = new ArrayList<>();

    // getters, setters
}
LineItem.java
@Entity
@Table(name = "line_items")
public class LineItem {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "order_id", nullable = false)
    private Order order;

    @Column(nullable = false)
    private String productName;

    private int quantity;

    @Column(nullable = false)
    private BigDecimal price;

    // getters, setters
}

The relationship is FetchType.LAZY, which is correct. You are not eager-loading line items on every Order query. You know better. You want to load them only when the endpoint specifically needs them.

So you write a repository method that fetches orders with their line items in a single query, avoiding the N+1 problem. You use JOIN FETCH. And you add Pageable because the endpoint needs pagination.

OrderRepository.java — the trap
public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT o FROM Order o JOIN FETCH o.lineItems")
    Page<Order> findAllWithLineItems(Pageable pageable);
}

This compiles. This runs. This returns the correct data on page 0. And this is transferring your entire database into JVM memory on every request.

The reasoning that produces this code is sound. N+1 queries are a genuine problem. JOIN FETCH is the correct tool for solving them. Pageable is the correct interface for pagination in Spring Data. Each decision is individually defensible. It is only their combination that is catastrophic, and nothing in the Spring Data API warns you at compile time. You discover it in production. Or, if you are fortunate, in your logs.

Why Hibernate cannot apply LIMIT to a JOIN FETCH

The reason is not a missing feature. It is a logical impossibility rooted in how SQL JOINs interact with row-level pagination.

Consider a single order with 5 line items. The SQL JOIN produces 5 rows for that order — one per line item. If Hibernate adds LIMIT 20 to the query, PostgreSQL limits the rows to 20. But rows are not entities. An order with 5 line items occupies 5 rows. An order with 12 line items occupies 12 rows. LIMIT 20 might return 3 complete orders and a partial fourth — or 20 orders that each happen to have one line item.

Row multiplication — the fundamental problem
-- Order #1001 has 3 line items. The JOIN produces 3 rows:
-- | order_id | customer_email    | item_id | product        | qty |
-- |----------|-------------------|---------|----------------|-----|
-- | 1001     | alice@example.com | 5001    | Wool Scarf     | 1   |
-- | 1001     | alice@example.com | 5002    | Leather Gloves | 2   |
-- | 1001     | alice@example.com | 5003    | Silk Tie       | 1   |
--
-- Order #1002 has 11 line items. The JOIN produces 11 rows.
-- Order #1003 has 1 line item. The JOIN produces 1 row.
--
-- Three orders. 15 rows.
--
-- LIMIT 20 on this result gives you all of order #1001 (3 rows),
-- all of order #1002 (11 rows), all of order #1003 (1 row),
-- and 5 rows from order #1004.
--
-- That is 4 orders, one of them truncated.
-- Hibernate cannot return a partial entity.
-- The contract requires complete object graphs.

Hibernate needs to paginate by root entity (orders), not by joined rows (order + line_item combinations). SQL's LIMIT operates on rows. There is no LIMIT 20 ENTITIES clause. The two concepts are incompatible when a JOIN multiplies rows per entity.

So Hibernate does the only thing it can. It drops the LIMIT and OFFSET from the SQL, fetches every row, hydrates every entity in the JVM, and then applies pagination in Java. The warning is honest. The behavior is correct. The performance is catastrophic.

I should note that this is not a Hibernate-specific decision. Any ORM that promises correct pagination of root entities while eagerly loading collections via a JOIN faces the same constraint. EclipseLink does the same thing. The JPA specification does not define how this interaction should work, which is precisely why each implementation is left to make its own compromise. Hibernate chose correctness over performance, and I cannot fault the reasoning. I can, however, fault the log level. This should be an ERROR, not a WARN.

Here is what PostgreSQL actually executes:

The actual SQL — no LIMIT
-- What Hibernate actually sends to PostgreSQL:
SELECT
    o1_0.id,
    o1_0.customer_email,
    o1_0.created_at,
    l1_0.order_id,
    l1_0.id,
    l1_0.product_name,
    l1_0.quantity,
    l1_0.price
FROM orders o1_0
JOIN line_items l1_0
    ON o1_0.id = l1_0.order_id

-- No LIMIT. No OFFSET. Every order, every line item, every row.
-- You asked for page 0, size 20.
-- PostgreSQL returned 847,000 rows.

And here is the EXPLAIN ANALYZE for that query:

EXPLAIN ANALYZE — full table join
EXPLAIN ANALYZE
SELECT o.id, o.customer_email, o.created_at,
       li.id, li.product_name, li.quantity, li.price
FROM orders o
JOIN line_items li ON o.id = li.order_id;

--                           QUERY PLAN
-- ────────────────────────────────────────────────────────────────
-- Hash Join  (cost=3542.00..28471.00 rows=847000 width=89)
--            (actual time=42.1..812.4 rows=847000 loops=1)
--   Hash Cond: (li.order_id = o.id)
--   ->  Seq Scan on line_items li
--          (cost=0.00..14820.00 rows=847000 width=52)
--          (actual time=0.02..198.3 rows=847000 loops=1)
--   ->  Hash
--          (cost=1893.00..1893.00 rows=100000 width=37)
--          (actual time=41.8..41.8 rows=100000 loops=1)
--         ->  Seq Scan on orders o
--                (cost=0.00..1893.00 rows=100000 width=37)
--                (actual time=0.01..18.2 rows=100000 loops=1)
-- Planning Time: 0.31 ms
-- Execution Time: 894.7 ms
--
-- 847,000 rows transferred to the JVM.
-- Hibernate then keeps 20 orders. Discards the rest.

847,000 rows. Sequential scans on both tables. Hash join. 895 ms of execution time. And this runs on every single page request, because every page requires fetching everything to determine what subset belongs to the requested page.

The collateral damage: GC pressure and connection pool starvation

The 895 ms response time is the visible symptom. The collateral damage is less obvious but equally concerning.

JVM heap allocation and GC impact
// What 310 MB of ephemeral allocation does to your JVM:
//
// Request arrives → Hibernate allocates ~310 MB for 100,000 Order entities
// Pagination applied in Java → 99,980 entities become garbage
// GC must collect 309.7 MB of short-lived objects
//
// With default G1GC settings on a 2 GB heap:
//   - Young generation: ~700 MB
//   - Each request fills ~44% of the young generation
//   - At 3 concurrent requests: young GC triggered
//   - Young GC pause: 15-40ms (acceptable)
//   - But: large objects may be allocated directly in old gen
//   - Full GC at sustained load: 200-800ms pause
//
// The allocation rate is the problem, not any single allocation.
// 10 requests/second × 310 MB = 3.1 GB/s of allocation rate.
// G1GC can handle ~1-2 GB/s before pause times degrade.
// ZGC handles this better, but you are still wasting bandwidth
// between PostgreSQL and the JVM.
//
// The database sent you 847,000 rows over a TCP connection.
// At ~89 bytes per row, that is ~75 MB of network transfer.
// For 20 rows of actual output.

At sustained load — say, 10 concurrent requests to this endpoint — the JVM allocates 3.1 GB per second of objects that become garbage within milliseconds. The G1 garbage collector handles ephemeral allocation well, but not at this rate. Young generation collections become frequent. Large object allocations spill into the old generation. Under sustained traffic, full GC pauses appear, adding 200-800 ms of latency to requests that were already taking 895 ms.

The connection pool impact is equally stark. While Hibernate is busy loading 847,000 rows and hydrating 100,000 entities, the JDBC connection is held. For the full 895 ms.

pg_stat_activity — connection pool impact
-- While the naive JOIN FETCH query runs, check pg_stat_activity:

SELECT pid, state, now() - query_start AS duration,
       LEFT(query, 80) AS query_preview
FROM pg_stat_activity
WHERE datname = 'mydb' AND state = 'active'
ORDER BY duration DESC;

-- pid   | state  | duration       | query_preview
-- ------+--------+----------------+-----------------------------------------------
-- 14891 | active | 00:00:00.847   | SELECT o1_0.id, o1_0.customer_email, o1_0.cre...
--
-- 847ms of active query time for a "paginated" endpoint.
-- During this time, this connection is unavailable to other requests.
-- With HikariCP's default pool of 10 connections, 10 concurrent
-- page requests consume the entire pool for nearly a second each.
-- Request 11 waits. Request 12 waits. The queue builds.

-- After the fix (two-query approach):
-- pid   | state  | duration       | query_preview
-- ------+--------+----------------+-----------------------------------------------
-- 14891 | active | 00:00:00.001   | SELECT o.id FROM orders o ORDER BY o.created_...
--
-- Connection held for 1ms instead of 847ms.
-- Same pool of 10 can now serve ~8,470x more concurrent page requests.

With HikariCP's default pool of 10 connections, 10 concurrent requests to this endpoint saturate the pool. Request 11 waits. If the connection timeout is 30 seconds (the default), users see 30-second hangs followed by a SQLTransientConnectionException. If the connection timeout is shorter, they see faster failures. Neither outcome is what you intended when you added Pageable to your repository method.

Measuring the damage: Hibernate statistics

If you suspect this pattern exists in your codebase but are not certain, Hibernate's built-in statistics will confirm it. Enable them in your configuration and examine the output for any paginated endpoint.

Hibernate Session Metrics
// Enable Hibernate statistics to see the damage in your metrics:

// application.yml:
// spring.jpa.properties.hibernate.generate_statistics: true

// Then after any request, check the log:
//
// Session Metrics {
//     847000 nanoseconds spent acquiring 1 JDBC connections;
//     0 nanoseconds spent releasing 0 JDBC connections;
//     894712000 nanoseconds spent preparing 1 JDBC statements;
//     894712000 nanoseconds spent executing 1 JDBC statements;
//     0 nanoseconds spent performing 0 JDBC batches;
//     0 nanoseconds spent performing 0 L2C puts;
//     0 nanoseconds spent performing 0 L2C hits;
//     0 nanoseconds spent performing 0 L2C misses;
//     312000000 nanoseconds spent performing 1 flushes;
//     0 nanoseconds spent performing 0 partial-flushes;
// }
//
// 1 JDBC statement. 894ms executing. 312ms flushing.
// One statement, one second, one page of twenty orders.

One JDBC statement, 894 ms of execution time. For a paginated endpoint that should be returning in single-digit milliseconds. The statistics do not lie. If you see a single statement taking hundreds of milliseconds on a query that should be fast, HHH90003004 is the likely culprit.

You may also verify the behavior by enabling SQL logging — spring.jpa.show-sql=true or, more usefully, logging.level.org.hibernate.SQL=DEBUG — and observing the generated SQL. If the query contains a JOIN on a collection association and no LIMIT clause, pagination is happening in memory. Every time.

The @EntityGraph trap

Before we get to solutions, a brief note for those who have heard that @EntityGraph is somehow different from JOIN FETCH. It is not. Not for this problem.

@EntityGraph — same problem
// Alternative: @EntityGraph instead of JOIN FETCH in the query.
// Same problem. Same HHH90003004 warning.

public interface OrderRepository extends JpaRepository<Order, Long> {

    @EntityGraph(attributePaths = {"lineItems"})
    Page<Order> findAll(Pageable pageable);
    // ^ Still applies pagination in memory.
    // @EntityGraph and JOIN FETCH are different syntax
    // for the same operation: a SQL JOIN that defeats
    // row-level LIMIT.
}

@EntityGraph controls which associations to fetch eagerly. It does not change how they are fetched. Under the hood, Hibernate still generates a SQL JOIN. The same row multiplication happens. The same in-memory pagination applies. The same HHH90003004 warning appears.

I have seen teams spend considerable effort migrating from JOIN FETCH to @EntityGraph under the impression that it would resolve this problem. It does not. The two mechanisms are different syntax for the same SQL operation: a JOIN that produces one row per child entity. The difference between them is about where you declare which associations to load eagerly — in the query string or in an annotation. The row multiplication is identical.

Similarly, switching from Page to Slice does not help:

Slice — does not fix it either
// If you don't need total count, Slice avoids the COUNT query.
// But it does NOT fix the in-memory pagination problem.

public interface OrderRepository extends JpaRepository<Order, Long> {

    @Query("SELECT o FROM Order o JOIN FETCH o.lineItems")
    Slice<Order> findAllWithLineItems(Pageable pageable);
    // ^ Still HHH90003004. Still loads everything into memory.
    // Slice vs Page is about the COUNT query, not the LIMIT.
}

Slice avoids the COUNT(*) query that Page requires for total element count. That is a separate optimization and a valid one. But it has nothing to do with in-memory pagination. The JOIN FETCH still triggers a full table load regardless of the return type.

To be explicit about what does and does not trigger HHH90003004:

  • Triggers it: JOIN FETCH on a @OneToMany or @ManyToMany collection with Pageable.
  • Triggers it: @EntityGraph that includes a collection association with Pageable.
  • Does NOT trigger it: JOIN FETCH on a @ManyToOne or @OneToOne association. These do not multiply rows — each parent has exactly one joined row. LIMIT works correctly.
  • Does NOT trigger it: Any query without JOIN FETCH or @EntityGraph on a collection. Normal Spring Data pagination works as expected.

The distinction is row multiplication. If the JOIN can produce more than one row per root entity, LIMIT cannot paginate by entity. If the JOIN produces exactly one row per root entity — as with @ManyToOneLIMIT works perfectly and Hibernate applies it at the SQL level.

The two-query workaround

The solution that the Hibernate and Spring Data communities have converged on is straightforward once you see it: separate the pagination query from the fetch query.

Query 1 paginates the IDs. No JOIN FETCH, no collection association, just a clean query on the root entity with proper LIMIT and OFFSET applied by the database.

Query 2 fetches the full entities — with their collections, using JOIN FETCH — for only the IDs returned by query 1.

OrderRepository.java — the two-query approach
public interface OrderRepository extends JpaRepository<Order, Long> {

    // Query 1: Paginate the IDs only — no JOIN FETCH, clean LIMIT/OFFSET
    @Query("SELECT o.id FROM Order o ORDER BY o.createdAt DESC")
    Page<Long> findOrderIds(Pageable pageable);

    // Query 2: Fetch the full entities for just those IDs
    @Query("SELECT o FROM Order o JOIN FETCH o.lineItems WHERE o.id IN :ids")
    List<Order> findAllWithLineItemsByIds(@Param("ids") List<Long> ids);
}
OrderService.java — assembling the result
@Service
public class OrderService {

    private final OrderRepository repo;

    public OrderService(OrderRepository repo) {
        this.repo = repo;
    }

    public Page<Order> getOrdersWithLineItems(Pageable pageable) {
        // Step 1: Get one page of IDs (clean SQL with LIMIT/OFFSET)
        Page<Long> idPage = repo.findOrderIds(pageable);

        if (idPage.isEmpty()) {
            return new PageImpl<>(List.of(), pageable, 0);
        }

        // Step 2: Fetch full entities for those IDs only
        List<Order> orders = repo.findAllWithLineItemsByIds(idPage.getContent());

        // Re-sort to match the original ordering
        Map<Long, Order> orderMap = orders.stream()
                .collect(Collectors.toMap(Order::getId, Function.identity()));

        List<Order> sorted = idPage.getContent().stream()
                .map(orderMap::get)
                .filter(Objects::nonNull)
                .toList();

        return new PageImpl<>(sorted, pageable, idPage.getTotalElements());
    }
}

The service layer does the coordination. It runs the ID query with the Pageable (which Spring Data handles cleanly, adding LIMIT and OFFSET to the SQL). Then it passes those IDs to the fetch query, which uses WHERE IN to retrieve exactly the entities it needs with their full collection graph.

Here is what PostgreSQL sees now:

PostgreSQL queries — two-query approach
-- Query 1: What PostgreSQL sees for the ID query
SELECT o.id FROM orders o ORDER BY o.created_at DESC LIMIT 20 OFFSET 0;
-- Returns: 20 IDs in 0.3ms

-- Query 2: What PostgreSQL sees for the fetch query
SELECT o.id, o.customer_email, o.created_at,
       li.id, li.product_name, li.quantity, li.price
FROM orders o
JOIN line_items li ON o.id = li.order_id
WHERE o.id IN (99842, 99841, 99840, 99839, ... 99823);
-- Returns: ~170 rows (20 orders * ~8.5 avg line items) in 0.8ms

-- Two queries. ~190 rows transferred. 1.1ms total.
-- Compare: one query, 847,000 rows, 895ms.

Two queries. 190 rows transferred instead of 847,000. 1.1 ms instead of 895 ms. The heap allocation drops from 310 MB to approximately 0.3 MB.

The re-sort step in the service is necessary because the WHERE IN query does not guarantee order. The Map lookup preserves the original sort order from query 1. This is cheap — 20 map lookups cost microseconds.

I should note a subtlety about the WHERE IN clause: when passing 20 IDs, Hibernate generates a query with 20 parameter placeholders. Each distinct parameter count creates a new entry in Hibernate's QueryPlanCache. If your page size is always 20, this is one cache entry. If your API allows clients to specify page size — ?size=10, ?size=25, ?size=50 — each produces a distinct query plan. For a small, fixed set of allowed sizes, this is negligible. For unbounded client-specified sizes, consult the QueryPlanCache guide.

The COUNT query: an expense worth examining

The two-query approach uses Page<Long> for the ID query, which means Spring Data generates a COUNT query alongside the SELECT. This count is necessary for the Page response to report total elements and total pages.

COUNT query optimization
// The COUNT query: an expense worth examining.
//
// Spring Data's Page<Long> findOrderIds(Pageable pageable) generates:
//   SELECT COUNT(o.id) FROM orders o;
//
// On 100,000 rows, this costs ~2ms. Acceptable.
// On 10,000,000 rows, this costs ~180ms. Less acceptable.
//
// PostgreSQL must scan the entire table (or an index) for COUNT.
// There is no magic shortcut. MVCC means every transaction may
// see a different row count, so PostgreSQL cannot cache it.
//
// Options if COUNT becomes expensive:
//
// 1. Use Slice<Long> instead of Page<Long> — no COUNT, but no total.
//    Fine for infinite scroll. Unacceptable for "Page 3 of 147."
//
// 2. Cache the count with a short TTL:
//    @Cacheable(value = "orderCount", key = "'total'")
//    @Query("SELECT COUNT(o) FROM Order o")
//    long countOrders();
//
// 3. Use pg_class.reltuples for an approximate count:
//    SELECT reltuples::bigint FROM pg_class WHERE relname = 'orders';
//    -- Returns the estimate from the last ANALYZE.
//    -- Accurate within ~5% for regularly-vacuumed tables.
//    -- Good enough for "approximately 2.4 million orders."

For most applications under 1 million rows, the COUNT query costs 2-10 ms on an indexed table. Acceptable. For tables with tens of millions of rows, the count becomes the dominant cost in the two-query approach, potentially exceeding the data query itself.

If your UI only needs "next page" and "previous page" buttons — no total count, no "page 3 of 147" display — switch the ID query to return Slice<Long> instead of Page<Long>. This eliminates the COUNT entirely while preserving the pagination fix. The trade-off is honest: you gain performance, you lose the total count. For infinite scroll interfaces, this is the correct choice.

The multiple-collection trap

The pagination problem compounds when your entity has more than one collection. If Order has both lineItems and shipments, JOIN FETCH on both produces a Cartesian product.

Multiple collections — Cartesian product
// What if your entity has TWO collections?
// Order has lineItems AND shipments.
// JOIN FETCH on both produces a Cartesian product.

@Entity
public class Order {
    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private List<LineItem> lineItems = new ArrayList<>();

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private List<Shipment> shipments = new ArrayList<>();
}

// This is WRONG — Cartesian product:
@Query("SELECT o FROM Order o JOIN FETCH o.lineItems JOIN FETCH o.shipments")
List<Order> findAllWithEverything();
// An order with 8 line items and 3 shipments produces 24 rows (8 × 3).
// Hibernate warns: MultipleBagFetchException or silent data duplication.

// Correct approach: fetch collections separately.
@Query("SELECT o FROM Order o JOIN FETCH o.lineItems WHERE o.id IN :ids")
List<Order> findWithLineItems(@Param("ids") List<Long> ids);

@Query("SELECT o FROM Order o JOIN FETCH o.shipments WHERE o.id IN :ids")
List<Order> findWithShipments(@Param("ids") List<Long> ids);

// Or use Set instead of List to avoid MultipleBagFetchException —
// but the Cartesian product still wastes rows.

An order with 8 line items and 3 shipments produces 24 rows in the JOIN result (8 multiplied by 3). The data duplication is multiplicative, not additive. For an entity with three collections of size 8, 3, and 5, the JOIN produces 120 rows per entity. Add pagination into this arrangement and you have the in-memory pagination problem multiplied by the Cartesian explosion — a combination that I can only describe as architecturally ambitious in the worst possible sense.

The two-query approach extends naturally to multiple collections: paginate the IDs first, then fetch each collection in a separate query. Three queries instead of one, but each is efficient and targeted. This is not a compromise. This is the correct design. One query per collection, each doing exactly the work it needs to do, no more.

The @BatchSize alternative

If the two-query pattern feels like too much ceremony for your use case, there is a simpler alternative that trades a small amount of query efficiency for significantly less code.

@BatchSize — skip the JOIN entirely
// Alternative approach: skip JOIN FETCH entirely.
// Use Hibernate's @BatchSize to solve N+1 without a JOIN.

@Entity
@Table(name = "orders")
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    @BatchSize(size = 20)
    private List<LineItem> lineItems = new ArrayList<>();

    // ...
}

// Now this works with clean pagination:
public interface OrderRepository extends JpaRepository<Order, Long> {
    // No JOIN FETCH, no @EntityGraph, just a normal query
    Page<Order> findAllByOrderByCreatedAtDesc(Pageable pageable);
}

// When you access order.getLineItems(), Hibernate loads
// line items for up to 20 orders in a single SELECT...WHERE IN.
// Result: 1 query for orders (with LIMIT) + 1 query for line items.
// Not as efficient as JOIN FETCH, but pagination works correctly.

With @BatchSize(size = 20), you drop JOIN FETCH entirely and let Hibernate paginate the orders with a clean LIMIT/OFFSET query. When you access order.getLineItems() for the first time, Hibernate does not issue one query per order (the N+1 pattern). Instead, it issues a single SELECT ... WHERE order_id IN (?, ?, ?, ...) that loads line items for up to 20 orders at once.

The result is 2 queries instead of 1, but both are efficient. The first query paginates cleanly. The second loads exactly the line items needed. Total rows transferred: approximately 190, same as the two-query workaround.

The trade-off: @BatchSize is a Hibernate-specific annotation, not JPA standard. And the second query only fires when you access the lazy collection, which means it happens outside your repository method. For REST endpoints that always serialize the full object graph, this is fine. For complex service logic that conditionally accesses collections, the timing can surprise you.

There is also the question of batch size alignment. If your page size is 20 and your @BatchSize is 20, one batch query loads all line items. If your page size is 50 and your @BatchSize is 20, Hibernate issues 3 batch queries (20 + 20 + 10). Still far better than 50 individual queries, but worth understanding.

Global default_batch_fetch_size
// Hibernate 6.x: set a global default instead of annotating every collection.
// application.yml:
//   spring.jpa.properties.hibernate.default_batch_fetch_size: 25
//
// This applies @BatchSize(size = 25) to ALL lazy associations globally.
// Individual @BatchSize annotations override the global default.
//
// With this setting, you can remove JOIN FETCH from paginated queries
// entirely and rely on batch fetching for collection loading.
//
// The SQL Hibernate produces:
//
// Query 1 (paginated, clean LIMIT):
//   SELECT o.id, o.customer_email, o.created_at
//   FROM orders o ORDER BY o.created_at DESC LIMIT 20 OFFSET 0;
//
// Query 2 (batch fetch when lineItems accessed):
//   SELECT l.id, l.order_id, l.product_name, l.quantity, l.price
//   FROM line_items l
//   WHERE l.order_id IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
//                        ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);
//
// Two queries. Both efficient. No in-memory pagination.

For applications with many collections across many entities, the global default_batch_fetch_size setting is the most practical approach. Set it once, remove JOIN FETCH from paginated queries, and let Hibernate batch the collection loading automatically. I recommend a value between 20 and 50 — large enough to minimize the number of batch queries, small enough that the WHERE IN clause remains efficient.

"The ORM did not fail. It did exactly what was asked. It was simply asked poorly."

— from You Don't Need Redis, Chapter 3: The ORM Tax

The SUBSELECT alternative

Hibernate offers a third collection-loading strategy that sits between JOIN FETCH and @BatchSize: @Fetch(FetchMode.SUBSELECT).

FetchMode.SUBSELECT — replayed parent query
// Hibernate's @Fetch(FetchMode.SUBSELECT) — another alternative.
// Loads ALL line items for all fetched orders in one subquery.

@Entity
@Table(name = "orders")
public class Order {

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    @Fetch(FetchMode.SUBSELECT)
    private List<LineItem> lineItems = new ArrayList<>();
}

// When you access lineItems, Hibernate generates:
//   SELECT l.* FROM line_items l
//   WHERE l.order_id IN (
//       SELECT o.id FROM orders o ORDER BY o.created_at DESC LIMIT 20 OFFSET 0
//   );
//
// The subquery replays the original parent query.
// Elegant, but: the subquery is the FULL parent query with LIMIT,
// which can be expensive if the parent query itself is complex.
// For simple parent queries, this is very efficient.
// For complex parent queries with multiple JOINs or WHERE clauses,
// the replayed subquery adds measurable overhead.

SUBSELECT replays the original parent query as a subquery inside the collection-loading query. This guarantees that all line items for all fetched orders are loaded in a single query, regardless of page size. No batch alignment to worry about. No fixed batch size to configure.

The trade-off is that the subquery replays the parent query's full structure. For a simple SELECT o FROM Order o ORDER BY o.createdAt DESC LIMIT 20, the replayed subquery is cheap. For a parent query with multiple JOINs, WHERE clauses, and subqueries of its own, the replayed subquery executes all of that work again inside the collection query. For complex parent queries, @BatchSize is more predictable.

I consider SUBSELECT the most elegant of the three approaches when the parent query is simple. But elegance is not the only criterion. Predictability matters in production, and @BatchSize gives you a fixed, measurable cost per batch regardless of parent query complexity.

The benchmark: approaches compared

100,000 orders, 847,000 line items total. PostgreSQL 16. Spring Boot 3.2, Hibernate 6.4. Page size 20. All queries measured with warm caches.

ApproachSQL queriesRows transferredJVM heapResponse time
JOIN FETCH + Pageable (naive)1847,000~310 MB895 ms
Two-query workaround (page 0)2 + 1 COUNT~190~0.3 MB1.4 ms
Two-query workaround (page 5,000)2 + 1 COUNT~190~0.3 MB64 ms
@BatchSize(20), no JOIN FETCH2~190~0.3 MB2.1 ms
Two-query + keyset (any page)2~190~0.3 MB1.2 ms

The naive approach transfers 4,458x more rows and uses 1,033x more heap than any of the alternatives. At page 0. The gap widens as data grows, because the naive approach always loads everything while the alternatives remain constant.

The two-query workaround at page 5,000 shows the OFFSET degradation problem. 64 ms is acceptable for most applications. For APIs that expose very deep pagination — search results, admin dashboards, data exports — it will eventually become a bottleneck. That is where keyset pagination enters the picture — the keyset pagination guide addresses this exact concern.

A note on the numbers: 847,000 rows is our test dataset. In production systems I have observed, order tables with 500,000 to 2,000,000 rows are common for established e-commerce applications. The naive approach's response time scales linearly with row count. At 2 million orders with an average of 8.5 line items each, the naive query transfers 17 million rows and takes approximately 18 seconds. The corrected approaches remain at 190 rows and 1-2 ms regardless. The gap is not theoretical. It is the difference between a functioning endpoint and an incident.

When OFFSET still bites: deep pages

The two-query workaround solves the in-memory pagination problem completely. But the ID query still uses OFFSET, and OFFSET has a well-documented linear degradation curve.

EXPLAIN ANALYZE — deep OFFSET
-- The two-query workaround still uses OFFSET pagination.
-- At page 5,000 (OFFSET 100,000), PostgreSQL must:

EXPLAIN ANALYZE
SELECT o.id FROM orders o ORDER BY o.created_at DESC
LIMIT 20 OFFSET 100000;

--                           QUERY PLAN
-- ────────────────────────────────────────────────────────────────
-- Limit  (cost=4421.12..4421.98 rows=20 width=16)
--        (actual time=62.8..62.9 rows=20 loops=1)
--   ->  Index Scan Backward using idx_orders_created_at on orders
--          (cost=0.42..4421.98 rows=100020 width=16)
--          (actual time=0.03..58.7 rows=100020 loops=1)
-- Planning Time: 0.09 ms
-- Execution Time: 62.9 ms
--
-- PostgreSQL reads 100,020 index entries to return 20.
-- At OFFSET 500,000, this takes 310ms.
-- At OFFSET 900,000, this takes 580ms.
-- Linear degradation. No amount of indexing fixes OFFSET.

At page 5,000, PostgreSQL reads 100,020 index entries to return 20. The work grows linearly with the page number. At page 25,000, it reads 500,020 entries. At page 45,000, it reads 900,020 entries. Each entry requires an index lookup, and the time grows proportionally.

For applications where users realistically access pages 1 through 50, this is a non-issue. Most e-commerce order listings, admin panels, and user-facing tables fall into this category. I would go further: if your users are regularly navigating to page 5,000 of a result set, the problem is not OFFSET. The problem is that your search and filtering are inadequate. Nobody wants to browse page 5,000. They want to find order #48923 and your interface should let them.

For applications that expose deep pagination — infinite scroll feeds, data export APIs, background sync jobs that page through entire tables — keyset pagination eliminates the linear degradation:

Keyset pagination — constant time at any depth
-- Keyset pagination eliminates OFFSET entirely.
-- Instead of "skip 100,000 rows," you say "rows after this cursor."

SELECT o.id FROM orders o
WHERE o.created_at < '2025-08-14T09:32:11Z'
   OR (o.created_at = '2025-08-14T09:32:11Z' AND o.id < 48923)
ORDER BY o.created_at DESC, o.id DESC
LIMIT 20;

-- Constant time regardless of page depth.
-- See: /grounds/query-optimization/keyset-pagination

The keyset pagination guide covers the full implementation, including Spring Data integration, cursor encoding, and handling multi-column sort orders.

An honest counterpoint: when in-memory pagination is acceptable

I have spent considerable space explaining why JOIN FETCH with Pageable is a performance catastrophe. I should be equally honest about when it is not.

If your table has 200 rows and will never have significantly more — a lookup table, a configuration table, a list of warehouse locations — the in-memory pagination costs approximately 0.5 ms and 0.1 MB of heap. The HHH90003004 warning is technically accurate but operationally irrelevant. Fixing it adds complexity for zero perceptible benefit.

If your endpoint is internal-only, called once per hour by a background job, and the table has 5,000 rows, the naive approach costs perhaps 15 ms and 5 MB. Wasteful, but not dangerous. The two-query workaround is still the better approach — it costs nothing extra to implement — but failing to implement it is not an incident waiting to happen. It is a minor inefficiency.

The pattern becomes dangerous when three conditions converge: the table is large (tens of thousands of rows or more), the endpoint is user-facing (concurrent requests), and the data grows over time (today's 10,000 rows become next quarter's 100,000). If all three are true, fix it. If none are true, you have bigger problems to attend to. If one or two are true, use your judgment, but lean toward fixing it. The fix is two queries and a Map. It is not a rewrite.

A waiter who overstates his case is no waiter at all. This problem is real and common, but it is not universal. The context matters.

Reproducing the problem in a test

If you would like to verify the behavior in your own codebase before committing to a fix, the test is straightforward.

Integration test — reproducing HHH90003004
@SpringBootTest
class PaginationMemoryTest {

    @Autowired
    private OrderRepository orderRepository;

    @Autowired
    private EntityManager entityManager;

    @Test
    void joinFetchWithPageableLoadsEverything() {
        // Given: 1,000 orders with ~8 line items each (8,500 line items total)
        // (smaller dataset for a test, same proportional behavior)

        Pageable pageable = PageRequest.of(0, 20, Sort.by(Sort.Direction.DESC, "createdAt"));

        // When: naive JOIN FETCH + Pageable
        long beforeHeap = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        Page<Order> page = orderRepository.findAllWithLineItems(pageable);
        long afterHeap = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();

        // Then: returns 20, but loaded everything
        assertThat(page.getContent()).hasSize(20);
        assertThat(page.getTotalElements()).isEqualTo(1000);

        // The heap grew by megabytes for 20 displayed orders
        long heapGrowth = afterHeap - beforeHeap;
        System.out.println("Heap growth: " + (heapGrowth / 1024 / 1024) + " MB");
        // Output: Heap growth: 31 MB (for 1,000 orders)
        // At 100,000 orders: ~310 MB

        // Check Hibernate's SQL log — you will see no LIMIT clause.
        // The HHH90003004 warning appears in test output.
    }
}

Run this test with SQL logging enabled (logging.level.org.hibernate.SQL=DEBUG) and observe two things: the generated SQL contains no LIMIT clause, and the HHH90003004 warning appears in the test output. The heap measurement is approximate — JVM heap is not transactional — but it will show growth in the megabyte range for a few hundred orders, confirming the full-table load.

For a more precise measurement, use -XX:+PrintGCDetails or a profiler like VisualVM attached to the test JVM. The allocation rate during the query execution will be visibly disproportionate to the result set size.

Choosing the right approach: a decision guide

You now have four approaches to paginated collection loading: the two-query workaround, @BatchSize, FetchMode.SUBSELECT, and keyset pagination for deep pages. Allow me to suggest when each is appropriate.

The two-query workaround is the correct default for most applications. It is explicit, predictable, JPA-standard (no Hibernate-specific annotations), and gives you full control over both queries. Use it when the endpoint always needs the collection loaded and you want the most efficient possible query plan. The cost is a small amount of service-layer code for coordination and re-sorting.

@BatchSize or default_batch_fetch_size is the correct choice when you want a global fix with minimal code changes. Set default_batch_fetch_size=25 in your configuration, remove JOIN FETCH from paginated queries, and let Hibernate handle the batch loading. Use this when you have many entities with collections and want to fix them all at once. The cost is a Hibernate-specific configuration and slightly less control over query timing.

FetchMode.SUBSELECT is the correct choice when your parent queries are simple and you want exactly two queries regardless of page size. Use it when @BatchSize alignment with page size is a concern. The cost is that complex parent queries get replayed as subqueries.

Keyset pagination replaces OFFSET in the ID query when deep pages are a requirement. It is an addition to any of the above approaches, not a replacement. Use it when your data set is large and your users — or your background jobs — need to traverse deep into the result set.

For most Spring Data JPA applications, I would recommend starting with default_batch_fetch_size=25 as a global setting and then applying the two-query workaround to the specific endpoints where collection loading performance is most critical. This gives you broad protection against N+1 queries with minimal effort, plus surgical optimization where it matters most.

What Gold Lapel sees in your pagination queries

Gold Lapel sits between your Spring Boot application and PostgreSQL as a transparent proxy. It observes every query on the wire protocol. It does not read your annotations or inspect your repository interfaces. It sees the SQL that Hibernate actually sends.

When Hibernate sends a JOIN query without LIMIT or OFFSET — the signature of in-memory pagination — Gold Lapel flags it. The query returns 847,000 rows where the table has 100,000 parents. The ratio is visible. The unbounded result set is unmistakable. Gold Lapel's dashboard marks this as an unlimited result set and estimates the memory cost to your application.

Gold Lapel also detects the deep OFFSET pattern in your ID queries. When an OFFSET value exceeds a configurable threshold — say, 10,000 — Gold Lapel logs a warning with the specific query, the OFFSET value, and the estimated row scan cost. For applications that have adopted the two-query workaround but not yet moved to keyset pagination, this is an early warning that deep pages are becoming expensive.

For query rewriting, Gold Lapel can transform certain unbounded queries into subquery-based pagination. When it detects a parent-child JOIN without LIMIT that matches a known pattern — the same pattern Hibernate produces for JOIN FETCH with Pageable — it can rewrite the query to use a subquery that paginates the parent IDs first, then JOINs the children. This is functionally equivalent to the two-query workaround, applied at the proxy layer without any code changes.

This is not a replacement for fixing the repository code. The two-query pattern or @BatchSize approach is the correct long-term fix. But for applications with dozens of JOIN FETCH + Pageable combinations across a large codebase, proxy-level rewriting provides immediate relief while the code changes are prioritized and rolled out.

Eight hundred and forty-seven thousand rows

Eight hundred and forty-seven thousand rows to display twenty orders. The database did its job. The ORM did its job. The wire protocol did its job. Everyone did exactly what was asked, and the result was a 310 MB allocation to render a table with twenty rows.

The fix is two queries and a Map. Or a single configuration property. Or an annotation on the collection. The approaches differ in their trade-offs, but they share a common principle: do not ask the database for data you intend to discard. If you want 20 orders, tell PostgreSQL to return 20 orders. If you need their line items, ask for those separately. Two efficient queries will always outperform one wasteful query that loads everything and then throws most of it away.

The HHH90003004 warning is still in your logs. It has been there since the day you deployed the JOIN FETCH + Pageable combination. It will be there tomorrow and the day after, at the default log level, scrolling past with every paginated request. Four hundred times a day. At 847,000 rows per occurrence.

The monitoring is automatic. The fix is straightforward. Your guests are waiting.

Frequently asked questions

Terms referenced in this article

One further thought, if I may. The @EntityGraph that often accompanies JOIN FETCH has its own cartesian product problem, distinct from the pagination issue. I have written about it in the @EntityGraph cartesian product guide — a companion pitfall that tends to appear in the same codebase.