← Spring Boot & Java Frameworks

Hibernate's QueryPlanCache and the PostgreSQL IN-Clause: The OOM You Didn't See Coming

Your query plan cache is consuming 89% of your heap. Here are the 247,319 entries responsible.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 24 min read
The illustration was cached 247,319 times. We are presently clearing the heap.

Good evening. I regret to inform you that your SessionFactory is holding the heap hostage.

You have a Spring Boot application backed by Hibernate and PostgreSQL. It has been running in production for months without incident. Your heap is sized at 2 GB. Your -Xmx was set with care. You have dashboards. You have alerts. You have, if I may say so, a reasonable expectation that the application will continue to work.

And then, during a perfectly normal traffic spike, your application dies with an OutOfMemoryError pointing squarely at SessionFactoryImpl.

// Your Spring Boot application has been running fine for weeks.
// Then one Thursday at 14:23, during a batch-processing spike:

java.lang.OutOfMemoryError: Java heap space
    at org.hibernate.internal.SessionFactoryImpl$QueryPlanCacheAccess.get(SessionFactoryImpl.java:781)
    at org.hibernate.internal.SessionFactoryImpl.getQueryPlan(SessionFactoryImpl.java:602)
    at org.hibernate.internal.AbstractSharedSessionContract.getQueryPlan(AbstractSharedSessionContract.java:631)
    at org.hibernate.internal.AbstractSharedSessionContract.createQuery(AbstractSharedSessionContract.java:743)

// You take a heap dump. Eclipse MAT reports:
//
//   SessionFactoryImpl retains 1.4 GB of heap
//   89% of total heap consumed by QueryPlanCache
//   Problem suspect: org.hibernate.engine.query.spi.QueryPlanCache
//   with 247,319 entries

You open the heap dump. Eclipse MAT tells you that 89% of your heap is consumed by something called QueryPlanCache. It contains a quarter of a million entries. You were expecting perhaps fifty.

This is not a memory leak in the traditional sense. No object references are dangling. No resources are unclosed. Hibernate is caching query plans exactly as designed. The problem is that your IN-clause queries are generating thousands of distinct query strings, and Hibernate is faithfully caching every single one.

I have attended to this particular situation more often than I would prefer. It is, in my observation, one of the most common causes of OOM failures in production Hibernate applications, and one of the least diagnosed. The fix is three lines of configuration. The diagnosis, if you do not know what to look for, can consume days.

What follows should spare you those days.

Why this OOM is uniquely treacherous

Most memory leaks reveal themselves gradually. A connection leak shows increasing connection counts over hours. An unclosed stream shows a slow heap climb on your Grafana dashboard. You have time to notice. Time to react.

The QueryPlanCache OOM is different. It lies dormant.

During normal operations, your application may use 20 or 30 distinct IN-clause sizes. The cache holds a few hundred entries. Heap consumption is modest. Your monitoring shows nothing unusual. This state can persist for weeks or months.

Then a batch job runs. Or a report generates. Or a new API endpoint goes live that accepts a list of IDs from the client. Suddenly the application encounters IN clauses with sizes it has never seen before: 47, 193, 412, 783. Each new size creates a new cache entry. Each entry consumes 5 to 50 KB of heap. The cache balloons from 300 entries to 30,000 in the span of minutes.

Your monitoring may not catch this because the cache growth is rapid and the OOM follows shortly. There is no gradual ramp. There is "fine" and then there is java.lang.OutOfMemoryError. The GC logs will show a frantic escalation of full collections in the final minutes, but by the time you notice, the process is already dead.

The treachery is compounded by the fact that the heap dump, taken post-mortem, shows the result but not the trigger. You see 247,319 entries in the cache. You do not see which endpoint or which batch job introduced the novel IN-clause sizes. You are left to reconstruct the timeline from access logs and application metrics, if they exist.

I find this behaviour — caching with enthusiasm but evicting with reluctance — to be the infrastructural equivalent of a waiter who accepts every parcel delivered to the door but never once tidies the receiving room.

How one repository method becomes 1,000 cache entries

The mechanism is straightforward once you see it. Consider a typical Spring Data JPA repository method that accepts a list of IDs:

// You have a repository method like this:
@Query("SELECT o FROM Order o WHERE o.customerId IN :customerIds")
List<Order> findByCustomerIds(@Param("customerIds") List<Long> customerIds);

// When called with 3 customer IDs, Hibernate generates:
SELECT o FROM Order o WHERE o.customerId IN (?1, ?2, ?3)

// When called with 7 customer IDs:
SELECT o FROM Order o WHERE o.customerId IN (?1, ?2, ?3, ?4, ?5, ?6, ?7)

// When called with 12 customer IDs:
SELECT o FROM Order o WHERE o.customerId IN (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12)

// Hibernate treats each of these as a DISTINCT query.
// Each gets its own entry in the QueryPlanCache.
// If your application calls this method with lists ranging
// from 1 to 1,000 elements, that is 1,000 cache entries.
// For ONE method. Multiply by every IN-clause query you have.

This is not a contrived example. This is the findAllById pattern. The WHERE x IN (:list) pattern. It appears in virtually every non-trivial Hibernate application.

When the list has 3 elements, Hibernate generates an IN clause with 3 positional parameters. When it has 7 elements, the clause has 7 parameters. Each distinct parameter count produces a distinct query string. Each distinct query string gets its own entry in the QueryPlanCache.

If your application calls this method with list sizes ranging from 1 to 1,000 — which is entirely normal for batch processing, report generation, or bulk API endpoints — that single method generates 1,000 cache entries.

Now multiply by every repository method that takes a list parameter. Five such methods with similar variance: 5,000 entries. Ten methods: 10,000. The cache grows monotonically until it either hits the configured limit (and starts evicting useful entries) or exhausts the heap entirely.

The hidden multiplier: Spring Data Specifications and Criteria queries

Direct @Query annotations are the obvious source of IN-clause variability. But there is a subtler source that I observe with increasing frequency in modern Spring Boot applications: the Criteria API and Spring Data JPA Specifications.

Specifications: combinatorial cache explosion
// Spring Data JPA Specifications: the hidden multiplier.
// Every dynamic filter combination generates a unique query string.

public class OrderSpecifications {

    public static Specification<Order> hasStatus(String status) {
        return (root, query, cb) ->
            cb.equal(root.get("status"), status);
    }

    public static Specification<Order> createdAfter(LocalDate date) {
        return (root, query, cb) ->
            cb.greaterThan(root.get("createdAt"), date);
    }

    public static Specification<Order> forCustomers(List<Long> ids) {
        return (root, query, cb) ->
            root.get("customerId").in(ids);
    }
}

// Usage in a service:
Specification<Order> spec = Specification
    .where(hasStatus("pending"))
    .and(createdAfter(someDate))
    .and(forCustomers(customerIds));  // variable-length list

orderRepository.findAll(spec);

// The Criteria API generates a unique JPQL string for each
// combination of active filters AND each distinct list size.
//
// status + date + 3 customers  -> unique query string
// status + date + 7 customers  -> another unique query string
// status + 3 customers (no date) -> yet another
//
// With 5 optional filters and variable-length IN clauses,
// the combinatorial explosion is severe.

The difficulty with Specifications is that the query string is not written by the developer — it is generated at runtime by the Criteria API. Each combination of active filters produces a unique JPQL string. Each unique list size in an in() predicate produces another variant on top of that.

Suppose you have a search endpoint with 5 optional filters, one of which is an IN clause. The combinatorial space is not 5 — it is 24 filter combinations (16) multiplied by the range of IN-clause sizes. With list sizes ranging from 1 to 100, you are looking at 1,600 distinct query strings from a single endpoint.

I have observed applications where the Specification-generated queries outnumber the explicit @Query methods by an order of magnitude in the plan cache. The developers had no idea. They had audited their repository interfaces and counted 40 methods. The cache contained 12,000 entries, and 11,500 of them were Criteria-generated.

If you use Specifications extensively, parameter padding becomes not merely advisable but essential. And you should consider whether the dynamic filter approach truly warrants the cache cost, or whether a finite set of named queries would serve your users equally well with dramatically better cache behaviour.

What the QueryPlanCache actually stores

Understanding why this is expensive requires understanding what Hibernate caches per entry.

QueryPlanCache internals
// Hibernate's QueryPlanCache stores two things per entry:
//
// 1. The HQL/JPQL parse tree (the "query plan")
//    - HQL string -> AST -> validated semantic model
//    - This is expensive to compute: 0.5-2ms per parse
//    - The cache exists to avoid re-parsing identical HQL
//
// 2. The ParameterMetadata (parameter types, positions, nullability)
//    - Maps each ?N placeholder to its Java type and JDBC binding
//
// The cache is keyed by the EXACT query string, including
// the number and positions of all parameters.
//
// "SELECT o FROM Order o WHERE o.customerId IN (?1, ?2, ?3)"
// and
// "SELECT o FROM Order o WHERE o.customerId IN (?1, ?2, ?3, ?4)"
// are DIFFERENT keys. Different entries. Different memory.
//
// Default cache size: 2048 entries (Hibernate < 5.2.12)
//                     2048 entries (Hibernate >= 5.2.12, configurable)
//
// Each entry: 5-50 KB depending on query complexity
// 2048 entries * 50 KB = ~100 MB just for the plan cache
// But with IN-clause variants, you can blow past 2048 easily.

The cache exists for a good reason. Parsing HQL or JPQL is not cheap — it involves lexing, parsing to an AST, semantic validation against the entity model, and compilation to a SQL string. For a moderately complex query, this takes 0.5 to 2 milliseconds. Caching the result avoids repeating this work on every execution.

The problem is not that the cache exists. The problem is that IN-clause variability turns a bounded cache into an unbounded memory consumer. With 50 distinct queries (no IN-clause variance), the cache holds 50 entries and consumes perhaps 2.5 MB. With 50 queries, each varying across 500 IN-clause sizes, the cache holds 25,000 entries and consumes 500+ MB.

The math is unforgiving. And the cache's default size of 2,048 entries is large enough to cause real damage but small enough that many applications exceed it, triggering constant eviction churn that keeps GC pressure elevated even when the cache is nominally bounded.

I should also note what the cache does not store: the SQL execution plan that PostgreSQL generates. That lives on the server side, in PostgreSQL's own plan cache. Hibernate's cache and PostgreSQL's cache are independent, which means the same IN-clause variability causes cache bloat in both — but the symptoms manifest differently, and most teams only notice one.

The GC death spiral

The OOM itself is the final symptom. The degradation that precedes it is, in many ways, worse — because it affects every request, not just the one that triggers the crash.

GC log progression during cache bloat
// The GC death spiral in action:
//
// 1. QueryPlanCache grows, consuming heap
// 2. GC runs more frequently to free soft references
// 3. Under memory pressure, soft refs are collected...
//    but new queries immediately re-populate the cache
// 4. GC pause times increase:
//
//    [GC (Allocation Failure) 1.8G->1.6G(2.0G), 0.0847 secs]
//    [GC (Allocation Failure) 1.9G->1.7G(2.0G), 0.1203 secs]
//    [Full GC (Ergonomics)    1.9G->1.8G(2.0G), 2.4710 secs]  // <-- trouble
//    [Full GC (Ergonomics)    1.9G->1.8G(2.0G), 3.1022 secs]  // <-- serious
//    [Full GC (Allocation Failure) 1.9G->1.9G(2.0G)]           // <-- OOM imminent
//
// 5. Application threads spend more time in GC than doing work
// 6. Request latencies spike from 50ms to 5,000ms
// 7. Health checks fail. Load balancer pulls the instance.
// 8. Remaining instances absorb the traffic, accelerating
//    their own cache growth. Cascade failure.
//
// Time from "slightly elevated GC" to "full outage": often < 10 minutes.
// Time from "everything is fine" to "slightly elevated GC": days or weeks.

As the QueryPlanCache consumes more heap, the JVM's garbage collector works harder. Soft references — which the cache uses for its entries — are collected when memory is tight, but only at the collector's discretion. The JVM's heuristic for "when to collect soft references" does not account for a cache that refills as fast as it drains.

The result is a feedback loop. GC collects soft references, freeing a few hundred megabytes. New requests immediately create new cache entries, consuming that space. GC runs again, sooner this time. And again. The ratio of time spent doing useful work versus time spent collecting garbage deteriorates until the application is effectively frozen — responding to no requests, consuming 100% CPU on garbage collection, and eventually succumbing to an OutOfMemoryError that the GC could not prevent.

What makes this particularly insidious is the cascade effect. In a horizontally scaled deployment, when one instance dies, its traffic redistributes to the surviving instances. Those instances are now processing more requests with more varied IN-clause sizes. Their caches grow faster. Their GC pressure increases. One by one, they follow the first instance into the heap dump.

I have witnessed a four-instance deployment go from "all healthy" to "all dead" in under 7 minutes. The monitoring dashboard showed a single spike, a brief moment of elevated latencies, and then four simultaneous OOM alerts. The root cause was a marketing campaign that drove a burst of traffic to a search endpoint with an IN-clause filter. Each user's search had a different number of selected categories. Each distinct count created a new cache entry in every instance simultaneously.

"The ORM did not fail. It did exactly what was asked. It was simply asked poorly."

— from You Don't Need Redis, Chapter 3: The ORM Tax

Reading the heap dump

If you suspect this issue, a heap dump will confirm it in seconds. Trigger one with jmap -dump:live,format=b,file=heap.hprof <pid> or configure -XX:+HeapDumpOnOutOfMemoryError to capture one automatically when the OOM occurs.

Heap dump analysis with Eclipse MAT
-- What the heap dump reveals (Eclipse MAT, jmap, or VisualVM):
--
-- Class                                           | Retained Heap
-- ------------------------------------------------+---------------
-- SessionFactoryImpl                              | 1,412 MB
--   QueryPlanCache                                | 1,387 MB
--     BoundedConcurrentHashMap (entries)           | 1,384 MB
--       QueryPlanCache$HQLQueryPlanKey[247319]     |   892 MB
--       QueryPlanCache$ParameterMetadataKey[83201] |   492 MB
--
-- The smoking gun: 247,319 entries in a cache sized for 2,048.
--
-- How? Hibernate's BoundedConcurrentHashMap in versions < 5.2.12
-- did not reliably enforce the bound under memory pressure —
-- eviction relied on soft references, which the GC only collects
-- when memory runs low, allowing the cache to temporarily exceed
-- its configured size. Even in later versions, eviction under
-- GC pressure can lag behind allocation, allowing the cache
-- to grow far beyond its nominal limit.

The signature is unmistakable. SessionFactoryImpl at the top of the retained heap dominator tree, with QueryPlanCache immediately beneath it. The entry count in the BoundedConcurrentHashMap will be far higher than you expect.

For a more targeted investigation, Eclipse MAT's OQL (Object Query Language) lets you inspect individual cache entries:

OQL query for Eclipse MAT
-- Eclipse MAT OQL to find the largest QueryPlanCache entries:
--
-- Open your .hprof in Eclipse MAT, then:
-- 1. Click "Open Query Browser" (the magnifying glass icon)
-- 2. Select "OQL" tab
-- 3. Run:

SELECT c.query.toString(), c.@retainedHeapSize
FROM org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey c
ORDER BY c.@retainedHeapSize DESC

-- This shows you exactly which query strings are consuming
-- the most memory. You will likely see hundreds of entries
-- that differ only in their IN-clause parameter count.
--
-- For a quick triage without OQL, use the dominator tree:
-- 1. Open dominator tree (toolbar icon)
-- 2. Sort by "Retained Heap" descending
-- 3. Expand SessionFactoryImpl -> QueryPlanCache
-- 4. Look at the entry count on BoundedConcurrentHashMap

A note on Hibernate versions: prior to 5.2.12, the BoundedConcurrentHashMap used for the cache had a concurrency bug where the size bound was not reliably enforced under contention. The cache could grow without limit. This was fixed in HHH-12076, but if you are running an older version, the cache can grow to hundreds of thousands of entries regardless of configuration.

Even in fixed versions, the bound uses soft references, which means entries are eligible for GC under memory pressure but are not collected eagerly. The practical result: the cache can temporarily exceed its bound during traffic spikes, precisely when you can least afford the extra memory consumption.

Fix 1: in_clause_parameter_padding (start here)

This is the single most effective fix for the problem. One line of configuration. Available since Hibernate 5.2.18 and backported to several 5.1.x maintenance releases.

# application.properties — the single most effective fix
spring.jpa.properties.hibernate.query.in_clause_parameter_padding=true

# What this does:
# Instead of generating IN clauses with the exact parameter count,
# Hibernate pads to the next power of 2.
#
# Without padding:
#   IN (?1)                          -- 1 param
#   IN (?1, ?2)                      -- 2 params
#   IN (?1, ?2, ?3)                  -- 3 params
#   IN (?1, ?2, ?3, ?4)              -- 4 params
#   IN (?1, ?2, ?3, ?4, ?5)          -- 5 params
#   ...
#   IN (?1, ..., ?1000)              -- 1000 params
#   = 1,000 unique query strings = 1,000 cache entries
#
# With padding (powers of 2):
#   IN (?1)                          -- 1 param
#   IN (?1, ?2)                      -- 2 params
#   IN (?1, ?2, ?3, ?4)              -- 4 params  (3 padded to 4)
#   IN (?1, ..., ?8)                 -- 8 params  (5-8 padded to 8)
#   IN (?1, ..., ?16)                -- 16 params (9-16 padded to 16)
#   IN (?1, ..., ?32)                -- 32 params
#   IN (?1, ..., ?64)                -- 64 params
#   IN (?1, ..., ?128)               -- 128 params
#   IN (?1, ..., ?256)               -- 256 params
#   IN (?1, ..., ?512)               -- 512 params
#   IN (?1, ..., ?1024)              -- 1024 params (1000 padded to 1024)
#   = 11 unique query strings = 11 cache entries
#
# Padded positions are filled with the last real value repeated.
# PostgreSQL optimizes these duplicates away during planning.

The effect is dramatic. An IN clause that previously generated 1,000 distinct query strings (one per parameter count from 1 to 1,000) now generates exactly 11: one for each power of 2 from 1 to 1,024. That is a 99% reduction in cache entries from a single property change.

The padding works by rounding up the parameter count to the next power of 2, then filling the extra positions with a repetition of the last real value. If you pass [10, 20, 30], Hibernate generates IN (?, ?, ?, ?) and binds [10, 20, 30, 30]. PostgreSQL's query executor deduplicates internally, so the repeated value does not affect correctness or result set.

The overhead is negligible. A padded IN clause with 512 parameters when you only needed 300 adds approximately 0.05ms to plan time and zero measurable execution time (the duplicates are eliminated during the bitmap scan). Against the 1-2ms saved per cache miss, this is an exceptional trade-off.

Edge cases worth understanding

Parameter padding edge cases
// Edge cases to be aware of with parameter padding:

// 1. PostgreSQL has a hard limit of 65,535 parameters per statement.
//    Padding 40,000 params rounds up to 65,536 — which exceeds the limit.
//    Hibernate does handle this (it caps at 65,535), but you should
//    not be sending 40,000-element IN clauses regardless.

// 2. Large padded IN clauses can trigger plan changes.
//    If you pass 33 values, padding rounds to 64.
//    If you pass 65 values, padding rounds to 128.
//    The jump from 64 to 128 parameters CAN cause PostgreSQL
//    to switch from an index scan to a sequential scan,
//    depending on table statistics and the selectivity estimate.
//    This is the plan_cache_mode interaction discussed later.

// 3. Padding with NULL values in the list:
//    If your list contains NULLs, the padded duplicates of the
//    last value will NOT be NULL — they'll be the last non-null
//    value. This is correct behavior but worth understanding.

// 4. Empty lists:
//    Hibernate handles an empty list by generating "1=0" (always false).
//    No IN clause, no padding, no cache entry. This is fine.

The most consequential edge case is the plan quality interaction. When padding rounds 33 parameters to 64, or 65 to 128, the jump in estimated result set size can cause PostgreSQL to choose a different access method. This is not a bug in padding — it is a consequence of how the PostgreSQL planner estimates selectivity for large IN lists. The mitigation is the plan_cache_mode setting discussed in a later section.

Despite these edge cases, there is no reason not to enable this. If your application uses IN-clause queries with variable-length lists, enable in_clause_parameter_padding immediately. The edge cases are manageable. The alternative — unbounded cache growth — is not.

Fix 2: plan_cache_max_size tuning

Parameter padding handles the IN-clause explosion, but you should also right-size the cache itself.

# Hibernate 5.2.12+ and Hibernate 6.x:
# Set an explicit maximum for the query plan cache

# application.properties
spring.jpa.properties.hibernate.query.plan_cache_max_size=256
spring.jpa.properties.hibernate.query.plan_parameter_metadata_max_size=128

# Defaults:
#   plan_cache_max_size = 2048
#   plan_parameter_metadata_max_size = 128 (Hibernate 5) / 2048 (Hibernate 6)
#
# Why lower it?
# If you have 50 unique queries (after padding), you don't need
# 2048 slots. A smaller cache means:
#   - Less memory consumed
#   - Faster eviction of stale entries
#   - More predictable memory profile
#
# The trade-off: cache misses trigger re-parsing (~1ms per miss).
# For most applications, 128-512 entries covers all hot-path queries.

The default of 2,048 entries was chosen when Hibernate applications had fewer distinct query patterns. Modern Spring Boot applications with dozens of repositories, specification queries, and dynamic criteria often have more. But "more" should mean 200, not 200,000.

After enabling parameter padding, audit your actual query diversity. Enable hibernate.generate_statistics=true temporarily in a staging environment and observe the QueryPlanCache hit/miss ratio. If you see fewer than 200 distinct query plans with a 95%+ hit ratio, set the cache to 256. If you see 500, set it to 768. Leave headroom, but not a football field of it.

The cost of a cache miss is a re-parse: typically 0.5-2ms. For a cold startup with 100 unique queries, that is 50-200ms of total parse time spread across the first few hundred requests. Entirely acceptable. For context, a single network round trip to PostgreSQL typically costs 0.2-0.5ms. You are trading one or two round trips worth of latency for a dramatically more predictable memory profile.

I should note that lowering the cache size too aggressively can cause its own problems. If your application has 300 distinct query patterns and you set the cache to 128, every request for the "cold" 172 patterns will trigger a re-parse. At 1ms per parse and 1,000 requests per second hitting those patterns, you are adding 1 second of parse time per second — a full CPU core consumed by parsing alone. Right-size based on measurement, not assumption.

Fix 3: batch fetch size for association loading

A second source of IN-clause variability comes from Hibernate's lazy association loading. When you access an unloaded collection and Hibernate batch-fetches the associated entities, it generates an IN clause sized to the batch. Controlling the batch size controls the IN-clause size, which controls the cache entry count.

// Another mitigation: control how many IDs go into IN clauses
// by using Hibernate's batch fetching instead of manual IN queries.

// Entity-level configuration:
@Entity
@BatchSize(size = 25)
public class Order {
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "customer_id")
    private Customer customer;
}

// This tells Hibernate: when loading lazy associations,
// batch them in groups of 25 instead of one at a time.
// Combined with parameter padding, this produces exactly:
//   IN (?1, ..., ?32)  -- 25 padded to 32
// One query pattern. One cache entry. Every time.

// Global default (application.properties):
spring.jpa.properties.hibernate.default_batch_fetch_size=25

// The sweet spot is usually 16-50.
// Too low: more queries. Too high: diminishing returns
// and PostgreSQL optimizer cost increases.

With default_batch_fetch_size=25 and in_clause_parameter_padding=true, association loading always generates IN (?1, ..., ?32) — 25 padded to the next power of 2. One query pattern. One cache entry. Regardless of how many associations are being loaded.

This has a secondary performance benefit: batch fetching reduces the number of round trips to PostgreSQL. Instead of 200 individual SELECT statements for 200 lazy-loaded associations (the N+1 pattern), Hibernate issues 8 queries of 25 each. Combined with padding, the query plan is cached on the first batch and reused for the remaining seven.

A word on choosing the batch size. Powers of 2 are slightly more efficient with padding enabled (a batch size of 16 or 32 pads to itself, wasting no parameters), but the difference is marginal. The more important consideration is the trade-off between round trips and per-query overhead. A batch size of 16 means 13 queries for 200 associations. A batch size of 64 means 4 queries for the same 200. But each query with 64 parameters is slightly more expensive to plan and execute than one with 16 parameters. In my experience, 25 is a sensible default. It balances round-trip reduction with plan cost, and it pads to 32 — a number that PostgreSQL handles with an efficient index scan on any properly indexed column.

A note on Hibernate 6 and Spring Boot 3

If you have migrated to Spring Boot 3.x, you are running Hibernate 6, and you may be wondering whether this problem has been solved.

Hibernate 6 changes (and what hasn't changed)
// Hibernate 6 (used in Spring Boot 3.x) changes the internals:
//
// Hibernate 5.x: QueryPlanCache with BoundedConcurrentHashMap
// Hibernate 6.x: SqmInterpretationsCache with a simpler map
//
// The class names changed, but the fundamental problem did NOT:
//
// Hibernate 6 still caches SQM (Semantic Query Model) interpretations
// keyed by the exact query string. IN-clause variants still produce
// distinct cache entries. The OOM mechanism is identical.
//
// What DID improve in Hibernate 6:
//   - Cache eviction is more reliable (no soft reference lag)
//   - Default cache sizing is slightly more conservative
//   - The internal representation is more memory-efficient per entry
//
// What did NOT improve:
//   - IN-clause variability still causes cache bloat
//   - in_clause_parameter_padding is still required
//   - The cache can still consume hundreds of MB without padding
//
// Configuration in Hibernate 6 / Spring Boot 3:
// spring.jpa.properties.hibernate.query.plan_cache_max_size=256
// spring.jpa.properties.hibernate.query.in_clause_parameter_padding=true
//
// Same properties. Same fix. Different internals.

The short answer: the internals changed, the fundamental problem did not. Hibernate 6 replaced the QueryPlanCache with SqmInterpretationsCache and the HQL parser with the SQM (Semantic Query Model) parser. The cache is more memory-efficient per entry, and eviction is more reliable. These are genuine improvements.

But the cache is still keyed by exact query string. IN-clause variants still produce distinct entries. An application that generates 10,000 distinct query strings due to IN-clause variability will still have 10,000 cache entries, each consuming somewhat less memory than in Hibernate 5 but still consuming memory.

The same three configuration properties apply in Hibernate 6: in_clause_parameter_padding, plan_cache_max_size, and default_batch_fetch_size. If you migrated from Spring Boot 2.x to 3.x without carrying over these settings, verify that they are still present in your new configuration. I have seen more than one migration where the Hibernate properties were lost in the transition, silently re-enabling the default 2,048-entry cache with no parameter padding.

The PostgreSQL side: plan_cache_mode and server-side caching

Hibernate's query plan cache is the application-side problem. But PostgreSQL has its own plan cache for prepared statements, and IN-clause variability affects it too.

-- PostgreSQL 12+ has its own plan cache for prepared statements.
-- This interacts with Hibernate's cache in ways that matter.

-- Check current setting:
SHOW plan_cache_mode;  -- default: 'auto'

-- Three modes:
--   'auto'             PostgreSQL decides per statement (default)
--   'force_custom_plan' Always re-plan with actual parameter values
--   'force_generic_plan' Always use the generic (parameterized) plan

-- The 'auto' behavior:
-- PostgreSQL runs custom plans for the first 5 executions,
-- then compares the custom plan cost against the generic plan.
-- If the generic plan is not significantly worse, it switches.
--
-- Why this matters for IN-clause queries:
-- With parameter padding, your IN clause might have 512 placeholders
-- but only 300 real values (212 duplicates). The generic plan for
-- a 512-element IN clause may choose a sequential scan, while
-- the custom plan with 300 actual values would use an index scan.
--
-- Monitor this:
SELECT query, calls, generic_plans, custom_plans,
       mean_exec_time, stddev_exec_time
FROM pg_stat_statements
WHERE query LIKE '%IN%'
ORDER BY total_exec_time DESC;

-- If generic_plans > 0 and mean_exec_time is high,
-- the generic plan may be suboptimal. Force custom plans
-- for that session or globally if needed:
SET plan_cache_mode = 'force_custom_plan';

When Hibernate uses JDBC prepared statements (which it does by default through your connection pool), PostgreSQL caches an execution plan per prepared statement. Each distinct IN-clause variant — even after Hibernate's parameter padding — gets its own server-side plan.

PostgreSQL's plan_cache_mode setting (available since PostgreSQL 12) controls how the server chooses between generic plans (using placeholder statistics) and custom plans (using actual parameter values). The default auto mode is usually correct, but padded IN clauses can confuse the heuristic.

Consider: a WHERE customer_id IN ($1, ..., $512) plan generated with 300 real values and 212 duplicates. The generic plan sees 512 distinct parameters and may estimate a larger result set than reality, choosing a sequential scan over an index scan. The custom plan sees the actual values, estimates correctly, and chooses the index.

Monitor pg_stat_statements for queries where generic_plans > 0 and mean_exec_time is unexpectedly high. If the generic plan is suboptimal, set plan_cache_mode = 'force_custom_plan' for the relevant sessions. The re-planning cost is typically 0.5-2ms per query — less than the cost of a bad plan.

Monitoring for IN-clause thrashing

Prevention requires visibility. Here is how to detect IN-clause variant explosion before it becomes an OOM.

Detecting IN-clause variant explosion in pg_stat_statements
-- Detect IN-clause variant explosion in pg_stat_statements
-- This query finds parameterized queries with suspiciously high
-- plan counts relative to their structure.

SELECT
    left(query, 120) AS query_prefix,
    calls,
    rows,
    round(mean_exec_time::numeric, 2) AS avg_ms,
    round(total_exec_time::numeric, 0) AS total_ms,
    -- Count the parameter placeholders to estimate IN-clause size
    (length(query) - length(replace(query, '$', ''))) AS param_count
FROM pg_stat_statements
WHERE query LIKE '%IN (%'
  AND calls > 100
ORDER BY total_exec_time DESC
LIMIT 20;

-- What you are looking for:
-- Multiple entries with identical query structure but different
-- numbers of $N placeholders. Example:
--
--   "SELECT * FROM orders WHERE customer_id IN ($1, $2, $3)"          calls: 12,847
--   "SELECT * FROM orders WHERE customer_id IN ($1, $2, $3, $4)"      calls:  8,923
--   "SELECT * FROM orders WHERE customer_id IN ($1, $2, ..., $47)"    calls:  1,204
--
-- These are the same logical query cached separately
-- in both Hibernate and PostgreSQL.

On the Hibernate side, enable statistics temporarily:

Complete Spring Boot configuration
# Complete recommended configuration for Spring Boot + Hibernate + PostgreSQL

# application.properties (or application.yml equivalent)

# 1. Pad IN-clause parameters to powers of 2 (most important)
spring.jpa.properties.hibernate.query.in_clause_parameter_padding=true

# 2. Limit query plan cache size
spring.jpa.properties.hibernate.query.plan_cache_max_size=256
spring.jpa.properties.hibernate.query.plan_parameter_metadata_max_size=128

# 3. Set a reasonable default batch fetch size
spring.jpa.properties.hibernate.default_batch_fetch_size=25

# 4. Log cache statistics to verify (enable temporarily, not in prod)
# spring.jpa.properties.hibernate.generate_statistics=true
# logging.level.org.hibernate.stat=DEBUG
# logging.level.org.hibernate.engine.query.spi.QueryPlanCache=DEBUG

# 5. Heap monitoring — add to JVM args
# -XX:+HeapDumpOnOutOfMemoryError
# -XX:HeapDumpPath=/tmp/heap-dump.hprof

The hibernate.generate_statistics=true setting exposes the QueryPlanCache hit/miss ratio through JMX and log output. In production, enable it for 10 minutes during peak traffic, capture the output, then disable it. The overhead is measurable but modest — approximately 3-5% throughput reduction.

What you are looking for: a hit ratio below 90% suggests excessive cache churn. A cache size at or near the configured maximum suggests the limit is being hit and useful entries are being evicted. Both indicate IN-clause variability or excessive dynamic query generation.

Continuous monitoring with Micrometer

For ongoing visibility rather than periodic spot-checks, expose the cache metrics through Spring Boot Actuator and Micrometer:

Micrometer gauge for QueryPlanCache hit/miss
// Expose QueryPlanCache metrics via Micrometer (Spring Boot Actuator)

@Configuration
public class HibernateMetricsConfig {

    @Autowired
    private EntityManagerFactory emf;

    @Bean
    public MeterBinder hibernateQueryPlanCacheMetrics() {
        return registry -> {
            SessionFactoryImpl sf = emf.unwrap(SessionFactoryImpl.class);
            Statistics stats = sf.getStatistics();
            stats.setStatisticsEnabled(true);

            Gauge.builder("hibernate.query.plan.cache.hit.count",
                    stats, Statistics::getQueryPlanCacheHitCount)
                .description("Query plan cache hits")
                .register(registry);

            Gauge.builder("hibernate.query.plan.cache.miss.count",
                    stats, Statistics::getQueryPlanCacheMissCount)
                .description("Query plan cache misses")
                .register(registry);
        };
    }
}

// Then in Grafana or your monitoring tool, alert on:
//   rate(hibernate_query_plan_cache_miss_count[5m])
//     / rate(hibernate_query_plan_cache_hit_count[5m]) > 0.1
//
// A miss ratio above 10% sustained for 5 minutes means
// the cache is churning. Investigate immediately.

This gives you a continuous time series that you can alert on. A sustained miss ratio above 10% deserves investigation. A sudden spike in the miss rate — even if it returns to normal — indicates a burst of novel query strings that may recur. Both are easier to catch in a dashboard than in a heap dump after the fact.

The complete fix comparison

Every approach and its trade-offs, assembled for convenient reference:

FixCache reductionPerf impactEffortScope
in_clause_parameter_padding=true~99%Negligible1 line of configHibernate only
plan_cache_max_size=256~87%Rare re-parse (~1ms)1 line of configHibernate only
default_batch_fetch_size=25Indirect (fewer variants)Positive (fewer queries)1 line of configHibernate only
plan_cache_mode=force_custom_planNone (addresses plan quality)+0.5-2ms per querySET per session or globalPostgreSQL only
Gold Lapel proxy normalization~99% (both layers)Positive (shared plans)Change connection stringHibernate + PostgreSQL

The recommended approach for most applications: enable in_clause_parameter_padding, lower plan_cache_max_size to 256, and set default_batch_fetch_size to 25. Three lines of configuration. Total implementation time: less than it took to read this section. The combined effect eliminates the OOM risk and improves query performance through better cache utilization.

The honest counterpoint: when this is not your problem

I should be forthcoming about the boundaries of this advice, because a waiter who overstates his case is no waiter at all.

If your application uses only fixed-size IN clauses — say, a findByStatusIn method that always receives the same 3 or 4 status values — you do not have this problem. The cache will hold a handful of entries and behave perfectly. Enabling parameter padding in this case is harmless but unnecessary.

If your application does not use IN clauses at all — if your queries are all simple equality checks, range scans, and joins — the QueryPlanCache is unlikely to be your OOM culprit. Look instead at entity caches, result set caches, or connection pool sizing.

If your OOM occurs in a method that does not involve Hibernate at all, or if the heap dump shows the largest retained heap consumers are your own domain objects rather than SessionFactoryImpl, the plan cache is a red herring. Not every Hibernate OOM is a plan cache OOM. Some are genuine memory leaks — open sessions held across long-running operations, unbounded result sets loaded into memory, or ScrollableResults not properly closed.

And if you are considering moving away from Hibernate entirely to avoid this problem — to raw JDBC, jOOQ, or Exposed — I would not dissuade you from evaluating alternatives, but I would note that the IN-clause variability problem exists at the PostgreSQL level regardless of your ORM. Your application-side cache disappears, yes. The server-side plan cache bloat does not. You are trading one problem for a different, quieter version of the same problem. The right response is to understand the mechanism, not to flee the framework.

What nobody connects: Hibernate's cache and PostgreSQL's cache are fighting the same problem separately

Here is the observation that prompted this guide. Every article about this issue covers one side or the other. Hibernate documentation explains in_clause_parameter_padding and plan_cache_max_size. PostgreSQL documentation explains plan_cache_mode and prepared statement behavior. Nobody ties them together.

But they are the same problem at two layers of the stack. Hibernate is caching parsed HQL plans keyed by exact query string. PostgreSQL is caching execution plans for prepared statements, also keyed by exact query string. IN-clause variability creates cache explosion in both. Fixing only Hibernate leaves PostgreSQL's plan cache bloated. Fixing only PostgreSQL leaves Hibernate's heap consumption unchecked.

The reason nobody connects them is that they require different expertise to observe. The Hibernate side shows up in heap dumps and JMX metrics. The PostgreSQL side shows up in pg_stat_statements and EXPLAIN ANALYZE. Different teams, different tools, different dashboards. The application team sees an OOM and fixes Hibernate. The DBA sees plan cache churn and tweaks PostgreSQL. Neither realizes they are treating symptoms of the same underlying cause.

This is, if you will permit the observation, a recurring pattern in infrastructure failures. The most consequential problems are the ones that span team boundaries. Not because they are technically difficult — three lines of configuration, as we have seen — but because no single team owns both halves of the diagnosis. The application team does not run pg_stat_statements. The DBA does not take heap dumps. And so the same IN-clause variability causes two caches to bloat, two teams to investigate independently, and two partial fixes to be applied without addressing the whole.

Where a proxy normalizes what neither layer can

Gold Lapel sits between Hibernate and PostgreSQL, observing every query as it crosses the wire. It sees the IN-clause variants before they reach the database.

# What Gold Lapel sees when Hibernate sends IN-clause variants:
#
# From the application (without parameter padding):
#   SELECT * FROM orders WHERE customer_id IN ($1, $2, $3)
#   SELECT * FROM orders WHERE customer_id IN ($1, $2, $3, $4)
#   SELECT * FROM orders WHERE customer_id IN ($1, $2, $3, $4, $5)
#   SELECT * FROM orders WHERE customer_id IN ($1, ..., $847)
#
# Gold Lapel normalizes these at the proxy layer:
#   All variants -> single prepared statement pattern
#   "SELECT * FROM orders WHERE customer_id IN ($1, ...)"
#
# The proxy maintains an LRU cache of 1,024 prepared statement slots per connection.
# IN-clause normalization means all variants of the same logical
# query share ONE slot instead of consuming hundreds.
#
# This benefits PostgreSQL in two ways:
#   1. Fewer unique plans to cache server-side
#   2. More cache hits -> plans stay warm -> faster execution
#
# And it benefits Hibernate by reducing the number of distinct
# query strings the server responds to, which feeds back into
# more predictable plan_cache_mode behavior.

The proxy normalizes IN-clause variants at the wire protocol level. All variants of SELECT * FROM orders WHERE customer_id IN ($1, $2, $3) and SELECT * FROM orders WHERE customer_id IN ($1, ..., $847) are recognized as the same logical query pattern. Gold Lapel's prepared statement promotion — an LRU cache of 1,024 slots per connection — maps all variants to a single slot.

This means PostgreSQL receives fewer unique prepared statement definitions, keeps its plan cache compact, and achieves higher cache hit rates. The server-side benefit is independent of whether you have enabled Hibernate's parameter padding. Though you should enable it regardless — reducing cache pressure at every layer of the stack is the correct approach.

The proxy does not replace the Hibernate-side fixes. You should still enable in_clause_parameter_padding and right-size your plan_cache_max_size to keep the JVM heap healthy. What the proxy provides is the connection-level normalization that neither Hibernate nor PostgreSQL can do independently — a consistent view of query patterns across all IN-clause variants, all pooled connections, and all application instances.

Change the JDBC connection string. Keep your Hibernate configuration. The cache explosion resolves at both layers simultaneously.

The three lines that would have prevented this

I should like to leave you with the configuration that, had it been present from the start, would have prevented the 2:23 AM page, the heap dump, the postmortem, and the uncomfortable conversation about why a quarter-million cache entries were consuming 89% of your heap.

application.properties
spring.jpa.properties.hibernate.query.in_clause_parameter_padding=true
spring.jpa.properties.hibernate.query.plan_cache_max_size=256
spring.jpa.properties.hibernate.default_batch_fetch_size=25

Three properties. No code changes. No schema migrations. No dependency updates. The first reduces cache entries by 99%. The second caps what remains. The third controls the source of variability in association loading.

Apply them to staging first. Verify with hibernate.generate_statistics=true that the cache hit ratio remains above 95%. Then apply to production. Monitor heap consumption through one full traffic cycle — daily, weekly, whatever your patterns are. You will observe a heap that is flatter, more predictable, and dramatically smaller than before.

The application that was crashing at 2 GB of heap will run comfortably at 512 MB. Not because it needed less memory — it always needed less memory. It was simply caching the same question, asked 247,319 slightly different ways.

In infrastructure, as in hospitality, the most valuable service is often the one that prevents the emergency from occurring in the first place. Quiet, unseeable, and entirely worth your trust.

Frequently asked questions

Terms referenced in this article

Allow me to suggest a companion reading. The prepared statement lifecycle that Hibernate's QueryPlanCache attempts to optimize is thoroughly documented in our practical guide to pg_stat_statements — which will show you precisely what PostgreSQL sees on its side of the conversation, and how to identify the queries consuming the most planning time.