@Async and Database Access in Spring Boot: How Async Threads Silently Drain Your PostgreSQL Pool
Your pool is not too small. Your @Async methods are checking out connections and never returning them. The fix is one annotation you already know.
Good evening. Your connection pool is not draining — it is being drained.
There is a class of production incident that presents with a very specific signature. HikariCP's active connection count climbs. Steadily. Monotonically. It does not come back down. The idle count falls to zero. The pending count begins to rise. And then, thirty seconds later, the first SQLTransientConnectionException appears in your logs: "Connection is not available, request timed out after 30000ms."
The instinct is to increase maximum-pool-size. You go from 10 to 20. The same thing happens, just slightly later. You go to 50. Now PostgreSQL itself is unhappy — 50 backend processes competing for 4 CPU cores is worse than the original problem. And the connections still do not come back.
The connections are not slow. They are not blocked on locks. They are not executing long-running queries. They are idle. Sitting in PostgreSQL's pg_stat_activity with state = 'idle', having executed their last query minutes ago. They are checked out from HikariCP's perspective but doing absolutely nothing from PostgreSQL's perspective.
I find this behaviour — connections that are simultaneously occupied and idle, claimed but unused, present but unavailable — to be the infrastructural equivalent of a dining room where every chair is pulled out but no one is sitting down. The table is technically full. The room is technically empty. And no one can be seated for dinner.
The cause, in every case I have seen with this exact signature, is the same: an @Async method that accesses lazy-loaded Hibernate associations without a @Transactional boundary. If you will permit me, I should like to explain precisely why this happens, why your tests do not catch it, why increasing the pool makes it worse, and how to fix it permanently with one annotation you already know.
The mechanics of the leak
To understand why this happens, you need to understand three things about Spring Boot's database access model and how they interact — or rather, how they fail to interact — on async threads.
Thing one: Open Session In View (OSIV). Spring Boot ships with spring.jpa.open-in-view=true by default. This creates an OpenSessionInViewInterceptor that opens a Hibernate Session at the beginning of each HTTP request and closes it when the response is written. Every database access during that request — whether inside a @Transactional method or not — shares that single Session and its single database connection.
OSIV is controversial. The Spring Boot documentation itself logs a warning at startup when it is enabled. Many experienced Spring developers disable it on principle. But one thing OSIV does reliably — and I will give it this much credit — is prevent connection leaks on request threads. One Session, one connection, one request. The lifecycle is clean. The connection always comes home.
Thing two: @Async runs on a different thread. When you annotate a method with @Async, Spring executes it on a TaskExecutor thread — not the HTTP request thread. The OpenSessionInViewInterceptor is a servlet filter. It applies to servlet request threads. It has no knowledge of, and no authority over, the thread pool managed by @EnableAsync.
This means @Async methods run without OSIV. No pre-opened Session. No pre-acquired connection. No safety net. The async thread is, from Hibernate's perspective, an uncharted wilderness. No one has set out the table linens. No one is managing the household.
Thing three: lazy loading without a Session creates temporary Sessions. When Hibernate encounters a lazy-loaded association and no Session is bound to the current thread, the behavior depends on your JPA provider configuration. With Spring Boot's defaults, Hibernate creates a temporary Session, acquires a connection from HikariCP, executes the lazy-load query, and — here is the critical part — does not reliably close that temporary Session or return that connection on the same code path.
The connection may be returned when the temporary Session is garbage collected. It may be returned when HikariCP's leak detection fires. It may be returned when max-lifetime expires and HikariCP forces a connection retirement. But it is not returned promptly. And "not promptly," in the context of a connection pool with 10 slots, means "after your pool is already exhausted."
I should be precise about what "not reliably close" means, because this is the hinge on which the entire problem turns. It does not mean Hibernate has a bug. It means Hibernate's temporary Session lifecycle is tied to garbage collection, and garbage collection is nondeterministic. On a lightly loaded system, GC runs frequently enough that temporary Sessions are cleaned up before the pool feels any pressure. On a heavily loaded system — 200 concurrent @Async invocations, each with multiple lazy loads — connections leak faster than GC can reclaim them. The pool fills. The application stops.
What happens inside the temporary Session
Allow me to trace the exact lifecycle of a connection leak, from the moment your code calls a lazy-loaded getter to the moment that connection becomes unreachable. This level of detail matters because it explains why the leak is so difficult to detect through code review alone.
// What happens inside Hibernate when a lazy load fires without a Session
//
// Thread: async-3 (not a servlet thread — no OSIV interceptor active)
//
// 1. Your code calls order.getLineItems()
// 2. Hibernate's LazyInitializer detects the collection is uninitialized
// 3. LazyInitializer checks ThreadLocal for a bound Session — finds NONE
// 4. Hibernate creates a TemporarySessionHolder via SessionFactoryImpl.openSession()
// 5. The new Session calls DataSource.getConnection() on HikariCP
// 6. HikariCP checks out a connection from the pool (active count: +1)
// 7. Hibernate executes: SELECT l.* FROM line_items l WHERE l.order_id = ?
// 8. Results are loaded into the PersistentBag
// 9. The temporary Session is NOT closed on this code path
// 10. The connection is NOT returned to HikariCP
//
// The Session object is referenced by the PersistentBag (the loaded collection).
// The PersistentBag is referenced by the Order entity.
// The Order entity is referenced by your local variable.
// As long as your method holds a reference to the Order, the Session stays alive,
// and the connection stays checked out.
//
// Eventually:
// - Your method completes, the Order goes out of scope
// - The PersistentBag becomes eligible for GC
// - The Session becomes eligible for GC
// - Session.finalize() or PhantomReference cleanup returns the connection
// - But GC timing is nondeterministic. It might run in 100ms. Or 10 minutes.
//
// In the meantime, your connection pool is one slot smaller.
// Multiply by every lazy load in every @Async invocation. The reference chain is the key. The connection is held by the Session. The Session is held by the PersistentBag (the lazy-loaded collection). The PersistentBag is held by the entity. The entity is held by your local variable. As long as your method is executing and holding a reference to that entity, the connection cannot be garbage collected. It is alive, it is doing nothing, and it is occupying a pool slot.
For a method that processes 50 orders with 5 line items each and a product on each line item, that is potentially 50 + 250 = 300 temporary Sessions, each holding a connection. In practice, some connections will be returned by earlier GC cycles. But under load — when your async executor is running 16 threads concurrently and each thread is iterating through its own set of orders — the rate of connection checkout far exceeds the rate of GC-driven return. The pool fills monotonically. The familiar signature appears.
I want to be clear about something. This is not a Hibernate bug. Hibernate's contract for lazy loading assumes a managed persistence context — either a transaction-scoped Session or an OSIV-managed Session. Lazy loading outside a managed context is, from Hibernate's perspective, a courtesy. It creates a temporary Session rather than throwing a LazyInitializationException because the Spring Boot defaults (hibernate.enable_lazy_load_no_trans=true) tell it to. That courtesy is what creates the leak. Without it, you would get an exception. With it, you get a slow, silent pool exhaustion. I confess I am not certain which is worse, but at least the exception tells you something is wrong.
The code that leaks
Here is a service that generates reports asynchronously. It is clean. It follows Spring conventions. It compiles. It passes tests. It passes code review. And it will exhaust your connection pool in production under any meaningful load.
@Service
public class ReportService {
@Autowired
private OrderRepository orderRepository;
@Async
public CompletableFuture<ReportDTO> generateReport(Long customerId) {
// This method runs on a TaskExecutor thread, NOT the request thread.
// Spring's OpenSessionInView filter does not apply here.
// There is no active transaction.
List<Order> orders = orderRepository.findByCustomerId(customerId);
for (Order order : orders) {
// order.getLineItems() is lazy-loaded.
// Hibernate opens a NEW connection from HikariCP to load them.
// That connection is never explicitly returned.
List<LineItem> items = order.getLineItems();
for (LineItem item : items) {
// item.getProduct() is also lazy.
// Another connection. Another leak.
String productName = item.getProduct().getName();
}
}
return CompletableFuture.completedFuture(buildReport(orders));
}
} The method is called 200 times in a batch. Each invocation runs on an async thread. Each invocation loads orders — one connection, returned after the query completes because findByCustomerId is a repository method with its own transactional scope. So far, so good. Then it iterates and accesses order.getLineItems() — lazy-loaded. A new temporary Session is created. A new connection is checked out from HikariCP. The lazy-load query executes. Then item.getProduct() — another lazy load, another temporary Session, another connection.
For a customer with 10 orders, each having 5 line items, that is 1 + 10 + 50 = 61 connection checkouts. Most of those connections are not returned until garbage collection runs or HikariCP intervenes. Across 200 concurrent async invocations, the math is catastrophic. The pool has 10 slots. The application needs thousands.
Here is the configuration that makes it worse:
@Configuration
@EnableAsync
public class AsyncConfig implements AsyncConfigurer {
@Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(8);
executor.setMaxPoolSize(16);
executor.setQueueCapacity(100);
executor.setThreadNamePrefix("async-");
executor.initialize();
return executor;
}
}
// Your HikariCP pool has 10 connections.
// Your async executor has 16 threads.
// Each leaked @Async method can hold 1-N connections indefinitely.
// 16 threads * 3 lazy loads each = up to 48 connections held.
// HikariCP maximum-pool-size: 10.
//
// The math does not work. It was never going to work. Sixteen async threads. Ten HikariCP connections. Each async thread can hold multiple connections through lazy loading. The pool is exhausted before the first report finishes generating.
I should note that the numbers in this example are conservative. In the production systems where I have observed this pattern, the object graphs were deeper — orders had shipping records, shipping records had tracking events, tracking events had carrier metadata. The lazy-load chains were four or five levels deep, each level multiplying the connection checkouts. A single @Async invocation could check out 200+ connections over the course of its execution.
Why increasing the pool size makes it worse
The natural response to a pool exhaustion incident is to increase the pool. It is a reasonable instinct. The pool ran out of connections, so give it more connections. The reasoning is sound in the general case. In this specific case, it is precisely wrong.
# The "just increase the pool" anti-pattern
#
# Incident timeline across three configuration changes:
#
# Configuration 1: maximum-pool-size = 10 (default)
# - Outage after 200 concurrent @Async calls
# - Time to pool exhaustion: 7 minutes
# - First timeout: 2:22 PM
#
# Configuration 2: maximum-pool-size = 30 (post-incident "fix")
# - Outage after 200 concurrent @Async calls
# - Time to pool exhaustion: 19 minutes
# - First timeout: 2:34 PM
# - Side effect: PostgreSQL now has 30 backend processes
# competing for CPU during normal request handling
#
# Configuration 3: maximum-pool-size = 50 ("this should be enough")
# - Outage after 200 concurrent @Async calls
# - Time to pool exhaustion: 31 minutes
# - First timeout: 2:46 PM
# - Side effect: PostgreSQL performance degraded for ALL queries
# 50 backends = context switching overhead on a 4-core machine
# pg_stat_activity shows 50 connections, 45 idle, 5 active
# The 5 active queries take 3x longer due to CPU contention
#
# You have not fixed the leak. You have given it a larger bucket.
# The bucket still has a hole in the bottom. Increasing the pool does not fix a leak. It delays the symptoms. A pool of 10 exhausts in 7 minutes. A pool of 30 exhausts in 19 minutes. A pool of 50 exhausts in 31 minutes. The graph is the same shape — monotonically increasing active connections with no recovery — just stretched along the time axis.
But the side effects of the larger pool are immediate and harmful. PostgreSQL is not designed to handle 50 concurrent backend processes on a 4-core machine. Each backend is a full OS process with its own memory allocation, its own query planner state, its own sort buffers. The PostgreSQL documentation recommends keeping max_connections in the range of 2-4x the number of CPU cores for OLTP workloads. On a 4-core machine, that is 8-16 connections. Pushing to 50 means that even the healthy, non-leaked connections experience degraded performance due to CPU contention and context switching overhead.
You have not fixed the problem. You have transferred it from a sharp failure (pool exhaustion, clear error message, obvious in logs) to a slow degradation (every query 2-3x slower, no clear error, invisible in logs until someone checks pg_stat_activity). Sharp failures are better than slow degradations. At least sharp failures wake someone up.
The timeline of an outage
This is what the HikariCP metrics look like during the incident. I have seen this exact pattern across four different production systems. The shape is always the same: active climbs, idle falls, pending rises, timeouts begin.
# HikariCP metrics during the incident — captured via /actuator/metrics
# 2:00 PM — normal traffic
hikaricp.connections.active: 3
hikaricp.connections.idle: 7
hikaricp.connections.pending: 0
hikaricp.connections.timeout: 0
# 2:15 PM — report generation batch started (200 @Async calls)
hikaricp.connections.active: 8
hikaricp.connections.idle: 2
hikaricp.connections.pending: 0
hikaricp.connections.timeout: 0
# 2:22 PM — active connections climbing, none being returned
hikaricp.connections.active: 10
hikaricp.connections.idle: 0
hikaricp.connections.pending: 6
hikaricp.connections.timeout: 0
# 2:24 PM — threads blocking on connection checkout
hikaricp.connections.active: 10
hikaricp.connections.idle: 0
hikaricp.connections.pending: 14
hikaricp.connections.timeout: 0
# 2:25 PM — first timeouts, users seeing errors
hikaricp.connections.active: 10
hikaricp.connections.idle: 0
hikaricp.connections.pending: 22
hikaricp.connections.timeout: 4
# 2:31 PM — full outage, every request timing out
hikaricp.connections.active: 10
hikaricp.connections.idle: 0
hikaricp.connections.pending: 47
hikaricp.connections.timeout: 183
# The active count hit 10 and NEVER CAME BACK DOWN.
# Those connections were checked out by lazy-load operations
# inside @Async methods — and never returned to the pool. The critical observation: active connections reached 10 and never came back down. In a healthy system, active connections fluctuate — up when queries execute, down when they complete. A monotonically increasing active count with no recovery is the unmistakable signature of a connection leak. Not a slow query. Not a lock. A leak.
There is a second observation worth noting: the pending count. Pending connections are threads that are waiting to check out a connection from the pool. When pending rises while active is at maximum, every pending thread is blocked. It is doing nothing. It is consuming a thread from your servlet or async thread pool. If the pending threads are servlet threads, your application cannot serve any HTTP requests. If the pending threads are async threads, your async work has stalled entirely. Either way, the application is effectively down even before the first timeout is logged.
On the PostgreSQL side, the picture is equally diagnostic:
-- What PostgreSQL sees during the leak
-- These connections are "idle" — not executing queries, not in a transaction.
-- They are simply... held. By threads that finished their work and moved on.
SELECT pid,
state,
now() - state_change AS idle_duration,
now() - backend_start AS connection_age,
LEFT(query, 60) AS last_query
FROM pg_stat_activity
WHERE datname = 'myapp'
AND application_name LIKE '%HikariPool%'
ORDER BY state_change ASC;
-- pid | state | idle_duration | connection_age | last_query
-- 14201 | idle | 00:08:42 | 01:12:33 | SELECT p.* FROM products p WHERE p.id = $1
-- 14203 | idle | 00:07:19 | 01:12:33 | SELECT l.* FROM line_items l WHERE l.order_id = $1
-- 14207 | idle | 00:06:54 | 01:12:33 | SELECT p.* FROM products p WHERE p.id = $1
-- 14209 | idle | 00:06:11 | 01:12:33 | SELECT l.* FROM line_items l WHERE l.order_id = $1
-- 14211 | idle | 00:05:48 | 01:12:33 | SELECT p.* FROM products p WHERE p.id = $1
-- 14215 | idle | 00:05:22 | 01:12:33 | SELECT l.* FROM line_items l WHERE l.order_id = $1
-- 14217 | idle | 00:04:55 | 01:12:33 | SELECT p.* FROM products p WHERE p.id = $1
-- 14219 | idle | 00:04:12 | 01:12:33 | SELECT l.* FROM line_items l WHERE l.order_id = $1
-- 14221 | idle | 00:03:47 | 01:12:33 | SELECT p.* FROM products p WHERE p.id = $1
-- 14223 | idle | 00:02:59 | 01:12:33 | SELECT l.* FROM line_items l WHERE l.order_id = $1
--
-- 10 connections. All idle. Last queries are lazy-load fetches.
-- No transaction. No lock. Just... occupied. Like a hotel room
-- where the guest checked out but nobody told housekeeping. Ten connections, all idle, all with last queries that are lazy-load fetches. No transactions. No locks. Just held connections. The queries finished minutes ago. The connections remain checked out because the temporary Hibernate Sessions that opened them were never explicitly closed.
Note the connection_age column: all ten connections have the same age (01:12:33). These are HikariCP's long-lived connections — checked out from the pool, not newly created. HikariCP is not opening new connections to PostgreSQL because its maximum-pool-size is already reached. It is waiting for one of these ten to be returned. They are not coming back.
Why OSIV normally masks this problem
If @Async methods leak connections, why does the same lazy-loading code work perfectly in your @Controller or @Service methods? Because OSIV is quietly doing the right thing on request threads. It is the household staff you never notice until they are absent.
# How OSIV (Open Session In View) normally masks this problem
#
# Normal HTTP request flow WITH spring.jpa.open-in-view=true (the default):
#
# 1. Request arrives on Tomcat thread
# 2. OpenSessionInViewInterceptor opens a Hibernate Session
# 3. Session obtains ONE connection from HikariCP
# 4. Controller calls service method
# 5. Service queries the database (uses the Session's connection)
# 6. Controller accesses lazy associations in the response
# 7. Lazy loads use the SAME Session, SAME connection
# 8. Response is written
# 9. OpenSessionInViewInterceptor closes the Session
# 10. Connection is returned to HikariCP
#
# One connection. One session. Entire request lifecycle. Clean.
#
# @Async method flow — OSIV DOES NOT APPLY:
#
# 1. @Async method runs on TaskExecutor thread (not the request thread)
# 2. No OpenSessionInViewInterceptor — it is a servlet filter, not a thread filter
# 3. No existing Hibernate Session
# 4. Each database access opens a NEW temporary Session
# 5. Each temporary Session obtains a NEW connection from HikariCP
# 6. The temporary Session may or may not close promptly
# 7. If it does close, the connection returns. If not: leak.
#
# The Spring documentation does not warn about this interaction.
# The Hibernate documentation does not warn about this interaction.
# You discover it at 2:25 PM on a Tuesday when your pool exhausts. The asymmetry is subtle and easy to miss. On a request thread, OSIV provides a Session. All lazy loads share that Session and its connection. When the request completes, OSIV closes the Session and the connection is returned. You never think about connection management because you never have to.
Move the same code to an @Async method and the safety net is gone. Each lazy load is on its own. Each opens a temporary Session. Each acquires its own connection. And the return path for those connections is... optimistic at best.
This is why the leak is so insidious. Developers write and test code on request threads, where it works perfectly. They add @Async to improve response times — a sensible optimization. The code still works. Functionally. The queries still return correct results. The data is still accurate. The only difference is that connections are now leaking, and the leak does not manifest until the async call volume is high enough to exhaust the pool. In development, with one developer calling the endpoint manually, the pool never exhausts. In staging, with a handful of concurrent users, the pool might fluctuate but recovers during quiet periods. In production, with 200 concurrent batch invocations, the pool fills in minutes.
The progression from "works in development" to "catastrophic failure in production" is as smooth as it is invisible. There is no warning. There is no degradation curve. There is a threshold, and below it everything works, and above it everything stops. This is the nature of resource exhaustion failures. They are not gradual. They are binary.
Why your tests do not catch this
I feel it is important to address this directly, because the most frustrating aspect of this failure pattern is that it passes every test in your suite. Unit tests. Integration tests. End-to-end tests. They all pass. The application is functionally correct. It is only connection-management-incorrect, and no standard test pattern checks for that.
// Why your tests do not catch this
@SpringBootTest
class ReportServiceTest {
@Autowired
private ReportService reportService;
@Test
void testGenerateReport() {
// This test PASSES. Every time.
CompletableFuture<ReportDTO> future = reportService.generateReport(1L);
ReportDTO report = future.get(5, TimeUnit.SECONDS);
assertNotNull(report);
assertEquals(3, report.getOrderCount());
}
// Why it passes:
// 1. Test data is small: 3 orders, 5 line items, 5 products
// 2. That means ~13 lazy loads, ~13 connection checkouts
// 3. HikariCP default pool: 10 connections
// 4. Some connections ARE returned by GC during test execution
// 5. Single-threaded test — only ONE invocation at a time
// 6. No concurrency pressure — connections leak but pool never exhausts
//
// In production:
// 1. Real data: 50 orders, 200 line items, 150 products per customer
// 2. 200 concurrent @Async invocations (batch endpoint)
// 3. ~400 lazy loads per invocation * 200 concurrent = 80,000 connection checkouts
// 4. GC cannot keep up — connections leak faster than they are reclaimed
// 5. Pool exhausts in minutes
//
// The test proves the code is FUNCTIONALLY correct.
// It does not prove the code is CONNECTION-SAFE under concurrency.
} Your test creates one @Async invocation. It waits for the result. It asserts the result is correct. The result is correct. The test passes. But during that single invocation, the method checked out 13 connections from the pool and returned none of them promptly. The pool had 10 connections, so 3 of those checkouts blocked briefly until GC freed some earlier ones. In a test environment, this resolves in milliseconds. In production, with 200 concurrent invocations, the blocking becomes permanent.
To catch this in testing, you would need a load test that:
- Calls the
@Asyncmethod with production-scale concurrency (200+ simultaneous invocations) - Uses production-scale data (50+ orders per customer, not 3)
- Monitors HikariCP metrics during the test, not just after
- Fails the test if
activeconnections reachmaximum-pool-sizeand stay there for more than 10 seconds
Most teams do not write tests like this. They are expensive to maintain, slow to run, and require infrastructure that mirrors production. I do not say this to criticize — the testing gap is genuine and structural. Connection leak detection is a monitoring concern, not a testing concern. The correct response is leak detection in production (which I will address shortly), not more unit tests.
The scenario matrix
Not every combination of thread type and transaction annotation leaks. The following table covers the six most common scenarios in Spring Boot applications. I keep this table pinned, if you will forgive the metaphor, beside the household inventory. It is the quickest way to determine whether a given code path is safe.
| Scenario | Thread | Hibernate Session | Connections | Returned? | Risk |
|---|---|---|---|---|---|
| @Transactional service, OSIV enabled | Request thread | Bound to transaction | 1 per request | Yes, at commit | None |
| No @Transactional, OSIV enabled | Request thread | OSIV session | 1 per request | Yes, at response | Low |
| @Async + @Transactional | Async thread | Bound to transaction | 1 per call | Yes, at commit | None |
| @Async, no @Transactional | Async thread | Temporary per access | 1 per lazy load | Maybe. Eventually. | LEAK |
| @Scheduled, no @Transactional | Scheduler thread | Temporary per access | 1 per lazy load | Maybe. Eventually. | LEAK |
| CompletableFuture.supplyAsync() | ForkJoinPool | Temporary per access | 1 per lazy load | Maybe. Eventually. | LEAK |
The pattern is clear. Any code that accesses Hibernate entities with lazy associations, running on a non-request thread, without @Transactional, is a connection leak. This includes @Async methods, @Scheduled methods, CompletableFuture.supplyAsync() callbacks, @EventListener handlers running on async dispatchers, and any custom thread pool that calls into JPA repositories.
The rule is simple enough to be a code review checklist item: if it runs off the request thread and touches the database, it needs @Transactional. No exceptions. Even if it only reads data. Even if it only calls one repository method. Even if the repository method itself is transactional — because the entities it returns may have lazy associations that your code subsequently traverses.
The fix: @Transactional on every @Async method that touches the database
The fix is counterintuitive. Adding @Transactional to a read-only report method feels wrong — there is nothing to commit, nothing to roll back, no data being modified. But @Transactional does more than manage transactions. It binds a Hibernate Session to the current thread for the duration of the method. That Session uses a single connection. All lazy loads within the method share that connection. And when the method completes — whether by return or by exception — Spring's transaction infrastructure closes the Session and returns the connection to HikariCP.
@Service
public class ReportService {
@Autowired
private OrderRepository orderRepository;
@Async
@Transactional(readOnly = true) // <-- THIS FIXES THE LEAK
public CompletableFuture<ReportDTO> generateReport(Long customerId) {
// Now this method has:
// 1. A transaction context
// 2. A single Hibernate Session bound to that transaction
// 3. One connection, checked out at method start, returned at method end
//
// All lazy loads use the SAME connection.
// The connection is returned when @Transactional commits/rolls back.
List<Order> orders = orderRepository.findByCustomerId(customerId);
for (Order order : orders) {
List<LineItem> items = order.getLineItems();
for (LineItem item : items) {
String productName = item.getProduct().getName();
}
}
return CompletableFuture.completedFuture(buildReport(orders));
}
} One annotation. @Transactional(readOnly = true). The readOnly flag is not strictly necessary for fixing the leak, but it provides three benefits that make it worth including:
- Hibernate skips dirty checking. At the end of the transaction, Hibernate normally compares every managed entity against its original state to detect modifications. For a read-only method that loaded 50 orders, 250 line items, and 250 products, that is 550 entity comparisons. With
readOnly = true, Hibernate skips this entirely. On a large report, this can save 50-100ms of CPU time. - PostgreSQL enables read-only transaction optimizations. PostgreSQL's
SET TRANSACTION READ ONLYallows the database to skip WAL (Write-Ahead Log) entries and conflict checking for the transaction. For read-heavy reporting queries, this is a measurable optimization. - It communicates intent. A developer reading this code six months from now will understand immediately that this method reads data and does not modify it. The annotation is documentation that the compiler enforces — if someone adds a
save()call, they will get an exception rather than a silent data modification inside what was supposed to be a read-only operation.
The change in behavior is dramatic. Before: N connections checked out per lazy load, returned unpredictably. After: 1 connection checked out at method start, returned deterministically at method end. The pool stays healthy. The metrics stay flat. The 2:25 PM incident does not happen.
The proxy trap: @Async and @Transactional on the same class
There is a subtlety here that has caught more than a few teams who thought they had applied the fix correctly. Spring's @Async and @Transactional annotations work through AOP proxies. When you call a method on the same class using this.methodName(), the call bypasses the proxy entirely. Neither annotation takes effect.
// The proxy trap: @Async + @Transactional on the SAME class
@Service
public class ReportService {
@Autowired
private OrderRepository orderRepository;
// This is the public method called by the controller
public void triggerReportGeneration(Long customerId) {
// PROBLEM: calling an @Async method on 'this' bypasses the Spring proxy
this.generateReport(customerId); // <-- direct call, NOT through proxy
// Spring's @Async and @Transactional work via AOP proxies.
// When you call a method on 'this', you bypass the proxy entirely.
// The method runs synchronously, without @Async behavior,
// and WITHOUT the @Transactional session binding.
}
@Async
@Transactional(readOnly = true)
public CompletableFuture<ReportDTO> generateReport(Long customerId) {
// If called via this.generateReport(), neither @Async nor @Transactional
// is applied. The method runs synchronously on the request thread
// (which has OSIV, so it might work) — but the @Async annotation
// is silently ignored. No async execution at all.
List<Order> orders = orderRepository.findByCustomerId(customerId);
// ...
return CompletableFuture.completedFuture(buildReport(orders));
}
}
// Fix: inject the service into itself (self-injection) or extract to a separate class.
@Service
public class ReportOrchestrator {
@Autowired
private ReportWorker reportWorker; // separate bean, separate proxy
public void triggerReportGeneration(Long customerId) {
reportWorker.generateReport(customerId); // goes through Spring proxy
}
}
@Service
public class ReportWorker {
@Autowired
private OrderRepository orderRepository;
@Async
@Transactional(readOnly = true) // both annotations work correctly now
public CompletableFuture<ReportDTO> generateReport(Long customerId) {
List<Order> orders = orderRepository.findByCustomerId(customerId);
// ...
return CompletableFuture.completedFuture(buildReport(orders));
}
} This is not a new problem — it is the well-documented "self-invocation" limitation of Spring AOP. But in the context of @Async + @Transactional, the symptoms are confusing. The method might appear to work (because OSIV is active on the calling request thread), but @Async is silently not applied, so the method runs synchronously. Or, if you are calling from another async context where OSIV is not available, both annotations are ignored and you get the original leak pattern.
The fix is architectural: put the @Async method in a separate Spring bean. The orchestrator bean calls the worker bean. The call crosses the proxy boundary. Both annotations are applied. The solution is two classes instead of one, which feels like unnecessary ceremony until you consider the alternative: a production outage caused by an annotation that looks correct but is silently ignored.
"Connection pooling before query optimization. Query optimization before indexes. Skip a step, and you will optimize the wrong layer."
— from You Don't Need Redis, Chapter 18: The PostgreSQL Performance Decision Framework
Alternative fix: eliminate lazy loading with EntityGraph
If your @Async method needs the full object graph — orders, line items, products — there is an argument for loading everything eagerly in a single query rather than relying on lazy loading at all. This eliminates the leak by eliminating the mechanism that causes it.
// Alternative fix: use EntityGraph to eliminate lazy loading entirely
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
@EntityGraph(attributePaths = {"lineItems", "lineItems.product"})
List<Order> findByCustomerId(Long customerId);
}
// With this EntityGraph:
// - One SQL query with JOINs fetches orders, line items, AND products
// - No lazy loading needed
// - No extra connections opened
// - Works with or without @Transactional
// - Works on any thread
//
// The query is larger, but it is ONE query, ONE connection, ONE round trip.
// This is often the better fix when your @Async method needs
// the full object graph anyway. An @EntityGraph tells Hibernate to fetch the specified associations in the same query using JOINs. One query, one connection checkout, one round trip. No lazy loading means no risk of the leak pattern, regardless of whether @Transactional is present.
The trade-off is query complexity. An EntityGraph that spans three levels of associations generates a query with multiple JOINs, and the result set may contain significant duplication. Each order row is repeated for each line item, and each line item row is repeated for each product. For a customer with 10 orders, 50 line items, and 50 products, the query returns ~50 rows (dominated by the leaf-level join). For 100 orders with 500 line items, the result set contains ~500 rows with significant column duplication. The database sends the order data 50 times, the line item data once each, and the product data once each.
For small-to-medium result sets — hundreds of rows — this is efficient. The single round trip and single connection checkout more than compensate for the data duplication. For large result sets — tens of thousands of rows with deep nesting — the Cartesian product can become expensive, both in network transfer and in Hibernate's deduplication logic.
For most @Async reporting use cases, the EntityGraph approach is the better fix. It is explicit about what data is loaded, it works on any thread, and it makes the N+1 query problem structurally impossible rather than merely contained. I recommend it as the default approach when the full object graph is needed. Use @Transactional as the fix when lazy loading is intentional — when you want to load some associations conditionally based on runtime logic.
Third fix: DTO projections — bypass entities entirely
There is a third approach that I find particularly elegant for reporting use cases, and it is the one I would recommend when the @Async method does not need entities at all — when it needs data, shaped for a specific purpose.
// Third fix: DTO projections — bypass entities and lazy loading entirely
// 1. Define an interface projection
public interface OrderReportProjection {
Long getOrderId();
String getCustomerName();
LocalDateTime getCreatedAt();
BigDecimal getTotalAmount();
int getLineItemCount();
}
// 2. Write a repository method that returns the projection
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
@Query("""
SELECT o.id AS orderId,
c.name AS customerName,
o.createdAt AS createdAt,
o.totalAmount AS totalAmount,
(SELECT COUNT(*) FROM LineItem li WHERE li.order.id = o.id) AS lineItemCount
FROM Order o
JOIN o.customer c
WHERE o.customer.id = :customerId
ORDER BY o.createdAt DESC
""")
List<OrderReportProjection> findReportByCustomerId(@Param("customerId") Long customerId);
}
// 3. Use it in your @Async method — no entities, no lazy loading, no leak
@Async
public CompletableFuture<ReportDTO> generateReport(Long customerId) {
List<OrderReportProjection> rows = orderRepository.findReportByCustomerId(customerId);
// One query. One connection checkout. Returned immediately after the query.
// No Hibernate Session needed. No entity tracking. No lazy loading possible.
return CompletableFuture.completedFuture(buildReport(rows));
}
// Trade-off: You write the query yourself. You define the shape of the result.
// Benefit: There is nothing to leak. The connection lifecycle is trivial. A DTO projection returns exactly the data you need, in the shape you need it, with no entity management overhead. There are no managed entities, so there is no dirty checking. There are no lazy associations, so there are no temporary Sessions. There is one query, one connection, and the connection is returned the moment the query completes.
The trade-off is that you write the query yourself. You define the shape of the result. You maintain the projection interface and the JPQL query as your schema evolves. This is more work than relying on entities and lazy loading, but it is also more explicit, more predictable, and entirely immune to connection leaks.
I have a mild preference for this approach in reporting contexts because it aligns the data access pattern with the use case. A report does not need a managed Order entity with all its associations and lifecycle hooks. It needs a row of data: order ID, customer name, date, total, line item count. The DTO projection delivers exactly that, at the cost of writing 10 lines of JPQL. That strikes me as a fair trade.
The three fixes, ranked by my preference for @Async reporting methods:
- DTO projection — eliminates entities, lazy loading, and Session management entirely. Most thorough.
- EntityGraph — keeps entities but eliminates lazy loading. Good when you need the entity graph for business logic.
- @Transactional — keeps entities and lazy loading but manages the Session lifecycle. Good when lazy loading is intentional. The quickest fix to apply.
All three work. All three fix the leak. The difference is in how far upstream you solve the problem. @Transactional fixes the symptom (unmanaged Sessions). EntityGraph fixes the cause (lazy loading). DTO projections fix the premise (using entities for a reporting use case).
The @Scheduled trap: same leak, different trigger
@Scheduled methods have the exact same vulnerability. They run on Spring's scheduling thread pool, which is not a servlet thread, which means no OSIV, which means no Session binding. The leak mechanism is identical. The consequences are, if anything, worse.
@Component
public class NightlyReportJob {
@Autowired
private OrderRepository orderRepository;
@Scheduled(cron = "0 0 2 * * *") // runs at 2 AM
// NO @Transactional — same leak pattern as @Async
public void generateNightlyReport() {
List<Order> allOrders = orderRepository
.findByCreatedAtAfter(LocalDate.now().minusDays(1));
for (Order order : allOrders) {
// Every lazy load opens a temporary Session + connection
BigDecimal total = order.getLineItems().stream()
.map(li -> li.getProduct().getPrice().multiply(
BigDecimal.valueOf(li.getQuantity())))
.reduce(BigDecimal.ZERO, BigDecimal::add);
}
// At 2:01 AM, your pool is exhausted.
// At 2:02 AM, the health check fails.
// At 2:03 AM, Kubernetes restarts the pod.
// At 2:04 AM, the cron job fires again on the fresh pod.
// Repeat.
}
// Fix: add @Transactional(readOnly = true)
// Or: use an EntityGraph to load everything eagerly
// Or: use a DTO projection that avoids entities entirely
} This is arguably worse than the @Async case, because scheduled jobs often process larger datasets. A nightly job that touches every order from the past day might trigger thousands of lazy loads. And because it runs at 2 AM, the symptoms may not be noticed until the morning health check — or until the next production deploy restarts the pod and the job fires again on fresh connections.
The Kubernetes interaction is particularly unkind. The scheduled job exhausts the pool. The health check fails (because the health check endpoint needs a database connection, and there are none available). Kubernetes marks the pod as unhealthy and restarts it. The fresh pod starts up, Spring's scheduler fires the cron job again (it missed its 2 AM slot, so depending on your cron configuration, it may re-trigger immediately). The job runs. The pool exhausts. The health check fails. Kubernetes restarts. The cycle repeats until a human intervenes or the data set is small enough to complete before the pool fills.
The fix is identical: add @Transactional(readOnly = true). Or use EntityGraphs. Or use DTO projections that bypass entity loading entirely. The principle is the same: if you are on a non-request thread, you must explicitly manage the Hibernate Session lifecycle. Spring will not do it for you.
CompletableFuture.supplyAsync() and custom thread pools
The leak is not limited to @Async methods. Any code that runs on a non-servlet thread and accesses lazy-loaded Hibernate associations is vulnerable. CompletableFuture.supplyAsync() is the most common alternative path I encounter.
// The same leak, without @Async annotation
// CompletableFuture.supplyAsync() runs on ForkJoinPool.commonPool()
// — no servlet context, no OSIV, no transaction.
@Service
public class DashboardService {
@Autowired
private OrderRepository orderRepository;
@Autowired
private CustomerRepository customerRepository;
@Autowired
private InventoryRepository inventoryRepository;
public DashboardDTO buildDashboard() {
CompletableFuture<List<Order>> ordersFuture =
CompletableFuture.supplyAsync(() ->
orderRepository.findRecentOrders() // runs on ForkJoinPool
);
CompletableFuture<List<Customer>> customersFuture =
CompletableFuture.supplyAsync(() ->
customerRepository.findActiveCustomers() // another ForkJoinPool thread
);
CompletableFuture<InventorySummary> inventoryFuture =
CompletableFuture.supplyAsync(() -> {
List<Product> products = inventoryRepository.findAll();
for (Product p : products) {
// Lazy load: p.getWarehouseAllocations()
// Each one: new temp Session, new connection, no return
p.getWarehouseAllocations().size();
}
return summarize(products);
});
// Join all three futures
return CompletableFuture.allOf(ordersFuture, customersFuture, inventoryFuture)
.thenApply(v -> new DashboardDTO(
ordersFuture.join(),
customersFuture.join(),
inventoryFuture.join()
)).join();
}
// The ForkJoinPool has parallelism = Runtime.availableProcessors() (typically 4-8).
// If the inventory future leaks connections, those ForkJoinPool threads
// are shared across the entire JVM — other CompletableFuture work blocks too.
} CompletableFuture.supplyAsync() runs on the ForkJoinPool.commonPool() by default. This pool is shared across the entire JVM — every parallelStream(), every CompletableFuture without an explicit executor, every framework that uses the common pool. If your lazy-loading lambda holds connections on common pool threads, those threads are unavailable for all other async work in the application.
The fix for CompletableFuture.supplyAsync() is less elegant than the @Async fix because you cannot simply add @Transactional to a lambda. You have two options:
- Move the database access to a separate
@Transactionalmethod in a Spring bean and call that method from the lambda. The lambda calls through the proxy, the transactional context is established, and the Session is managed. - Use EntityGraphs or DTO projections so that no lazy loading occurs inside the lambda at all.
I prefer the second option. If you are writing database access code inside a CompletableFuture.supplyAsync() lambda, you have already made a choice to manage concurrency outside of Spring's framework. Embrace that choice fully by also managing your data access explicitly — fetch exactly what you need, in one query, and do not rely on lazy loading.
The @EventListener variant
There is one more common manifestation of this pattern that warrants specific attention, because it is the one that is most easily missed in code review.
// The @EventListener variant — the one nobody suspects
@Component
public class OrderEventHandler {
@Autowired
private NotificationService notificationService;
@Async // often added "for performance" without understanding the implications
@EventListener
public void handleOrderPlaced(OrderPlacedEvent event) {
Order order = event.getOrder();
// order is a detached entity — the original transaction is long gone
// order.getCustomer() triggers a lazy load on the async thread
Customer customer = order.getCustomer(); // temp Session, new connection
// customer.getPreferences() — another lazy load
NotificationPreference pref = customer.getPreferences(); // another connection
if (pref.isEmailEnabled()) {
// customer.getEmail() — yet another lazy load if Email is a separate entity
notificationService.sendEmail(customer.getEmail(), order);
}
// Three lazy loads. Three connections. None returned promptly.
// And this fires for EVERY order placed.
}
// Fix: @Transactional(readOnly = true) on this method
// Better fix: pass only the data you need in the event (customer email, preferences)
// rather than passing the entity and lazy-loading on the async thread
} The @Async @EventListener combination is deceptive. The event handler looks like a simple method that responds to domain events. The @Async annotation is often added later, as a performance optimization — "we do not need to send the notification synchronously, let's make it async." The optimization is sound. The implementation leaks connections.
The particularly tricky aspect of event listeners is that the entity passed in the event was loaded in a different transaction. The original transaction (the one that saved the order and published the event) has committed and its Session is closed. The entity is now detached. Any lazy association accessed on the detached entity in the async event handler triggers the temporary Session mechanism. The leak proceeds as described.
The better fix for event handlers is often not @Transactional but rather redesigning the event to carry data instead of entities. Pass OrderPlacedEvent(Long orderId, String customerEmail, boolean emailEnabled) instead of OrderPlacedEvent(Order order). The event handler receives all the data it needs without touching the database at all. No lazy loading, no Session, no connection. This is also better event design — events should be self-contained, not dependent on the persistence context of their publisher.
Detection: HikariCP leak detection and what to look for
If you suspect this pattern in your application but are not sure, HikariCP has a built-in leak detector that will tell you exactly where the problem is. I consider this configuration non-negotiable for any Spring Boot application in production.
# HikariCP leak detection — your first line of defense
# Add to application.yml:
spring:
datasource:
hikari:
leak-detection-threshold: 30000 # 30 seconds
# When a connection is held longer than 30 seconds without being returned,
# HikariCP logs a full stack trace showing WHERE the connection was checked out.
# Example output during the leak:
# WARN HikariPool-1 - Connection leak detection triggered for
# org.postgresql.jdbc.PgConnection@3a1f4b2c on thread async-3,
# stack trace follows:
# java.base/java.lang.Thread.getStackTrace
# com.zaxxer.hikari.pool.ProxyLeakTask.run
# ...
# org.hibernate.internal.SessionFactoryImpl.openSession
# org.springframework.orm.jpa.JpaTransactionManager...
# com.example.service.ReportService.generateReport(ReportService.java:42)
#
# That stack trace is the diagnosis. Line 42. The @Async method.
# The connection was checked out by a lazy load inside generateReport()
# and held by a temporary Hibernate Session with no transaction boundary. Set leak-detection-threshold to 30 seconds (or even 10 seconds in staging). When a connection is held longer than this threshold, HikariCP logs the full stack trace of the thread that checked it out. The stack trace will point directly to the @Async or @Scheduled method, and the line within that method where the lazy load triggered the connection checkout.
Four diagnostic signals to watch for:
1. Active connections equal to maximum-pool-size for more than 30 seconds. Healthy pools fluctuate. Pools at capacity that stay at capacity are leaking or blocked. If you use Micrometer with Prometheus or Grafana, create an alert: hikaricp_connections_active == hikaricp_connections_max for more than 60 seconds. This single alert would have caught every instance of this bug I have ever seen.
2. pg_stat_activity showing idle connections with lazy-load queries. Connections in state = 'idle' whose query column shows SELECT ... WHERE foreign_key = $1 patterns are the telltale. These are lazy loads that completed but whose connections were not returned.
3. Pool exhaustion that correlates with async batch operations. If the outage happens every time the report-generation endpoint is called, or every night at 2 AM when the scheduled job runs, the correlation is your diagnosis.
4. The connection checkout stack trace points to Hibernate's lazy loading internals. If the leak detection stack trace includes AbstractPersistentCollection, DefaultInitializeCollectionEventListener, or SessionImpl.initializeCollection, you are looking at a lazy load that opened a temporary Session. The fix is one of the three approaches described above.
Reading the stack trace: a diagnostic walkthrough
Allow me to walk through a real leak detection stack trace, because knowing how to read it is as important as knowing it exists. The stack trace is the diagnosis. Every frame tells you something specific about what happened and where.
// Reading the leak detection stack trace — a diagnostic walkthrough
// The HikariCP leak detection log entry:
//
// WARN 2026-03-11 14:25:33.441 HikariPool-1 - Connection leak detection triggered
// for org.postgresql.jdbc.PgConnection@3a1f4b2c on thread async-3,
// stack trace follows:
//
// java.lang.Exception: Apparent connection leak detected
// at com.zaxxer.hikari.pool.ProxyLeakTask.run(ProxyLeakTask.java:100)
// at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
// ...
// at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162) // <-- HikariCP checkout
// at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(...) // <-- JDBC prepared statement
// at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareStatement(...) // <-- Hibernate preparing the query
// at org.hibernate.loader.plan.exec.internal.AbstractLoadPlanBasedLoader.executeLoad(...)
// at org.hibernate.loader.collection.plan.AbstractLoadPlanBasedCollectionInitializer.initialize(...)
// at org.hibernate.persister.collection.AbstractCollectionPersister.initialize(...)
// at org.hibernate.event.internal.DefaultInitializeCollectionEventListener.onInitializeCollection(...)
// at org.hibernate.internal.SessionImpl.initializeCollection(...) // <-- lazy collection init
// at org.hibernate.collection.internal.AbstractPersistentCollection.forceInitialization(...)
// at org.hibernate.collection.internal.AbstractPersistentCollection.iterator(...) // <-- YOUR CODE called .iterator()
// at java.base/java.lang.Iterable.forEach(Iterable.java:74)
// at com.example.service.ReportService.generateReport(ReportService.java:42) // <-- THE LINE
// at com.example.service.ReportService$$SpringCGLIB$$0.generateReport(...)
// at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(...)
// at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(...)
// at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(...)
// at java.base/java.lang.Thread.run(Thread.java:833)
//
// Read it bottom-up:
// 1. Thread.run → ThreadPoolExecutor → this is an async thread, not a servlet thread
// 2. CompletableFuture$AsyncRun → confirms async execution
// 3. ReportService.generateReport line 42 → YOUR code, the @Async method
// 4. AbstractPersistentCollection.iterator → a lazy collection was iterated
// 5. SessionImpl.initializeCollection → Hibernate opened a Session to load it
// 6. HikariPool.getConnection → that Session checked out a connection
//
// Diagnosis complete. The fix is on line 42. The bottom-up reading order matters. You start with the thread identity (async thread, not servlet thread — so OSIV is not active). You move through the framework layers (CompletableFuture, Spring proxy, Hibernate Session internals). You arrive at your code — the specific line where the lazy load was triggered. That line is the fix point.
One subtlety: the stack trace shows where the connection was checked out, not where it was leaked. The connection was checked out by a lazy load on line 42. It was leaked because there is no @Transactional to close the Session when the method completes. The fix is not on line 42 — it is on the method declaration, where @Transactional needs to be added.
I recommend keeping a few of these stack traces in your team's runbook. Once you have seen the pattern — AbstractPersistentCollection.iterator inside a thread named async-* — you will recognize it instantly the next time it appears. The diagnosis goes from "two hours of investigation" to "ten seconds of reading a stack trace."
An honest counterpoint: when this does not apply
I should be forthcoming about the cases where this entire discussion does not apply, because overapplying a fix is its own form of error.
If your @Async methods do not access lazy-loaded associations, there is no leak. An @Async method that calls a repository's query method — findByCustomerId() — and returns the result without traversing any lazy associations is safe. The repository method opens a Session, executes the query, closes the Session, and returns the result. One connection, properly returned. The leak only occurs when your code triggers lazy loading outside of a managed Session.
If you have disabled hibernate.enable_lazy_load_no_trans, Hibernate throws a LazyInitializationException instead of opening a temporary Session. This is a different kind of failure — louder, more obvious, and arguably better because it tells you exactly what is wrong. No temporary Sessions means no leaked connections. You get an exception instead of a leak. Many teams consider this the safer default.
If your entities have no lazy associations — either because you have set fetch = FetchType.EAGER on all associations (not recommended for other reasons, but it does prevent this specific issue) or because your entities are simple with no relationships — then there is nothing to lazy-load, and the temporary Session mechanism is never triggered.
If you are using Spring Data JDBC instead of Spring Data JPA, this entire article does not apply. Spring Data JDBC does not have lazy loading, does not have Sessions, and does not have the OSIV interceptor. It loads the aggregate root and its members in a single query. The connection lifecycle is simple and deterministic. I have a certain admiration for this approach, though it trades Hibernate's flexibility for a more constrained — and more predictable — data access model.
The general rule: if you are not sure whether your @Async methods are vulnerable, add @Transactional(readOnly = true) anyway. The cost is negligible — a single connection per method invocation, properly managed. The benefit is certainty. And certainty, in connection management, is worth a great deal.
A checklist for auditing your application
If you have arrived at this section having read the preceding analysis, you are likely wondering whether your application has this problem. Allow me to provide a systematic approach to finding out.
- Search for
@Asyncon any method that directly or indirectly accesses JPA entities. Grep your codebase for@Async. For each method, trace whether it calls a repository method that returns entities, and whether it subsequently accesses any association on those entities. If yes, and@Transactionalis absent, the method is a leak candidate. - Search for
@Scheduledwith the same criteria. The same analysis applies. Any@Scheduledmethod that touches entities with lazy associations without@Transactionalis vulnerable. - Search for
CompletableFuture.supplyAsyncandCompletableFuture.runAsync. These lambdas run on the common pool or a custom executor. If they access entities, they are vulnerable. - Check for
@Async @EventListenercombinations. Event handlers that receive entities and access lazy associations are the most commonly missed variant. - Enable HikariCP leak detection in staging. Set
leak-detection-threshold: 10000(10 seconds) and run your integration tests. If any stack traces appear, they point directly to the vulnerable code paths. - Check your async executor configuration. If
maxPoolSizeon yourThreadPoolTaskExecutoris greater than HikariCP'smaximum-pool-size, you have more async threads than connections. This is fine when connections are properly managed. It is catastrophic when connections leak.
The audit takes an hour. The fix — adding @Transactional(readOnly = true) to every @Async and @Scheduled method that touches the database — takes another hour. Two hours of preventive work to avoid a production outage that, in my experience, costs between 4 and 40 hours of incident response, root cause analysis, and post-mortem documentation.
What Gold Lapel does when your pool is leaking
The correct fix for this problem is at the application level: @Transactional on your @Async methods, EntityGraphs to eliminate lazy loading, or DTO projections to bypass entity management entirely. No proxy can fix a connection leak caused by application code holding references to connections it should have released.
What Gold Lapel can do is limit the blast radius.
# What Gold Lapel sees during a connection leak
#
# Without Gold Lapel:
# App -> HikariCP (10 connections) -> PostgreSQL
# Leaked connections sit idle in PostgreSQL, holding slots.
# New requests queue behind the HikariCP checkout timeout.
# Total outage in minutes.
#
# With Gold Lapel:
# App -> HikariCP (10 connections) -> Gold Lapel -> PostgreSQL
#
# Gold Lapel's connection health checks detect connections that have been
# idle (no queries, no transaction) for longer than the configured threshold.
# It can release the backend PostgreSQL connection while keeping the
# frontend connection to HikariCP alive.
#
# This does not fix the leak — your @Async methods still hold connections
# from HikariCP's perspective. But it prevents the leak from cascading
# to PostgreSQL. Your pg_stat_activity stays clean. Other applications
# sharing the same PostgreSQL instance are not affected.
#
# The real fix is still @Transactional on your @Async methods.
# Gold Lapel buys you time. It does not buy you absolution. Gold Lapel sits between HikariCP and PostgreSQL. When it detects that a frontend connection (from your application) has been idle — no queries, no open transaction — for longer than a configurable threshold, it can reclaim the backend PostgreSQL connection. The frontend connection to HikariCP remains open, so your application does not see an error. But the actual PostgreSQL backend slot is freed for other work.
This is not a fix. It is damage containment. Your @Async methods are still holding HikariCP connections. Your application pool is still exhausted. But PostgreSQL itself — and any other applications connecting to the same database — are protected from the cascade. Gold Lapel's connection pooling absorbs the impact of leaked connections by multiplexing a larger number of application connections onto a smaller, correctly-sized backend pool.
The Spring Boot integration takes one dependency. Add the goldlapel-spring-boot starter to your project and it configures HikariCP automatically. Everything else — your transaction management, your @Async configuration, your leak detection — stays exactly the same. You gain a layer of protection that buys you time to find and fix the leak, rather than discovering it through a production outage at 2:25 PM on a Tuesday.
But please do fix the leak. Gold Lapel buys you time. It does not buy you absolution. Add the annotation. Your connections will be grateful for the escort home.
Frequently asked questions
Terms referenced in this article
There is a companion problem worth knowing about. The open-in-view pool exhaustion guide covers the other common path to HikariCP starvation in Spring Boot — one where connections are held for the duration of the HTTP request rather than leaked by async threads. Different cause, same symptom, equally fixable.