Spring Batch on PostgreSQL: Why saveAll() Is 100× Slower Than COPY
Four approaches to writing 100,000 rows. One of them finishes before the others have negotiated their first batch flush. The gap is not subtle.
Good evening. Your batch job is not batching.
Allow me to present a familiar scene. You have a Spring Batch job that processes a CSV file, an API response, or a message queue backlog. The reader works fine. The processor is fast. The writer calls repository.saveAll() because that is what every tutorial shows. The chunk size is 1,000. Everything looks correct.
Then you run it against 100,000 rows and it takes fourteen minutes.
You check PostgreSQL. pg_stat_statements shows 200,000 SQL statements executed. Two hundred thousand. For a hundred thousand rows. Half are SELECT queries you did not write — Hibernate checking whether each entity already exists before deciding to INSERT or UPDATE. The other half are individual INSERT statements, one per row, each waiting for a response before the next one fires.
Your chunk size of 1,000 is a lie. Spring Batch hands 1,000 items to the writer. The writer hands them to JPA. JPA processes them one at a time. Batching in name only.
I have seen this pattern in every Spring Batch deployment that uses JPA for writes. The framework's abstraction layers — which serve admirable purposes for CRUD operations — become a liability at bulk-write volumes. The overhead is not theoretical. It is measurable, it is substantial, and it compounds with every row.
There is a better way. There are, in fact, three better ways, each faster than the last. The fastest is 105 times faster than saveAll(). That is not a misprint. If you will permit me, I should like to walk through each approach, explain precisely where the time goes, and let you decide which trade-offs suit your situation.
Approach 1: naive saveAll() — the default everyone ships
Here is what most Spring Batch writers look like in production. It is the approach that Spring Data JPA encourages, that Stack Overflow answers recommend, and that works fine for 500 rows.
@Component
public class EventItemWriter implements ItemWriter<Event> {
@Autowired
private EventRepository repository;
@Override
public void write(Chunk<? extends Event> chunk) {
// The default approach: JPA saveAll()
// Looks clean. Performs terribly.
repository.saveAll(chunk.getItems());
}
}
// What actually happens per entity:
// 1. SELECT to check if entity exists (merge vs persist)
// 2. INSERT INTO events (...) VALUES (...) — individual row
// 3. Wait for response
// 4. Repeat for next entity
//
// For 100,000 rows: 200,000+ SQL statements.
// Round trips dominate. The database is mostly waiting. The problem is not saveAll() itself. The problem is what saveAll() does under the hood when your entity uses @GeneratedValue(strategy = GenerationType.IDENTITY) — which is what Spring Data JPA tutorials overwhelmingly demonstrate.
For each entity, Hibernate calls entityManager.merge(). Merge first executes a SELECT to check if the entity exists. Finding nothing, it issues an INSERT. With IDENTITY generation, that INSERT must execute immediately and return the database-generated ID before Hibernate can proceed. No batching is possible.
Two SQL statements per entity. 100,000 entities. 200,000 round trips. On a localhost connection with 0.1ms latency, that is 20 seconds of pure wire protocol overhead before PostgreSQL does any real work. Over a network to a managed database — 1-2ms latency — multiply accordingly.
Result: ~1,200 rows/second.
Why IDENTITY generation defeats batching entirely
This deserves its own explanation, because the mechanism is subtle and the consequence is severe.
When you annotate an entity with @GeneratedValue(strategy = GenerationType.IDENTITY), you are telling Hibernate that the database generates the primary key — typically via a SERIAL or BIGSERIAL column, which is syntactic sugar for a PostgreSQL sequence with a DEFAULT nextval(). The key is generated inside the INSERT statement itself, as a side effect of the row being written.
Hibernate needs that key. The persistence context is keyed on (entity type, primary key). Without the key, Hibernate cannot register the entity, cannot manage its lifecycle, and cannot process any cascaded associations that reference it. So Hibernate executes the INSERT, reads the generated key from the JDBC getGeneratedKeys() response, sets it on the entity, and only then moves to the next entity.
This is inherently sequential. There is no way to batch IDENTITY-generated INSERTs, because the ID of entity N is not known until entity N's INSERT has completed, and Hibernate refuses to proceed until it has that ID.
// IDENTITY generation — disables batching entirely
@Entity
public class Event {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
// ...
}
// With IDENTITY, Hibernate MUST execute each INSERT immediately
// and read back the database-generated ID before proceeding.
// It cannot batch because it needs each ID to populate the
// entity's @Id field for the persistence context.
//
// The JDBC driver sends:
// INSERT INTO events (...) VALUES (...)
// -- wait for response, read generated key --
// INSERT INTO events (...) VALUES (...)
// -- wait for response, read generated key --
// ... 100,000 times
// SEQUENCE generation — enables batching
@Entity
public class Event {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE,
generator = "event_seq")
@SequenceGenerator(name = "event_seq",
sequenceName = "event_id_seq",
allocationSize = 50)
private Long id;
// ...
}
// With SEQUENCE + allocationSize=50, Hibernate pre-allocates
// 50 IDs per database round trip (SELECT nextval('event_id_seq')).
// It assigns IDs locally, then batches the INSERTs.
//
// For 100,000 rows: 2,000 sequence calls + ~2,000 batch executions
// vs 200,000 individual INSERT round trips. SEQUENCE generation solves this by separating ID allocation from row insertion. Hibernate calls nextval() to pre-allocate a block of IDs (controlled by allocationSize), assigns them locally to entities in memory, and then batches the INSERTs. The sequence call is one round trip per block. With allocationSize=50 and 100,000 entities, that is 2,000 sequence calls instead of 100,000 immediate INSERT-and-read-back cycles.
I should note that switching from IDENTITY to SEQUENCE is not always free. If your application has existing data with SERIAL columns, you need to create a sequence, set its starting value above the current maximum ID, and update the entity mapping. If other applications or raw SQL scripts insert into the same table, they must also use the sequence. These are solvable problems, but they are not zero-cost, and I would be doing you a disservice to pretend otherwise.
The persistence context tax
Even after fixing the IDENTITY problem, JPA still imposes a per-entity overhead that becomes visible at bulk-write volumes. I find it instructive to enumerate exactly what happens for each entity that passes through persist(), because the aggregate cost is often larger than teams expect when they have only benchmarked at small scale.
// What happens inside the persistence context for each entity
// during a saveAll() / persist() call:
// 1. Cascade check
// - Walk every @OneToMany, @ManyToOne, @OneToOne, @ManyToMany
// - Check cascade type (PERSIST, MERGE, ALL)
// - For each cascaded association, recursively enter this flow
// Typical cost: 2-8 μs per entity (depends on mapping complexity)
// 2. Entity state snapshot
// - Allocate Object[] array sized to the entity's property count
// - Deep-copy every property value into the snapshot
// - Store in the persistence context's entityEntries map
// Typical cost: 1-3 μs per entity, plus ~200 bytes heap per entity
// 3. First-level cache registration
// - HashMap.put(entityKey, entity)
// - HashMap.put(entityKey, snapshot)
// Typical cost: <1 μs per entity, but memory adds up
// 4. Lifecycle callbacks
// - @PrePersist methods on the entity
// - EntityListeners callbacks
// - Interceptor.onSave() if registered
// Typical cost: 0-5 μs depending on listener complexity
// 5. Dirty check at flush time (the big one)
// - For EVERY managed entity (not just new ones):
// compare current field values to the saved snapshot
// - Uses DeepComparator for each field
// - For 100,000 entities: 100,000 dirty checks
// Typical cost: 3-10 μs per entity
// Total JPA overhead per entity: ~10-25 μs
// For 100,000 entities: 1-2.5 seconds of pure framework overhead
// This is BEFORE any SQL touches the wire. The numbers in that breakdown are approximate, but the pattern is consistent across every profiling session I have observed. For 100,000 entities, the persistence context alone — before any SQL reaches PostgreSQL — adds 1 to 2.5 seconds of pure CPU time on the application side. That is time spent allocating objects, copying field values, traversing association graphs, and comparing snapshots.
The em.clear() call in the tuned writer is essential. Without it, the persistence context grows monotonically through the chunk, holding 100,000 entity references and 100,000 dirty-checking snapshots simultaneously. That is 20-40 MB of heap for the entities plus another 20-40 MB for the snapshots. The garbage collector notices. Especially if your batch job processes multiple chunks — by the third chunk, you may be triggering full GC pauses.
The flush-and-clear pattern mitigates the memory problem but does not eliminate the per-entity overhead. Every entity still enters the persistence context, has its cascade graph walked, and has a snapshot allocated — even though the snapshot will be cleared moments later. The work is done and then discarded. A tidy waiter might describe this as straightening a room that no one will occupy.
Approach 2: tuned JPA batching — fixing the obvious mistakes
The first improvement requires three changes: switch from IDENTITY to SEQUENCE generation, configure Hibernate batching, and use persist() instead of merge() to eliminate the existence-check SELECT.
# application.properties — JPA batching tuned
spring.jpa.properties.hibernate.jdbc.batch_size=50
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true
spring.datasource.url=jdbc:postgresql://localhost:5432/myapp\
?reWriteBatchedInserts=true
# And the writer must use persist(), not merge():
@Component
public class EventItemWriter implements ItemWriter<Event> {
@PersistenceContext
private EntityManager em;
@Override
public void write(Chunk<? extends Event> chunk) {
for (Event event : chunk.getItems()) {
em.persist(event);
}
em.flush();
em.clear();
}
}
// Better. Hibernate now batches INSERTs in groups of 50.
// With reWriteBatchedInserts, pgjdbc collapses those into
// multi-row VALUES clauses.
//
// But JPA still parses each entity through the persistence
// context, dirty-checks, cascades, lifecycle callbacks...
// That overhead adds up at scale. This gets you real batching. Hibernate groups INSERTs by entity type and sends them in batches of 50. With reWriteBatchedInserts=true on the JDBC URL, the pgjdbc driver rewrites those batches into multi-row VALUES clauses, collapsing 50 individual Execute messages into a single multi-row INSERT.
The SELECT-before-INSERT problem is gone. The one-row-at-a-time INSERT problem is gone. For 100,000 rows, you get roughly 1,700 JDBC statements instead of 200,000.
But JPA is still in the path. Every entity passes through the persistence context. Hibernate allocates dirty-checking snapshots, processes lifecycle callbacks, cascades relationships, and manages the first-level cache. For a bulk write operation where you are creating new entities from flat data, all of that machinery is overhead. Pure overhead.
There is a subtlety with batch size that merits attention. Setting hibernate.jdbc.batch_size=50 does not mean Hibernate always sends batches of 50. If your step writes multiple entity types — say, Event and EventMetadata — Hibernate batches them separately and may interleave the batches, breaking the rewrite optimization. The order_inserts=true property fixes this by sorting entities by type before flushing, ensuring that all Event INSERTs are grouped together and all EventMetadata INSERTs are grouped together.
Result: ~18,000 rows/second. Fifteen times faster than naive saveAll(). Respectable. Not fast.
Approach 3: JdbcBatchItemWriter — dropping the ORM for writes
Spring Batch ships a writer that bypasses JPA entirely: JdbcBatchItemWriter. It uses raw JDBC addBatch()/executeBatch() with named parameters. No entity manager. No persistence context. No dirty checking. No existence check.
@Bean
public JdbcBatchItemWriter<Event> jdbcWriter(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Event>()
.dataSource(dataSource)
.sql("INSERT INTO events (name, category, payload, created_at) " +
"VALUES (:name, :category, :payload, :createdAt)")
.beanMapped()
.build();
}
// JdbcBatchItemWriter uses JDBC addBatch()/executeBatch()
// directly — no entity manager, no persistence context,
// no dirty checking, no SELECT-before-INSERT.
//
// With reWriteBatchedInserts=true in the JDBC URL,
// the driver rewrites batches into multi-row INSERTs.
//
// For 100,000 rows with chunk-size=1000:
// 100 batches * ~17 multi-row statements each = ~1,700 statements
// vs saveAll()'s 200,000+ The SQL is explicit. The parameter binding is straightforward. Spring Batch calls executeBatch() once per chunk, and with reWriteBatchedInserts=true, the pgjdbc driver collapses the batch into multi-row INSERTs just as it does with Hibernate batching.
The difference is everything that does not happen. No reflection to read entity fields. No snapshot allocation for dirty checking. No cascade traversal. No persistence context bookkeeping. The writer receives a list of objects, binds their properties to SQL parameters, and sends the batch. That is all.
For teams already using Spring Batch, this is a one-bean change. You keep your reader and processor. You replace the writer. Everything else stays the same.
The beanMapped() configuration uses BeanPropertySqlParameterSource, which resolves named parameters by calling getters on your item object via reflection. This is convenient but occasionally insufficient — particularly when your schema uses PostgreSQL-specific types like JSONB, arrays, or custom enums. For those cases, provide a custom parameter source.
// For complex mappings, use ItemSqlParameterSourceProvider:
@Bean
public JdbcBatchItemWriter<Event> jdbcWriter(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Event>()
.dataSource(dataSource)
.sql("INSERT INTO events (name, category, payload, created_at) " +
"VALUES (:name, :category, :payload::jsonb, :createdAt)")
.itemSqlParameterSourceProvider(event -> {
MapSqlParameterSource params = new MapSqlParameterSource();
params.addValue("name", event.getName());
params.addValue("category", event.getCategory());
params.addValue("payload", event.getPayload(),
Types.OTHER); // PostgreSQL JSONB
params.addValue("createdAt", event.getCreatedAt());
return params;
})
.build();
}
// This gives you full control over type mapping.
// PostgreSQL-specific types (JSONB, arrays, custom enums)
// work correctly because you control the JDBC type hint. I should be direct about the trade-off here. JdbcBatchItemWriter means you are writing raw SQL. If your table schema changes — a column is renamed, a type changes — your SQL string breaks at runtime, not at compile time. JPA entity mappings, for all their overhead, give you a layer of compile-time safety (with annotation processing) and schema migration awareness. For batch jobs that run unattended at 3 AM, a runtime SQL error at row 80,000 is a different kind of pain than a slow-but-correct saveAll(). Choose with that in mind.
Result: ~42,000 rows/second. 2.3 times faster than tuned JPA. 35 times faster than saveAll(). The JPA overhead was real, and it was larger than you might expect.
Approach 4: COPY protocol — the freight entrance
Everything so far has been variations on INSERT. Individual INSERTs. Batched INSERTs. Multi-row INSERTs. They all use the same PostgreSQL extended query protocol: Parse, Bind, Execute, Sync. The database parses the SQL, plans the query, binds parameters, executes, and returns a result. For every batch.
The COPY protocol is a different animal entirely. It is a dedicated bulk-loading pathway that bypasses the SQL parser, the query planner, and the per-row execution overhead. The client opens a COPY stream, sends rows as formatted data (CSV or binary), and PostgreSQL writes them directly into the heap.
@Component
public class CopyItemWriter implements ItemWriter<Event> {
@Autowired
private DataSource dataSource;
@Override
public void write(Chunk<? extends Event> chunk) throws Exception {
try (Connection conn = dataSource.getConnection()) {
PGConnection pgConn = conn.unwrap(PGConnection.class);
CopyManager copyManager = pgConn.getCopyAPI();
String copySql = "COPY events (name, category, payload, created_at) " +
"FROM STDIN WITH (FORMAT csv)";
try (Writer writer = new OutputStreamWriter(
copyManager.copyIn(copySql), StandardCharsets.UTF_8)) {
for (Event event : chunk.getItems()) {
writer.write(escapeCsv(event.getName()));
writer.write(',');
writer.write(escapeCsv(event.getCategory()));
writer.write(',');
writer.write(escapeCsv(event.getPayload()));
writer.write(',');
writer.write(event.getCreatedAt().toString());
writer.write('\n');
}
}
}
}
private String escapeCsv(String value) {
if (value == null) return "";
if (value.contains(",") || value.contains("\"") || value.contains("\n")) {
return "\"" + value.replace("\"", "\"\"") + "\"";
}
return value;
}
} The API is lower-level. You are formatting CSV rows and writing them to a stream. There is no parameter binding, no ORM mapping, no batch size to tune. You open the pipe and pour data through it.
-- The PostgreSQL COPY protocol is fundamentally different
-- from INSERT. Here is what happens on the wire:
-- INSERT (even multi-row):
-- Client: Parse → Bind → Execute → Sync
-- Server: ParseComplete → BindComplete → CommandComplete → ReadyForQuery
-- Repeat per batch. Each batch = 4 messages out, 4 back.
-- COPY:
-- Client: sends "COPY ... FROM STDIN" query
-- Server: CopyInResponse (ready to receive)
-- Client: CopyData (stream of rows — one TCP message per ~8KB chunk)
-- Client: CopyData (next chunk)
-- Client: CopyData (next chunk)
-- ...
-- Client: CopyDone
-- Server: CommandComplete → ReadyForQuery
--
-- For 100,000 rows:
-- INSERT batches: ~1,700 statement round trips
-- COPY: 1 protocol handshake + ~150 CopyData messages + 1 CopyDone
--
-- COPY also bypasses the SQL parser for each row.
-- The executor reads the binary/CSV stream directly into the heap.
-- No per-row planning. No per-row parse overhead.
-- WAL writes are batched internally by PostgreSQL. The wire protocol difference is dramatic. A batch INSERT of 100,000 rows requires approximately 1,700 statement round trips (with optimal batching and reWriteBatchedInserts). COPY requires one protocol handshake, roughly 150 CopyData messages (8KB each, streaming), and one CopyDone. Total wire messages: about 152 versus about 3,400.
But the wire protocol is only half the story. Inside PostgreSQL, COPY also wins:
- No per-row SQL parsing. The COPY command is parsed once. Individual rows are parsed as CSV/binary data, not as SQL.
- No per-row query planning. There is no plan to generate. The target table and columns are fixed for the entire stream.
- Batched WAL writes. PostgreSQL buffers WAL records internally during COPY and flushes them in larger batches than individual INSERTs produce.
- Reduced lock overhead. One relation lock for the entire COPY, not one per batch.
- Trigger optimization. Row-level triggers still fire, but statement-level triggers fire once, not per-batch.
Result: ~127,000 rows/second. Three times faster than JdbcBatchItemWriter. 105 times faster than saveAll().
COPY and transactions: the fine print
Before you rewrite your writers, there is an important detail about how COPY interacts with Spring Batch's transaction management. Getting this wrong will cost you data integrity, which is considerably more expensive than slow writes.
// COPY and Spring Batch transaction management
//
// IMPORTANT: COPY FROM STDIN runs inside the current transaction.
// If the transaction rolls back, the COPYed rows are rolled back too.
// This is exactly what you want for Spring Batch chunk processing.
// However, there is a subtlety with DataSource.getConnection():
// Option A: Let Spring manage the transaction (preferred)
@Component
public class CopyItemWriter implements ItemWriter<Event> {
@Autowired
private DataSource dataSource;
@Override
public void write(Chunk<? extends Event> chunk) throws Exception {
// DataSourceUtils gets the connection bound to the
// current Spring transaction — NOT a new connection.
Connection conn = DataSourceUtils.getConnection(dataSource);
try {
PGConnection pgConn = conn.unwrap(PGConnection.class);
CopyManager copyManager = pgConn.getCopyAPI();
// ... COPY logic ...
} finally {
DataSourceUtils.releaseConnection(conn, dataSource);
}
}
}
// Option B: dataSource.getConnection() — DANGER
// This gets a NEW connection from the pool, outside Spring's
// transaction. The COPY runs in its own auto-committed transaction.
// If the chunk's processor threw an exception and Spring wants
// to roll back, the COPY data is already committed.
//
// Use DataSourceUtils. Always. The critical point is this: DataSource.getConnection() and DataSourceUtils.getConnection(dataSource) are not the same call. The first gives you a new connection from the pool. The second gives you the connection bound to Spring's current transaction. If you use the first, your COPY runs in a separate auto-committed transaction, and Spring Batch's chunk retry/rollback mechanism cannot touch it. Rows 1 through 999 are permanently committed even if row 1,000 throws an exception that should have rolled back the entire chunk.
This is not an academic concern. I have observed production systems where COPY writers used dataSource.getConnection() and ran without issues for months — until a constraint violation on a single row exposed the transaction isolation gap. The resulting data inconsistency was far more costly to remediate than the original performance problem.
DataSourceUtils. Always.
Error handling: what happens when row 47,832 fails
One of the questions I am asked most frequently about COPY is this: what happens when a single row has a constraint violation? With INSERT, each row fails independently. With COPY, the answer is different, and it shapes how you design your error-handling strategy.
@Component
public class CopyItemWriter implements ItemWriter<Event> {
@Autowired
private DataSource dataSource;
@Override
public void write(Chunk<? extends Event> chunk) throws Exception {
try (Connection conn = dataSource.getConnection()) {
PGConnection pgConn = conn.unwrap(PGConnection.class);
CopyManager copyManager = pgConn.getCopyAPI();
String copySql = "COPY events (name, category, payload, created_at) " +
"FROM STDIN WITH (FORMAT csv)";
try {
long rowsCopied = copyManager.copyIn(copySql,
new CopyDataInputStream(chunk.getItems()));
// rowsCopied == chunk.size() if successful
} catch (PSQLException e) {
// COPY fails atomically — if row 47,832 has a
// constraint violation, ALL rows in this chunk
// are rolled back. No partial state.
//
// The error message includes the line number:
// "ERROR: duplicate key value violates unique
// constraint ... CONTEXT: COPY events, line 47832"
//
// Spring Batch's retry/skip mechanism works here:
// configure skip-limit and retryable-exceptions
// in your step definition.
throw e;
}
}
}
}
// For upsert-like behavior with COPY, use a staging table:
//
// 1. COPY into a temp table (no constraints = no failures)
// 2. INSERT INTO events SELECT ... FROM temp_events
// ON CONFLICT (id) DO UPDATE SET ...
// 3. DROP the temp table
//
// This gives you COPY speed + upsert semantics. COPY is atomic at the statement level. If any row fails — a NOT NULL violation, a unique constraint violation, a check constraint failure — the entire COPY command fails and all rows are rolled back. There is no partial success. This is PostgreSQL's transaction semantics working exactly as designed, but it means your chunk-level retry strategy matters more with COPY than with individual INSERTs.
Spring Batch's skip and retry mechanisms still work. When a chunk fails, Spring Batch can retry by falling back to single-item processing — writing each item in the chunk individually, skipping the ones that fail, and committing the rest. With COPY, this fallback means the retry path uses individual INSERTs, not COPY. The fast path handles the happy case; the slow path handles the exceptions. This is an entirely reasonable design for workloads where failures are rare.
For workloads where failures are common — upserts, merge operations, data with known quality issues — the staging table pattern is your friend. COPY into a temporary table with no constraints (which cannot fail), then INSERT-SELECT into the target table with ON CONFLICT handling. You get COPY speed for the data transfer and SQL flexibility for the conflict resolution. The temporary table adds one extra step, but since it has no indexes and no constraints, COPY into it is actually faster than COPY into the target table.
The honest counterpoint: when COPY is not the answer
I have spent considerable time praising COPY, and every word of it is earned. But a waiter who overstates his case is no waiter at all, so allow me to be forthcoming about the scenarios where COPY is the wrong choice — or at least not clearly the right one.
Small volumes. If your batch job writes 500 rows, the performance difference between JdbcBatchItemWriter and COPY is approximately 12 milliseconds versus 4 milliseconds. Eight milliseconds saved per chunk is not worth the added complexity of CSV formatting, escape handling, and custom writer code. JdbcBatchItemWriter is the right tool for small-to-medium volumes — say, under 10,000 rows per chunk. The crossover point where COPY's setup overhead is amortized varies by row width, but 10,000 is a reasonable heuristic.
Complex types. COPY's CSV format handles strings, numbers, timestamps, and booleans cleanly. It handles JSONB if you are careful with escaping. It handles arrays with PostgreSQL's array literal syntax. But if your table has composite types, range types, or deeply nested JSONB with embedded CSV-hostile characters (newlines, quotes, NUL bytes), the escaping logic becomes fragile. A single escaping bug produces corrupt data silently — COPY does not validate semantics, only syntax. For complex types, JdbcBatchItemWriter's parameter binding is safer because the JDBC driver handles type conversion.
Connection poolers. PgBouncer in transaction-pooling mode does not support the COPY protocol. Full stop. If your Spring Batch application connects through PgBouncer — which is common in cloud-hosted PostgreSQL deployments — COPY will fail with a protocol error. You need either session-pooling mode (which largely defeats PgBouncer's purpose) or a direct connection that bypasses the pooler. This is a deployment constraint, not a code constraint, but it is one that many teams discover at the worst possible moment.
Computed columns and defaults. With INSERT, PostgreSQL can compute DEFAULT values, call functions in DEFAULT expressions, and populate generated columns. With COPY, you must provide values for every column in the COPY column list — or omit columns that have defaults, in which case PostgreSQL applies the defaults. But if your DEFAULT expression involves a function that reads other tables (a pattern I have seen more than I would like), COPY still executes that function per row, and the performance advantage shrinks. Know your defaults.
Audit triggers. COPY fires row-level triggers, so audit triggers work. But AFTER triggers on COPY are deferred until the COPY completes, which means your audit trigger cannot abort the COPY based on the result of a lookup. If your audit logic requires synchronous validation per row, COPY may not behave as you expect. Test your triggers with COPY explicitly before deploying.
"This is not a hardware problem. It is not a budget problem. It is a knowledge problem. And knowledge problems, if you'll permit me, are the very best kind."
— from You Don't Need Redis, Chapter 1: Good Evening. We Have a Problem.
The benchmark
All four approaches tested against PostgreSQL 16, same hardware, same table structure, same data, same JVM. Table has four columns plus a SERIAL primary key. No foreign keys, no triggers, one B-tree index on created_at. 100,000 rows per run, median of five runs.
| Approach | Rows/sec | SQL stmts | Wire round trips | DB CPU | App CPU |
|---|---|---|---|---|---|
| saveAll() — naive JPA | 1,200 | 200,000+ | ~200,000 | Low | High |
| Tuned JPA batching | 18,000 | ~1,700 | ~3,400 | Low | Medium |
| JdbcBatchItemWriter | 42,000 | ~1,700 | ~3,400 | Medium | Low |
| COPY protocol | 127,000 | 1 | ~152 | Medium | Low |
| COPY vs saveAll() | 105x | 1,315x fewer |
A few things worth noting. The JdbcBatchItemWriter and tuned JPA batching produce the same number of SQL statements — the difference is entirely JPA's client-side overhead (persistence context management, dirty checking, snapshot allocation). That overhead costs you 2.3x throughput.
COPY's advantage over JdbcBatchItemWriter is partly wire protocol (152 messages vs 3,400) and partly server-side (no per-row parsing or planning). The 3x gap is consistent across different row sizes. With wider rows — more columns, larger text fields — COPY's advantage grows because it avoids repeated parameter binding overhead.
The "App CPU" column is telling. saveAll() burns the most CPU on the application side — Hibernate's persistence context is doing enormous amounts of work that produces no value for a bulk write. COPY burns the least, because there is no ORM, no reflection, no framework. Just string formatting and a stream.
One variable this benchmark holds constant is network latency — all tests were run on localhost. Over a network, the gap between COPY and everything else widens further, because COPY's streaming protocol sends data without waiting for per-batch acknowledgments. At 2ms network latency, COPY's advantage over JdbcBatchItemWriter grows from 3x to approximately 5x, because the 3,400 round trips each accumulate 2ms of dead time. That is 6.8 seconds of pure latency that COPY does not pay.
How to verify which strategy is actually in use
I have lost count of the number of times a team has told me "we configured Hibernate batching" only to discover, upon inspection, that the batching was never active. The configuration was in place, but the entity used IDENTITY generation, or reWriteBatchedInserts was misspelled, or the order_inserts property was missing and Hibernate was interleaving entity types. The configuration and the reality were strangers to one another.
PostgreSQL gives you the tools to verify exactly what is happening on the wire. Use them.
-- How to verify which write strategy is actually in use
-- (because configuration and reality often disagree)
-- Check statement counts per query pattern:
SELECT query,
calls,
total_exec_time::numeric(10,1) as total_ms,
mean_exec_time::numeric(10,3) as mean_ms,
rows
FROM pg_stat_statements
WHERE query LIKE '%events%'
ORDER BY total_exec_time DESC
LIMIT 10;
-- If you see this, you are in saveAll() territory:
-- query: SELECT ... FROM events WHERE id = $1
-- calls: 100000
-- query: INSERT INTO events (...) VALUES ($1,$2,$3,$4)
-- calls: 100000
-- If you see this, JPA batching is working:
-- query: INSERT INTO events (...) VALUES ($1,$2,$3,$4),($5,$6,$7,$8),...
-- calls: 2000
-- (no SELECT queries against the events table)
-- If you see this, COPY is working:
-- No INSERT statements for the events table at all.
-- COPY statements do not appear in pg_stat_statements.
-- Instead, check pg_stat_user_tables:
SELECT relname,
n_tup_ins,
n_live_tup
FROM pg_stat_user_tables
WHERE relname = 'events';
-- n_tup_ins shows total inserts since last stats reset.
-- If this number grows but pg_stat_statements shows no INSERT,
-- your data is arriving via COPY. The pg_stat_statements extension is your ground truth. If your configuration says "batching is enabled" but pg_stat_statements shows 200,000 individual INSERT calls, your configuration is wrong. Believe the database, not the properties file.
For COPY specifically, there is an additional subtlety: COPY statements do not appear in pg_stat_statements. This is a known limitation — the COPY protocol bypasses the query execution path that pg_stat_statements hooks into. To confirm that COPY is working, check pg_stat_user_tables for the n_tup_ins counter on your target table. If inserts are happening but pg_stat_statements shows no INSERT queries for that table, the rows are arriving via COPY.
If you want to see COPY activity in real time, enable log_statement = 'all' temporarily in postgresql.conf. You will see the COPY command in the log, though the individual CopyData messages are not logged. For production monitoring, this is too verbose — but for a one-time verification that your writer is using the protocol you think it is using, it is invaluable.
The metadata time bomb: GET_LAST_STEP_EXECUTION and issue #3634
There is a second performance problem in Spring Batch on PostgreSQL that has nothing to do with your data tables. It is in Spring Batch's own metadata tables, and it gets worse every day your application runs. I confess that this problem exercises my patience more than most, because it is entirely preventable and yet Spring Batch has shipped without a fix for years.
Spring Batch maintains its execution state in a set of tables: BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_STEP_EXECUTION, and others. Every job launch inserts rows. Every step execution inserts rows. These tables grow monotonically. They are never cleaned up by default.
Before every job launch, Spring Batch queries these tables to check for previous executions, determine restartability, and find the last step execution. The critical query is GET_LAST_STEP_EXECUTION:
-- The query that destroys Spring Batch at scale.
-- From Spring Batch's JdbcJobExecutionDao:
SELECT JOB_EXECUTION_ID, START_TIME, END_TIME, STATUS,
EXIT_CODE, EXIT_MESSAGE, CREATE_TIME, LAST_UPDATED, VERSION
FROM BATCH_JOB_EXECUTION
WHERE JOB_INSTANCE_ID = ?
ORDER BY JOB_EXECUTION_ID DESC;
-- And the step execution query (GET_LAST_STEP_EXECUTION):
SELECT STEP_EXECUTION_ID, STEP_NAME, START_TIME, END_TIME,
STATUS, COMMIT_COUNT, READ_COUNT, FILTER_COUNT,
WRITE_COUNT, EXIT_CODE, EXIT_MESSAGE, READ_SKIP_COUNT,
WRITE_SKIP_COUNT, PROCESS_SKIP_COUNT, ROLLBACK_COUNT,
LAST_UPDATED, VERSION, CREATE_TIME
FROM BATCH_STEP_EXECUTION
WHERE JOB_EXECUTION_ID = ?
AND STEP_NAME = ?
ORDER BY STEP_EXECUTION_ID DESC;
-- With 500,000 job executions and 2 million step executions,
-- these queries scan millions of rows. Every. Single. Job. Launch.
--
-- GitHub issue #3634 (15+ upvotes):
-- "GET_LAST_STEP_EXECUTION becomes catastrophically slow"
-- https://github.com/spring-projects/spring-batch/issues/3634 The BATCH_STEP_EXECUTION table has no index on (JOB_EXECUTION_ID, STEP_NAME). The BATCH_JOB_EXECUTION table has no index on JOB_INSTANCE_ID. Spring Batch's default schema scripts create primary keys and a few foreign keys, but not the covering indexes these queries need.
With 500 jobs running daily, after a year you have 182,500 job executions and potentially millions of step executions. The GET_LAST_STEP_EXECUTION query does a sequential scan of the entire BATCH_STEP_EXECUTION table on every job launch. GitHub issue #3634 documents this — 15+ upvotes, open since 2022, still unresolved in the framework.
At two million step execution rows, that sequential scan takes 1.8 seconds. Per job launch. If you launch 500 jobs per day, that is 15 minutes of pure waste — every day — scanning metadata tables. And the problem compounds: the tables grow, the scans get slower, and eventually the metadata queries take longer than the actual batch processing. I have observed production systems where job launch time was dominated not by reading, processing, or writing data, but by Spring Batch checking its own bookkeeping.
The fix: three indexes Spring Batch forgot
The solution is three indexes. They are straightforward, they are safe to add to a live system, and they reduce the metadata query time from 1.8 seconds to 0.05 milliseconds.
-- The indexes that Spring Batch does NOT create by default:
-- Fix for GET_LAST_STEP_EXECUTION:
CREATE INDEX idx_batch_step_execution_job_step
ON BATCH_STEP_EXECUTION (JOB_EXECUTION_ID, STEP_NAME, STEP_EXECUTION_ID DESC);
-- Fix for job execution lookups:
CREATE INDEX idx_batch_job_execution_instance
ON BATCH_JOB_EXECUTION (JOB_INSTANCE_ID, JOB_EXECUTION_ID DESC);
-- Fix for job instance lookups by name:
CREATE INDEX idx_batch_job_instance_name
ON BATCH_JOB_INSTANCE (JOB_NAME);
-- Before indexes:
-- EXPLAIN ANALYZE on GET_LAST_STEP_EXECUTION with 2M rows:
-- Seq Scan on batch_step_execution
-- cost=0.00..89247.32 rows=2 width=312
-- actual time=1847.233..1847.891 rows=1 loops=1
-- Planning Time: 0.2 ms
-- Execution Time: 1848.1 ms
--
-- After indexes:
-- Index Scan using idx_batch_step_execution_job_step
-- cost=0.43..8.45 rows=1 width=312
-- actual time=0.028..0.029 rows=1 loops=1
-- Planning Time: 0.3 ms
-- Execution Time: 0.051 ms
--
-- 1,848 ms → 0.05 ms. That is not a typo. That is a 36,000x improvement on the metadata query alone. The composite index on (JOB_EXECUTION_ID, STEP_NAME, STEP_EXECUTION_ID DESC) gives PostgreSQL an index-only path to the answer. No heap access required. The sort order matches the ORDER BY clause, so PostgreSQL can return the first row directly from the index without sorting.
Why does Spring Batch not include these indexes? The schema scripts are designed to be database-agnostic. The primary key indexes are sufficient for H2 and HSQLDB — the embedded databases used in Spring Batch's test suite and most tutorials. PostgreSQL's query planner makes different choices at scale, and the metadata table access patterns were not optimized for production workloads with hundreds of thousands of historical executions.
There is an uncomfortable truth here. Spring Batch is primarily tested against embedded databases at small scale. The PostgreSQL schema is a translation of the H2 schema, not a design optimized for PostgreSQL's behavior at volume. This is not a criticism of the Spring Batch team — they are maintaining a framework that targets a dozen databases — but it is a reality that PostgreSQL users must account for.
Metadata cleanup: the problem that follows the problem
Adding indexes is necessary but not sufficient. The metadata tables will continue to grow, and even with proper indexes, multi-million-row tables impose costs: larger backups, slower VACUUM operations, increased WAL volume, and larger index maintenance overhead on every INSERT.
Spring Batch does not include a built-in metadata cleanup mechanism. There is no max-age property, no auto-purge, no retention policy. If you want old execution records removed, you must build it yourself.
-- Spring Batch metadata cleanup strategies
-- Option 1: Built-in retention (Spring Batch 5+)
-- spring.batch.jdbc.table-prefix=BATCH_
-- But there is no built-in max-age or auto-purge.
-- You must write a scheduled task:
@Scheduled(cron = "0 0 3 * * *") // 3 AM daily
public void cleanBatchMetadata() {
// Remove completed executions older than 30 days
jdbcTemplate.update("""
DELETE FROM BATCH_STEP_EXECUTION_CONTEXT
WHERE STEP_EXECUTION_ID IN (
SELECT se.STEP_EXECUTION_ID
FROM BATCH_STEP_EXECUTION se
JOIN BATCH_JOB_EXECUTION je
ON se.JOB_EXECUTION_ID = je.JOB_EXECUTION_ID
WHERE je.STATUS = 'COMPLETED'
AND je.END_TIME < NOW() - INTERVAL '30 days'
)
""");
// Then delete step executions, then job execution context,
// then job executions, then job instances.
// Order matters — foreign key constraints.
}
-- Option 2: Partitioned tables (PostgreSQL 12+)
-- Partition BATCH_STEP_EXECUTION by month.
-- Drop old partitions instead of DELETE.
-- Dramatically faster cleanup. No long-running DELETEs.
-- But requires custom schema scripts — Spring Batch's
-- default DDL does not support partitioning. The deletion order matters because of foreign key constraints. You must delete step execution contexts before step executions, step executions before job execution contexts, job execution contexts before job executions, and job executions before job instances. If you get the order wrong, PostgreSQL will tell you — loudly — with a foreign key violation. This is PostgreSQL doing exactly what it should: protecting referential integrity.
For high-volume deployments, consider partitioned tables. Partitioning BATCH_STEP_EXECUTION by month lets you drop entire partitions instead of running DELETE queries. A DROP TABLE on a partition is instantaneous regardless of row count, produces no dead tuples, and does not require a subsequent VACUUM. The trade-off is that Spring Batch's default DDL does not create partitioned tables, so you need custom schema scripts — which means you are taking ownership of the schema rather than letting the framework manage it.
I find this to be the correct posture for production PostgreSQL deployments regardless. Frameworks should suggest schema; production databases should own it.
The migration path: from saveAll() to COPY without rewriting everything
If you are reading this with a production Spring Batch application that currently uses saveAll(), the prospect of rewriting to COPY may feel daunting. Allow me to suggest a staged approach that delivers value at each step, so you are never all-or-nothing.
Stage 1: Fix the metadata tables. This is a ten-minute task with immediate payoff. Run the three CREATE INDEX statements from the previous section. No code changes. No redeployment. Just DDL. If your metadata tables have more than 50,000 rows, you will see the improvement immediately in your job launch times.
Stage 2: Switch from IDENTITY to SEQUENCE generation. This requires an entity change and a one-time database migration (create the sequence, set the start value). It is the prerequisite for any batching improvement, and it benefits all JPA operations, not just batch writes. Deploy this independently and verify with pg_stat_statements that SELECT-before-INSERT queries have disappeared.
Stage 3: Enable Hibernate batching. Three properties in application.properties and the reWriteBatchedInserts=true JDBC parameter. Update your writer to use persist() and add flush()/clear(). This gets you from 1,200 rows/second to 18,000. Deploy and verify.
Stage 4: Switch to JdbcBatchItemWriter. Replace the JPA writer with a JDBC writer for your highest-volume steps. Keep JPA writers for low-volume steps where the safety of entity mapping outweighs the performance cost. This gets your high-volume steps from 18,000 to 42,000 rows/second. Deploy and verify.
Stage 5: COPY for the workloads that justify it. For steps that write more than 50,000 rows per run, implement a COPY-based writer. Use DataSourceUtils for transaction safety. Test error handling with intentional constraint violations. This gets your highest-volume steps to 127,000 rows/second. Deploy and verify.
Each stage is independently deployable, independently valuable, and independently reversible. You do not need to reach stage 5 to benefit. Many teams stop at stage 3 or 4 and find the performance sufficient for their volumes. The right stage is the one where the performance meets your requirements, not the one with the largest number.
What Gold Lapel does for Spring Batch
Gold Lapel is a self-optimizing PostgreSQL proxy. It sits between your Spring Batch application and PostgreSQL, observing every query on the wire protocol. For Spring Batch workloads, three things happen automatically.
First: COPY passes through transparently. The COPY protocol is a binary-level stream. Gold Lapel detects the CopyIn/CopyData/CopyDone message sequence and passes it through without buffering, rewriting, or adding latency. Your 127,000 rows/second throughput is preserved. This is not trivial — many database proxies and connection poolers either do not support COPY or add meaningful overhead to the stream. Gold Lapel's Rust-based protocol handler treats COPY as a first-class protocol operation.
# Gold Lapel sits between Spring Batch and PostgreSQL.
# No query changes. Add the Spring Boot starter and Gold Lapel
# handles the rest — your datasource, your queries, your COPY streams.
spring.datasource.url=jdbc:postgresql://localhost:2345/myapp\
?reWriteBatchedInserts=true
# Gold Lapel:
# - Passes COPY protocol through transparently
# - Detects the batch metadata table scan pattern
# - Auto-creates the missing indexes on BATCH_STEP_EXECUTION,
# BATCH_JOB_EXECUTION, and BATCH_JOB_INSTANCE
# - Monitors INSERT throughput and flags saveAll() anti-patterns
#
# The COPY protocol is binary-level pass-through.
# Gold Lapel does not buffer, rewrite, or interfere with the
# CopyData stream. Your 100K rows/sec COPY throughput is preserved. Second: batch metadata indexes are created automatically. When Gold Lapel observes a sequential scan on BATCH_STEP_EXECUTION or BATCH_JOB_EXECUTION that takes more than a configurable threshold, it identifies the missing index, validates it against the table schema, and creates it. The next time Spring Batch launches a job, the metadata query hits the index. 1.8 seconds becomes 0.05 milliseconds. No manual intervention. No DDL scripts to remember. No EXPLAIN ANALYZE sessions to diagnose the problem.
Third: saveAll() anti-patterns are flagged. Gold Lapel sees the pattern — rapid individual INSERTs interleaved with SELECT-by-primary-key, targeting the same table, within the same transaction. That signature is unmistakable: an ORM doing merge() on new entities with IDENTITY generation. Gold Lapel surfaces this in your dashboard with the specific table, the statement count, and the estimated time cost of the anti-pattern versus the alternatives.
The data writing itself is your decision. saveAll(), JdbcBatchItemWriter, COPY — Gold Lapel works with all of them. But the metadata table problem is something a proxy is uniquely positioned to solve, because it sees the sequential scan happening in real time and can fix it before your next job launch.
The freight entrance is open
One hundred thousand rows should not take fourteen minutes. They should not take eighty-three seconds. They should take less than one second. And the framework's own metadata queries — the bookkeeping that nobody thinks to check — should not be the silent bottleneck that degrades every subsequent job launch.
The four approaches I have presented are not theoretical alternatives. They are a progression, each removing a layer of abstraction that was adding cost without adding value for bulk-write operations. saveAll() at 1,200 rows/second. Tuned JPA at 18,000. JdbcBatchItemWriter at 42,000. COPY at 127,000. The right choice depends on your volume, your error-handling requirements, your deployment constraints, and your team's comfort with lower-level APIs.
What is not a valid choice is ignorance of the options. Now that you know what saveAll() actually does — two SQL statements per entity, zero batching with IDENTITY generation, a persistence context that costs more than the database work — you can make an informed decision about how many of those layers your workload needs.
Your rows are waiting. The freight entrance is open.
Frequently asked questions
Terms referenced in this article
If you are loading data in bulk, you will eventually need to index it efficiently. I have prepared a guide on Spring Boot JPA batch insert performance that covers the JPA side of batch writes — useful when COPY is not an option and you need the ORM path to be as fast as it can be.