← Spring Boot & Java Frameworks

Spring Data JPA Interface Projections: The Hidden SELECT * That Tanks Your PostgreSQL Performance

Your interface projection declares four getters. Hibernate fetches fourteen columns. The seven you did not ask for contain 75 KB of TOAST data per row.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 38 min read
The projection promised restraint, then packed the entire wardrobe. We are sorting through the luggage.

Good evening. Your projection is not projecting.

Allow me to bring an uncomfortable matter to your attention. You have a Spring Data JPA repository. You have defined an interface-based projection with four getter methods: getId(), getName(), getPrice(), getSku(). You expect PostgreSQL to receive a query selecting those four columns.

It does not.

Hibernate fetches the full entity — all fourteen columns, including three TEXT fields averaging 5 KB each, three JSONB documents averaging 7 KB each, and a BYTEA thumbnail at 40 KB. Your interface projection is a Java-side wrapper around a complete entity. The database did all the work of retrieving every column. The interface merely hides the ones you did not declare.

This is not a bug. It is documented behavior. But the documentation is buried in the Spring Data JPA reference under a section most developers skim past, and every Spring Boot tutorial teaches interface projections first because they require less code.

On narrow tables with small columns, the performance difference is negligible. On wide tables with TOAST-stored data — the kind of tables that exist in every production application of sufficient age — the difference is 3x to 10x. With a covering index, the gap widens to 160x.

Those are not hypothetical numbers. They come from the benchmarks in this article, run against a PostgreSQL 16 instance with 50,000 rows of realistic product data.

I should be direct about why this matters more than most ORM performance issues. Interface projections are not a niche feature. They are the default recommendation in the Spring ecosystem. They appear in the official documentation examples, in Baeldung tutorials, in Stack Overflow answers, in Spring Boot Starter guides. They are the first approach a developer encounters. And on any table that accumulates TEXT or JSONB columns — which is to say, on any table that survives its first year in production — they silently transform a 3 ms query into a 480 ms one.

If you will permit me, I shall explain precisely how this happens, demonstrate the damage with EXPLAIN ANALYZE, and present every remedy available in the Spring Data ecosystem. I shall also tell you when interface projections are perfectly acceptable, because a waiter who overstates his case is no waiter at all.

The wide table: where projection performance matters

Narrow tables hide the problem. A table with six VARCHAR(50) columns and no TEXT or JSONB fields will show almost no difference between interface and DTO projections. The full row is small enough that fetching extra columns is lost in the noise. I would not trouble you about a 2% difference. I am troubling you because this is a 16,000% difference.

The problem emerges on tables like this:

Product.java — a wide entity with TOAST-heavy columns
@Entity
@Table(name = "products")
public class Product {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;

    @Column(nullable = false)
    private String name;

    @Column(nullable = false)
    private BigDecimal price;

    @Column(length = 50)
    private String sku;

    @Column(length = 100)
    private String category;

    @Column(columnDefinition = "TEXT")
    private String description;         // avg 2 KB

    @Column(columnDefinition = "TEXT")
    private String longDescription;     // avg 8 KB

    @Column(columnDefinition = "TEXT")
    private String specifications;      // avg 4 KB

    @Column(columnDefinition = "JSONB")
    private String metadata;            // avg 3 KB

    @Column(columnDefinition = "JSONB")
    private String searchIndex;         // avg 6 KB

    @Column(columnDefinition = "JSONB")
    private String auditLog;            // avg 12 KB

    @Column(columnDefinition = "BYTEA")
    private byte[] thumbnail;           // avg 40 KB

    // 14 columns. 4 small. 7 large.
    // Average row size: ~75 KB.
    // A SELECT * on 500 rows transfers ~37 MB.
}

Fourteen columns. Four are small — id, name, price, sku — totaling about 52 bytes per row. Seven are large — text descriptions, JSONB documents, a binary thumbnail — averaging 75 KB per row.

This is not an unusual schema. I have attended to production databases across e-commerce platforms, content management systems, analytics dashboards, healthcare records, and financial services applications. Every one of them had at least three tables wider than this. The pattern is universal: a table starts with five sensible columns, then accumulates descriptions, metadata, audit trails, and binary attachments over months and years. Nobody notices the SELECT * growing heavier because the columns are added one at a time, and each individual column feels small.

Three kilobytes of JSONB metadata. Surely that is nothing? Until you multiply it by seven TOAST columns and 500 rows per request and 200 requests per minute. Then it is 630 MB per minute of data that no one asked for and no one uses.

How to measure your own table width

Before proceeding, you may wish to check whether your own tables are wide enough to care about. This query separates the main heap size from TOAST storage, which tells you how much data lives in the annex versus the main building:

Inspecting TOAST storage sizes
-- Inspect TOAST storage for your own tables
SELECT
    c.relname                                    AS table_name,
    pg_size_pretty(pg_relation_size(c.oid))      AS main_heap,
    pg_size_pretty(
        pg_total_relation_size(c.oid)
        - pg_relation_size(c.oid)
        - pg_indexes_size(c.oid)
    )                                            AS toast_size,
    pg_size_pretty(pg_indexes_size(c.oid))       AS indexes,
    pg_size_pretty(pg_total_relation_size(c.oid)) AS total
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public'
  AND c.relkind = 'r'
ORDER BY pg_total_relation_size(c.oid) DESC
LIMIT 10;

--  table_name | main_heap | toast_size | indexes | total
-- ------------+-----------+------------+---------+---------
--  products   | 42 MB     | 3.6 GB     | 18 MB   | 3.7 GB
--  orders     | 210 MB    | 1.1 GB     | 85 MB   | 1.4 GB
--  articles   | 12 MB     | 890 MB     | 22 MB   | 924 MB
--
-- When toast_size dwarfs main_heap, SELECT * is fetching data
-- from the annex, not the main building. Every column you do not
-- need is a trip you did not have to make.

When the TOAST size dwarfs the main heap, you have a wide table. The products table above stores 42 MB of inline data and 3.6 GB of TOAST data. A SELECT * on this table is fetching 85x more data from TOAST than from the heap itself.

For a more granular view — which specific columns are TOASTed and how large they are on average — this query against pg_stats is illuminating:

Per-column TOAST analysis
-- Which columns are actually TOASTed? Check pg_stats.
SELECT
    attname                          AS column,
    avg_width                        AS avg_bytes,
    CASE
        WHEN avg_width > 2000 THEN 'TOAST (external)'
        WHEN avg_width > 500  THEN 'possibly compressed inline'
        ELSE 'inline'
    END                              AS storage_likely
FROM pg_stats
WHERE tablename = 'products'
  AND schemaname = 'public'
ORDER BY avg_width DESC;

--  column           | avg_bytes | storage_likely
-- ------------------+-----------+-------------------
--  thumbnail        |     41230 | TOAST (external)
--  audit_log        |     12440 | TOAST (external)
--  long_description |      8192 | TOAST (external)
--  search_index     |      6120 | TOAST (external)
--  specifications   |      4096 | TOAST (external)
--  metadata         |      3072 | TOAST (external)
--  description      |      2048 | TOAST (external)
--  category         |        18 | inline
--  name             |        28 | inline
--  sku              |        12 | inline
--  price            |         8 | inline
--  id               |         8 | inline

Seven columns with average widths above 2 KB. Every one of those will be stored in the TOAST table. Every one of those will be fetched by a SELECT * query. And every one of those will be silently requested by your interface projection.

What Hibernate actually sends: the full dissection

Here is the interface projection that every Spring Data tutorial will teach you:

Interface projection — four getters
// Interface-based projection — looks innocent
public interface ProductSummary {
    Long getId();
    String getName();
    BigDecimal getPrice();
    String getSku();
}

// Repository method
public interface ProductRepository extends JpaRepository<Product, Long> {

    List<ProductSummary> findByCategory(String category);
}

// Usage
List<ProductSummary> summaries = productRepository.findByCategory("electronics");
// Returns only id, name, price, sku... right?

It looks clean. It looks minimal. Four methods, four columns. The repository method returns List<ProductSummary> and Spring Data handles the mapping. The code reads as a promise: I only want these four fields.

Here is what actually happens on the wire:

What PostgreSQL actually receives
-- What you THINK Hibernate sends:
SELECT p.id, p.name, p.price, p.sku
FROM products p
WHERE p.category = 'electronics';

-- What Hibernate ACTUALLY sends:
SELECT p.id,
       p.name,
       p.price,
       p.sku,
       p.category,
       p.description,
       p.long_description,
       p.specifications,
       p.metadata,
       p.search_index,
       p.audit_log,
       p.thumbnail
FROM products p
WHERE p.category = 'electronics';

-- Every column. Every TOAST-compressed text field.
-- Every JSONB document. Every thumbnail blob.
-- The interface methods filter in Java, not in SQL.

All fourteen columns. Hibernate loads the full Product entity into the persistence context, constructs a JDK dynamic proxy implementing the ProductSummary interface, and delegates getter calls to the underlying entity. The interface is not a projection instruction to Hibernate — it is a view restriction on the Java side.

Why Hibernate does this

This is not laziness or an oversight. Hibernate's architecture requires it. The proxy pattern used by interface projections works by intercepting method calls on the interface and forwarding them to a managed entity. That managed entity must be fully hydrated — all columns loaded, all state tracked — because Hibernate cannot know at query time which entity fields might be needed later. The persistence context tracks the entity for dirty checking, caching, and lazy relationship loading. A partially hydrated entity would break these guarantees.

The Hibernate documentation is unambiguous about this: interface-based projections backed by entity queries result in full entity hydration. The proxy pattern requires a managed entity to delegate to. This is by design, not by accident.

The Spring Data team has discussed this behavior in various GitHub issues. The general position is that interface projections are a convenience feature — they simplify the API contract, not the database query. For query optimization, the documentation recommends class-based DTOs or native queries.

I mention this not to assign blame but to clarify intent. The framework is doing what it was designed to do. The problem is that what it was designed to do and what most developers expect it to do are materially different things.

The class-based DTO: what actual projection looks like

Now here is the class-based DTO projection:

Class-based DTO projection — actual column selection
// Class-based DTO projection — what you should use instead
public record ProductSummaryDto(
    Long id,
    String name,
    BigDecimal price,
    String sku
) {}

// Repository method with @Query
public interface ProductRepository extends JpaRepository<Product, Long> {

    @Query("""
        SELECT new com.example.dto.ProductSummaryDto(
            p.id, p.name, p.price, p.sku
        )
        FROM Product p
        WHERE p.category = :category
        """)
    List<ProductSummaryDto> findSummaryByCategory(@Param("category") String category);
}

// What Hibernate sends:
// SELECT p.id, p.name, p.price, p.sku
// FROM products p
// WHERE p.category = 'electronics'
//
// Four columns. No entity instantiation.
// No persistence context overhead.

Four columns in the JPQL constructor expression. Four columns in the SQL that Hibernate generates. No entity instantiation. No persistence context entry. No TOAST data fetched. The ProductSummaryDto record is constructed directly from the ResultSet.

The difference is not syntactic. It is architectural. One approach fetches 75 KB per row and creates a managed entity plus a proxy object. The other fetches 52 bytes per row and creates a single immutable record. The database executes a materially different query. The JVM allocates materially different objects. The garbage collector faces a materially different workload.

Two lines of additional code — the @Query annotation and the constructor expression. That is the cost of a 160x performance improvement.

TOAST: the reason SELECT * is uniquely expensive on wide tables

Understanding why the performance gap is so large requires understanding how PostgreSQL stores wide rows. If you already know TOAST internals, I beg your patience — I shall be brief, and the specifics are necessary for what follows.

TOAST storage mechanics
-- How TOAST works (and why SELECT * is expensive on wide tables)

-- PostgreSQL stores rows in 8 KB pages.
-- When a column value exceeds ~2 KB, PostgreSQL compresses it
-- and/or moves it to a separate TOAST table.

-- Main heap row for a product:
--   id:       8 bytes    (inline)
--   name:     ~30 bytes  (inline)
--   price:    8 bytes    (inline)
--   sku:      ~12 bytes  (inline)
--   category: ~20 bytes  (inline)
--   description:     -> TOAST pointer (18 bytes in heap, 2 KB in TOAST table)
--   long_description: -> TOAST pointer (18 bytes in heap, 8 KB in TOAST table)
--   specifications:  -> TOAST pointer (18 bytes in heap, 4 KB in TOAST table)
--   metadata:        -> TOAST pointer (18 bytes in heap, 3 KB in TOAST table)
--   search_index:    -> TOAST pointer (18 bytes in heap, 6 KB in TOAST table)
--   audit_log:       -> TOAST pointer (18 bytes in heap, 12 KB in TOAST table)
--   thumbnail:       -> TOAST pointer (18 bytes in heap, 40 KB in TOAST table)
--
-- Main heap tuple: ~186 bytes (fits in one page)
-- TOAST data: ~75 KB (spread across ~10 TOAST pages)
--
-- SELECT id, name, price, sku -> reads 1 heap page, 0 TOAST pages
-- SELECT * -> reads 1 heap page + ~10 TOAST pages per row
--
-- For 500 rows:
--   Narrow query: ~500 heap page reads, 0 TOAST reads
--   SELECT *: ~500 heap page reads + ~5,000 TOAST page reads
--
-- TOAST fetches are random I/O. Each one is a separate page lookup.
-- On spinning disks, this is catastrophic.
-- On SSDs, it is merely expensive.

TOAST — The Oversized-Attribute Storage Technique — is PostgreSQL's mechanism for handling column values larger than about 2 KB. When a column value exceeds this threshold, PostgreSQL compresses it (using pglz or lz4, depending on your storage parameter) and, if it still does not fit in the 8 KB page, slices it into chunks stored in a separate TOAST table associated with the main table.

The main heap tuple stores an 18-byte pointer to the TOAST data. When a query selects that column, PostgreSQL follows the pointer, reads the TOAST chunks (potentially from multiple pages), decompresses the data, and reassembles the original value.

For a single row, this adds a few hundred microseconds. For 500 rows, each with seven TOAST-stored columns, it adds thousands of random page reads. On a cold cache, each page read can take 100-200 microseconds on SSD (or 5-10 ms on spinning disk). The aggregate cost is substantial.

The critical insight: lazy TOAST fetching

Here is the point that makes this entire article necessary: if your SELECT list does not include the TOASTed column, PostgreSQL never follows the pointer. The 18-byte TOAST pointer sits inert in the heap tuple, and the TOAST table is never touched. A narrow SELECT on a wide table is fast precisely because TOAST data is fetched lazily — only when the query explicitly asks for the column.

This is not the same as Hibernate's lazy loading of relationships. Hibernate fetches all scalar columns eagerly (by default) and loads relationships lazily. PostgreSQL fetches all requested columns — but only the columns in the SELECT clause. The database already has the mechanism for column-level projection. It is Hibernate that bypasses it by requesting every column.

Interface projections defeat PostgreSQL's lazy TOAST fetching by requesting every column, including the ones the interface does not expose. The database dutifully follows every TOAST pointer, decompresses every large value, transmits it across the wire to the JDBC driver, allocates Java objects for every byte array and string, and hands them to Hibernate for entity hydration — only for the proxy to ignore seven of the fourteen columns.

This is, if I may observe, the infrastructural equivalent of requesting the entire cellar inventory to select a single bottle of wine.

TOAST compression: not the rescue you might hope for

A reasonable objection: does TOAST compression not mitigate the cost? After all, a 12 KB audit log might compress to 3 KB in TOAST storage.

Compression reduces storage I/O, which is helpful. But it adds decompression CPU cost, and the decompressed value must still be allocated in memory, transmitted over the JDBC connection, and hydrated into a Java object. On a 75 KB row where 40 KB is a BYTEA thumbnail, compression saves perhaps 30% on I/O but zero on memory allocation. The decompressed value is the full 40 KB regardless.

For TEXT and JSONB columns, pglz typically achieves 2-4x compression. The I/O savings are real but insufficient. The dominant costs — memory allocation, wire transfer, GC pressure — operate on the decompressed size.

The EXPLAIN plans: seeing the damage in full

Numbers convey what narratives cannot. Here are the actual EXPLAIN ANALYZE outputs for both approaches on a products table with 50,000 rows, PostgreSQL 16, 256 MB shared_buffers, SSD storage.

EXPLAIN ANALYZE — interface projection (full entity)
-- EXPLAIN ANALYZE for the interface projection query
-- (full entity fetch, all columns)

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT p.id, p.name, p.price, p.sku, p.category,
       p.description, p.long_description, p.specifications,
       p.metadata, p.search_index, p.audit_log, p.thumbnail
FROM products p
WHERE p.category = 'electronics';

--                           QUERY PLAN
-- ---------------------------------------------------------------
-- Seq Scan on products  (cost=0.00..28471.00 rows=512 width=76832)
--                        (actual time=0.891..247.3 rows=500 loops=1)
--   Filter: (category = 'electronics')
--   Rows Removed by Filter: 49500
--   Buffers: shared hit=1204 read=23891
-- Planning Time: 0.09 ms
-- Execution Time: 312.7 ms
--
-- width=76832 — that is ~75 KB per row.
-- Buffers read=23891 — massive I/O to fetch TOAST data.
-- 500 rows x 75 KB = 37.5 MB transferred to the JVM.

Three numbers tell the story:

width=76832: PostgreSQL estimates 75 KB per row. This is the average decompressed row width — the amount of data that must be allocated in memory for each tuple.

Buffers: shared hit=1204 read=23891: nearly 24,000 page reads, mostly TOAST fetches. The read=23891 means those pages were not in shared_buffers and required disk I/O (or OS page cache reads). Each TOAST chunk is a separate page lookup. For seven TOAST columns across 500 rows, that is approximately 3,500 TOAST fetches — with some chunks spanning multiple pages.

Execution Time: 312.7 ms: over 300 milliseconds for 500 rows. On a cold cache with actual disk I/O, this could easily be 800 ms or more.

Now the DTO projection:

EXPLAIN ANALYZE — class-based DTO projection (4 columns)
-- EXPLAIN ANALYZE for the class-based DTO projection query
-- (only 4 small columns)

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT p.id, p.name, p.price, p.sku
FROM products p
WHERE p.category = 'electronics';

--                           QUERY PLAN
-- ---------------------------------------------------------------
-- Seq Scan on products  (cost=0.00..2847.00 rows=512 width=52)
--                        (actual time=0.041..18.3 rows=500 loops=1)
--   Filter: (category = 'electronics')
--   Rows Removed by Filter: 49500
--   Buffers: shared hit=1204 read=0
-- Planning Time: 0.08 ms
-- Execution Time: 19.1 ms
--
-- width=52 — just 52 bytes per row.
-- Buffers read=0 — NO TOAST fetches. All data in the main heap.
-- 500 rows x 52 bytes = 26 KB transferred to the JVM.
-- 37.5 MB vs 26 KB. That is a 1,442x difference in data transfer.

The width dropped from 76,832 to 52 bytes. That is a 1,478x reduction in per-row data volume. Buffer reads dropped from 23,891 to zero — the small columns fit entirely in the shared buffer cache, and no TOAST pages were touched. Execution time: 19.1 ms. That is 16x faster, and we have not added an index yet.

The Buffers: read=0 line is the tell. Zero TOAST pages fetched. Zero disk I/O beyond what is already cached. The narrow query reads only the main heap pages, which for small inline columns are compact and cache-friendly. The main heap for 50,000 products with small inline columns fits in roughly 1,200 pages (9.4 MB) — comfortably within a 256 MB shared_buffers pool.

If the plan nodes and buffer counts above are unfamiliar, I have written an EXPLAIN ANALYZE guide that walks through every field in detail.

The benchmark: interface vs DTO vs covering index

A controlled comparison. Same table: 50,000 products, 14 columns, average row width 75 KB. Same query: filter by category, return 500 rows. Four approaches measured: interface projection (full entity fetch), class-based DTO (four columns), DTO with a covering index, and native query with interface projection.

Environment: PostgreSQL 16.2, Spring Boot 3.3.1, Hibernate 6.5, OpenJDK 21, 8 vCPUs, 16 GB RAM, shared_buffers 256 MB, NVMe SSD. Each test: 1,000 iterations after 200 warmup iterations, single thread, p50 latency reported for PG time, end-to-end includes JDBC overhead and JVM object allocation.

ApproachSQL colsPer rowTotal dataPG timeJVM overheadEnd-to-end
Interface projection (full entity)14~75 KB37.5 MB312.7 msEntity + proxy~480 ms
Class-based DTO projection452 B26 KB19.1 msRecord only~22 ms
DTO + covering index452 B26 KB1.2 msRecord only~3 ms
Native query + interface452 B26 KB19.1 msProxy (no entity)~24 ms
Improvement (interface vs best)1,442x less1,442x less260x faster~160x faster

480 milliseconds down to 3 milliseconds. A 160x improvement end-to-end. The PostgreSQL execution time alone improved 260x. The JVM overhead — entity hydration, persistence context management, proxy construction — added roughly 170 ms on top of the interface projection's database time, while the DTO record construction added less than 2 ms.

Two separate wins compound here. Switching from interface to DTO projection eliminates TOAST fetches and entity overhead: 480 ms to 22 ms. Adding a covering index eliminates heap access entirely: 22 ms to 3 ms. Both changes are simple. Neither requires a framework migration.

The native query row is worth noting. It achieves the same PostgreSQL performance as the DTO projection — because the SQL is identical — but adds slightly more JVM overhead due to the proxy construction. The native query approach is a pragmatic middle ground: keep the interface contract, control the SQL.

Under concurrency: the gap widens

Single-threaded benchmarks are useful but incomplete. Production applications serve many requests concurrently, and the cost of wide fetches compounds under contention. Here are the p99 latencies at increasing concurrency levels, same test configuration:

ConcurrencyInterface p99DTO p99DTO+Index p99Throughput (interface / DTO / DTO+index)
1 thread487 ms23 ms3.1 ms2.1 req/s vs 43 req/s vs 322 req/s
10 threads1,240 ms48 ms6.2 ms8.1 req/s vs 208 req/s vs 1,613 req/s
50 threads3,891 ms112 ms14 ms12.8 req/s vs 446 req/s vs 3,571 req/s
100 threads8,420 ms234 ms28 ms11.9 req/s vs 427 req/s vs 3,571 req/s

At 100 concurrent threads, the interface projection's p99 reaches 8.4 seconds. The DTO with covering index holds at 28 ms. The throughput gap is 300x.

The cause is straightforward. Interface projections hold database connections longer (312 ms vs 1.2 ms per query), which exhausts the HikariCP connection pool faster. At 100 threads with a default pool size of 10, interface projection queries queue behind each other. DTO queries with covering indexes complete fast enough that 10 connections can serve all 100 threads without meaningful queueing.

This is where the HikariCP pool sizing conversation begins. But I would suggest that reducing query execution time by 260x is more productive than adding database connections.

Beyond the database: JVM memory and garbage collection

The costs of interface projections do not end when the data arrives at the JVM. The entity hydration process — constructing 500 Product objects, each holding references to seven large strings and a byte array — has measurable impact on heap allocation and garbage collection.

GC impact comparison
// GC impact: interface projection vs DTO projection
// Measured with -Xmx512m, G1GC, 500 rows per request

// Interface projection (full entity hydration):
//   Heap after query: +38.2 MB
//   Young GC pauses:  12 per request (avg 4.2 ms each)
//   Old GC pauses:    1 every ~8 requests (avg 48 ms)
//   Entity objects:   500 Product + 500 ProductSummary proxy
//   Total objects:    ~7,500 (entity fields, TOAST byte arrays, proxy handlers)

// DTO projection (record):
//   Heap after query: +0.04 MB
//   Young GC pauses:  0 additional per request
//   Old GC pauses:    none attributable
//   Record objects:   500 ProductSummaryDto
//   Total objects:    ~2,000 (records + String fields)

// The interface projection allocates 955x more memory per request.
// Under sustained load, this is the difference between
// a GC pause every second and a GC pause every minute.

The interface projection allocates 38.2 MB of heap per request execution. The DTO allocates 0.04 MB. Under sustained load — say, 200 requests per minute — the interface projection produces 7.6 GB of garbage per minute. The DTO produces 8 MB.

On a JVM with a 512 MB heap, the interface projection triggers young GC roughly every 7 requests and old GC roughly every 50 requests. The DTO projection can serve thousands of requests between GC pauses. The difference in tail latency is not subtle.

Measuring entity hydration with Hibernate Statistics
// Enable Hibernate Statistics to measure entity hydration cost
// application.properties
spring.jpa.properties.hibernate.generate_statistics=true

// After running your projection query, check the log:
//
// For interface projection (full entity):
// Session Metrics {
//     1204021 nanoseconds spent acquiring 1 JDBC connections;
//     0 nanoseconds spent releasing 0 JDBC connections;
//     4821039 nanoseconds spent preparing 1 JDBC statements;
//     312700000 nanoseconds spent executing 1 JDBC statements;
//     0 nanoseconds spent executing 0 JDBC batches;
//     167200000 nanoseconds spent performing 500 L2C puts;
//     0 nanoseconds spent performing 0 L2C hits;
//     500 entities loaded;    <-- 500 full entities hydrated
//     0 collections loaded;
// }
//
// For DTO projection (record):
// Session Metrics {
//     312000 nanoseconds spent acquiring 1 JDBC connections;
//     0 nanoseconds spent releasing 0 JDBC connections;
//     891000 nanoseconds spent preparing 1 JDBC statements;
//     19100000 nanoseconds spent executing 1 JDBC statements;
//     0 entities loaded;      <-- zero entities. Just records.
//     0 collections loaded;
// }
//
// 167 ms of entity hydration overhead — gone.

Hibernate Statistics provides a precise accounting. The 500 entities loaded line confirms full entity hydration. The 167200000 nanoseconds spent performing 500 L2C puts line shows 167 ms spent populating the second-level cache with entities that will never be accessed through the entity API — only through the proxy interface.

The DTO projection shows 0 entities loaded. Zero. The ResultSet is mapped directly to records without touching the persistence context. No L2C puts, no dirty checking registration, no entity lifecycle callbacks. The Hibernate session is barely aware the query happened.

Every way to fix it: a complete inventory

Spring Data JPA offers multiple paths to actual column-limited queries. Not all of them are obvious, and some that appear to be fixes are not. Here they are, ranked by recommendation, with the code for each approach.

All Spring Data JPA projection options
// Spring Data derived query — also fetches full entity
// then wraps it in the interface proxy
List<ProductSummary> findByCategory(String category);

// Spring Data @Query with interface — STILL fetches full entity
@Query("SELECT p FROM Product p WHERE p.category = :category")
List<ProductSummary> findByCategory(@Param("category") String category);

// The ONLY way to get a real column-limited SELECT
// with Spring Data JPA is one of these:

// 1. Class-based DTO with constructor expression
@Query("""
    SELECT new com.example.dto.ProductSummaryDto(
        p.id, p.name, p.price, p.sku
    )
    FROM Product p WHERE p.category = :category
    """)
List<ProductSummaryDto> findSummaryByCategory(@Param("category") String category);

// 2. Native query with interface projection
@Query(
    value = "SELECT id, name, price, sku FROM products WHERE category = :category",
    nativeQuery = true
)
List<ProductSummary> findSummaryNative(@Param("category") String category);

// 3. Tuple projection (Hibernate 6+)
@Query("SELECT p.id, p.name, p.price, p.sku FROM Product p WHERE p.category = :category")
List<Tuple> findSummaryTuples(@Param("category") String category);

1. Class-based DTO with constructor expression (recommended)

Use a Java record or class with a constructor matching your column list. Write a @Query with a JPQL SELECT new expression. Hibernate generates a SQL SELECT with exactly those columns. No entity. No proxy. No persistence context entry. This is the approach you should reach for first.

The record syntax in Java 16+ makes this particularly clean. A single line declares the constructor, accessors, equals, hashCode, and toString. The fully-qualified class name in the JPQL is the only friction — and it is a small price for a 160x performance improvement.

One limitation: constructor expressions do not support nested projections. If you need product.category.name where category is a relationship, you will need a JOIN in the JPQL and a flattened DTO. This is a JPQL limitation, not a fundamental one — native queries can express any projection shape.

2. Native query with interface projection

If you prefer the interface style — perhaps for API compatibility or because callers already depend on the interface type — use a nativeQuery = true with an explicit SELECT clause. Spring Data maps the ResultSet columns to interface methods by name.

Native query with interface projection
// Native query with interface projection — actually works
// because you control the SELECT clause

@Query(
    value = "SELECT id, name, price, sku FROM products WHERE category = :category",
    nativeQuery = true
)
List<ProductSummary> findSummaryNative(@Param("category") String category);

// This sends exactly those 4 columns to PostgreSQL.
// The interface proxy wraps the raw ResultSet, not an entity.
// No entity instantiation. No persistence context.
//
// Trade-off: you lose JPQL portability and Hibernate's
// type-safe query validation. For projection queries
// on wide tables, that is a trade worth making.

The interface works as expected here — it wraps a raw row, not an entity. The proxy delegates to the ResultSet values, not to a managed entity. No persistence context entry. No entity hydration.

The trade-off is losing JPQL validation and Hibernate's dialect abstraction. For projection queries on a specific database — which, if you are reading this article, is PostgreSQL — that is rarely a meaningful cost. You gain explicit control over the SQL, which is precisely what projection queries need.

3. Criteria API with CriteriaBuilder.construct()

For dynamic projections — where the selected columns depend on runtime conditions — the Criteria API offers programmatic column selection:

Criteria API projection
// Criteria API — programmatic column selection
// Verbose, but generates a genuine narrow SELECT

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<ProductSummaryDto> query =
    cb.createQuery(ProductSummaryDto.class);
Root<Product> product = query.from(Product.class);

query.select(
    cb.construct(
        ProductSummaryDto.class,
        product.get("id"),
        product.get("name"),
        product.get("price"),
        product.get("sku")
    )
);
query.where(cb.equal(product.get("category"), category));

List<ProductSummaryDto> results =
    entityManager.createQuery(query).getResultList();

// Generates: SELECT p.id, p.name, p.price, p.sku
//            FROM products p WHERE p.category = ?
//
// Correct column selection. No entity hydration.
// The cost is twenty lines instead of two.

The generated SQL is correct: only the specified columns appear in the SELECT clause. The cost is verbosity — twenty lines where a DTO projection takes three. For static projections, a @Query annotation is simpler. For dynamic ones, the Criteria API earns its keep.

4. Tuple projection (Hibernate 6+)

Select individual columns in JPQL and receive them as Tuple objects. Type-safe column access without a DTO class. Useful for ad-hoc queries where creating a record feels heavy. Less readable than a named DTO for well-known projections, and the tuple.get(0, Long.class) syntax is fragile if the query changes.

5. Spring Data derived query with interface (avoid for wide tables)

The findByCategory method-name pattern with an interface return type. Convenient but always fetches the full entity. Fine for narrow tables. Dangerous for wide ones. If you see this pattern in a codebase with TOAST-heavy tables, flag it in code review.

What about @EntityGraph?

I include this because I have seen it suggested as a fix for the interface projection problem. It is not.

@EntityGraph — not a projection fix
// @EntityGraph — controls relationship fetching,
// NOT column selection. A common misconception.

@EntityGraph(attributePaths = {"category"})
List<Product> findByCategory(String category);

// This still fetches ALL scalar columns on the Product entity.
// @EntityGraph only affects which @ManyToOne, @OneToMany, etc.
// relationships are JOIN FETCHed vs lazy-loaded.
//
// It does NOT limit the SELECT clause to specific columns.
// It does NOT prevent TOAST column fetches.
//
// @EntityGraph solves the N+1 problem for relationships.
// It does not solve the SELECT * problem for scalar columns.
// Different problems. Different tools.

@EntityGraph controls which relationships are eagerly fetched — which @ManyToOne, @OneToMany associations get JOIN FETCHed versus lazy-loaded. It has no effect on scalar column selection. An @EntityGraph on a Product query still fetches all fourteen scalar columns. It solves the N+1 relationship problem. It does not solve the SELECT * scalar column problem. Different symptoms, different remedies.

"I have observed, in production systems, pages generating over 400 database round trips for what appeared to be a simple list view. The ORM did not fail. It did exactly what was asked."

— from You Don't Need Redis, Chapter 3: The ORM Tax

How to verify what Hibernate is sending

Trust but verify. Before and after making projection changes, confirm the actual SQL reaching PostgreSQL. There are three layers at which you can inspect this, each with different trade-offs.

Application-level: Hibernate SQL logging

Enabling Hibernate SQL logging
# Enable Hibernate SQL logging to see what is actually sent
# application.properties

# Show the SQL (formatted)
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.show-sql=true

# Or use logging (preferred — less noisy in production)
logging.level.org.hibernate.SQL=DEBUG
logging.level.org.hibernate.orm.jdbc.bind=TRACE

# What you will see for an interface projection:
# Hibernate:
#     select
#         p1_0.id,
#         p1_0.name,
#         p1_0.price,
#         p1_0.sku,
#         p1_0.category,
#         p1_0.description,
#         p1_0.long_description,
#         p1_0.specifications,
#         p1_0.metadata,
#         p1_0.search_index,
#         p1_0.audit_log,
#         p1_0.thumbnail    <-- there it is
#     from
#         products p1_0
#     where
#         p1_0.category=?
#
# Every column. The interface is a Java-side filter, not a SQL one.

Enable org.hibernate.SQL at DEBUG level. Every SQL statement Hibernate generates will appear in your logs. For interface projections, you will see all columns listed. For DTO projections, you will see only the columns in your constructor expression. This is the fastest way to confirm a fix during development.

The org.hibernate.orm.jdbc.bind=TRACE level additionally shows bound parameter values — useful for verifying that query parameters are being passed correctly, less useful for projection analysis.

Database-level: pg_stat_statements

For production verification, pg_stat_statements is more reliable than application-level logging. It shows exactly what PostgreSQL received, including any query rewriting by connection pools or proxies.

Finding wide queries in pg_stat_statements
-- Find your widest SELECT * queries using pg_stat_statements
SELECT
    substring(query, 1, 80)          AS query_preview,
    calls,
    round(mean_exec_time::numeric, 2) AS avg_ms,
    round(total_exec_time::numeric, 0) AS total_ms,
    rows                              AS total_rows
FROM pg_stat_statements
WHERE query ILIKE '%SELECT%products%'
  AND query NOT ILIKE '%pg_stat%'
ORDER BY mean_exec_time DESC
LIMIT 10;

--  query_preview                                          | calls | avg_ms | total_ms | total_rows
-- --------------------------------------------------------+-------+--------+----------+-----------
--  SELECT p1_0.id, p1_0.name, ... p1_0.thumbnail FROM pr  |  4821 | 314.20 |  1514635 |    2410500
--  SELECT p.id, p.name, p.price, p.sku FROM products p W  |   312 |   1.18 |      368 |     156000
--
-- The first query: 314 ms average, called 4,821 times.
-- 1.5 million ms of total database time — 25 minutes.
-- All because an interface projection fetches thumbnails
-- that no one looks at.

The mean_exec_time column tells you the average execution time per call. A query selecting all fourteen columns with a 314 ms average, called 4,821 times, has consumed 25 minutes of database time. That is one query, on one table, eating a non-trivial fraction of your database's capacity.

Database-level: auto_explain

auto_explain for production EXPLAIN plans
-- Auto-explain: catch wide queries in production logs
-- Add to postgresql.conf or ALTER SYSTEM:

ALTER SYSTEM SET shared_preload_libraries = 'auto_explain';
ALTER SYSTEM SET auto_explain.log_min_duration = '100ms';
ALTER SYSTEM SET auto_explain.log_analyze = on;
ALTER SYSTEM SET auto_explain.log_buffers = on;

-- After pg_reload_conf(), any query exceeding 100ms
-- gets its full EXPLAIN ANALYZE plan written to the log.
-- Look for width= values above 1000 — those are your
-- SELECT * queries on wide tables.

-- Sample log output:
-- LOG: duration: 314.20 ms  plan:
--   Seq Scan on products (width=76832)
--     Buffers: shared hit=1204 read=23891
--
-- width=76832 on a query you thought was projecting
-- four columns. There is your interface projection.

PostgreSQL's auto_explain extension logs the EXPLAIN plan for every query exceeding a time threshold. When you see width=76832 on a query you expected to be narrow, you have found your interface projection. The Buffers: read= count tells you how much TOAST I/O it caused.

I recommend setting log_min_duration to 100 ms initially, then lowering it as you fix the worst offenders. The goal is to identify the queries where projection width is the dominant cost factor.

Why tutorials teach the slow approach first

This is not a conspiracy. It is not incompetence. It is a reasonable pedagogical choice that happens to create performance problems at scale, and I think it is worth understanding why.

Interface projections are easier to teach. No @Query annotation. No constructor expression syntax. No fully-qualified class name in JPQL. You define an interface, change the return type, and Spring Data handles the rest. For a tutorial author trying to demonstrate Spring Data's power in 500 words, interface projections are the obvious choice. They show the magic — you declare what you want, and the framework provides it.

The Spring Data documentation lists interface projections first and class-based projections second. The interface section has more examples. The performance implications are mentioned in a brief aside that says, roughly, "class-based DTOs avoid the proxy overhead." It does not say "interface projections fetch the full entity and will make your wide-table queries 160x slower." Understatement is a British virtue; in documentation, it is a hazard.

The Baeldung projections guide — likely the most-read resource on this topic — demonstrates interface projections first with a small entity. The performance section is brief. It would take a reader with prior TOAST knowledge to connect the dots.

The result is a pipeline. Developers learn interface projections from tutorials. They use them in production. The tables start narrow and grow wide over time — new TEXT columns for descriptions, JSONB for metadata, BYTEA for attachments. By the time the performance impact is noticeable, the pattern is entrenched across dozens of repositories. Nobody connects "we added a JSONB audit_log column" to "our API response times doubled."

This article exists to make that connection explicit.

Covering indexes: the final multiplier

The DTO projection eliminated TOAST fetches. The query still reads the heap — the main table pages — to retrieve the four small columns. A covering index eliminates that too.

EXPLAIN ANALYZE — DTO with covering index
-- With a covering index, PostgreSQL skips the heap entirely

CREATE INDEX idx_products_category_covering
ON products (category)
INCLUDE (id, name, price, sku);

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT p.id, p.name, p.price, p.sku
FROM products p
WHERE p.category = 'electronics';

--                           QUERY PLAN
-- ---------------------------------------------------------------
-- Index Only Scan using idx_products_category_covering on products
--                        (cost=0.42..18.71 rows=512 width=52)
--                        (actual time=0.029..0.94 rows=500 loops=1)
--   Index Cond: (category = 'electronics')
--   Heap Fetches: 0
--   Buffers: shared hit=7
-- Planning Time: 0.07 ms
-- Execution Time: 1.2 ms
--
-- Index Only Scan — no heap access at all.
-- Heap Fetches: 0 — the index contains everything needed.
-- Buffers: shared hit=7 — seven pages, all from cache.
-- 312.7 ms -> 1.2 ms. That is a 260x improvement.

The INCLUDE clause stores copies of id, name, price, and sku inside the index structure itself. When PostgreSQL can satisfy the entire query from the index — both the WHERE filter and the SELECT list — it performs an Index Only Scan. No heap access. No TOAST access. Seven buffer hits instead of 1,200.

312.7 ms to 1.2 ms. That is the full journey from interface projection on a wide table to DTO projection with a covering index.

The anatomy of INCLUDE columns

Covering index mechanics
-- The anatomy of a covering index

-- Standard B-tree index on category:
CREATE INDEX idx_products_category ON products (category);

-- PostgreSQL uses this index to find rows matching
-- category = 'electronics', then fetches the heap tuple
-- to read id, name, price, sku. Heap access required.

-- Covering index with INCLUDE:
CREATE INDEX idx_products_category_covering
ON products (category)
INCLUDE (id, name, price, sku);

-- The INCLUDE columns are stored IN the index leaf pages.
-- PostgreSQL satisfies the entire query from the index.
-- No heap access. No TOAST access. Index Only Scan.

-- Why not put all columns in the index key?
-- CREATE INDEX idx ON products (category, id, name, price, sku);
--
-- This ALSO enables Index Only Scan, but the key columns
-- affect the B-tree sort order. INCLUDE columns do not.
-- INCLUDE columns are cheaper: stored only in leaf pages,
-- not in internal pages. The index is smaller and faster.

-- When does a covering index NOT help?
-- 1. If the visibility map is stale (recent UPDATEs without VACUUM),
--    PostgreSQL must check the heap anyway -> Heap Fetches > 0
-- 2. If the query selects columns not in the index
-- 3. If the table is small enough that a sequential scan is cheaper

The distinction between index key columns and INCLUDE columns matters. Key columns determine the B-tree sort order and appear in both internal and leaf pages. INCLUDE columns are stored only in leaf pages — they are payload, not structure. This makes the index smaller and more cache-friendly than if you put all columns in the key.

Use INCLUDE for columns that appear in the SELECT list but not in WHERE or ORDER BY. Use key columns for columns that appear in WHERE, ORDER BY, or JOIN conditions. For our query, category is the key (used in WHERE), and id, name, price, sku are INCLUDE columns (used in SELECT).

The covering index is only useful with narrow projections

The covering index is only useful if the query selects a subset of columns. This is precisely what DTO projections enable. An interface projection that fetches all columns cannot benefit from a covering index — the index would need to include every column, which defeats the purpose and creates an index larger than the table itself.

This is the compounding effect: DTO projections unlock covering indexes, and covering indexes multiply the DTO projection's advantage. The two optimizations are synergistic. Interface projections disable both.

The composite index and ORM vs raw SQL topics cover this dynamic in detail. ORMs that generate narrow SELECT clauses — or that can be guided to do so — unlock the same index optimization opportunities as hand-written SQL. Interface projections deliberately close that door.

Visibility maps and VACUUM

One caveat that catches people: Index Only Scans require a clean visibility map. If the table has recent updates that VACUUM has not yet processed, PostgreSQL must check the heap to verify row visibility, producing Heap Fetches > 0 in the EXPLAIN output.

Ensuring clean visibility maps for Index Only Scans
-- Covering indexes need a clean visibility map
-- Run VACUUM to ensure Index Only Scans actually skip the heap

VACUUM products;

-- Check visibility map coverage:
SELECT
    n_live_tup,
    n_dead_tup,
    n_mod_since_analyze,
    last_vacuum,
    last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'products';

--  n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum
-- ------------+------------+---------------------+---------------------
--       50000 |         47 |                  47 | 2026-03-05 14:22:01
--
-- n_dead_tup is low. The visibility map should be mostly clean.
-- If n_dead_tup is high, VACUUM before expecting Heap Fetches: 0.

If your covering index shows Heap Fetches: 312 instead of Heap Fetches: 0, the visibility map is stale. Run VACUUM or verify that autovacuum is keeping up. The autovacuum tuning parameters control this. A well-tuned autovacuum keeps the visibility map current, which keeps your Index Only Scans genuinely index-only.

When interface projections are the right choice

I should be forthcoming about the limits of this article's thesis, because pretending they do not exist would be a disservice to you and an embarrassment to me.

Interface projections are not always wrong. They are wrong on wide tables with TOAST-stored columns, under load, where the projection is on a hot path. That describes a significant number of production scenarios — but not all of them.

When interface projections are acceptable
// When interface projections ARE the right choice:

// 1. Narrow tables with no TOAST columns
// All columns are small. The full entity is ~200 bytes.
// The overhead of interface proxy vs DTO record: negligible.
@Entity
@Table(name = "currencies")
public class Currency {
    @Id private String code;          // "USD"
    private String name;              // "US Dollar"
    private String symbol;            // "$"
    private Integer decimalPlaces;    // 2
    // Total row: ~50 bytes. No TOAST. SELECT * is harmless.
}

// 2. Prototyping and early development
// You are iterating on the domain model daily.
// A DTO record for every view is overhead you do not need yet.
// Use interface projections now, profile later, migrate when
// a table grows wide enough to matter.

// 3. Closed projections on narrow subsets of narrow tables
// The Spring Data docs call these "closed projections" —
// every getter maps to a property. On a narrow table,
// the full entity fetch is fast and the code is simpler.

Narrow tables: the cost is negligible

On a table like currencies — four small columns, total row width under 100 bytes, no TEXT or JSONB — the interface projection fetches four columns. A DTO projection also fetches four columns (or fewer). The difference in PostgreSQL execution time is under 1 ms. The JVM overhead difference is under 100 microseconds. The convenience of the interface — no @Query, no constructor expression — is worth more than the negligible performance cost.

The threshold is not precise, but a useful heuristic: if the table has no columns with avg_width > 500 in pg_stats, interface projections are fine. If any column exceeds 2 KB average width, measure before deciding.

Prototyping: speed of iteration over speed of execution

In early development, the domain model changes daily. Creating a DTO record for every view adds friction to schema evolution — every entity change requires updating the DTOs that reference it. Interface projections adapt automatically because they delegate to the entity. During prototyping, this flexibility matters more than TOAST optimization.

The migration path is clear: start with interface projections, profile when the tables grow wide, migrate the hot paths to DTOs. This is a pragmatic approach, and I would not fault any team for choosing it.

Read-heavy systems on narrow result sets

If a query returns a small number of rows — under 20 — the absolute cost difference is small even on wide tables. Twenty rows at 75 KB each is 1.5 MB of TOAST data: noticeable but not catastrophic. The ROI of migrating to DTOs is lower for single-row lookups than for list queries returning hundreds of rows.

Focus your migration effort on the queries that return the most rows from the widest tables. Those are where the 160x improvement lives.

How other frameworks handle projections

The interface projection problem is specific to Spring Data JPA and Hibernate. It is instructive to see how other frameworks in the Java ecosystem — and beyond — approach the same challenge, because the comparison illuminates what is unique about Hibernate's design decision.

Spring Data JDBC: the query is the query

Spring Data JDBC — no hidden SELECT *
// Spring Data JDBC — no JPA, no entity hydration, no surprises
// Each query maps directly to a SQL SELECT

// Define the projection as a simple record
public record ProductSummary(Long id, String name, BigDecimal price, String sku) {}

// Repository using Spring Data JDBC (not JPA)
public interface ProductRepository extends CrudRepository<Product, Long> {

    @Query("SELECT id, name, price, sku FROM products WHERE category = :category")
    List<ProductSummary> findSummaryByCategory(@Param("category") String category);
}

// What reaches PostgreSQL:
// SELECT id, name, price, sku FROM products WHERE category = ?
//
// No ORM. No persistence context. No TOAST surprises.
// The query is the query. What you write is what you send.

Spring Data JDBC, the sibling project to Spring Data JPA, takes a fundamentally different approach. There is no ORM. No persistence context. No entity hydration. The SQL you write (or that Spring Data derives) is the SQL that reaches PostgreSQL. If you SELECT four columns, PostgreSQL receives four columns.

The trade-off is explicit: you lose lazy loading, dirty checking, cascading, and the second-level cache. You gain transparency. For read-heavy services where projection queries dominate, Spring Data JDBC often delivers better performance with less debugging.

jOOQ: type-safe SQL with genuine projection

jOOQ — type-safe projection
// jOOQ — type-safe SQL with genuine projection
// The query builder IS the SQL. No impedance mismatch.

Result<Record4<Long, String, BigDecimal, String>> result =
    dsl.select(
        PRODUCTS.ID,
        PRODUCTS.NAME,
        PRODUCTS.PRICE,
        PRODUCTS.SKU
    )
    .from(PRODUCTS)
    .where(PRODUCTS.CATEGORY.eq("electronics"))
    .fetch();

// Generates exactly:
// SELECT id, name, price, sku FROM products WHERE category = ?
//
// The type system enforces the projection at compile time.
// You cannot accidentally SELECT * — there is no * to select
// unless you explicitly write DSL.asterisk().

jOOQ's approach eliminates the impedance mismatch entirely. The query builder generates SQL directly, with no intermediate entity layer. The type system enforces the projection at compile time — you cannot accidentally SELECT * unless you explicitly use DSL.asterisk(). The result type reflects the exact columns selected.

jOOQ's projection documentation is worth reading even if you do not use jOOQ. It demonstrates what SQL projection looks like when the framework treats it as a first-class concern rather than an afterthought.

Other ecosystems

Django's ORM has .values() and .only(), which generate genuine column-limited SELECTs. Rails' ActiveRecord has .select() and .pluck(). SQLAlchemy has explicit column loading with load_only() and defer(). In each case, the framework translates the projection instruction into a narrower SQL query. The interface projection pattern — where the framework fetches everything and the application-side wrapper hides the excess — is, as far as I am aware, unique to JPA.

This is not to say those frameworks are without projection pitfalls. Django's defer() can cause N+1 queries when deferred fields are accessed. Rails' .select() creates partial model instances that raise errors on missing attributes. Every ORM has its own category of projection surprise. But the specific pattern of "declare four fields, fetch fourteen" is a JPA specialty.

Migrating from interface to DTO projections: a step-by-step approach

For a codebase with dozens of interface projections, a wholesale replacement is neither necessary nor advisable. The impact varies enormously by table width and query frequency. A methodical approach starts with measurement and targets the highest-impact changes first.

Migration checklist: interface to DTO
// Migration checklist: interface to DTO projection
// For each repository method returning an interface projection:

// Step 1: Enable Hibernate SQL logging
logging.level.org.hibernate.SQL=DEBUG

// Step 2: Run the query. Inspect the SELECT clause.
// If it lists columns you do not need, continue.

// Step 3: Create a record matching the interface getters
public record ProductSummaryDto(Long id, String name, BigDecimal price, String sku) {}

// Step 4: Add a @Query with constructor expression
@Query("""
    SELECT new com.example.dto.ProductSummaryDto(
        p.id, p.name, p.price, p.sku
    )
    FROM Product p
    WHERE p.category = :category
    """)
List<ProductSummaryDto> findSummaryByCategory(@Param("category") String category);

// Step 5: Update callers — record getters match interface getters
//   summary.getId()    -> dto.id()      (record accessor)
//   summary.getName()  -> dto.name()
//   summary.getPrice() -> dto.price()
//   summary.getSku()   -> dto.sku()

// Step 6: Run EXPLAIN ANALYZE on the new query.
// Confirm width dropped. Confirm TOAST reads dropped.

// Step 7: Remove the interface if nothing else uses it.

// Step 8: Consider a covering index if the query is hot.

Prioritization: which projections to migrate first

Rank your interface projections by the product of three factors: table width (average row size from pg_stats), result set size (rows returned per invocation), and call frequency (from pg_stat_statements or application metrics).

A projection returning 500 rows from a 75 KB-wide table called 200 times per minute is producing 7.5 GB of unnecessary data transfer per minute. That is your first migration candidate. A projection returning 3 rows from a 200-byte-wide table called twice per day can wait indefinitely.

API compatibility

If callers depend on the interface type — particularly if the interface is part of a published API — the native query approach preserves the contract. Callers continue to receive ProductSummary instances. The implementation changes from entity-backed proxy to ResultSet-backed proxy. No caller code changes required.

For internal projections where you control all callers, the record-based DTO is cleaner. Record accessors (dto.name()) replace interface getters (projection.getName()), which is a straightforward search-and-replace across the codebase.

Testing after migration

After each projection migration, verify three things. First, the SQL: enable Hibernate logging and confirm the SELECT clause contains only the expected columns. Second, the query plan: run EXPLAIN ANALYZE and confirm the width dropped and TOAST reads disappeared. Third, the behavior: run your existing tests — the DTO should return identical data to the interface projection, just without the entity overhead.

If you have integration tests that compare projection results, they will catch data mismatches. If you do not have such tests, this is a fine occasion to write them.

What Gold Lapel does with your projection queries

Gold Lapel sits between your Spring Boot application and PostgreSQL as a transparent proxy, observing every query on the wire protocol. It does not read your annotations or inspect your JPA configuration. It sees the SQL that Hibernate generates — which, as we have established, may differ significantly from what your Java code suggests.

When a query arrives selecting 14 columns from a wide table and only 4 of those columns ever appear in subsequent queries, API responses, or downstream processing, Gold Lapel identifies the excess. More importantly, it identifies the opportunity: if this query were narrowed to those 4 columns, a covering index with INCLUDE columns could enable an index-only scan.

Gold Lapel's covering index recommendations are built for exactly this pattern. It analyzes which columns are actually selected, which columns appear in WHERE clauses, and whether an index configuration exists that could eliminate heap access entirely. For projection queries on wide tables — the scenario where the gain is measured in orders of magnitude — the recommendation is particularly valuable.

The projection change is yours to make in the Java code. The covering index is something Gold Lapel can recommend, validate with EXPLAIN simulation, and monitor after deployment. When your 312 ms query drops to 1.2 ms, Gold Lapel confirms the improvement is real, sustained, and not regressing.

Your interface projection packed the entire wardrobe for a day trip. A DTO projection packs what you need. A covering index means you do not even need to open the suitcase — everything is in your jacket pocket.

Frequently asked questions

Terms referenced in this article

Your patience has earned you a recommendation. The hidden SELECT * problem is part of a larger pattern I have documented in the Spring Boot PostgreSQL optimization guide — projection misconfiguration, N+1 queries, missing indexes, and the other defaults that JPA leaves for you to discover in production.