Doctrine DBAL vs ORM for PostgreSQL Reads: When to Bypass the Entity Layer
Your read queries are spending more time building PHP objects than fetching data. Allow me to show you the receipts.
Good evening. Your ORM is doing far more work than you asked for.
You have a Symfony application backed by Doctrine and PostgreSQL. Your read endpoints return JSON. Your dashboard queries aggregate numbers. Your reports produce CSV exports. None of these need change tracking. None of these will ever call $entityManager->flush().
Yet every row passes through Doctrine's full hydration pipeline: identity map lookup, entity instantiation via reflection, property population through metadata mappings, proxy creation for lazy associations, and registration with the UnitOfWork. For every single row. Even the ones you immediately serialize to JSON and discard.
I have observed this pattern in Symfony applications of every size. An API endpoint returning a list of orders — a perfectly routine operation — spending 155ms constructing PHP objects that exist only to be serialized back into JSON 200 microseconds later. The database finished its work in 5ms. The remaining 150ms was the ORM carefully, meticulously building a rich object graph that nobody asked for and nobody will use.
This is not a bug. Doctrine's ORM is designed for managing entity state. It tracks changes so it can persist them. It maintains an identity map so the same database row always resolves to the same PHP object. These are valuable features — for writes. For reads, they are overhead. Measurable, significant, avoidable overhead.
Doctrine ships with the solution built in. The DBAL layer sits directly beneath the ORM, providing a query builder that speaks SQL instead of DQL and returns arrays instead of entities. Same connection. Same transaction management. No hydration.
If you will permit me a brief tour of both layers, I believe the appropriate division of labor will become quite clear.
A brief orientation: what DBAL and ORM actually are
Before we compare the two, it may be helpful to understand their relationship. These are not competing libraries. They are layers of the same stack, and understanding which layer you need for a given operation is the central skill this article aims to develop.
<?php
// The three layers of Doctrine — and where they live
// Layer 1: DBAL (Doctrine Database Abstraction Layer)
// - Wraps PDO/native drivers
// - QueryBuilder that produces SQL strings
// - Connection management, transactions, schema introspection
// - Returns: arrays, scalars
use Doctrine\DBAL\Connection;
// Layer 2: ORM (Object-Relational Mapper)
// - Builds on top of DBAL
// - DQL parser (entity-aware query language)
// - Hydration engine (rows → entities)
// - UnitOfWork (identity map, change tracking, flush)
// - Returns: managed entity objects
use Doctrine\ORM\EntityManagerInterface;
// Layer 3: Your application code
// - Uses ORM for domain operations (persist, flush, remove)
// - Uses DBAL for read-heavy queries (reports, exports, analytics)
// - Both share the same underlying database connection
// To get the DBAL connection from the EntityManager:
$connection = $entityManager->getConnection();
// They are not competing tools. They are different floors
// of the same building. The DBAL — Database Abstraction Layer — is Doctrine's foundation. It wraps PDO and native database drivers, provides connection management, transaction handling, schema introspection, and a query builder that produces SQL strings. When you call $connection->executeQuery(), you are speaking directly to PostgreSQL through this layer. The result is an array of arrays. No objects. No state management. Just data.
The ORM sits atop the DBAL. It adds DQL (Doctrine Query Language), a parser that translates entity-aware queries into SQL. It adds the hydration engine, which transforms database rows into PHP entity objects. It adds the UnitOfWork, which tracks every property change on every managed entity so that flush() knows exactly which SQL statements to execute.
The critical insight: every ORM query passes through the DBAL to reach PostgreSQL. The ORM does not have its own database connection. It adds processing on top of the DBAL's result. When you use DBAL directly for a read query, you are not bypassing Doctrine — you are using Doctrine's own foundation, without the upper floors.
The EntityManager even provides access to the underlying connection: $entityManager->getConnection() returns the same Doctrine\DBAL\Connection you would inject directly. There is no separation to bridge. They share a connection, a transaction scope, and a configuration. The only question is how much processing you want between the database result and your PHP code.
The identity map: a feature masquerading as overhead
Before we examine hydration costs, I should explain why the ORM does all this work. It is not caprice. The identity map is a genuinely useful data structure — in the right context.
<?php
// The identity map: Doctrine's memory of every entity it has seen
// First query — loads Order #42 from the database
$order = $em->find(Order::class, 42);
// SQL: SELECT * FROM orders WHERE id = 42
// Entity created, registered in identity map
// Second query — returns the SAME PHP object, no SQL
$sameOrder = $em->find(Order::class, 42);
// No SQL executed. Returned from identity map.
// $order === $sameOrder → true (same object in memory)
// Third query — DQL that includes Order #42 in results
$orders = $em->createQueryBuilder()
->select('o')
->from(Order::class, 'o')
->where('o.status = :status')
->setParameter('status', 'shipped')
->getQuery()
->getResult();
// SQL is executed, but when row id=42 is hydrated,
// Doctrine returns the existing object from identity map.
// This is a correctness guarantee: one row = one object.
// The cost: every entity stays in memory for the lifetime
// of the EntityManager. For a web request, fine.
// For a long-running worker processing 100,000 rows: fatal. The identity map guarantees that a given database row corresponds to exactly one PHP object within a request. If you load Order #42 through a find() call and then encounter it again in a query result, you get the same object instance. Not a copy. The same reference. This means that modifications to the entity are visible everywhere it is referenced, and flush() can detect changes by comparing the entity's current state to a snapshot taken when it was first loaded.
For write operations, this is genuinely valuable. It prevents the classic bug where two parts of your code modify the same entity through different copies and the last write silently wins. The identity map makes entity state consistent within a request.
For read operations, the identity map is a ledger that nobody will ever audit. Every entity is registered, snapshotted, and tracked — and then the request ends, the response is sent, and PHP garbage-collects the entire object graph. The tracking was wasted work.
This is the core of the DBAL-vs-ORM decision. Not "which is faster" in the abstract, but "does this specific query need the features that make the ORM slower?" If the answer is no — and for reads that serialize, aggregate, or export data, the answer is almost always no — then the overhead is avoidable.
The hydration tax: what ORM reads actually cost
To understand the overhead, consider the same query expressed through both layers. First, the ORM approach:
<?php
// Doctrine ORM — fetching orders with customer data
$orders = $entityManager->createQueryBuilder()
->select('o', 'c')
->from(Order::class, 'o')
->join('o.customer', 'c')
->where('o.status = :status')
->andWhere('o.createdAt >= :since')
->setParameter('status', 'shipped')
->setParameter('since', new \DateTime('-30 days'))
->getQuery()
->getResult();
// Each result is a fully hydrated Order entity with a Customer proxy.
// Doctrine tracks every property for change detection.
// Memory per row: ~2.4 KB (entity + UnitOfWork identity map entry)
// Time: 47ms for 5,000 rows Now the same data retrieved through DBAL:
<?php
// Doctrine DBAL — same query, raw result
$rows = $connection->createQueryBuilder()
->select('o.id', 'o.total', 'o.shipped_at', 'c.name AS customer_name')
->from('orders', 'o')
->join('o', 'customers', 'c', 'c.id = o.customer_id')
->where('o.status = :status')
->andWhere('o.created_at >= :since')
->setParameter('status', 'shipped')
->setParameter('since', (new \DateTime('-30 days'))->format('Y-m-d'))
->executeQuery()
->fetchAllAssociative();
// Each result is a plain associative array.
// No identity map. No change tracking. No proxy objects.
// Memory per row: ~0.3 KB
// Time: 8ms for 5,000 rows Both execute identical SQL against PostgreSQL. The database does the same work: parse, plan, execute, return rows. The difference is entirely in what PHP does with the result.
The ORM spends 39ms hydrating 5,000 rows into managed entities. The DBAL spends 3ms mapping the same rows to associative arrays. That is a 13x difference on a query that is otherwise identical at the database level.
Here is what happens during hydration, broken down per row:
<?php
// What happens during ORM hydration (simplified):
//
// 1. Execute SQL and receive PDO result set ~5ms
// 2. For each row:
// a. Check identity map — is this entity loaded? ~0.002ms
// b. Create entity instance via reflection ~0.008ms
// c. Populate properties via metadata mapping ~0.012ms
// d. Register in UnitOfWork identity map ~0.003ms
// e. Create proxy objects for lazy associations ~0.005ms
// f. Mark entity as MANAGED state ~0.001ms
// 3. Total per-row overhead: ~0.031ms
//
// For 5,000 rows: 5ms (SQL) + 155ms (hydration) = 160ms
// DBAL for 5,000 rows: 5ms (SQL) + 3ms (array build) = 8ms
//
// The SQL is identical. The difference is purely PHP-side. At 0.031ms per row, the overhead looks trivial. But it compounds. At 1,000 rows, you have added 31ms of pure PHP processing. At 5,000 rows, 155ms. At 50,000 rows — a background report, an export job — the hydration alone takes 1.5 seconds, and the identity map has consumed roughly 120 MB of RAM.
The SQL execution time stays constant. The hydration cost scales linearly with row count. For any query returning more than a few hundred rows that will be serialized, displayed, or exported without modification, hydration is wasted work.
I should note that the per-row overhead varies with entity complexity. An entity with 5 scalar properties hydrates faster than one with 15 properties and 4 associations. The 0.031ms figure represents a moderately complex entity — two associations, ten mapped properties. Your entities may be simpler or more elaborate. The principle, however, does not change: every mapped property costs reflection time, every association costs proxy creation time, and every entity costs identity map registration time. These costs are per-row and linear.
The DQL parsing overhead: the cost before hydration
Hydration is the most visible cost, but it is not the only one. Before the ORM can execute a query, it must translate DQL into SQL. This is a non-trivial compilation step.
<?php
// DQL parsing: the hidden cost before hydration even begins
// This DQL string:
$dql = 'SELECT o, c FROM App\Entity\Order o JOIN o.customer c
WHERE o.status = :status AND o.createdAt >= :since';
// Goes through this pipeline:
// 1. Lexer tokenizes the DQL string ~0.3ms
// 2. Parser builds an Abstract Syntax Tree (AST) ~1.2ms
// 3. SQL walker converts AST to SQL string ~0.8ms
// 4. Metadata lookups resolve entity mappings ~0.5ms
// 5. SQL is sent to PostgreSQL for execution (variable)
//
// Total parse overhead: ~2.8ms per query
//
// With query cache enabled (APCu or Redis):
// 1. Hash the DQL string ~0.01ms
// 2. Cache lookup ~0.1ms
// 3. Return cached SQL ~0.01ms
//
// Total with cache: ~0.12ms
//
// DBAL QueryBuilder: builds SQL string directly.
// No lexer. No parser. No AST. No walker.
// String concatenation: ~0.05ms DQL is not SQL with different syntax. It is a separate language that operates on entities and their mapped associations rather than tables and columns. The parser must tokenize the DQL string, build an Abstract Syntax Tree, walk that tree to resolve entity metadata and association mappings, and finally emit a SQL string that PostgreSQL can execute.
For a single query, 2.8ms of parse time is invisible. For an endpoint that executes 12 DQL queries per request — which is not unusual in a Symfony application rendering a complex page — that is 33ms of pure parsing overhead before a single byte reaches the database.
Doctrine provides a query cache to mitigate this. With APCu or Redis configured as the query cache backend, the compiled SQL is stored and reused. Subsequent requests skip the parse step entirely, reducing the overhead to a cache lookup — roughly 0.12ms. If your application uses DQL extensively and you have not configured query caching, I would encourage you to do so immediately, regardless of whether you adopt DBAL for reads. It is free performance.
DBAL's QueryBuilder, by contrast, does not parse anything. It builds a SQL string through method calls — select(), from(), where() — and produces the final string via concatenation. The "compilation" cost is negligible. There is no AST, no metadata resolution, no walker. The SQL string you see in your code is, with parameter placeholders, what PostgreSQL receives.
This distinction matters most when you are already bypassing entity hydration. If you use DQL with ->getScalarResult() or ->getArrayResult() to avoid full hydration, you have eliminated the per-row object construction — but the DQL parse step remains. DBAL eliminates both.
The benchmarks: ORM vs DBAL across seven query patterns
PostgreSQL 16, PHP 8.3, Doctrine ORM 3.x / DBAL 4.x, 2 million orders, 200,000 customers. Measured with Blackfire, median of 50 runs, opcache warm.
| Operation | ORM | DBAL | Overhead | Note |
|---|---|---|---|---|
| Simple fetch (100 rows) | 4.2ms | 2.1ms | 2.0x | Hydration dominates |
| Join + filter (1,000 rows) | 28ms | 6ms | 4.7x | Identity map + proxies |
| Join + filter (5,000 rows) | 160ms | 8ms | 20x | Hydration scales linearly |
| Aggregation (GROUP BY) | 14ms | 9ms | 1.6x | Scalar — less hydration |
| CTE + window function | N/A | 12ms | - | DQL cannot express this |
| JSONB containment query | N/A | 7ms | - | Requires native SQL |
| Bulk report (50,000 rows) | 3,400ms | 85ms | 40x | ORM memory limit risk |
Three patterns emerge from these numbers.
First: hydration overhead is proportional to row count, not query complexity. A simple fetch of 5,000 rows costs more in hydration than a complex aggregation returning 10 rows. The bottleneck is not the SQL — it is the PHP object construction loop. This means that optimizing the SQL (adding indexes, rewriting joins) only helps when the database is the bottleneck. For high-row-count reads where the database finishes quickly, the bottleneck has moved to PHP, and no amount of index tuning will help.
Second: aggregation queries show the smallest gap (1.6x) because they return scalar results. Doctrine's scalar hydration is lighter than entity hydration — no identity map, no change tracking, no proxy creation. If your ORM queries already use ->getScalarResult() or ->getArrayResult(), you have captured most of the benefit. But at that point, you are writing DQL to avoid hydration, when DBAL would let you write SQL directly. The question becomes: why maintain a DQL layer that produces the same SQL as DBAL, just to immediately bypass the feature (hydration) that justifies DQL's existence?
Third: CTEs, window functions, and JSONB operations are not expressible in DQL at all. The ORM column shows N/A because these queries require NativeQuery or dropping to DBAL. If your read patterns rely on PostgreSQL's analytical features, the ORM is not an option — it is an obstacle.
The 50,000-row bulk report deserves particular attention. At 40x overhead, the ORM approach takes 3.4 seconds and consumes approximately 120 MB of additional RAM for the identity map and entity objects. In a PHP-FPM worker with a 128 MB memory limit, this query will fail with a fatal error. The DBAL approach completes in 85ms and uses roughly 15 MB. Same query. Same data. The difference is where the work happens.
Where DQL cannot follow: CTEs, window functions, LATERAL, and JSONB
DQL is a deliberate subset of SQL. It operates on entities and their mapped associations, not on database tables and their columns. This abstraction is useful until you need PostgreSQL features that have no DQL equivalent. And PostgreSQL, if you will forgive me for saying so, has rather a lot of features.
Common Table Expressions (CTEs)
Cohort analysis, recursive hierarchies, and multi-step transformations all rely on WITH clauses. DQL has no syntax for them. This is not an oversight — CTEs operate on result sets, not on entity associations, and DQL's entity-centric model has no way to represent them.
<?php
// DBAL — monthly cohort retention with a CTE
// (Not expressible in DQL at all)
$sql = <<<SQL
WITH cohorts AS (
SELECT
customer_id,
DATE_TRUNC('month', MIN(created_at)) AS cohort_month
FROM orders
GROUP BY customer_id
),
activity AS (
SELECT
c.cohort_month,
DATE_TRUNC('month', o.created_at) AS activity_month,
COUNT(DISTINCT o.customer_id) AS active_customers
FROM orders o
JOIN cohorts c ON c.customer_id = o.customer_id
GROUP BY c.cohort_month, DATE_TRUNC('month', o.created_at)
)
SELECT
cohort_month,
activity_month,
active_customers,
ROUND(
active_customers::numeric /
FIRST_VALUE(active_customers) OVER (
PARTITION BY cohort_month ORDER BY activity_month
) * 100, 1
) AS retention_pct
FROM activity
ORDER BY cohort_month, activity_month
SQL;
$results = $connection->executeQuery($sql)->fetchAllAssociative();
// CTEs, window functions, FIRST_VALUE — none of this exists in DQL.
// DBAL is not optional here. It is the only path. This query defines two intermediate result sets (cohorts and activity) and then combines them with a window function. The PostgreSQL planner optimizes the entire CTE as a single unit, often materializing intermediate results or inlining them depending on cost estimates. The PHP application receives the final result directly — no intermediate processing, no multiple round trips.
The alternative without CTEs would require either multiple separate queries with PHP-side joining (slow, memory-intensive) or a deeply nested subquery that PostgreSQL's planner handles less efficiently. CTEs exist precisely because they make complex analytical queries both readable and performant.
Window functions
Running totals, rankings, lead/lag comparisons, and moving averages use OVER() clauses. DQL cannot express them.
<?php
// DBAL — running total and rank per customer using window functions
$sql = <<<SQL
SELECT
o.id,
o.customer_id,
o.total,
o.created_at,
SUM(o.total) OVER (
PARTITION BY o.customer_id
ORDER BY o.created_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
ROW_NUMBER() OVER (
PARTITION BY o.customer_id
ORDER BY o.total DESC
) AS order_rank
FROM orders o
WHERE o.created_at >= :since
ORDER BY o.customer_id, o.created_at
SQL;
$results = $connection->executeQuery($sql, [
'since' => (new \DateTime('-90 days'))->format('Y-m-d'),
])->fetchAllAssociative();
// Window functions are PostgreSQL's finest analytical tool.
// DQL cannot express them. Period. Window functions operate across a set of rows related to the current row, without collapsing them into a group. The running total accumulates; the rank orders. PostgreSQL computes both in a single pass over the data. Replicating this in PHP would require fetching all rows, sorting them in memory, and iterating with accumulators — more code, more memory, worse performance.
LATERAL joins
When you need "the top N items per group" — top 3 orders per customer, most recent 5 log entries per user — a LATERAL join is PostgreSQL's elegant answer.
<?php
// DBAL — top 3 orders per customer using LATERAL JOIN
// Another PostgreSQL feature with no DQL equivalent
$sql = <<<SQL
SELECT
c.id AS customer_id,
c.name AS customer_name,
top_orders.id AS order_id,
top_orders.total,
top_orders.created_at
FROM customers c
JOIN LATERAL (
SELECT o.id, o.total, o.created_at
FROM orders o
WHERE o.customer_id = c.id
ORDER BY o.total DESC
LIMIT 3
) top_orders ON true
WHERE c.created_at >= :since
ORDER BY c.name, top_orders.total DESC
SQL;
$results = $connection->executeQuery($sql, [
'since' => (new \DateTime('-1 year'))->format('Y-m-d'),
])->fetchAllAssociative();
// LATERAL JOIN: correlated subquery in the FROM clause.
// Far more efficient than the PHP alternative of loading
// all orders and slicing in application code.
// PostgreSQL executes the subquery once per customer row,
// using the index on (customer_id, total DESC). Without LATERAL, the standard approach is either a window function with ROW_NUMBER() and a filtering outer query, or loading all rows into PHP and slicing per group. LATERAL lets PostgreSQL execute a correlated subquery for each row in the outer table, with full access to the outer row's columns. It is SQL doing what SQL does best: set operations with declarative constraints.
DQL has no concept of LATERAL. It has no concept of subqueries in the FROM clause at all. This is not a temporary limitation — it is a fundamental mismatch between DQL's entity-graph model and SQL's relational algebra.
JSONB operations
PostgreSQL's JSONB operators — containment (@>), arrow extraction (->>), path queries (#>>) — are not available in DQL without custom function extensions.
<?php
// DBAL — querying JSONB metadata stored on orders
$sql = <<<SQL
SELECT
o.id,
o.metadata->>'source' AS acquisition_source,
o.metadata->'tags' AS tags,
o.total,
jsonb_array_length(o.metadata->'items') AS item_count
FROM orders o
WHERE o.metadata @> :filter
AND o.created_at >= :since
ORDER BY o.total DESC
LIMIT 100
SQL;
$results = $connection->executeQuery($sql, [
'filter' => json_encode(['priority' => 'high']),
'since' => (new \DateTime('-30 days'))->format('Y-m-d'),
])->fetchAllAssociative();
// JSONB containment (@>), arrow operators (->>),
// jsonb_array_length — native PostgreSQL, inaccessible from DQL.
// The ORM would require a NativeQuery or custom DQL function. You can work around these limitations with $entityManager->createNativeQuery() and ResultSetMapping, but at that point you are writing raw SQL with extra ceremony. Which brings us to an important comparison.
NativeQuery vs DBAL: why the escape hatch costs more than the exit
When Doctrine developers need raw SQL, they often reach for NativeQuery first. It is, after all, still within the ORM — it feels like staying in the family. Allow me to suggest that this familiarity comes at a cost.
<?php
// NativeQuery: the ORM's escape hatch for raw SQL
// It works, but compare the ceremony to DBAL.
// === NativeQuery approach ===
$rsm = new ResultSetMapping();
$rsm->addScalarResult('region', 'region');
$rsm->addScalarResult('revenue', 'revenue');
$rsm->addScalarResult('customer_count', 'customerCount');
$query = $em->createNativeQuery(
'SELECT region, SUM(total) AS revenue,
COUNT(DISTINCT customer_id) AS customer_count
FROM orders
WHERE created_at >= :since
GROUP BY region
ORDER BY revenue DESC',
$rsm
);
$query->setParameter('since', (new \DateTime('-90 days'))->format('Y-m-d'));
$results = $query->getResult();
// === DBAL approach ===
$results = $connection->executeQuery(
'SELECT region, SUM(total) AS revenue,
COUNT(DISTINCT customer_id) AS customer_count
FROM orders
WHERE created_at >= :since
GROUP BY region
ORDER BY revenue DESC',
['since' => (new \DateTime('-90 days'))->format('Y-m-d')]
)->fetchAllAssociative();
// Same SQL. Same result. But NativeQuery requires:
// 1. A ResultSetMapping (manual column-to-alias mapping)
// 2. Scalar result registration for each column
// 3. A Query object wrapping the SQL
// 4. The result still passes through Doctrine's hydration layer
//
// DBAL: one call, direct array return.
// NativeQuery exists for when you need raw SQL but want
// entity hydration. If you don't need entities — and for
// reads, you usually don't — DBAL is the cleaner path. NativeQuery requires a ResultSetMapping that manually declares every column in the result set. For scalar results, each column needs an addScalarResult() call specifying the SQL column name and the PHP alias. For entity results, the mapping requires class metadata, discriminator columns, and association mappings. This configuration is fragile — add a column to your SQL without updating the mapping, and you get silent data loss or cryptic hydration errors.
DBAL's executeQuery() returns whatever the database returns. Add a column, and it appears in the array. Remove one, and it disappears. The mapping between SQL and PHP is implicit: column names become array keys. No configuration to maintain, no mapping to keep synchronized.
NativeQuery does have one legitimate use case: when you need raw SQL and entity hydration. If your query uses a CTE to compute results but you need managed entities in the return — perhaps because you intend to modify and flush them — NativeQuery with entity-level ResultSetMapping is the tool for the job. This is a genuinely narrow case. For the vast majority of read queries, the result is data, not entities, and DBAL is the cleaner path.
The CQRS-lite pattern: ORM for writes, DBAL for reads
CQRS — Command Query Responsibility Segregation — is an architecture pattern where writes and reads use different models. The "full" version involves separate databases, event sourcing, and eventual consistency. That is, if you will forgive the observation, rather a lot of machinery for most applications.
CQRS-lite is simpler: same database, same Doctrine connection, but different interfaces for writes and reads. The EntityManager handles writes. The DBAL Connection handles reads. Both are already available in any Symfony application — you are not adding a dependency, you are using one you already have.
<?php
// The CQRS-lite pattern: ORM for writes, DBAL for reads
// === WRITE SIDE (ORM) ===
// Entities, change tracking, flush — the ORM earns its keep here.
class PlaceOrderHandler
{
public function __construct(
private EntityManagerInterface $em,
) {}
public function handle(PlaceOrderCommand $cmd): void
{
$customer = $this->em->find(Customer::class, $cmd->customerId);
$order = new Order($customer, $cmd->items, $cmd->total);
$this->em->persist($order);
$this->em->flush();
// Identity map, cascade, lifecycle events — all useful here.
}
}
// === READ SIDE (DBAL) ===
// No entities. No hydration. Just data.
class OrderReportQuery
{
public function __construct(
private Connection $connection,
) {}
public function revenueByRegion(\DateTime $since): array
{
return $this->connection->createQueryBuilder()
->select('region', 'SUM(total) AS revenue', 'COUNT(*) AS orders')
->from('orders')
->where('created_at >= :since')
->setParameter('since', $since->format('Y-m-d'))
->groupBy('region')
->orderBy('revenue', 'DESC')
->executeQuery()
->fetchAllAssociative();
}
}
// Symfony service wiring — both injected, both available:
// EntityManagerInterface for writes
// Doctrine\DBAL\Connection for reads This separation is not just a performance optimization. It clarifies intent. When you see EntityManagerInterface in a constructor, you know this class modifies state. When you see Connection, you know it only reads. The dependency declaration becomes documentation.
Symfony's service container makes this wiring trivial. Both EntityManagerInterface and Doctrine\DBAL\Connection are autowirable. No factory methods, no configuration blocks, no abstraction layers. Inject what you need.
Structuring read results with DTOs
Associative arrays from DBAL are fast but loosely typed. For domain boundaries — controller responses, API serialization, Twig templates — a readonly DTO provides structure without ORM overhead:
<?php
// For structured read results, map DBAL rows to DTOs
readonly class OrderSummary
{
public function __construct(
public int $id,
public string $customerName,
public float $total,
public string $status,
public \DateTimeImmutable $createdAt,
) {}
public static function fromRow(array $row): self
{
return new self(
id: (int) $row['id'],
customerName: $row['customer_name'],
total: (float) $row['total'],
status: $row['status'],
createdAt: new \DateTimeImmutable($row['created_at']),
);
}
}
// Usage:
$rows = $connection->executeQuery($sql, $params)->fetchAllAssociative();
$summaries = array_map(OrderSummary::fromRow(...), $rows);
// Type-safe. Immutable. No ORM overhead.
// Memory: ~0.4 KB per DTO vs ~2.4 KB per hydrated entity. The DTO adds roughly 0.1 KB per row over raw arrays — a fraction of the 2.4 KB per hydrated entity. For 5,000 rows, that is 2 MB of DTO objects versus 12 MB of managed entities. The memory difference matters for long-running processes and high-concurrency endpoints.
PHP 8.2's readonly classes are particularly well-suited to this pattern. The readonly modifier on the class ensures all properties are immutable once constructed — the same guarantee Doctrine entities provide through managed state, but without the overhead of tracking changes that will never happen.
The ReadRepository pattern
As your DBAL queries accumulate, you will want a home for them. Doctrine's EntityRepository is tied to the ORM — it expects an entity class and returns entities. For DBAL reads, I would suggest a parallel structure: dedicated read repositories that inject Connection and return DTOs or arrays.
<?php
// Organizing DBAL read queries: the ReadRepository pattern
// Doctrine's EntityRepository is tied to the ORM.
// For DBAL queries, create dedicated read repositories.
final class OrderReadRepository
{
public function __construct(
private Connection $connection,
) {}
/**
* @return list<OrderSummary>
*/
public function findShippedSince(\DateTimeImmutable $since, int $limit = 100): array
{
$rows = $this->connection->createQueryBuilder()
->select('o.id', 'o.total', 'o.status', 'o.created_at', 'c.name AS customer_name')
->from('orders', 'o')
->join('o', 'customers', 'c', 'c.id = o.customer_id')
->where('o.status = :status')
->andWhere('o.created_at >= :since')
->setParameter('status', 'shipped')
->setParameter('since', $since->format('Y-m-d'))
->setMaxResults($limit)
->orderBy('o.created_at', 'DESC')
->executeQuery()
->fetchAllAssociative();
return array_map(OrderSummary::fromRow(...), $rows);
}
public function revenueByRegion(\DateTimeImmutable $since): array
{
return $this->connection->createQueryBuilder()
->select('region', 'SUM(total) AS revenue', 'COUNT(*) AS order_count')
->from('orders')
->where('created_at >= :since')
->setParameter('since', $since->format('Y-m-d'))
->groupBy('region')
->orderBy('revenue', 'DESC')
->executeQuery()
->fetchAllAssociative();
}
public function topCustomersByRevenue(int $limit = 20): array
{
$sql = <<<SQL
SELECT
c.id,
c.name,
COUNT(o.id) AS order_count,
SUM(o.total) AS total_revenue,
MAX(o.created_at) AS last_order_at
FROM customers c
JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name
ORDER BY total_revenue DESC
LIMIT :limit
SQL;
return $this->connection->executeQuery($sql, [
'limit' => $limit,
], [
'limit' => \Doctrine\DBAL\ParameterType::INTEGER,
])->fetchAllAssociative();
}
}
// Register in Symfony services.yaml:
// App\ReadRepository\OrderReadRepository:
// arguments:
// $connection: '@doctrine.dbal.default_connection' This pattern keeps your DBAL queries organized, testable, and discoverable. Each read repository corresponds to a domain concept — OrderReadRepository, CustomerReadRepository, ReportReadRepository — and contains the queries that serve read operations for that concept. The naming convention mirrors Doctrine's OrderRepository for ORM operations, making the codebase navigable.
"The ORM did not fail. It did exactly what was asked. It was simply asked poorly."
— from You Don't Need Redis, Chapter 3: The ORM Tax
Identifying which queries to migrate
Converting every ORM read to DBAL would be as wasteful as leaving them all in the ORM. The effort should be proportional to the overhead. You need measurement, not intuition.
<?php
// Identifying which queries to migrate: use Symfony Profiler
// In config/packages/dev/doctrine.yaml:
// doctrine:
// dbal:
// profiling_collect_backtrace: true
//
// This logs every query with its execution time, row count,
// and the PHP call stack that triggered it.
// In the Symfony profiler toolbar:
// - Sort queries by execution time (descending)
// - Look for queries returning > 500 rows
// - Look for queries where hydration time > SQL time
// - Look for queries on read-only endpoints (GET requests)
// Programmatic profiling with Blackfire:
// blackfire run php bin/console app:generate-report --period=90
//
// Look for:
// - Doctrine\ORM\Internal\Hydration\ObjectHydrator::hydrateAllData
// - High call count on UnitOfWork::createEntity
// - Memory spikes during result hydration
// Quick measurement in code:
$start = microtime(true);
$result = $entityManager->createQueryBuilder()
->select('o', 'c')
->from(Order::class, 'o')
->join('o.customer', 'c')
->where('o.status = :status')
->setParameter('status', 'shipped')
->getQuery()
->getResult();
$ormTime = (microtime(true) - $start) * 1000;
$start = microtime(true);
$rows = $connection->createQueryBuilder()
->select('o.id', 'o.total', 'c.name AS customer_name')
->from('orders', 'o')
->join('o', 'customers', 'c', 'c.id = o.customer_id')
->where('o.status = :status')
->setParameter('status', 'shipped')
->executeQuery()
->fetchAllAssociative();
$dbalTime = (microtime(true) - $start) * 1000;
// Log both times. If $ormTime / $dbalTime > 3, the query
// is a strong candidate for migration. The Symfony Profiler is your first tool. In development mode, it records every database query with timing, row count, and the PHP stack trace that triggered it. Sort by execution time, then look for queries where the total response time far exceeds the SQL execution time. The gap is hydration.
Blackfire provides more granular data for production profiling. Look specifically for time spent in ObjectHydrator::hydrateAllData and call counts on UnitOfWork::createEntity. If a single request spends more time in hydration methods than in all other application code combined, you have found your candidates.
A pragmatic heuristic that works well in practice: any ORM query that returns more than 500 rows on a read-only endpoint is a candidate. Any query that appears in a background worker or CLI command processing batches is a strong candidate. Any query that uses ->getScalarResult() or ->getArrayResult() to avoid hydration is already telling you that it should be DBAL.
Start with the top three by combined row-count-times-frequency. Measure before and after. The gains compound — not just in wall time, but in memory reduction, which translates to more requests per PHP-FPM worker pool.
Memory management: the identity map in long-running processes
Web requests have a natural memory boundary: the request ends, PHP tears down the process (or recycles it), and the identity map is discarded. Long-running processes — Symfony Messenger consumers, CLI commands, queue workers — have no such boundary. The identity map grows without limit.
<?php
// Memory management for long-running processes
// Problem: identity map grows without bound in workers/consumers
$em = $this->entityManager;
// Process 100,000 orders in a Messenger consumer:
foreach ($orderIds as $batch) {
$orders = $em->getRepository(Order::class)->findBy(['id' => $batch]);
foreach ($orders as $order) {
// Process each order...
}
// Without this, every entity stays in memory forever
$em->clear();
// Detaches ALL managed entities. Identity map is empty.
// Any references to old entity objects are now stale.
}
// DBAL alternative — no identity map to manage:
foreach ($batches as $batch) {
$rows = $connection->executeQuery(
'SELECT id, total, status FROM orders WHERE id IN (?)',
[$batch],
[Connection::PARAM_INT_ARRAY]
)->fetchAllAssociative();
foreach ($rows as $row) {
// Process each row — plain array, no managed state
}
// Nothing to clear. Arrays are garbage collected normally.
}
// For background jobs processing large datasets,
// DBAL eliminates an entire class of memory management problems. The ORM approach requires explicit memory management. You must call $em->clear() periodically to reset the identity map. But clear() detaches all managed entities, which means any references you are holding to entity objects become stale — accessing their lazy associations will throw EntityNotFoundException, and attempting to flush them will fail. This creates a class of bugs that only appears under load, in production, in your worker processes.
DBAL queries return plain arrays. Arrays are garbage collected normally when they go out of scope. There is no identity map to clear, no managed state to detach, no stale references to guard against. For batch processing, this is not just faster — it is architecturally simpler. An entire category of memory management bugs simply does not exist.
I have seen Symfony applications where the Messenger consumer was restarted every 500 messages because the identity map leaked memory beyond the process limit. The fix was not tuning the restart threshold — it was moving the read queries in the consumer to DBAL, eliminating the identity map growth entirely. The consumer now runs indefinitely.
When the ORM is the right choice (even for reads)
I should be forthcoming about the boundaries of this argument, because a recommendation without honest limits is not a recommendation — it is a sales pitch. And I am no salesman.
DBAL is not universally better for reads. The ORM provides genuine value in several read scenarios.
Identity map consistency within a request. When a request reads an entity, modifies it, and reads it again, the identity map ensures the second read reflects the uncommitted changes. DBAL queries would return stale data from the database. If your read-then-write flow depends on seeing pending modifications, the ORM's identity map is not overhead — it is a correctness guarantee. This matters most in complex write workflows where validation depends on reading related entities.
Lazy loading within entity graphs. When rendering an entity's associations in a template or serializer, the ORM's proxy objects load related data on demand. Replacing this with DBAL means manually joining every association upfront or executing multiple DBAL queries. For complex entity graphs with conditional display logic — "show the customer's address only if they have a premium plan, and if so, also load their subscription details" — the ORM's lazy loading is genuinely more ergonomic. The N+1 cost is real, but for small result sets with complex traversal patterns, it may be the correct trade-off.
Small result sets in CRUD operations. Fetching a single entity by ID for a form edit page costs perhaps 0.5ms in hydration overhead. The ORM's form binding, validation, and flush cycle saves far more development time than 0.5ms of runtime. Optimizing a single-entity fetch from 0.5ms to 0.1ms is not a productive use of engineering effort.
Doctrine lifecycle events. If your entities use @PrePersist, @PostLoad, or other lifecycle callbacks that run logic after loading from the database, DBAL bypasses these entirely. Any behavior that depends on lifecycle events will need to be replicated in your DTO mapping code or read repository. For some applications, this is straightforward. For others, it represents significant duplicated logic.
Second-level cache. Doctrine's second-level cache stores hydrated entities across requests. If you have configured it for high-traffic read entities, the ORM serves them from cache without database access — faster than DBAL, which always hits the database. This is an advanced feature that most applications do not use, but if yours does, the ORM's read performance changes substantially.
The decision framework is straightforward:
| Scenario | Use | Why |
|---|---|---|
| CRUD — create, update, delete single entities | ORM | Change tracking, cascades, lifecycle events, validation |
| Form handling and entity persistence | ORM | Symfony forms bind directly to entities |
| Dashboard aggregations and reports | DBAL | No entities needed — arrays or DTOs are lighter |
| API list endpoints returning > 500 rows | DBAL | Hydration cost grows linearly; DTO mapping is constant |
| CTEs, window functions, recursive queries | DBAL | DQL does not support these PostgreSQL features |
| JSONB queries with containment operators | DBAL | Native PostgreSQL operators unavailable in DQL |
| Background jobs processing large datasets | DBAL | Memory pressure from 50K+ hydrated entities is real |
| Complex writes with business rules | ORM | Domain model, events, and flush atomicity matter |
Most Symfony applications settle around 60-70% ORM writes and single-entity reads, 30-40% DBAL for list views, reports, and analytics. The ratio varies, but the principle holds: use the ORM where it adds value, bypass it where it adds weight.
Testing DBAL queries against real PostgreSQL
A common objection to DBAL queries is testability. ORM entities can be tested with mocks and fixtures. DBAL queries produce SQL that must run against a real database. I regard this as a feature, not a limitation.
<?php
// Testing DBAL read repositories
use Doctrine\DBAL\Connection;
use PHPUnit\Framework\TestCase;
class OrderReadRepositoryTest extends TestCase
{
private Connection $connection;
private OrderReadRepository $repository;
protected function setUp(): void
{
// Use the same DBAL connection from your test kernel
$this->connection = self::getContainer()->get(Connection::class);
$this->repository = new OrderReadRepository($this->connection);
// Insert test fixtures via DBAL — faster than ORM fixtures
$this->connection->executeStatement(
'INSERT INTO customers (id, name) VALUES (1, :name)',
['name' => 'Test Customer']
);
$this->connection->executeStatement(
'INSERT INTO orders (id, customer_id, total, status, region, created_at)
VALUES (1, 1, 99.50, :status, :region, :created)',
['status' => 'shipped', 'region' => 'eu', 'created' => '2026-03-01']
);
}
public function testRevenueByRegion(): void
{
$results = $this->repository->revenueByRegion(
new \DateTimeImmutable('2026-01-01')
);
self::assertCount(1, $results);
self::assertSame('eu', $results[0]['region']);
self::assertEquals(99.50, (float) $results[0]['revenue']);
}
protected function tearDown(): void
{
$this->connection->executeStatement('DELETE FROM orders');
$this->connection->executeStatement('DELETE FROM customers');
}
}
// Tests run against real PostgreSQL — no mocks, no SQLite.
// DBAL queries must be tested against the same database engine
// they will run against in production. PostgreSQL-specific syntax
// (JSONB, CTEs, window functions) will simply fail on SQLite. DBAL queries that use PostgreSQL-specific features — JSONB operators, CTEs, window functions, LATERAL joins — will fail against SQLite. This is correct behavior. Your tests should run against the same database engine as your production application. A test that passes against SQLite and fails against PostgreSQL has not tested anything useful.
Symfony's test kernel can be configured with a dedicated PostgreSQL test database. Fixtures inserted via DBAL are faster than ORM fixtures because they bypass hydration and the UnitOfWork — a pleasant symmetry. The test creates data with DBAL, reads it with DBAL, and asserts on the result. No ORM overhead in the test path either.
For integration tests that verify the full stack — controller to database and back — the same approach applies. The DBAL read repository is a concrete class with a real database connection. There is nothing to mock. The test verifies that the SQL is correct, that the parameters bind properly, and that the result structure matches expectations. If the SQL has a syntax error, the test fails. If a column name changes in a migration, the test fails. These are the failures you want to catch.
Avoiding the common migration mistakes
Moving read queries from ORM to DBAL is not a rewrite. It is a targeted refactor. But several mistakes show up repeatedly, and I would be remiss not to address them.
Mistake 1: converting all queries at once. Start with the queries that return the most rows or appear in the hottest code paths. Profile first. Your /api/orders endpoint returning 1,000 hydrated entities is a better candidate than your /admin/settings page loading three config rows. Batch conversions introduce risk without proportional benefit — a single mistyped column name in one of twenty converted queries can take hours to trace.
Mistake 2: losing Doctrine's type conversion. ORM entities apply type mappings automatically — a datetime column becomes a PHP \DateTime object, a json column becomes an array, a boolean column becomes a PHP bool. DBAL returns the raw values from the database driver: strings for timestamps, strings for JSON, and — depending on your PostgreSQL driver configuration — strings or booleans for boolean columns. Your DTO factory methods need to handle this conversion explicitly. The fromRow pattern shown earlier handles it cleanly, but forgetting it produces subtle type bugs that only manifest at serialization boundaries.
Mistake 3: bypassing the ORM for writes. DBAL can execute INSERT and UPDATE statements, and you might be tempted to use it for bulk writes to avoid ORM overhead. Resist this for domain entities. The ORM's lifecycle events, validation hooks, and flush atomicity exist because write operations have business rules that raw SQL bypasses. Use DBAL for writes only in infrastructure contexts: audit logs, analytics events, bulk imports of external data that bypasses domain logic intentionally.
Mistake 4: duplicating query logic. If an ORM query and a DBAL query both need to filter orders by the same complex status rules — "shipped in the last 30 days, not cancelled, not flagged for review" — that logic now exists in two places. Any change to the business rule must be applied to both. The mitigation is to centralize filter logic in shared methods or SQL builder helpers that both the ORM repository and the DBAL read repository can use. Do not allow the same business rule to drift between two independent implementations.
Mistake 5: ignoring column name differences. ORM entities use camelCase property names that Doctrine maps to snake_case column names through naming strategy configuration. DBAL queries use actual column names from the database. If your naming strategy transforms createdAt to created_at, every DBAL query must use created_at. This seems obvious, but after an hour of converting queries, you will reach for createdAt out of muscle memory. The error message — "column createdAt does not exist" — is at least clear.
The honest counterpoint: what you lose
I have spent most of this article explaining what DBAL gains you. Permit me a moment of candor about what it costs.
You lose schema validation at query time. DQL validates entity property names against your mapping metadata. If you mistype a property name, Doctrine throws an exception with a helpful message. DBAL queries validate against the database at execution time. Misspell a column name, and you get a PostgreSQL error that may be less immediately clear. This is a development-time cost, not a runtime one, but it is real.
You lose the abstraction layer. DQL is theoretically database-agnostic — switch from PostgreSQL to MySQL, and your DQL queries still work (modulo dialect differences). DBAL's QueryBuilder maintains some portability, but raw SQL passed to executeQuery() is PostgreSQL-specific. If you ever migrate databases, every raw SQL string needs review. In practice, very few applications migrate databases after launch, and the PostgreSQL-specific features (CTEs, JSONB, window functions) are precisely why you chose PostgreSQL in the first place. But the coupling is real.
You lose automatic relationship resolution. DQL knows that o.customer means "join the customers table through the customer_id foreign key." DBAL requires you to specify the join condition explicitly: c.id = o.customer_id. For complex entity graphs with multiple association paths, reconstructing the join conditions manually is tedious and error-prone. The ORM's metadata is doing real work here — work you must replicate when using DBAL.
You add surface area for SQL injection. Doctrine's DQL parser inherently prevents SQL injection because it controls the entire query construction pipeline. DBAL's QueryBuilder with parameterized queries is equally safe. But raw SQL strings passed to executeQuery() can contain injection vulnerabilities if parameters are concatenated rather than bound. Always use parameter binding. Always. This is not negotiable.
These are real costs. They are manageable costs, in my estimation, and the performance and capability benefits outweigh them for read-heavy queries. But I would be a poor guide indeed if I presented only the favorable side of the ledger.
When the proxy makes the whole debate quieter
The DBAL-vs-ORM choice exists because ORM queries carry overhead that matters at scale. But the calculus changes when the SQL layer itself becomes faster.
Gold Lapel sits between your Symfony application and PostgreSQL, analyzing the queries that both ORM and DBAL send. When it detects a read query that runs repeatedly — a dashboard aggregation, a list endpoint, a report — it can create a materialized view transparently. The next time that query arrives, it reads from the precomputed view instead of scanning the base tables.
Consider the 5,000-row join query from our benchmark. With the ORM, it took 160ms: 5ms SQL execution, 155ms hydration. With DBAL, it took 8ms: 5ms SQL, 3ms array construction. With Gold Lapel's materialized view, the SQL drops to 1ms. The ORM total becomes 156ms. The DBAL total becomes 4ms.
The ORM overhead is still there — hydration costs what it costs. But the relative pain changes. When SQL execution is 5ms, spending 155ms on hydration feels absurd. When SQL execution is 1ms, spending 155ms on hydration is still wasteful, but the absolute wall time of 156ms might be acceptable for your use case. You are no longer rewriting queries to shave 5ms off the database — the proxy handled that.
Gold Lapel's auto-indexing helps both approaches equally. If your ORM-generated WHERE clause would benefit from a composite index, the proxy identifies it from traffic patterns and recommends (or creates) it. The same applies to DBAL queries. The optimization operates at the SQL level, agnostic to which PHP layer generated the query.
This does not eliminate the case for DBAL. Bulk exports, CTEs, window functions, and JSONB operations still require it. But for the middle ground — the list endpoints and aggregation queries where you are considering DBAL purely for speed — a proxy that makes the underlying SQL faster can make the rewrite unnecessary. The ORM query that took 160ms now takes 30ms. Good enough? For many teams, yes.
The honest position: use DBAL where it provides capabilities the ORM lacks. Use DBAL where hydration overhead is genuinely unacceptable. And for everything else, let the database run faster instead of rewriting the PHP.
A summary of the division of labor
The ORM and DBAL are not in competition. They are colleagues with different responsibilities, and the well-run application assigns each to its strengths.
The ORM excels at managing entity state: creating, modifying, and persisting domain objects with change tracking, cascade operations, and lifecycle events. It ensures consistency through the identity map and atomicity through the UnitOfWork's flush mechanism. For write operations and single-entity reads within transactional workflows, it earns its overhead many times over.
The DBAL excels at retrieving data: list views, aggregations, reports, analytics, and any query that uses PostgreSQL's advanced features. It returns arrays or DTOs at a fraction of the ORM's memory and time cost. For read operations that will be serialized, displayed, or exported, it is the appropriate tool.
The division is not a compromise. It is an architecture — one that Doctrine itself was designed to support, given that it ships both layers and makes both injectable in Symfony applications.
Your entities deserve the ORM's careful attention when their state is changing. Your read queries deserve the DBAL's efficient directness when they are simply fetching data. There is no conflict in this arrangement. There is only the recognition that different operations have different needs, and the tools to serve them are already in your composer.json.
Frequently asked questions
Terms referenced in this article
Before you take your leave — the N+1 problem that plagues Doctrine's entity hydration is not unique to PHP. I have prepared a guide on detecting and resolving N+1 queries that applies the same discipline across any persistence layer.