Django PostgreSQL Optimization: Beyond the ORM
I regret to inform you that your ORM has been submitting queries on your behalf that may require a frank discussion.
How Django talks to PostgreSQL
Django's ORM translates Python code into SQL. Understanding what SQL it generates is the first optimization step — you cannot optimize what you cannot see.
A QuerySet is lazy. No SQL is sent to PostgreSQL until the QuerySet is evaluated:
# No SQL executed here — this is just building a query description
qs = Order.objects.filter(status='pending').select_related('customer')
# SQL is executed here, when you iterate
for order in qs:
print(order.customer.name) To see the actual SQL Django generates, use django.db.connection.queries in development:
from django.db import connection
# After executing some queries...
for query in connection.queries:
print(query['sql'])
print(query['time']) For a more practical approach, install django-debug-toolbar. Its SQL panel shows every query with timing, duplicate detection, and the ability to run EXPLAIN on each query directly from the browser.
Django PostgreSQL optimization starts with understanding the SQL your ORM generates. select_related and prefetch_related eliminate N+1 queries, database functions push computation to PostgreSQL, and connection pooling prevents exhaustion under load.
Solving the N+1 query problem
The N+1 query problem is the single most common performance issue in Django applications. Address this first, before anything else.
Identifying N+1 queries
An N+1 query happens when you load a list of objects (1 query), then access a related object on each one (N additional queries):
# models.py
class Author(models.Model):
name = models.CharField(max_length=200)
email = models.EmailField()
class Book(models.Model):
title = models.CharField(max_length=300)
author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='books')
published_date = models.DateField()
page_count = models.IntegerField()
class Tag(models.Model):
name = models.CharField(max_length=100)
books = models.ManyToManyField(Book, related_name='tags') # This generates 1 + N queries (N = number of books)
books = Book.objects.all() # Query 1: SELECT * FROM books
for book in books:
print(book.author.name) # Query 2..N+1: SELECT * FROM authors WHERE id = ? If there are 100 books, this produces 101 queries. Each book.author triggers a separate database round trip. I have seen production views generating over 400 database round trips through precisely this pattern.
nplusone detects N+1 queries automatically during development:
# settings.py
INSTALLED_APPS = [
...
'nplusone.ext.django',
]
MIDDLEWARE = [
'nplusone.ext.django.NPlusOneMiddleware',
...
]
NPLUSONE_RAISE = True # Raise an exception on N+1 (dev only) auto_explain on the PostgreSQL side logs execution plans for slow queries:
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 100 # Log queries slower than 100ms
auto_explain.log_analyze = true
auto_explain.log_buffers = true See: N+1 query guide | N+1 query problem | DRF nested serializer N+1
select_related — JOIN-based eager loading
select_related follows ForeignKey and OneToOneField relationships, generating a SQL JOIN:
# 1 query instead of 101
books = Book.objects.select_related('author').all()
for book in books:
print(book.author.name) # No additional query — author is already loaded The generated SQL:
SELECT books.id, books.title, books.author_id, books.published_date, books.page_count,
authors.id, authors.name, authors.email
FROM books
INNER JOIN authors ON books.author_id = authors.id; Chaining fetches multiple relationships in one query:
# Fetch both author and publisher in one JOIN
books = Book.objects.select_related('author', 'publisher').all()
# Follow author → company (if Author has a ForeignKey to Company)
books = Book.objects.select_related('author__company').all()
for book in books:
print(book.author.company.name) # No extra queries prefetch_related — separate query eager loading
prefetch_related handles ManyToManyField and reverse ForeignKey relationships with a separate query:
# 2 queries instead of N+1
books = Book.objects.prefetch_related('tags').all()
for book in books:
for tag in book.tags.all(): # No additional query — tags are prefetched
print(tag.name) -- Query 1: Fetch all books
SELECT * FROM books;
-- Query 2: Fetch all tags for those books via the junction table
SELECT tags.*, books_tags.book_id
FROM tags
INNER JOIN books_tags ON tags.id = books_tags.tag_id
WHERE books_tags.book_id IN (1, 2, 3, 4, 5, ...); Two queries total, regardless of how many books or tags exist.
Prefetch objects allow filtering the prefetched queryset:
from django.db.models import Prefetch
# Only prefetch tags that are "featured"
books = Book.objects.prefetch_related(
Prefetch('tags', queryset=Tag.objects.filter(name__startswith='featured'))
).all() Choosing between select_related and prefetch_related
| Relationship Type | Use | Why |
|---|---|---|
| ForeignKey | select_related | One row to one related row. JOIN is efficient. |
| OneToOneField | select_related | One row to one related row. JOIN is efficient. |
| ManyToManyField | prefetch_related | One row to many related rows. JOIN causes row explosion. |
| Reverse ForeignKey | prefetch_related | One row to many related rows. Separate query with IN clause. |
The hybrid pattern combines both:
books = (
Book.objects
.select_related('author') # FK — JOIN is efficient
.prefetch_related('tags') # M2M — separate query
.prefetch_related('reviews') # Reverse FK — separate query
)
# Total: 3 queries regardless of result count QuerySet optimization patterns
only() and defer() — column projection
By default, Django fetches all columns (SELECT *). When you only need a few fields:
# SELECT id, title FROM books (instead of SELECT *)
books = Book.objects.only('id', 'title')
# SELECT id, title, author_id, published_date, page_count FROM books
# (excludes the large "description" text field)
books = Book.objects.defer('description') Accessing a deferred field triggers an additional query per object. Use only() and defer() only when you're confident the deferred fields won't be accessed.
values() and values_list() — lightweight results
# Returns list of dictionaries — no model instantiation
Book.objects.values('id', 'title', 'author__name')
# [{'id': 1, 'title': 'The Art of SQL', 'author__name': 'Jane Smith'}, ...]
# Returns list of tuples — even lighter
Book.objects.values_list('id', 'title')
# [(1, 'The Art of SQL'), (2, 'Database Internals'), ...]
# Returns flat list of a single field
Book.objects.values_list('id', flat=True)
# [1, 2, 3, 4, 5, ...] No model methods, no related object access, no save(). Use them for exports, API responses that don't need model logic, and feeding data into other queries.
Aggregation and annotation
aggregate() computes a summary across an entire queryset:
from django.db.models import Count, Sum, Avg, Max
# Single query: SELECT COUNT(*), SUM(page_count), AVG(page_count) FROM books
stats = Book.objects.aggregate(
total_books=Count('id'),
total_pages=Sum('page_count'),
avg_pages=Avg('page_count'),
longest_book=Max('page_count')
) annotate() adds per-row computed values, pushing computation to PostgreSQL:
from django.db.models import Count
# Each author gets a book_count computed in SQL
authors = Author.objects.annotate(
book_count=Count('books')
).order_by('-book_count')
for author in authors:
print(f"{author.name}: {author.book_count} books") Subquery and OuterRef for correlated subqueries:
from django.db.models import Subquery, OuterRef
# Annotate each author with their most recent book's title
latest_book = (
Book.objects
.filter(author=OuterRef('pk'))
.order_by('-published_date')
.values('title')[:1]
)
authors = Author.objects.annotate(
latest_book_title=Subquery(latest_book)
) Avoid computing aggregates in Python when PostgreSQL can do it in SQL. Counting 100,000 rows in Python means fetching and iterating over 100,000 objects. COUNT(*) in SQL returns a single integer.
See: Django annotate, subquery, and CTE
exists() and count() over len() and bool()
# Bad: fetches all rows, builds all objects, then counts them
total = len(Book.objects.filter(status='published'))
# Good: SELECT COUNT(*) — returns an integer, no object instantiation
total = Book.objects.filter(status='published').count()
# Bad: evaluates the entire queryset to check truthiness
if Book.objects.filter(author_id=5):
print("Has books")
# Good: SELECT 1 FROM books WHERE author_id = 5 LIMIT 1
if Book.objects.filter(author_id=5).exists():
print("Has books") exists() generates SELECT 1 ... LIMIT 1. count() generates SELECT COUNT(*). Both are dramatically faster than fetching and counting all results in Python.
Bulk operations
bulk_create inserts many objects in a single INSERT:
books = [
Book(title=f"Book {i}", author=author, published_date='2026-01-01', page_count=200)
for i in range(10000)
]
# 1 INSERT with multiple value sets (or batched)
Book.objects.bulk_create(books, batch_size=1000) QuerySet update() and delete() operate at the SQL level without loading objects:
# SQL-level UPDATE — no objects loaded
Book.objects.filter(status='draft').update(status='archived')
# SQL-level DELETE — no objects loaded
Book.objects.filter(status='archived', updated_at__lt='2025-01-01').delete() Database functions and expressions
Pushing computation to PostgreSQL
Let the database do the work it was designed for. F() expressions are particularly important for atomic updates:
from django.db.models.functions import Coalesce, ExtractYear
from django.db.models import F, Value
# F() expressions for database-level arithmetic
Book.objects.filter(page_count__gt=F('avg_page_count') * 2)
# Update with F() — no race conditions
Book.objects.filter(id=1).update(view_count=F('view_count') + 1) update(view_count=F('view_count') + 1) generates SET view_count = view_count + 1 in SQL, which is atomic. The Python alternative — book.view_count += 1; book.save() — is a race condition under load.
PostgreSQL-specific features in Django
Full-text search:
from django.contrib.postgres.search import (
SearchVector, SearchQuery, SearchRank, SearchHeadline
)
# Basic full-text search
results = Book.objects.annotate(
search=SearchVector('title', 'description')
).filter(search=SearchQuery('database optimization'))
# Ranked search with headline
query = SearchQuery('database optimization')
results = Book.objects.annotate(
rank=SearchRank(SearchVector('title', 'description'), query),
headline=SearchHeadline('description', query)
).filter(rank__gte=0.1).order_by('-rank') Trigram similarity for fuzzy matching (requires pg_trgm):
from django.contrib.postgres.search import TrigramSimilarity
# Find books with titles similar to the search term
results = Book.objects.annotate(
similarity=TrigramSimilarity('title', 'postgre optimzation') # note the typo
).filter(similarity__gt=0.3).order_by('-similarity') GIN indexes on ArrayField, JSONField, and SearchVectorField:
from django.contrib.postgres.indexes import GinIndex
class Book(models.Model):
title = models.CharField(max_length=300)
tags = ArrayField(models.CharField(max_length=50), default=list)
metadata = models.JSONField(default=dict)
search_vector = SearchVectorField(null=True)
class Meta:
indexes = [
GinIndex(fields=['tags']),
GinIndex(fields=['metadata']),
GinIndex(fields=['search_vector']),
] See: Django full-text search | GIN index | JSONB
Custom SQL with connection.cursor()
When the ORM cannot express the query you need — use raw SQL with parameterized queries:
from django.db import connection
def get_monthly_revenue(year):
with connection.cursor() as cursor:
cursor.execute("""
WITH monthly AS (
SELECT
DATE_TRUNC('month', created_at) AS month,
SUM(total) AS revenue,
COUNT(*) AS order_count
FROM orders
WHERE EXTRACT(YEAR FROM created_at) = %s
AND status = 'completed'
GROUP BY DATE_TRUNC('month', created_at)
)
SELECT
month,
revenue,
order_count,
revenue - LAG(revenue) OVER (ORDER BY month) AS revenue_change
FROM monthly
ORDER BY month;
""", [year])
columns = [col[0] for col in cursor.description]
return [dict(zip(columns, row)) for row in cursor.fetchall()] Connection pool sizing for Django
Django's default connection behavior
By default, Django opens a new database connection at the start of each request and closes it at the end. Setting CONN_MAX_AGE controls reuse:
# settings.py — default behavior (new connection per request)
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'myapp',
'HOST': 'db.example.com',
'PORT': '5432',
'CONN_MAX_AGE': 0, # Close after each request (default)
}
} PgBouncer with Django
PgBouncer in transaction mode is the most common setup with Django. Set CONN_MAX_AGE=0 when using PgBouncer — if Django holds connections open, those connections stay checked out of PgBouncer's pool, defeating the purpose:
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'myapp',
'HOST': '127.0.0.1',
'PORT': '6432', # PgBouncer port, not PostgreSQL
'CONN_MAX_AGE': 0, # Let PgBouncer manage persistence
}
} Read replica routing
Django's DATABASE_ROUTERS
Django supports multiple database connections and a router system to direct queries:
# routers.py
class PrimaryReplicaRouter:
def db_for_read(self, model, **hints):
return 'replica'
def db_for_write(self, model, **hints):
return 'default'
def allow_relation(self, obj1, obj2, **hints):
return True
def allow_migrate(self, db, app_label, model_name=None, **hints):
return db == 'default' With this router, all SELECT queries go to the replica and all writes go to the primary. No application code changes required.
Replication lag considerations
After writing to the primary, reading from the replica immediately may return stale data. Session-sticky routing solves this:
# routers.py
import time
from threading import local
_thread_locals = local()
class LagAwareRouter:
STICKY_SECONDS = 2 # Route reads to primary for 2s after a write
def db_for_read(self, model, **hints):
last_write = getattr(_thread_locals, 'last_write_time', 0)
if time.time() - last_write < self.STICKY_SECONDS:
return 'default' # Read from primary during sticky window
return 'replica'
def db_for_write(self, model, **hints):
_thread_locals.last_write_time = time.time()
return 'default'
def allow_relation(self, obj1, obj2, **hints):
return True
def allow_migrate(self, db, app_label, model_name=None, **hints):
return db == 'default' Indexing for Django models
Meta.indexes — declarative index definition
Proper indexing is the single most underused optimization in Django applications:
from django.db import models
from django.contrib.postgres.indexes import GinIndex
class Order(models.Model):
customer = models.ForeignKey('Customer', on_delete=models.CASCADE)
status = models.CharField(max_length=20, db_index=True)
total = models.DecimalField(max_digits=10, decimal_places=2)
created_at = models.DateTimeField(auto_now_add=True)
metadata = models.JSONField(default=dict)
class Meta:
indexes = [
# Composite index for common query pattern
models.Index(
fields=['status', 'created_at'],
name='idx_order_status_created'
),
# Partial index — only index active orders
models.Index(
fields=['created_at'],
name='idx_order_active_created',
condition=models.Q(status='active')
),
# GIN index on JSONB field
GinIndex(fields=['metadata'], name='idx_order_metadata_gin'),
] Zero-downtime index creation
Use AddIndexConcurrently from django.contrib.postgres.operations:
from django.contrib.postgres.operations import AddIndexConcurrently
class Migration(migrations.Migration):
atomic = False # Required for CONCURRENTLY
operations = [
AddIndexConcurrently(
model_name='order',
index=models.Index(
fields=['status', 'created_at'],
name='idx_order_status_created'
),
),
] See: Composite indexes | Partial indexes | Django zero-downtime migrations
Monitoring Django's database performance
Development tools
django-debug-toolbar is indispensable:
# settings.py (development only)
INSTALLED_APPS = [
...
'debug_toolbar',
]
MIDDLEWARE = [
'debug_toolbar.middleware.DebugToolbarMiddleware',
...
]
INTERNAL_IPS = ['127.0.0.1'] Production monitoring
pg_stat_statements — the most important production monitoring tool:
-- Top 10 queries by total execution time
SELECT
LEFT(query, 100) AS query_preview,
calls,
total_exec_time / 1000 AS total_seconds,
mean_exec_time AS mean_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10; A query that runs in 50ms but is called 100,000 times per hour consumes more total time than a 5-second query called once per hour. The quiet, frequent query is often the one that deserves your attention most.
See: pg_stat_statements | auto_explain
The ORM is a capable tool, and Django's in particular is one of the finest available. But it works best when you understand what it is doing on your behalf. The techniques in this guide — eager loading, column projection, bulk operations, database functions, proper indexing, connection pooling — are not advanced. They are foundational. And the performance difference between a Django application that applies them and one that does not is, in my experience, measured not in percentages but in orders of magnitude.
The database was not slow. The ORM was speaking on your behalf without your full instructions. Now that you know what to ask for, the conversation will go rather differently. For PostgreSQL-level tuning that complements these ORM patterns, see the PostgreSQL performance tuning guide.