Django PostgreSQL Optimization: Beyond the ORM

Q: What is the difference between select_related and prefetch_related in Django?

select_related uses a SQL JOIN to fetch related objects in one query. It works with ForeignKey and OneToOneField relationships where one row maps to exactly one related object. prefetch_related runs a separate query with an IN clause, then matches results in Python. It works with ManyToManyField and reverse ForeignKey relationships where one row maps to many related objects. Use select_related when the relationship is single-valued. Use prefetch_related when the relationship is multi-valued.

Q: How do I find N+1 queries in Django?

During development, use django-debug-toolbar's SQL panel to count queries per view — if you see dozens of nearly identical SELECT statements, you have an N+1 problem. Install the nplusone package for automatic detection that raises exceptions on N+1 patterns. In production, enable pg_stat_statements on PostgreSQL to find queries with high call counts.

Q: Should I use raw SQL or Django ORM?

Use the ORM for standard CRUD operations and queries it can express cleanly. Use raw SQL via connection.cursor() or Model.objects.raw() for complex queries the ORM cannot express or expresses poorly: CTEs, recursive queries, lateral joins, or cases where ORM-generated SQL is significantly suboptimal. Always use parameterized queries to prevent SQL injection.

Q: How do I configure connection pooling for Django with PostgreSQL?

Set CONN_MAX_AGE in your Django settings to reuse connections across requests. For production deployments with multiple workers, add PgBouncer in transaction mode. When using PgBouncer, set CONN_MAX_AGE=0 so Django doesn't hold connections open — let PgBouncer manage the pool.

Q: How many database connections does Django use?

Django creates one connection per thread per database. With Gunicorn configured for 4 workers with 2 threads each, one application server uses up to 8 connections. Multiply by the number of application servers to find your total requirement. Use a connection pooler like PgBouncer when aggregate connections approach PostgreSQL's max_connections limit.

I regret to inform you that your ORM has been submitting queries on your behalf that may require a frank discussion.

How-To Guide · The Butler of Gold Lapel · March 2026 · 30 min read

The illustration was requested. The ORM generated fourteen separate requests for it. We are having the conversation.

How Django talks to PostgreSQL

Django's ORM translates Python code into SQL. Understanding what SQL it generates is the first optimization step — you cannot optimize what you cannot see.

A QuerySet is lazy. No SQL is sent to PostgreSQL until the QuerySet is evaluated:

# No SQL executed here — this is just building a query description
qs = Order.objects.filter(status='pending').select_related('customer')

# SQL is executed here, when you iterate
for order in qs:
    print(order.customer.name)

To see the actual SQL Django generates, use django.db.connection.queries in development:

from django.db import connection

# After executing some queries...
for query in connection.queries:
    print(query['sql'])
    print(query['time'])

For a more practical approach, install django-debug-toolbar. Its SQL panel shows every query with timing, duplicate detection, and the ability to run EXPLAIN on each query directly from the browser.

Django PostgreSQL optimization starts with understanding the SQL your ORM generates. select_related and prefetch_related eliminate N+1 queries, database functions push computation to PostgreSQL, and connection pooling prevents exhaustion under load.

Solving the N+1 query problem

The N+1 query problem is the single most common performance issue in Django applications. Address this first, before anything else.

Identifying N+1 queries

An N+1 query happens when you load a list of objects (1 query), then access a related object on each one (N additional queries):

Models

# models.py
class Author(models.Model):
    name = models.CharField(max_length=200)
    email = models.EmailField()

class Book(models.Model):
    title = models.CharField(max_length=300)
    author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='books')
    published_date = models.DateField()
    page_count = models.IntegerField()

class Tag(models.Model):
    name = models.CharField(max_length=100)
    books = models.ManyToManyField(Book, related_name='tags')

# This generates 1 + N queries (N = number of books)
books = Book.objects.all()  # Query 1: SELECT * FROM books
for book in books:
    print(book.author.name)  # Query 2..N+1: SELECT * FROM authors WHERE id = ?

If there are 100 books, this produces 101 queries. Each book.author triggers a separate database round trip. I have seen production views generating over 400 database round trips through precisely this pattern.

nplusone detects N+1 queries automatically during development:

# settings.py
INSTALLED_APPS = [
    ...
    'nplusone.ext.django',
]

MIDDLEWARE = [
    'nplusone.ext.django.NPlusOneMiddleware',
    ...
]

NPLUSONE_RAISE = True  # Raise an exception on N+1 (dev only)

auto_explain on the PostgreSQL side logs execution plans for slow queries:

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 100   # Log queries slower than 100ms
auto_explain.log_analyze = true
auto_explain.log_buffers = true

See: N+1 query guide | N+1 query problem | DRF nested serializer N+1

select_related — JOIN-based eager loading

select_related follows ForeignKey and OneToOneField relationships, generating a SQL JOIN:

# 1 query instead of 101
books = Book.objects.select_related('author').all()
for book in books:
    print(book.author.name)  # No additional query — author is already loaded

The generated SQL:

SELECT books.id, books.title, books.author_id, books.published_date, books.page_count,
       authors.id, authors.name, authors.email
FROM books
INNER JOIN authors ON books.author_id = authors.id;

Chaining fetches multiple relationships in one query:

# Fetch both author and publisher in one JOIN
books = Book.objects.select_related('author', 'publisher').all()

# Follow author → company (if Author has a ForeignKey to Company)
books = Book.objects.select_related('author__company').all()
for book in books:
    print(book.author.company.name)  # No extra queries

prefetch_related — separate query eager loading

prefetch_related handles ManyToManyField and reverse ForeignKey relationships with a separate query:

# 2 queries instead of N+1
books = Book.objects.prefetch_related('tags').all()
for book in books:
    for tag in book.tags.all():  # No additional query — tags are prefetched
        print(tag.name)

Generated SQL

-- Query 1: Fetch all books
SELECT * FROM books;

-- Query 2: Fetch all tags for those books via the junction table
SELECT tags.*, books_tags.book_id
FROM tags
INNER JOIN books_tags ON tags.id = books_tags.tag_id
WHERE books_tags.book_id IN (1, 2, 3, 4, 5, ...);

Two queries total, regardless of how many books or tags exist.

Prefetch objects allow filtering the prefetched queryset:

from django.db.models import Prefetch

# Only prefetch tags that are "featured"
books = Book.objects.prefetch_related(
    Prefetch('tags', queryset=Tag.objects.filter(name__startswith='featured'))
).all()

Choosing between select_related and prefetch_related

Relationship Type	Use	Why
ForeignKey	`select_related`	One row to one related row. JOIN is efficient.
OneToOneField	`select_related`	One row to one related row. JOIN is efficient.
ManyToManyField	`prefetch_related`	One row to many related rows. JOIN causes row explosion.
Reverse ForeignKey	`prefetch_related`	One row to many related rows. Separate query with IN clause.

The hybrid pattern combines both:

books = (
    Book.objects
    .select_related('author')          # FK — JOIN is efficient
    .prefetch_related('tags')           # M2M — separate query
    .prefetch_related('reviews')        # Reverse FK — separate query
)
# Total: 3 queries regardless of result count

QuerySet optimization patterns

only() and defer() — column projection

By default, Django fetches all columns (SELECT *). When you only need a few fields:

# SELECT id, title FROM books (instead of SELECT *)
books = Book.objects.only('id', 'title')

# SELECT id, title, author_id, published_date, page_count FROM books
# (excludes the large "description" text field)
books = Book.objects.defer('description')

Accessing a deferred field triggers an additional query per object. Use only() and defer() only when you're confident the deferred fields won't be accessed.

values() and values_list() — lightweight results

# Returns list of dictionaries — no model instantiation
Book.objects.values('id', 'title', 'author__name')
# [{'id': 1, 'title': 'The Art of SQL', 'author__name': 'Jane Smith'}, ...]

# Returns list of tuples — even lighter
Book.objects.values_list('id', 'title')
# [(1, 'The Art of SQL'), (2, 'Database Internals'), ...]

# Returns flat list of a single field
Book.objects.values_list('id', flat=True)
# [1, 2, 3, 4, 5, ...]

No model methods, no related object access, no save(). Use them for exports, API responses that don't need model logic, and feeding data into other queries.

Aggregation and annotation

aggregate() computes a summary across an entire queryset:

from django.db.models import Count, Sum, Avg, Max

# Single query: SELECT COUNT(*), SUM(page_count), AVG(page_count) FROM books
stats = Book.objects.aggregate(
    total_books=Count('id'),
    total_pages=Sum('page_count'),
    avg_pages=Avg('page_count'),
    longest_book=Max('page_count')
)

annotate() adds per-row computed values, pushing computation to PostgreSQL:

from django.db.models import Count

# Each author gets a book_count computed in SQL
authors = Author.objects.annotate(
    book_count=Count('books')
).order_by('-book_count')

for author in authors:
    print(f"{author.name}: {author.book_count} books")

Subquery and OuterRef for correlated subqueries:

from django.db.models import Subquery, OuterRef

# Annotate each author with their most recent book's title
latest_book = (
    Book.objects
    .filter(author=OuterRef('pk'))
    .order_by('-published_date')
    .values('title')[:1]
)

authors = Author.objects.annotate(
    latest_book_title=Subquery(latest_book)
)

Avoid computing aggregates in Python when PostgreSQL can do it in SQL. Counting 100,000 rows in Python means fetching and iterating over 100,000 objects. COUNT(*) in SQL returns a single integer.

See: Django annotate, subquery, and CTE

exists() and count() over len() and bool()

# Bad: fetches all rows, builds all objects, then counts them
total = len(Book.objects.filter(status='published'))

# Good: SELECT COUNT(*) — returns an integer, no object instantiation
total = Book.objects.filter(status='published').count()

# Bad: evaluates the entire queryset to check truthiness
if Book.objects.filter(author_id=5):
    print("Has books")

# Good: SELECT 1 FROM books WHERE author_id = 5 LIMIT 1
if Book.objects.filter(author_id=5).exists():
    print("Has books")

exists() generates SELECT 1 ... LIMIT 1. count() generates SELECT COUNT(*). Both are dramatically faster than fetching and counting all results in Python.

Bulk operations

bulk_create inserts many objects in a single INSERT:

books = [
    Book(title=f"Book {i}", author=author, published_date='2026-01-01', page_count=200)
    for i in range(10000)
]

# 1 INSERT with multiple value sets (or batched)
Book.objects.bulk_create(books, batch_size=1000)

QuerySet update() and delete() operate at the SQL level without loading objects:

# SQL-level UPDATE — no objects loaded
Book.objects.filter(status='draft').update(status='archived')

# SQL-level DELETE — no objects loaded
Book.objects.filter(status='archived', updated_at__lt='2025-01-01').delete()

Database functions and expressions

Pushing computation to PostgreSQL

Let the database do the work it was designed for. F() expressions are particularly important for atomic updates:

from django.db.models.functions import Coalesce, ExtractYear
from django.db.models import F, Value

# F() expressions for database-level arithmetic
Book.objects.filter(page_count__gt=F('avg_page_count') * 2)

# Update with F() — no race conditions
Book.objects.filter(id=1).update(view_count=F('view_count') + 1)

update(view_count=F('view_count') + 1) generates SET view_count = view_count + 1 in SQL, which is atomic. The Python alternative — book.view_count += 1; book.save() — is a race condition under load.

PostgreSQL-specific features in Django

Full-text search:

from django.contrib.postgres.search import (
    SearchVector, SearchQuery, SearchRank, SearchHeadline
)

# Basic full-text search
results = Book.objects.annotate(
    search=SearchVector('title', 'description')
).filter(search=SearchQuery('database optimization'))

# Ranked search with headline
query = SearchQuery('database optimization')
results = Book.objects.annotate(
    rank=SearchRank(SearchVector('title', 'description'), query),
    headline=SearchHeadline('description', query)
).filter(rank__gte=0.1).order_by('-rank')

Trigram similarity for fuzzy matching (requires pg_trgm):

from django.contrib.postgres.search import TrigramSimilarity

# Find books with titles similar to the search term
results = Book.objects.annotate(
    similarity=TrigramSimilarity('title', 'postgre optimzation')  # note the typo
).filter(similarity__gt=0.3).order_by('-similarity')

GIN indexes on ArrayField, JSONField, and SearchVectorField:

from django.contrib.postgres.indexes import GinIndex

class Book(models.Model):
    title = models.CharField(max_length=300)
    tags = ArrayField(models.CharField(max_length=50), default=list)
    metadata = models.JSONField(default=dict)
    search_vector = SearchVectorField(null=True)

    class Meta:
        indexes = [
            GinIndex(fields=['tags']),
            GinIndex(fields=['metadata']),
            GinIndex(fields=['search_vector']),
        ]

See: Django full-text search | GIN index | JSONB

Custom SQL with connection.cursor()

When the ORM cannot express the query you need — use raw SQL with parameterized queries:

from django.db import connection

def get_monthly_revenue(year):
    with connection.cursor() as cursor:
        cursor.execute("""
            WITH monthly AS (
                SELECT
                    DATE_TRUNC('month', created_at) AS month,
                    SUM(total) AS revenue,
                    COUNT(*) AS order_count
                FROM orders
                WHERE EXTRACT(YEAR FROM created_at) = %s
                  AND status = 'completed'
                GROUP BY DATE_TRUNC('month', created_at)
            )
            SELECT
                month,
                revenue,
                order_count,
                revenue - LAG(revenue) OVER (ORDER BY month) AS revenue_change
            FROM monthly
            ORDER BY month;
        """, [year])

        columns = [col[0] for col in cursor.description]
        return [dict(zip(columns, row)) for row in cursor.fetchall()]

Connection pool sizing for Django

Django's default connection behavior

By default, Django opens a new database connection at the start of each request and closes it at the end. Setting CONN_MAX_AGE controls reuse:

# settings.py — default behavior (new connection per request)
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'myapp',
        'HOST': 'db.example.com',
        'PORT': '5432',
        'CONN_MAX_AGE': 0,  # Close after each request (default)
    }
}

PgBouncer with Django

PgBouncer in transaction mode is the most common setup with Django. Set CONN_MAX_AGE=0 when using PgBouncer — if Django holds connections open, those connections stay checked out of PgBouncer's pool, defeating the purpose:

# settings.py
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'myapp',
        'HOST': '127.0.0.1',
        'PORT': '6432',       # PgBouncer port, not PostgreSQL
        'CONN_MAX_AGE': 0,    # Let PgBouncer manage persistence
    }
}

See: Connection pooling guide | PostgreSQL poolers

Read replica routing

Django's DATABASE_ROUTERS

Django supports multiple database connections and a router system to direct queries:

# routers.py
class PrimaryReplicaRouter:
    def db_for_read(self, model, **hints):
        return 'replica'

    def db_for_write(self, model, **hints):
        return 'default'

    def allow_relation(self, obj1, obj2, **hints):
        return True

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        return db == 'default'

With this router, all SELECT queries go to the replica and all writes go to the primary. No application code changes required.

Replication lag considerations

After writing to the primary, reading from the replica immediately may return stale data. Session-sticky routing solves this:

# routers.py
import time
from threading import local

_thread_locals = local()

class LagAwareRouter:
    STICKY_SECONDS = 2  # Route reads to primary for 2s after a write

    def db_for_read(self, model, **hints):
        last_write = getattr(_thread_locals, 'last_write_time', 0)
        if time.time() - last_write < self.STICKY_SECONDS:
            return 'default'  # Read from primary during sticky window
        return 'replica'

    def db_for_write(self, model, **hints):
        _thread_locals.last_write_time = time.time()
        return 'default'

    def allow_relation(self, obj1, obj2, **hints):
        return True

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        return db == 'default'

See: Django read replica replication lag

Indexing for Django models

Meta.indexes — declarative index definition

Proper indexing is the single most underused optimization in Django applications:

from django.db import models
from django.contrib.postgres.indexes import GinIndex

class Order(models.Model):
    customer = models.ForeignKey('Customer', on_delete=models.CASCADE)
    status = models.CharField(max_length=20, db_index=True)
    total = models.DecimalField(max_digits=10, decimal_places=2)
    created_at = models.DateTimeField(auto_now_add=True)
    metadata = models.JSONField(default=dict)

    class Meta:
        indexes = [
            # Composite index for common query pattern
            models.Index(
                fields=['status', 'created_at'],
                name='idx_order_status_created'
            ),

            # Partial index — only index active orders
            models.Index(
                fields=['created_at'],
                name='idx_order_active_created',
                condition=models.Q(status='active')
            ),

            # GIN index on JSONB field
            GinIndex(fields=['metadata'], name='idx_order_metadata_gin'),
        ]

Zero-downtime index creation

Use AddIndexConcurrently from django.contrib.postgres.operations:

from django.contrib.postgres.operations import AddIndexConcurrently

class Migration(migrations.Migration):
    atomic = False  # Required for CONCURRENTLY

    operations = [
        AddIndexConcurrently(
            model_name='order',
            index=models.Index(
                fields=['status', 'created_at'],
                name='idx_order_status_created'
            ),
        ),
    ]

See: Composite indexes | Partial indexes | Django zero-downtime migrations

Monitoring Django's database performance

Development tools

django-debug-toolbar is indispensable:

# settings.py (development only)
INSTALLED_APPS = [
    ...
    'debug_toolbar',
]

MIDDLEWARE = [
    'debug_toolbar.middleware.DebugToolbarMiddleware',
    ...
]

INTERNAL_IPS = ['127.0.0.1']

Production monitoring

pg_stat_statements — the most important production monitoring tool:

-- Top 10 queries by total execution time
SELECT
    LEFT(query, 100) AS query_preview,
    calls,
    total_exec_time / 1000 AS total_seconds,
    mean_exec_time AS mean_ms,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

A query that runs in 50ms but is called 100,000 times per hour consumes more total time than a 5-second query called once per hour. The quiet, frequent query is often the one that deserves your attention most.

See: pg_stat_statements | auto_explain

The ORM is a capable tool, and Django's in particular is one of the finest available. But it works best when you understand what it is doing on your behalf. The techniques in this guide — eager loading, column projection, bulk operations, database functions, proper indexing, connection pooling — are not advanced. They are foundational. And the performance difference between a Django application that applies them and one that does not is, in my experience, measured not in percentages but in orders of magnitude.

The database was not slow. The ORM was speaking on your behalf without your full instructions. Now that you know what to ask for, the conversation will go rather differently. For PostgreSQL-level tuning that complements these ORM patterns, see the PostgreSQL performance tuning guide.