DRF Nested Serializer N+1 Queries: Fixing Prefetch Invalidation in Django REST Framework

Q: Why does prefetch_related not reduce query count with nested DRF serializers?

DRF's ListSerializer.to_representation() calls .all() on each related manager, which creates a fresh QuerySet and discards any cached prefetch results. The prefetch still executes, but its results are never used — the serializer issues new queries for every object. Using to_attr on your Prefetch objects stores the results as a plain list attribute, bypassing the manager entirely and preserving your prefetched data.

Q: What is the difference between select_related and prefetch_related in Django?

select_related performs a SQL JOIN and fetches related objects in a single query, but only works for forward ForeignKey and OneToOneField relationships. prefetch_related issues a separate query per relationship and stitches results together in Python, which makes it suitable for reverse ForeignKey and ManyToManyField relationships. For nested serializers, you typically need both: select_related for forward FKs and prefetch_related for reverse relations.

Q: How do I prevent N+1 regressions from reappearing after fixing them?

Write assertNumQueries tests with realistic data volumes — at least two parent objects with two or more children each, so the query count is distinguishable from the O(1) case. Assert the exact count and include a comment with the expected query breakdown. These tests catch regressions immediately in CI when someone adds a new nested field or changes the serializer structure.

Q: Can Gold Lapel help with N+1 queries in Django REST Framework?

Gold Lapel operates at the database layer, automatically creating indexes for the queries your DRF serializers generate. It does not eliminate N+1 queries — you still need proper prefetching for that — but it ensures each of those queries runs as fast as possible. For a typical DRF API, correct prefetching reduces query count while Gold Lapel's auto-indexing reduces per-query execution time, and the two improvements compound.

You added prefetch_related. The query count didn't change. Allow me to explain why, and what to do instead.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 28 min read

The illustration was prefetched, then re-fetched individually. One pixel at a time.

Good evening. I understand your prefetches have stopped working.

A distressing situation, and more common than the documentation would lead you to believe. You have done everything correctly. You read the Django docs on prefetch_related. You added the right lookups. You confirmed the query count dropped in the shell. And then, at some point between "works in manage.py shell" and "works in production," the prefetch silently stopped being used.

The queries came back. All 851 of them.

I have seen this precise scenario in production DRF applications more times than I should be comfortable admitting. An endpoint that shipped at 40ms gradually drifts to 1,200ms. No one changed the view. No one removed the prefetch_related. Someone added a SerializerMethodField. Or swapped a string lookup for a Prefetch object with a filtered queryset. Or the pagination class was added after the prefetching was written. Each of these changes is individually reasonable. Each of them is capable of destroying your prefetch cache without leaving any trace in your logs.

This is not a Django bug. It is not a caching issue. It is not a race condition. It is a specific, documented interaction between Django REST Framework's ListSerializer and Django's prefetch cache — and it has been an open issue on GitHub (#2704) since 2015. Over a decade, as of this writing.

The short version: DRF calls .all() on your related managers during serialization. For simple string-based prefetches, this is fine — Django matches by attribute name, and the clone produced by .all() still finds the cached data. For Prefetch objects with custom querysets — the kind you write when you need filtered or annotated relations — that .all() call creates a new queryset clone that Django cannot match to the prefetch cache. The cached data is right there in memory. Django ignores it and hits the database again.

DRF's documentation explicitly states that they "do not attempt to automatically optimize querysets." Fair enough. Transparency is a virtue. But they also do not warn you that their serialization code actively undermines the optimizations you add yourself. That omission is less virtuous.

This article addresses the specific, DRF-flavored variant of the N+1 problem where your prefetches exist, are correct, and are silently ignored. If you are looking for the broader treatment of N+1 queries across ORMs and languages, I have written about that elsewhere. What follows here is the DRF-specific pathology, its root cause in Django's prefetch cache, and the four approaches that reliably fix it.

The setup: nested serializers and quiet disaster

Allow me to construct the scenario that produces the problem. Four models, three levels of nesting, one API endpoint that returns authors with their books, publishers, and reviews. A bookstore API — the canonical example, because the real-world version of this (e-commerce orders with items, variants, and shipping records) is identical in structure but harder to fit on a screen.

from django.db import models

class Publisher(models.Model):
    name = models.CharField(max_length=200)
    founded = models.IntegerField()

class Author(models.Model):
    name = models.CharField(max_length=200)
    bio = models.TextField(blank=True)

class Book(models.Model):
    title = models.CharField(max_length=300)
    isbn = models.CharField(max_length=13)
    published = models.DateField()
    author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name="books")
    publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE, related_name="books")

class Review(models.Model):
    book = models.ForeignKey(Book, on_delete=models.CASCADE, related_name="reviews")
    rating = models.IntegerField()
    text = models.TextField()
    created = models.DateTimeField(auto_now_add=True)

The serializers look reasonable. Nested, yes, but that is precisely what the frontend needs — a single request that returns the full author with embedded books and reviews. The mobile team will not thank you for making them issue four separate requests to assemble a single screen.

from rest_framework import serializers

class ReviewSerializer(serializers.ModelSerializer):
    class Meta:
        model = Review
        fields = ["id", "rating", "text", "created"]

class BookSerializer(serializers.ModelSerializer):
    reviews = ReviewSerializer(many=True, read_only=True)
    author_name = serializers.CharField(source="author.name")
    publisher_name = serializers.CharField(source="publisher.name")

    class Meta:
        model = Book
        fields = ["id", "title", "isbn", "published",
                  "author_name", "publisher_name", "reviews"]

class AuthorSerializer(serializers.ModelSerializer):
    books = BookSerializer(many=True, read_only=True)

    class Meta:
        model = Author
        fields = ["id", "name", "bio", "books"]

And the view — five lines of DRF boilerplate. The sort of code that looks so innocuous it does not occur to anyone to profile it.

from rest_framework import generics

class AuthorListView(generics.ListAPIView):
    queryset = Author.objects.all()
    serializer_class = AuthorSerializer

# What happens when a client hits GET /api/authors/:
#
# Query 1: SELECT * FROM author
#   (returns 50 authors)
#
# For each author:
#   Query 2..51: SELECT * FROM book WHERE author_id = ?
#     (one per author — 50 queries)
#
#   For each book:
#     Query 52..N: SELECT * FROM publisher WHERE id = ?
#       (one per book)
#     Query N+1..M: SELECT * FROM review WHERE book_id = ?
#       (one per book)
#
# 50 authors x 8 books avg = 400 books
# 1 + 50 + 400 + 400 = 851 queries for a single API response
#
# Response time: 1,200ms on a fast database. 4,000ms+ in production.

Fifty authors, eight books each on average, five reviews per book. One API request. 851 queries.

# django-debug-toolbar or django.db.connection.queries shows:
#
# [0.3ms] SELECT "author"."id", "author"."name", "author"."bio" FROM "author"
# [0.2ms] SELECT "book".* FROM "book" WHERE "book"."author_id" = 1
# [0.1ms] SELECT "publisher".* FROM "publisher" WHERE "publisher"."id" = 7
# [0.2ms] SELECT "review".* FROM "review" WHERE "review"."book_id" = 1
# [0.2ms] SELECT "review".* FROM "review" WHERE "review"."book_id" = 2
# [0.1ms] SELECT "publisher".* FROM "publisher" WHERE "publisher"."id" = 3
# [0.2ms] SELECT "book".* FROM "book" WHERE "book"."author_id" = 2
# ... (847 more queries)
#
# Each query is fast. The total is catastrophic.

Each individual query takes a fraction of a millisecond. The per-query cost is negligible. But each query also incurs a network round-trip between your application server and PostgreSQL — connection checkout, wire protocol overhead, result parsing, Python object instantiation. At 1.4ms of overhead per query (a conservative estimate for a co-located database), 851 queries take 1,190ms of pure round-trip overhead before a single row is processed.

This is the fundamental character of the N+1 problem: the bottleneck is not the database's execution time but the application's chattiness with the database. A single query that returns 10,000 rows is faster than 10,000 queries that each return one row, because the round-trip cost dominates. One conversation is cheaper than ten thousand.

Detecting the problem before your users do

The most frustrating quality of prefetch invalidation is its silence. No exception. No warning in the Django log. No slow-query alert from PostgreSQL, because each individual query is fast. The only symptom is a response time that creeps upward as data grows — and by the time someone notices, the endpoint has been haemorrhaging queries for weeks.

I would recommend establishing detection at three levels: development, testing, and production. Each catches what the others miss.

Development: query logging

The simplest diagnostic is seeing the queries. Django's built-in logging can print every SQL statement to the console, and a decorator can group them per view for clarity.

# settings.py — enable query logging in development
LOGGING = {
    "version": 1,
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
        },
    },
    "loggers": {
        "django.db.backends": {
            "handlers": ["console"],
            "level": "DEBUG",
        },
    },
}

# Or, for targeted inspection in a view:
from django.db import connection, reset_queries
from django.conf import settings

def debug_query_count(view_func):
    """Decorator that logs the query count for a view."""
    def wrapper(request, *args, **kwargs):
        reset_queries()
        response = view_func(request, *args, **kwargs)
        if settings.DEBUG:
            queries = connection.queries
            print(f"View {view_func.__name__}: {len(queries)} queries")
            # Group by table for clarity
            tables = {}
            for q in queries:
                sql = q["sql"]
                for table in ["author", "book", "publisher", "review"]:
                    if f'"{table}"' in sql:
                        tables[table] = tables.get(table, 0) + 1
                        break
            for table, count in sorted(tables.items()):
                print(f"  {table}: {count} queries")
        return response
    return wrapper

This is not something you leave on permanently. It is a diagnostic instrument — the database equivalent of a stethoscope. When a view feels slow, attach it, observe the query pattern, and remove it. The output is often illuminating. Seeing 851 nearly-identical SELECT statements scroll past in the console has a way of making the problem visceral in a way that no metric dashboard achieves.

Development: the nplusone package

If you prefer automated detection over manual inspection, the nplusone package can raise exceptions whenever it detects an N+1 access pattern.

# pip install nplusone
# settings.py
INSTALLED_APPS = [
    ...
    "nplusone.ext.django",
]

MIDDLEWARE = [
    "nplusone.ext.django.NPlusOneMiddleware",
    ...
]

NPLUSONE_RAISE = True  # Raise exceptions on N+1 in dev
# NPLUSONE_RAISE = False  # Log warnings instead (for staging)
# NPLUSONE_LOGGER = logging.getLogger("nplusone")

# When you hit GET /api/authors/ without prefetching,
# nplusone raises:
#
# nplusone.core.exceptions.NPlusOneError:
#   Potential N+1 query detected on Author.books
#
# It catches the pattern before you see the slow response.
# In CI, set NPLUSONE_RAISE = True to fail tests on N+1.

The package works by monkeypatching Django's related manager access. When a ForeignKey or reverse relation is accessed on a model instance that was loaded as part of a queryset, and no prefetch is present, nplusone recognizes the pattern and raises. It is the canary in the coal mine — immediate, unmistakable, and far more pleasant to encounter in development than in a 3AM production alert.

I should note a limitation: nplusone detects missing prefetches, not invalidated ones. If you have a Prefetch object that DRF's .all() is silently destroying, the prefetch technically exists — it is just not being used. The package will not catch this. For invalidation specifically, assertNumQueries in your test suite is the only reliable guard, and we will come to that shortly.

The obvious fix (and why it breaks)

Every Django developer learns the same fix: prefetch_related. Add it to the queryset, and Django loads related objects in bulk using WHERE id IN (...) queries instead of one-at-a-time lookups.

class AuthorListView(generics.ListAPIView):
    queryset = Author.objects.prefetch_related(
        "books",
        "books__publisher",
        "books__reviews",
    )
    serializer_class = AuthorSerializer

# Expected: 4 queries (authors, books, publishers, reviews)
# Actual:   4 queries... sometimes.
#
# But add a filter to the Review serializer, or use
# SerializerMethodField, or let DRF paginate, and the
# prefetch cache silently vanishes. Back to 851 queries.
# No warning. No error. Just a slow API.

This works. Four queries instead of 851. The response drops from 1,200ms to 40ms. Ship it.

Then someone adds a SerializerMethodField that filters reviews by rating. Or DRF paginates the results. Or a custom Prefetch object replaces the string lookup. And the query count creeps back up — silently, without errors, without warnings — because the prefetch cache has been invalidated.

The mechanism is subtle enough that it deserves its own explanation.

# The problem lives in DRF's ListSerializer.to_representation().
# Simplified from rest_framework/serializers.py:

class ListSerializer(BaseSerializer):
    def to_representation(self, data):
        return [
            self.child.to_representation(item)
            for item in data.all()  # <--- THIS LINE
        ]

# That .all() call is the culprit.
#
# When Django evaluates a prefetched queryset, it checks whether
# the queryset object is the EXACT same instance that was
# prefetched. Calling .all() creates a new queryset clone.
# Django sees a new queryset, doesn't find it in the prefetch
# cache, and hits the database again.
#
# For simple Prefetch objects (just a string), .all() is fine —
# Django matches by attribute name.
#
# For Prefetch objects with custom querysets, .all() destroys
# the match. The custom queryset is gone. Django falls back
# to a fresh database query.
#
# GitHub issue #2704 has tracked this since 2015. It remains open.

The critical detail: .all() on a related manager returns a new queryset. For string-based prefetches like "books", Django matches the prefetch cache by attribute name — so .all() still finds the cached data. But for Prefetch objects with a custom queryset parameter, Django matches by queryset identity. A cloned queryset is not the same object. The cache miss is silent. The database query fires again.

This is why your prefetch_related "works in the shell" — where you iterate manually without calling .all() — but fails in DRF, where ListSerializer.to_representation() calls .all() on every related manager it touches.

Inside the prefetch cache: why identity matching fails

If you will permit me a brief detour into Django's internals, the cache mechanism itself is worth understanding. Not because you need to patch it, but because understanding why it breaks prevents you from writing code that triggers the failure.

# Django's prefetch cache lives on each model instance.
# Simplified from django/db/models/query.py:

class QuerySet:
    def _prefetch_related_lookups(self):
        for lookup in self._prefetch_related_lookups:
            # For string lookups ("books"):
            #   Matches by attribute name on the model
            #   .all() returns a clone, but the attribute
            #   name still matches — cache hit
            #
            # For Prefetch objects with queryset:
            #   Matches by queryset identity (id())
            #   .all() returns a NEW queryset — different id()
            #   — cache miss — fresh DB query
            pass

# The cache lookup (simplified):
def get_prefetched_cache_key(instance, lookup):
    if isinstance(lookup, str):
        return lookup  # "books" — stable across .all()
    else:
        return id(lookup.queryset)  # memory address — unstable

# This is why string prefetches survive .all() but
# Prefetch(queryset=...) does not.
# The fix in Django would be to match on the SQL string
# instead of object identity. That patch exists in the
# issue tracker. It has not been merged.

The design choice is reasonable from Django's perspective. String-based lookups like "books" can be matched by attribute name because the string is the cache key. But a Prefetch object with a custom queryset cannot be matched by name alone — the queryset might filter, annotate, order, or limit the results, and two Prefetch objects with the same attribute name but different querysets should not share a cache entry.

Django's solution is to match by queryset identity — the Python id() of the queryset object. This works perfectly when you iterate the prefetched results directly. It fails the moment anyone calls .all(), because .all() returns a clone with a different id().

A patch exists in the Django issue tracker that would match by SQL string instead of object identity. It would solve the problem entirely. It has not been merged, in part because matching by SQL string introduces its own edge cases — querysets that generate identical SQL but have different Python-side behaviour (custom Iterable classes, for instance). The Django maintainers have chosen caution over convenience, and I cannot fault them for it.

What I can fault is the absence of a warning. When Django falls back to a database query because a prefetch cache miss occurred on a queryset that was prefetched (just with a different identity), a logging.warning would save thousands of developers thousands of hours. The data to detect this is available at the point of cache lookup. The warning does not exist.

The SerializerMethodField trap

Before we move to fixes, I must address the single most common way DRF developers accidentally re-introduce N+1 queries into a prefetched endpoint. It is not a missing prefetch_related. It is a SerializerMethodField that accesses a related manager.

class BookSerializer(serializers.ModelSerializer):
    recent_reviews = serializers.SerializerMethodField()
    author_name = serializers.CharField(source="author.name")
    publisher_name = serializers.CharField(source="publisher.name")

    class Meta:
        model = Book
        fields = ["id", "title", "isbn", "published",
                  "author_name", "publisher_name", "recent_reviews"]

    def get_recent_reviews(self, book):
        # THIS FIRES A NEW QUERY FOR EVERY BOOK
        reviews = book.reviews.filter(
            created__gte=timezone.now() - timedelta(days=30)
        ).order_by("-created")[:5]
        return ReviewSerializer(reviews, many=True).data

# Even with prefetch_related("books__reviews"), this
# SerializerMethodField bypasses the prefetch cache entirely.
#
# book.reviews is the related manager. Calling .filter() on it
# creates a new queryset. Django cannot match this new queryset
# to the prefetched data. Fresh database query. Every time.
#
# 50 authors x 8 books = 400 calls to get_recent_reviews()
# = 400 additional queries on top of whatever else the
#   serializer is doing.
#
# The fix: use to_attr to prefetch the filtered set, then
# reference the to_attr list in the serializer.

The pattern is seductive. The frontend needs only recent reviews, not all reviews. A SerializerMethodField lets you filter in the serializer. The code reads clearly. It works in development with five records. It generates 400 additional queries in production with real data.

The root cause is the same: calling .filter() on a related manager creates a new queryset. Even if you prefetched the reviews, the filtered queryset is a different object. Django does not check whether the prefetched data is a superset of your filter — it simply does not match, and fires a new query.

The fix is to move the filtering into the Prefetch object and use to_attr to store the result as a plain list.

# In the view's get_queryset():
from django.db.models import Prefetch
from django.utils import timezone
from datetime import timedelta

def get_queryset(self):
    thirty_days_ago = timezone.now() - timedelta(days=30)
    return Author.objects.prefetch_related(
        Prefetch(
            "books",
            queryset=Book.objects.select_related("publisher"),
        ),
        Prefetch(
            "books__reviews",
            queryset=Review.objects.filter(
                created__gte=thirty_days_ago
            ).order_by("-created")[:5],
            to_attr="recent_reviews",  # plain list, no manager
        ),
    )

# In the serializer:
class BookSerializer(serializers.ModelSerializer):
    recent_reviews = ReviewSerializer(many=True, read_only=True)
    # No SerializerMethodField needed. DRF finds
    # book.recent_reviews (the to_attr list) and serializes it.

    class Meta:
        model = Book
        fields = ["id", "title", "isbn", "published",
                  "author_name", "publisher_name", "recent_reviews"]

# Result: 0 additional queries. The filtered, ordered, limited
# review set was prefetched into a plain list. DRF iterates
# the list. No .all(). No .filter(). No cache miss.

The SerializerMethodField disappears entirely. The filtering moves to the queryset layer, where it belongs — where PostgreSQL can execute it once in bulk instead of 400 times individually. The to_attr ensures the result is a plain Python list that DRF can iterate without calling .all() or .filter() or anything else that would trigger a cache miss.

I should offer a general principle: any time you find yourself writing a SerializerMethodField that accesses self.instance.related_manager, stop and consider whether the work should be a Prefetch with to_attr instead. The SerializerMethodField runs once per instance. The Prefetch runs once per queryset. That is the difference between O(n) queries and O(1).

Pagination and the class-level queryset problem

There is a second, subtler way that prefetches break in DRF, and it involves the interaction between pagination and class-level queryset attributes.

# DRF pagination can break prefetch caches too.
# Here's the common pattern:

class AuthorListView(generics.ListAPIView):
    queryset = Author.objects.prefetch_related(
        "books", "books__reviews"
    )
    serializer_class = AuthorSerializer
    pagination_class = PageNumberPagination

# What happens internally:
#
# 1. DRF calls get_queryset() → returns prefetched queryset
# 2. DRF calls paginate_queryset() → slices the queryset
#    queryset[offset:offset+limit]
# 3. Slicing evaluates the queryset and returns a list
# 4. The prefetch_related runs on the SLICED set — this is fine
#
# The subtlety: class-level queryset (not get_queryset())
# is evaluated once and cached as a class attribute.
# On the second request, the queryset is already evaluated.
# Django re-evaluates it, but the prefetch objects may be
# stale if they reference time-dependent filters.
#
# Rule: ALWAYS use get_queryset() instead of class-level
# queryset when you have prefetch_related. Always.

class AuthorListView(generics.ListAPIView):
    serializer_class = AuthorSerializer
    pagination_class = PageNumberPagination

    def get_queryset(self):
        return Author.objects.prefetch_related(
            "books",
            Prefetch(
                "books__reviews",
                queryset=Review.objects.filter(
                    created__gte=timezone.now() - timedelta(days=30)
                ),
                to_attr="recent_reviews",
            ),
        )

# Fresh prefetch objects on every request.
# Pagination slices before prefetch evaluation.
# Time-dependent filters use the current time, not
# the time the class was first loaded.

The issue is not that pagination breaks prefetching — it does not. DRF paginates by slicing the queryset, and Django's prefetch_related runs on the sliced result, which is correct. The issue is that class-level queryset attributes are evaluated once at class load time. If your Prefetch objects contain time-dependent filters (reviews from the last 30 days, orders from the current quarter), those filters are frozen at the time the class was first imported.

The fix is always the same: use get_queryset() instead of a class-level queryset. Fresh Prefetch objects on every request. Current timestamps. No stale caches. This is not a matter of preference — it is a correctness requirement for any view with time-dependent or user-dependent prefetch filters.

I have encountered production APIs where the "last 30 days" filter was actually filtering from the last 30 days relative to the deployment date, not the request date. The endpoint had been deployed four months earlier. The "recent reviews" included reviews from 120 days ago and excluded reviews from yesterday. No one noticed because the response was fast and the data looked plausible.

Fix 1: The setup_eager_loading pattern

The most widely adopted fix in the DRF ecosystem is the setup_eager_loading pattern. Instead of declaring prefetches on the view's queryset, you let the serializer declare its own loading requirements.

class AuthorListView(generics.ListAPIView):
    serializer_class = AuthorSerializer

    def get_queryset(self):
        return Author.objects.prefetch_related(
            "books",
            "books__publisher",
            "books__reviews",
        )

# This is the setup_eager_loading pattern. Instead of setting
# queryset as a class attribute, override get_queryset().
#
# Why? Class-level queryset is evaluated once and cached.
# get_queryset() runs per-request, ensuring fresh prefetches.
#
# For simple cases, both work. For filtered or paginated views,
# get_queryset() is the only reliable option.

Moving from a class-level queryset to get_queryset() ensures the prefetch objects are fresh on every request. This alone does not fix the .all() invalidation problem — but it prevents the related issue where stale prefetch caches from previous requests pollute subsequent ones.

The full pattern extracts prefetch declarations into a mixin that the serializer controls.

class EagerLoadingMixin:
    """Mixin that calls setup_eager_loading on the serializer."""

    @classmethod
    def setup_eager_loading(cls, queryset):
        """Override in serializer to declare prefetches."""
        return queryset

    def get_queryset(self):
        queryset = super().get_queryset()
        # Let the serializer declare its own loading strategy
        return self.serializer_class.setup_eager_loading(queryset)


class AuthorSerializer(serializers.ModelSerializer):
    books = BookSerializer(many=True, read_only=True)

    class Meta:
        model = Author
        fields = ["id", "name", "bio", "books"]

    @classmethod
    def setup_eager_loading(cls, queryset):
        return queryset.prefetch_related(
            "books",
            "books__publisher",
            "books__reviews",
        )


class AuthorListView(EagerLoadingMixin, generics.ListAPIView):
    queryset = Author.objects.all()
    serializer_class = AuthorSerializer

# Now the serializer owns its prefetch declarations.
# The view doesn't need to know what the serializer needs.
# Reuse the serializer in another view — prefetches follow.

This pattern has three advantages over view-level prefetching:

Serializer portability. Use AuthorSerializer in a different view and the prefetches follow automatically. No need to remember which prefetches each view needs — the serializer carries that knowledge with it.
Single source of truth. The serializer knows what fields it accesses. It should know what data to prefetch. Putting that knowledge in the view creates a coupling that breaks when someone changes the serializer without updating the view — a failure mode that has no test, no lint rule, and no warning. Only a slow endpoint.
Composability. Nested serializers can each declare their own setup_eager_loading, and the parent serializer chains them together.

The composability point deserves an example, because it is where the pattern truly earns its keep.

class ReviewSerializer(serializers.ModelSerializer):
    class Meta:
        model = Review
        fields = ["id", "rating", "text", "created"]

    @classmethod
    def setup_eager_loading(cls, queryset):
        # Reviews have no nested relations to prefetch
        return queryset


class BookSerializer(serializers.ModelSerializer):
    reviews = ReviewSerializer(many=True, read_only=True)
    author_name = serializers.CharField(source="author.name")
    publisher_name = serializers.CharField(source="publisher.name")

    class Meta:
        model = Book
        fields = ["id", "title", "isbn", "published",
                  "author_name", "publisher_name", "reviews"]

    @classmethod
    def setup_eager_loading(cls, queryset):
        queryset = queryset.select_related("publisher")
        queryset = ReviewSerializer.setup_eager_loading(
            queryset.prefetch_related("reviews")
        )
        return queryset


class AuthorSerializer(serializers.ModelSerializer):
    books = BookSerializer(many=True, read_only=True)

    class Meta:
        model = Author
        fields = ["id", "name", "bio", "books"]

    @classmethod
    def setup_eager_loading(cls, queryset):
        # Chain the child serializer's eager loading
        book_qs = BookSerializer.setup_eager_loading(
            Book.objects.all()
        )
        return queryset.prefetch_related(
            Prefetch("books", queryset=book_qs),
        )

# Each serializer declares what it needs.
# Parent serializers compose child declarations.
# Add a new field to BookSerializer? Update its
# setup_eager_loading. Every view that uses it benefits.

Each serializer declares only its own requirements. The parent serializer composes them. Add a new field to BookSerializer? Update its setup_eager_loading. Every view that uses AuthorSerializer — which composes BookSerializer — automatically gets the updated prefetch. No view changes. No hunting through your codebase for every place the serializer is used.

For string-based prefetches, this pattern is sufficient. For custom Prefetch objects with filtered querysets, you need one more tool.

Fix 2: to_attr — the invalidation-proof prefetch

The to_attr parameter on Prefetch objects changes the storage mechanism entirely. Instead of populating the related manager's cache (which .all() can invalidate), it stores the prefetched results as a plain Python list directly on the model instance.

from django.db.models import Prefetch

class AuthorListView(generics.ListAPIView):
    serializer_class = AuthorSerializer

    def get_queryset(self):
        return Author.objects.prefetch_related(
            Prefetch(
                "books",
                queryset=Book.objects.select_related("publisher"),
            ),
            Prefetch(
                "books__reviews",
                queryset=Review.objects.filter(rating__gte=3),
                to_attr="good_reviews",  # <--- KEY
            ),
        )

# to_attr stores the prefetched results as a plain Python list
# on the model instance, instead of as a queryset manager.
#
# Without to_attr:
#   book.reviews.all()  →  .all() may invalidate the prefetch
#
# With to_attr:
#   book.good_reviews   →  plain list, no .all() needed, no
#                           queryset clone, no cache invalidation
#
# The serializer must access book.good_reviews instead of
# book.reviews. This requires a custom field:

class BookSerializer(serializers.ModelSerializer):
    good_reviews = ReviewSerializer(many=True, read_only=True)
    author_name = serializers.CharField(source="author.name")
    publisher_name = serializers.CharField(source="publisher.name")

    class Meta:
        model = Book
        fields = ["id", "title", "isbn", "published",
                  "author_name", "publisher_name", "good_reviews"]

With to_attr="good_reviews", the prefetched data lives at book.good_reviews as a regular Python list. No manager. No .all(). No queryset cloning. No cache invalidation. DRF's ListSerializer iterates over the list directly, because it checks for list-like objects before attempting queryset operations.

The trade-off is real and should not be minimized: your serializer field name must match the to_attr name, not the model's related name. This means you cannot use the same serializer for both the prefetched view and a non-prefetched view unless you handle both attribute names. In practice, this is rarely a problem — if you have a view without prefetching, you have a bigger issue than attribute names.

There is a second trade-off that is less frequently discussed. A plain Python list does not support further queryset operations. You cannot call .filter(), .exclude(), .order_by(), or .count() on a list. If your serializer or template code expects a queryset manager (because it chains additional filters), to_attr will break it. The fix is to move all filtering into the Prefetch object's queryset, which is where it should be anyway for performance reasons — but be aware of this if you are retrofitting to_attr onto an existing codebase.

When to use to_attr:

Any Prefetch object with a custom queryset parameter (filtered, annotated, or ordered)
Any prefetch where you have confirmed .all() invalidation in your query logs
Prefetches that filter the related set (e.g., only recent reviews, only active subscriptions)
Any case where a SerializerMethodField was previously doing the filtering

When you can skip to_attr:

Simple string-based prefetches like "books" or "books__reviews" — these use attribute-name matching and are not affected by .all()
Prefetches where you want the full, unfiltered related set and are using string lookups

select_related vs prefetch_related: choosing correctly

A clarification, because these two are confused often enough in DRF codebases that it warrants a direct comparison. The choice between them is not a matter of preference. It is dictated by the relationship type, and getting it wrong does not merely reduce performance — it can change the shape of your data.

# select_related: SQL JOIN. One query. For ForeignKey / OneToOne.
Author.objects.select_related("publisher")
# SELECT author.*, publisher.* FROM author
# INNER JOIN publisher ON author.publisher_id = publisher.id

# prefetch_related: Separate query with IN clause. For reverse FK / M2M.
Author.objects.prefetch_related("books")
# Query 1: SELECT * FROM author
# Query 2: SELECT * FROM book WHERE author_id IN (1, 2, 3, ..., 50)

# Combining them:
Author.objects.prefetch_related(
    Prefetch(
        "books",
        queryset=Book.objects.select_related("publisher"),
    ),
    "books__reviews",
)
# Query 1: SELECT * FROM author
# Query 2: SELECT book.*, publisher.*
#           FROM book
#           JOIN publisher ON book.publisher_id = publisher.id
#           WHERE book.author_id IN (1, 2, 3, ..., 50)
# Query 3: SELECT * FROM review WHERE book_id IN (1, 2, ..., 400)
#
# Total: 3 queries. Down from 851.

select_related uses a SQL JOIN. One query, wider result set. It works for ForeignKey and OneToOneField — relationships where each row has exactly one related object. It cannot work for reverse foreign keys or many-to-many relationships because the JOIN would create duplicate parent rows. If an author has eight books, a JOIN produces eight rows per author, and DRF would serialize that author eight times unless additional deduplication is applied.

prefetch_related uses a separate query with an IN clause. Two queries, clean result sets. It works for all relationship types, including reverse foreign keys and many-to-many. The results are joined in Python, not in SQL, which means no row duplication and no need for DISTINCT.

The combination is where the real efficiency lives. Use select_related inside a Prefetch object's queryset to handle the forward FKs on the prefetched model, and prefetch_related for the reverse relations. Three queries instead of 851, with no row duplication.

The resulting SQL is clean enough that any "ORM vs raw SQL" argument collapses. This is not ORM overhead. This is the ORM generating exactly the SQL you would write by hand — a SELECT with a JOIN and a WHERE ... IN. The ORM's contribution is managing the cache that maps the results back to Python objects. That cache management is what to_attr helps you control.

"I have observed, in production systems, pages generating over 400 database round trips for what appeared to be a simple list view. The ORM did not fail. It did exactly what was asked. It was simply asked poorly."
— from You Don't Need Redis, Chapter 3: The ORM Tax

An honest word about nesting depth

I should be forthcoming about a boundary that prefetching cannot cross, because pretending it does not exist would be a disservice to you and an embarrassment to me.

# When nesting goes too deep, even proper prefetching
# has diminishing returns.

# Consider: Author → Books → Chapters → Sections → Comments
# 5 levels deep. Even with perfect prefetching:
#
# Query 1: SELECT * FROM author
# Query 2: SELECT * FROM book WHERE author_id IN (...)
# Query 3: SELECT * FROM chapter WHERE book_id IN (...)
# Query 4: SELECT * FROM section WHERE chapter_id IN (...)
# Query 5: SELECT * FROM comment WHERE section_id IN (...)
#
# 5 queries. O(depth). Seems fine.
#
# But look at the data volume:
# 50 authors × 8 books × 20 chapters × 5 sections × 10 comments
# = 4,000,000 comment rows loaded into Python memory
#
# The query count is constant. The memory is not.
# The serialization time is not.
#
# At some point, the right answer is not "prefetch deeper"
# but "serialize shallower."
#
# Options:
# 1. Flatten: return IDs instead of nested objects
#    { "book_ids": [1, 2, 3] } instead of { "books": [...] }
#
# 2. Paginate the nested relation:
#    Only include the first 5 reviews per book
#
# 3. Separate endpoints:
#    GET /api/authors/ (flat, fast)
#    GET /api/authors/1/books/ (one level of nesting)
#    GET /api/books/1/reviews/ (one level of nesting)
#
# The frontend makes 2-3 requests instead of 1, but each
# is fast, cacheable, and doesn't load millions of rows.

Prefetching solves the query count problem. It does not solve the data volume problem. When your serializer nests five levels deep and each level fans out, the total number of objects loaded into Python memory grows multiplicatively. The query count is O(depth) — constant, predictable, fast. The memory consumption is O(n * m * p * q * r) — the product of the cardinalities at each level.

At some point, the right answer is not to prefetch deeper but to serialize shallower. Return IDs instead of nested objects. Paginate nested relations. Split into separate endpoints. The frontend makes two or three requests instead of one, but each is fast, cacheable, and does not require loading four million comment rows into a Python process.

This is not a concession. It is good API design. GraphQL understood this instinctively — let the client specify exactly the depth and fields it needs, and the server loads only what is requested. REST does not have this mechanism natively, which is why DRF developers must make the depth decision at design time rather than at request time.

If you find yourself writing Prefetch chains four or five levels deep, consider whether the endpoint is trying to do too much. A single endpoint that returns the entire object graph is convenient for the frontend and punishing for the database. Two endpoints that each return two levels of nesting are often faster in aggregate — and they are certainly easier to cache, paginate, and reason about.

django-auto-prefetching: automated forward relations

If declaring prefetches manually feels like bookkeeping you should not have to do — well, you are not entirely wrong. The django-auto-prefetching library takes a different approach: replace your model base class, and forward ForeignKey accesses are automatically covered by select_related.

# django-auto-prefetching: automatic prefetch detection
# pip install django-auto-prefetching

import auto_prefetch

class Author(auto_prefetch.Model):
    name = models.CharField(max_length=200)
    bio = models.TextField(blank=True)

class Book(auto_prefetch.Model):
    title = models.CharField(max_length=300)
    author = auto_prefetch.ForeignKey(
        Author, on_delete=models.CASCADE, related_name="books"
    )
    publisher = auto_prefetch.ForeignKey(
        Publisher, on_delete=models.CASCADE, related_name="books"
    )

# With auto_prefetch.Model, ForeignKey access automatically uses
# select_related when the queryset is evaluated as part of a
# prefetch chain. No manual prefetch declarations needed for
# forward ForeignKey relations.
#
# For reverse relations (author.books), you still need
# prefetch_related. Auto-prefetching handles the forward
# direction only.
#
# Reported results: 30-40% reduction in query count on
# real-world DRF APIs, with zero serializer changes.

The library inspects the queryset at evaluation time, identifies which ForeignKey fields will be accessed during serialization, and adds the appropriate select_related calls automatically. No manual declarations. No forgotten prefetches. No view-serializer coupling for the forward-FK case.

The limitation is clear and worth emphasizing: it handles forward ForeignKey relations only. Reverse relations (author.books) and many-to-many relations still require explicit prefetch_related calls. In practice, this means auto-prefetching eliminates roughly half of the N+1 queries in a typical DRF API — the forward-FK half — and you handle the reverse-FK half manually with the patterns described above.

Teams that have adopted it report 30-40% reductions in total query count across their APIs, with zero changes to serializer code. That is a meaningful improvement for a one-line model change.

I should note a practical consideration: django-auto-prefetching requires you to change your model base class from models.Model to auto_prefetch.Model. For a new project, this is trivial. For an existing project with hundreds of models, it is a significant migration. The library is designed to be backwards-compatible — auto_prefetch.Model inherits from models.Model and adds behaviour without removing any — but changing the base class of every model in a large codebase is a change that warrants its own pull request, its own review, and its own test run. It is not something to slip into a feature branch.

Documenting your loading strategy

The fragility of DRF prefetching is not a code problem alone. It is a knowledge problem. The prefetch declarations work today because the person who wrote them understood the serializer's field access patterns. They will break next quarter when someone who does not have that context adds a new field.

I advocate for making the loading strategy visible at every layer: in the code, in the tests, and in the API documentation.

# drf-spectacular and prefetching: documentation as a signal
# pip install drf-spectacular

from drf_spectacular.utils import extend_schema

class AuthorListView(EagerLoadingMixin, generics.ListAPIView):
    queryset = Author.objects.all()
    serializer_class = AuthorSerializer

    @extend_schema(
        summary="List all authors with books and reviews",
        description=(
            "Returns authors with nested books (including publisher) "
            "and reviews. Uses prefetch_related for constant query "
            "count regardless of result size."
        ),
    )
    def get(self, request, *args, **kwargs):
        return super().get(request, *args, **kwargs)

# Why document the prefetch strategy in your API schema?
#
# Because when the next developer adds a field to the
# serializer, the schema description reminds them that
# this endpoint has explicit prefetch declarations.
#
# It's not a guardrail. It's a signpost. The real guardrail
# is the assertNumQueries test. But signposts prevent the
# need for guardrails to activate.

This is not a substitute for the assertNumQueries test. The test is the guardrail. The documentation is the signpost. The signpost prevents the need for the guardrail to activate. Both are necessary, because developers read documentation when they are planning a change and encounter test failures after they have already made it. The documentation catches the mistake earlier, which is cheaper and less frustrating for everyone involved.

Loading strategy comparison

For reference, here is every approach discussed, ranked by effort and safety.

Approach	Query complexity	Setup effort	Considerations
No prefetching	O(n * m)	None	N+1 on every nested relation
prefetch_related (strings)	O(depth)	Low	Safe with .all() — no invalidation
Prefetch with custom queryset	O(depth)	Medium	Invalidated by .all() in ListSerializer
Prefetch + to_attr	O(depth)	Medium	Immune to .all() invalidation
setup_eager_loading mixin	O(depth)	Medium	Serializer owns its prefetch strategy
django-auto-prefetching	O(depth)	Low	Forward FK only; reverse still manual
Gold Lapel auto-indexing	O(depth)	None	Indexes the queries, not the code

The bottom three rows are not mutually exclusive. Use setup_eager_loading with to_attr to handle custom querysets, layer django-auto-prefetching for forward FKs, and let Gold Lapel index the columns those prefetch queries hit. Each addresses a different layer of the problem.

Testing your query count (so it never regresses)

Fixing N+1 queries is satisfying. Watching them return six months later because someone added a new nested serializer field is not. The fix for regression is the same as the fix for any other regression: automated tests. This is non-negotiable. Without a query-count test, prefetch optimizations are temporary by nature.

from django.test.utils import override_settings

class TestAuthorEndpointQueries:
    def test_author_list_query_count(self, api_client, create_authors):
        """Ensure the author list endpoint uses constant queries."""
        # Create 5 authors with 3 books each, 2 reviews per book
        create_authors(count=5, books_per=3, reviews_per=2)

        with self.assertNumQueries(3):
            # 1: authors
            # 2: books (with publisher via select_related)
            # 3: reviews
            response = api_client.get("/api/authors/")

        assert response.status_code == 200
        assert len(response.data) == 5

    def test_query_count_scales_constantly(self, api_client, create_authors):
        """Queries must not increase with more authors."""
        create_authors(count=50, books_per=10, reviews_per=5)

        with self.assertNumQueries(3):
            # Same 3 queries whether there are 5 or 500 authors
            response = api_client.get("/api/authors/")

        assert response.status_code == 200
        assert len(response.data) == 50

Django's assertNumQueries is the only reliable guard against prefetch regressions. It pins the exact query count. If someone adds a new relationship to the serializer without updating the prefetch declarations, the test fails immediately — not three weeks later when a customer reports a slow endpoint.

Three guidelines for query-count tests:

Test with enough data to expose N+1 patterns. One author with one book will always be fast. Fifty authors with ten books each will expose the missing prefetch. The N+1 problem is invisible at small scale — that is precisely what makes it dangerous.
Assert the exact count, not a range. "Less than 10 queries" is not a useful assertion when the correct answer is 3. The test should fail if the count changes at all — an increase means a regression, and a decrease means you improved something worth documenting.
Include the query breakdown in the test comment. # 1: authors, 2: books with publisher, 3: reviews tells the next developer exactly what each query does. When the test fails, they know which prefetch to investigate.

For CI environments, a more detailed fixture that prints the actual queries on failure is worth the investment.

# conftest.py — query count assertion for CI
import pytest
from django.test.utils import CaptureQueriesContext
from django.db import connection

@pytest.fixture
def assert_max_queries():
    """Context manager that fails if query count exceeds threshold."""
    class QueryCounter:
        def __init__(self, max_queries):
            self.max_queries = max_queries
            self.context = CaptureQueriesContext(connection)

        def __enter__(self):
            self.context.__enter__()
            return self

        def __exit__(self, *args):
            self.context.__exit__(*args)
            count = len(self.context.captured_queries)
            if count > self.max_queries:
                queries = "\n".join(
                    f"  [{q['time']}] {q['sql'][:120]}"
                    for q in self.context.captured_queries
                )
                pytest.fail(
                    f"Expected at most {self.max_queries} queries, "
                    f"got {count}:\n{queries}"
                )

    def factory(max_queries):
        return QueryCounter(max_queries)

    return factory


# Usage in tests:
def test_author_detail_queries(api_client, assert_max_queries):
    with assert_max_queries(4):
        api_client.get("/api/authors/1/")

# When it fails, it prints every query with timing.
# No guessing. No debug-toolbar. Just the failing test
# showing exactly which queries appeared.

When this test fails, the output shows every query with its timing. No guessing. No attaching a debug toolbar to a CI server. The failing test tells you precisely which queries appeared, which makes the missing prefetch obvious. I have seen teams reduce their prefetch regression debugging time from hours to minutes by adopting this pattern.

Honest counterpoints: when this advice does not apply

A waiter who overstates his case is no waiter at all. The approaches in this article solve a specific problem — N+1 queries in DRF nested serializers backed by PostgreSQL — and they solve it well. But they are not universally applicable, and I should say where they fall short.

Write-heavy APIs do not have this problem. If your DRF endpoint accepts POST and PUT requests and rarely serves deeply nested GET responses, prefetch optimization is not your bottleneck. Serializer validation, database writes, and signal handlers are where your time goes. Optimizing reads on a write-heavy endpoint is like polishing the silver while the kitchen is on fire.

Small datasets make the N+1 cost negligible. If your author table has 12 rows and each has 3 books, the naive serializer generates 49 queries that complete in 8ms total. Adding prefetch_related drops it to 3 queries and 3ms. The 5ms improvement is real but irrelevant — no user perceives it, no server is strained by it, and the code complexity of proper prefetching is not free. For small, slow-growing datasets, the naive approach is honestly fine. The optimizations in this article earn their keep when the data is large or growing.

GraphQL solves this differently. If you are considering a move to Graphene-Django or Strawberry, the approach to N+1 is fundamentally different. GraphQL uses dataloaders — batching mechanisms that collect individual lookups within a single request and execute them as a batch query. This is a different abstraction than prefetching, and in some ways a more elegant one, because the batching is automatic and does not require the developer to declare prefetch chains. The trade-off is that dataloaders add complexity to the resolver layer and can be difficult to debug when they interact with Django's ORM caching. Neither approach is strictly better. They are different tools for different API designs.

Aggressive caching can make prefetch optimization irrelevant. If your author list endpoint is behind a Redis cache with a 60-second TTL and your data changes hourly, the endpoint hits the database once per minute regardless of whether it fires 3 queries or 851. The cache absorbs the inefficiency. I would not recommend this as a primary strategy — caches introduce invalidation complexity, stale data, and cold-start latency — but it is an honest acknowledgement that N+1 optimization is less urgent when a caching layer is already in place.

Before and after: the full picture

# BEFORE: Naive nested serializers
# ─────────────────────────────────────────────
# Authors: 50
# Books per author (avg): 8
# Reviews per book (avg): 5
#
# Queries:      851
# Response time: 1,247ms
# DB time:       1,180ms
# Serialization: 67ms

# AFTER: setup_eager_loading + select_related + to_attr
# ─────────────────────────────────────────────
# Authors: 50
# Books per author (avg): 8
# Reviews per book (avg): 5
#
# Queries:      3
# Response time: 38ms
# DB time:       22ms
# Serialization: 16ms
#
# Improvement:   97% fewer queries
#                32x faster response
#
# And with Gold Lapel auto-indexing book.author_id
# and review.book_id:
#
# Queries:      3
# Response time: 11ms
# DB time:       4ms
# Serialization: 7ms
#
# The 3 remaining queries now hit indexes instead
# of sequential scans. Another 3x faster on top.

The progression tells the story. Naive nested serializers: 851 queries, 1,247ms. Proper prefetching with to_attr: 3 queries, 38ms. Adding proper indexes on the FK columns those 3 queries use: 11ms.

That last step — indexing — is where most teams stop optimizing. Three queries feels like victory. And it is, compared to 851. But those 3 queries each scan a potentially large table using the FK column in a WHERE ... IN clause. Without a B-tree index on that column, each prefetch query performs a sequential scan. With 10 million rows in the review table, that scan is the difference between 4ms and 400ms.

PostgreSQL does not automatically index foreign key columns. The constraint exists — referential integrity is enforced — but no index is created unless you explicitly add one in a migration. This is one of PostgreSQL's most consequential design decisions, and it catches Django developers regularly because Django's ForeignKey field does create a database index by default (db_index=True) — but only on the column that has the FK, not on the referenced column. The book.author_id column gets an index. The author.id primary key has one inherently. But if you have a composite or non-standard FK, the index may not be what you expect.

I should correct a common misconception here: for standard Django ForeignKey fields, the db_index=True default means the WHERE author_id IN (...) query will use an index, because Django creates one on the FK column. The indexing gap appears in two scenarios: when db_index=False is set explicitly (sometimes done to reduce write overhead), and when the IN clause contains enough values that PostgreSQL's query planner decides a sequential scan is cheaper than an index scan. For the latter case, Gold Lapel's query analysis can identify when the planner's cost estimate is wrong and suggest a configuration change.

Where Gold Lapel fits: indexing the queries DRF generates

Everything above is application-level work. Better serializer patterns, better prefetch declarations, better tests. It is all necessary, and it gets you from 851 queries to 3.

The remaining performance — the gap between "3 queries on suboptimal plans" and "3 queries executing optimally" — is database-level work. And it is precisely the work that Gold Lapel automates.

# What Gold Lapel sees in your DRF query traffic:
#
# Pattern 1 (prefetch_related "books"):
#   SELECT * FROM book WHERE author_id IN (1, 2, 3, ..., 50)
#   Frequency: 200 req/min
#   Missing index on: book.author_id
#   → CREATE INDEX CONCURRENTLY ON book (author_id)
#
# Pattern 2 (prefetch_related "books__reviews"):
#   SELECT * FROM review WHERE book_id IN (1, 2, ..., 400)
#   Frequency: 200 req/min
#   Missing index on: review.book_id
#   → CREATE INDEX CONCURRENTLY ON review (book_id)
#
# Pattern 3 (repeated join on publisher):
#   SELECT book.*, publisher.* FROM book
#   JOIN publisher ON book.publisher_id = publisher.id
#   WHERE book.author_id IN (...)
#   → Materialized view candidate detected
#
# Gold Lapel doesn't read your serializer code. It doesn't
# need to. It reads the SQL that reaches PostgreSQL, identifies
# the foreign key columns used in IN clauses and JOIN conditions,
# and indexes them. The ORM layer is irrelevant — the query
# pattern is everything.

Gold Lapel sits between your Django application and PostgreSQL as a transparent proxy. It does not parse your Python code. It does not inspect your serializers. It watches the actual SQL queries that reach the database — the SELECT ... WHERE author_id IN (...) pattern that prefetch_related generates — and identifies which columns are used in IN clauses, JOIN conditions, and WHERE filters.

When it detects a frequently-queried column without an index, it creates one concurrently. No migration file. No downtime. No deploy. The next time that prefetch query fires, it hits the index instead of scanning the table.

For DRF applications specifically, the impact is pronounced. Every prefetch_related call generates an IN-clause query on a FK column. Every select_related call generates a JOIN on a FK column. These are exactly the columns that benefit most from B-tree indexes, and exactly the columns where query plan analysis can reveal suboptimal access patterns.

For repeated join patterns — the same book JOIN publisher showing up in every API request — Gold Lapel can detect materialized view candidates and surface them for review. The query that took 22ms with indexes drops further when the join is pre-computed.

You fix the N+1 at the application layer. Gold Lapel fixes the performance gap at the database layer. Between the two, the 1,247ms endpoint becomes 11ms. That is not a rounding error. That is the difference between "we need to cache this" and "we do not need to cache anything." And an endpoint that does not need caching is an endpoint that always serves fresh data — which is, if you will permit me, a more honest arrangement between your API and its consumers.

The complete prefetch checklist

I shall leave you with a checklist. Not because checklists are elegant — they are not — but because the DRF prefetch problem has enough moving parts that a concise reference prevents the most common mistakes.

Use get_queryset(), never class-level queryset, when your view has prefetch_related. Fresh prefetch objects on every request.
Use to_attr on any Prefetch with a custom queryset. It stores results as a plain list, immune to .all() invalidation.
Replace SerializerMethodField with to_attr when the method accesses a related manager. Move the filtering into the Prefetch queryset.
Use select_related inside Prefetch querysets for forward FKs on the prefetched model. Turns two queries into one JOIN.
Adopt the setup_eager_loading mixin so serializers own their prefetch declarations. Prevents view-serializer coupling.
Write assertNumQueries tests with realistic data volumes. Assert the exact count. Include the query breakdown in comments.
Consider django-auto-prefetching for automatic forward-FK coverage. It eliminates half the manual declarations.
Question nesting depth. Five levels of nested serializers is a design problem, not an optimization problem. Flatten or split endpoints when the data volume grows multiplicatively.

If you follow these eight items, your DRF API will generate O(depth) queries per request instead of O(n * m). The queries themselves will be efficient. The regressions will be caught in CI. And the next developer who touches the endpoint will find a system that explains itself through its tests, its documentation, and its code.

That, if I may say, is rather the point of all this. Not just fast queries — but a codebase where fast queries are the default, and slow queries are immediately visible. A well-run household does not merely clean up messes. It arranges things so that messes do not occur.

Frequently asked questions

Why does prefetch_related not reduce query count with nested DRF serializers?

What is the difference between select_related and prefetch_related in Django?

How do I prevent N+1 regressions from reappearing after fixing them?

Can Gold Lapel help with N+1 queries in Django REST Framework?

Terms referenced in this article

The setup_eager_loading pattern above encodes your loading strategy in one place. For a broader treatment of how Django custom managers can enforce these patterns across every view and serializer in your application, I have written a guide to Django custom managers for PostgreSQL optimization — so the manager remembers what the developer forgets.