Polymorphic Associations Are Undermining Your PostgreSQL Performance. Here Are the Escape Routes.

Q: Why can't PostgreSQL enforce foreign keys on polymorphic associations?

A foreign key constraint references exactly one target table. Polymorphic associations store an ID that could point to any of several tables — posts, photos, videos — determined by a type column. PostgreSQL has no mechanism to express "this column references table X when that column equals Y." The constraint would need to be conditional, and the SQL standard does not support conditional foreign keys.

Q: At what table size do polymorphic associations start causing performance problems?

The inflection point is typically between 500,000 and 2 million rows, depending on your query patterns and hardware. Below 100,000 rows, sequential scans are fast enough that schema design is academic. Between 100K and 500K, partial indexes mask the problem. Above 1 million rows, the planner's cardinality misestimates on the type column compound with each query, and cross-type queries degrade noticeably.

Q: Can I fix polymorphic association performance without a schema migration?

Yes, to a degree. Partial indexes filtered on each type value, extended statistics with CREATE STATISTICS, and covering indexes can keep a polymorphic table performing within 10-20% of a normalized alternative for type-specific queries. These are the right first steps if your table is under a few million rows and you are not running cross-type JOINs.

Q: What is the best alternative to polymorphic associations in PostgreSQL?

For most applications, exclusive belongs-to (one nullable FK per type) is the right starting point — it is the simplest pattern that restores real foreign keys and performs well for 2-4 types. If your types have divergent schemas or you need many types, use a join table per type to preserve referential integrity without nullable column sprawl. Table inheritance suits the narrower case where types share most columns and you frequently query across all types. Choose based on how your application actually queries the data, not abstract modeling preferences.

Your comments table has a column called "commentable_type." Allow me to explain what it is costing you.

The Waiter of Gold Lapel · Updated Mar 20, 2026 Published Mar 5, 2026 · 28 min read

The illustration refused to commit to a single type. We are resolving the ambiguity.

Good morning. You have a polymorphic association.

It seemed like such a reasonable idea at the time. Comments can belong to posts, photos, and videos. Rather than creating three separate comment tables — each with identical columns — you created one table with two special columns: commentable_type and commentable_id. ActiveRecord made it easy. One belongs_to :commentable, polymorphic: true and you were done.

That was eighteen months and four million rows ago. Now you are here because something is wrong with your query performance, and the EXPLAIN output is not making you feel better.

I have seen this pattern many times. In Rails applications, in Django projects using GenericForeignKey, in Laravel codebases with morphTo. The syntax differs. The underlying problem does not. Polymorphic associations are one of the most popular ORM features and one of the most quietly destructive PostgreSQL anti-patterns. They trade schema integrity for developer convenience — and the bill comes due at scale.

This is not a pattern that fails immediately. It works beautifully for months, sometimes years. The problems accumulate silently: orphaned rows, degrading query plans, index bloat, cardinality misestimates. By the time you notice, the table has four million rows and the EXPLAIN output reads like a medical diagnosis.

Allow me to walk you through exactly what polymorphic associations cost, why the cost compounds, and — most importantly — the escape routes available to you. We shall examine four distinct migration patterns, each with honest trade-offs, along with indexing strategies for those who must live with the pattern as-is.

Rails models

# app/models/comment.rb
class Comment < ApplicationRecord
  belongs_to :commentable, polymorphic: true
end

class Post < ApplicationRecord
  has_many :comments, as: :commentable
end

class Photo < ApplicationRecord
  has_many :comments, as: :commentable
end

class Video < ApplicationRecord
  has_many :comments, as: :commentable
end

Resulting PostgreSQL schema

-- The resulting table:
CREATE TABLE comments (
  id          bigserial PRIMARY KEY,
  body        text NOT NULL,
  commentable_type  varchar NOT NULL,  -- 'Post', 'Photo', 'Video'
  commentable_id    bigint  NOT NULL,  -- id in that table
  created_at  timestamp NOT NULL
);

-- Rails adds this index by default:
CREATE INDEX idx_comments_commentable
  ON comments (commentable_type, commentable_id);

-- Notice what is missing: no foreign key constraint.
-- PostgreSQL cannot enforce a FK that points to three different tables.

This is not a Rails problem. It is a relational modeling problem.

I should be clear about something before we proceed: this article uses Rails and ActiveRecord in its examples because Rails popularized the pattern and made it most accessible. But the problem is not specific to Rails. Any framework that stores a type discriminator and an ID pointing at multiple tables produces the same structural limitations in PostgreSQL.

Django's GenericForeignKey — same pattern, different syntax

# Django's equivalent: GenericForeignKey
from django.contrib.contenttypes.fields import (
    GenericForeignKey, GenericRelation
)
from django.contrib.contenttypes.models import ContentType

class Comment(models.Model):
    content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
    object_id = models.PositiveIntegerField()
    commentable = GenericForeignKey('content_type', 'object_id')
    body = models.TextField()

class Post(models.Model):
    comments = GenericRelation(Comment)

# The resulting table is structurally identical:
# content_type_id (FK to django_content_types) + object_id
# Same problem: object_id cannot have a FK constraint
# to the referenced table.

# Django at least uses an integer FK to content_types
# instead of storing class names as strings.
# But the fundamental issue — one ID column pointing
# at multiple tables — remains.

Laravel's morphTo — same pattern, longer strings

// Laravel's morphTo/morphMany:
class Comment extends Model {
    public function commentable() {
        return $this->morphTo();
    }
}

class Post extends Model {
    public function comments() {
        return $this->morphMany(Comment::class, 'commentable');
    }
}

// Resulting columns: commentable_type, commentable_id
// Same pattern. Same limitations.
// Laravel stores the fully-qualified class name:
// 'App\Models\Post', 'App\Models\Photo'
// These strings are even longer than Rails' class names,
// making the index bloat worse.

Django's GenericForeignKey uses an integer FK to the content_types table rather than storing class names as strings, which avoids some of the index bloat that Rails and Laravel suffer. But the core issue — object_id pointing at rows in multiple tables without a foreign key constraint — is identical.

Laravel's implementation is arguably the worst from a storage perspective. It stores fully-qualified PHP class names like App\Models\Post in the type column — 16 bytes or more per row, repeated millions of times, included in every index. But the class name length is a secondary concern. The real problem is structural, and it affects every framework equally.

The relational model was designed around the principle that a foreign key references exactly one table. Polymorphic associations violate this principle by design. Understanding this as a modeling problem rather than a framework bug is essential, because it explains why no framework can fully solve it — and why the solutions are schema-level, not code-level.

What exactly are you giving up?

Three things, each progressively more expensive as your data grows.

1. Foreign key constraints — gone

A foreign key constraint tells PostgreSQL: "this column must reference a valid row in that table." It prevents orphaned records, enables ON DELETE CASCADE, and gives the query planner useful cardinality information. Foreign keys are not merely a data integrity convenience — they are the mechanism by which PostgreSQL guarantees that your references are valid. Without them, you are trusting your application code to never make a mistake. I have observed application code. It makes mistakes.

Polymorphic associations make foreign key constraints impossible. The commentable_id column might reference posts.id, photos.id, or videos.id — PostgreSQL cannot enforce a constraint that points to three different tables simultaneously.

-- You might try this. PostgreSQL will refuse.
ALTER TABLE comments
  ADD CONSTRAINT fk_commentable
  FOREIGN KEY (commentable_id) REFERENCES posts(id);
-- ERROR: This only covers commentable_type = 'Post'.
-- What about Photo? Video? You need a FK per type,
-- but commentable_id can point to ANY of them.

-- The result: orphaned rows accumulate silently.
-- A deleted Post leaves its comments behind with
-- commentable_type = 'Post', commentable_id = 42,
-- pointing at nothing.

Without FK constraints, referential integrity becomes your application's responsibility. Every DELETE must remember to clean up associated comments. Every bug in that cleanup logic creates orphaned rows that silently accumulate, bloating your table and corrupting your aggregations. Every race condition between a delete and a concurrent insert creates a window where orphaned records can slip through.

The dependent: :destroy callback in Rails handles this in the common case — but callbacks can be skipped (delete_all, bulk operations, raw SQL), and when they are skipped, orphans are born. A foreign key constraint with ON DELETE CASCADE runs inside PostgreSQL itself. It cannot be skipped. It cannot be forgotten. It does not depend on your application being well-behaved.

Counting the orphans you may already have

-- How many orphaned comments do you have right now?
-- This query checks for commentable_id values that
-- no longer exist in their referenced table:

SELECT commentable_type, count(*) AS orphaned_count
FROM comments c
WHERE commentable_type = 'Post'
  AND NOT EXISTS (
    SELECT 1 FROM posts p WHERE p.id = c.commentable_id
  )
GROUP BY commentable_type

UNION ALL

SELECT commentable_type, count(*)
FROM comments c
WHERE commentable_type = 'Photo'
  AND NOT EXISTS (
    SELECT 1 FROM photos p WHERE p.id = c.commentable_id
  )
GROUP BY commentable_type

UNION ALL

SELECT commentable_type, count(*)
FROM comments c
WHERE commentable_type = 'Video'
  AND NOT EXISTS (
    SELECT 1 FROM videos v WHERE v.id = c.commentable_id
  )
GROUP BY commentable_type;

-- On a production table I reviewed recently:
--  commentable_type | orphaned_count
-- ------------------+----------------
--  Post             |          3,847
--  Photo            |            912
--  Video            |            156
--
-- 4,915 orphaned rows. No error was ever raised.

If you run that query against your production database and the count is zero, you have either been extraordinarily disciplined or extraordinarily lucky. In my experience, the former is rare and the latter does not persist.

2. JOINs — blocked

SQL JOINs require a target table. A polymorphic association does not have one — it has several. Rails acknowledges this directly by raising EagerLoadPolymorphicError when you attempt .joins(:commentable).

The JOIN problem

# Rails eager loading with polymorphic associations:
Comment.includes(:commentable)
# This works — but sends MULTIPLE queries:
#   SELECT * FROM comments WHERE post_id = 7;
#   SELECT * FROM posts WHERE id IN (1, 2, 3, ...);
#   SELECT * FROM photos WHERE id IN (4, 5, ...);
#   SELECT * FROM videos WHERE id IN (6, ...);

# What you CANNOT do:
Comment.joins(:commentable)
# => ActiveRecord::EagerLoadPolymorphicError:
#    Cannot eagerly load the polymorphic association :commentable

# JOINs require a single target table.
# Polymorphic associations point to many tables.
# Rails raises an error rather than guess.

Without JOINs, every eager load becomes multiple queries — one per type. This is the N+1 problem baked into the schema itself. It is not a bug in your application code. It is not a missing .includes() call. It is a structural limitation of the data model that cannot be resolved by better application code.

The impact is most visible in activity feeds and dashboards — any page that displays comments from mixed types. Rails must issue a separate query for each type to resolve the associated records. Three types, three queries. Six types, six queries. Each query is individually fast, but the round trips accumulate. On a page that displays 50 comments belonging to 4 types, Rails issues at minimum 5 queries (one for comments, one per type) regardless of how well you optimize each individual query.

Some developers work around this with raw SQL LEFT JOINs to every type table, using COALESCE to merge results. This works but produces plans that scale linearly with the number of types — each additional type adds another sequential scan and another hash join. I have seen these queries exceed 400ms in production with as few as five types. The Rails-specific optimization guide covers the UNION ALL alternative in detail.

3. Index efficiency — compromised

The default composite index on (commentable_type, commentable_id) works for type-specific lookups. But any query that does not filter by type — recent comments across all types, global full-text search, aggregation by date — cannot use this index efficiently. The type column is the leading column; without it, the index is only partially useful.

This is a consequence of how B-tree indexes work. A composite index on (A, B) is sorted first by A, then by B within each A value. A query that filters only by B must scan across all A groups — PostgreSQL typically declines to do this, choosing a sequential scan instead. Your carefully created index sits unused while PostgreSQL reads every row in the table.

4. Query planner confusion — the hidden cost

This fourth cost is less discussed but frequently significant. The query planner estimates cardinality — how many rows a query will return — to choose between sequential scans, index scans, nested loops, hash joins, and merge joins. Polymorphic tables produce systematically inaccurate cardinality estimates because the planner treats commentable_type and commentable_id as independent columns.

Why the planner gets it wrong

-- The planner's cardinality estimates on polymorphic tables
-- are frequently inaccurate. Here is why:

-- Check the statistics:
SELECT most_common_vals, most_common_freqs
FROM pg_stats
WHERE tablename = 'comments'
  AND attname = 'commentable_type';

--  most_common_vals  | most_common_freqs
-- -------------------+-------------------
--  {Post,Photo,Video} | {0.6,0.3,0.1}

-- PostgreSQL knows the distribution. But the composite
-- statistics across (commentable_type, commentable_id)
-- are estimated independently by default.

-- When you query WHERE commentable_type = 'Post'
-- AND commentable_id = 42, the planner multiplies:
--   P(type='Post') * P(id=42) = 0.6 * (1/n_distinct)
-- This assumes independence. In practice, commentable_id
-- values are NOT uniformly distributed across types.
-- Post IDs come from the posts table (200K rows).
-- Photo IDs come from the photos table (150K rows).
-- The planner does not know this.

-- Result: cardinality misestimates that cascade into
-- wrong join strategies, wrong index choices,
-- wrong memory allocations for sorts and hash tables.

The planner assumes that the probability of commentable_type = 'Post' AND commentable_id = 42 equals P(type='Post') * P(id=42). But these columns are not independent — the set of valid IDs depends entirely on the type. Post IDs come from the posts table. Photo IDs come from the photos table. The ID spaces may overlap, may have gaps, may have vastly different cardinalities. The planner does not know any of this.

The result: wrong row count estimates that cascade into wrong join strategies, wrong index choices, and wrong memory allocations. A query the planner estimates will return 1 row actually returns 12. A query estimated at 50,000 rows returns 847. These misestimates are the invisible tax on every query that touches the polymorphic table.

Extended statistics — a partial remedy

-- PostgreSQL 10+ extended statistics can help:
CREATE STATISTICS comments_type_id_stats
  (dependencies, ndistinct, mcv)
  ON commentable_type, commentable_id FROM comments;

-- Then refresh:
ANALYZE comments;

-- Now the planner understands the correlation between
-- type and id. It knows that commentable_id values
-- for type='Post' are drawn from a different pool
-- than those for type='Photo'.

-- Before extended statistics:
--   Estimated rows: 1    Actual rows: 12
-- After extended statistics:
--   Estimated rows: 11   Actual rows: 12

-- Close enough. The planner chooses the right index.

PostgreSQL's extended statistics (available since version 10) can help by teaching the planner about correlations between columns. Creating extended statistics on (commentable_type, commentable_id) tells the planner to consider their joint distribution rather than treating them independently. This does not fix the structural problems, but it gives the planner better estimates for the queries you do run.

What does this cost in practice?

Numbers clarify things. Here is a comments table with 1 million rows — 600K for posts, 300K for photos, 100K for videos — on PostgreSQL 16.

Type-specific lookup (acceptable)

-- Query: find all comments for Post id 42
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM comments
WHERE commentable_type = 'Post'
  AND commentable_id = 42;

-- With the default composite index on (commentable_type, commentable_id):
Index Scan using idx_comments_commentable on comments
  (cost=0.56..8.58 rows=1 width=97)
  (actual time=0.031..0.035 rows=12 loops=1)
  Index Cond: ((commentable_type = 'Post') AND (commentable_id = 42))
  Buffers: shared hit=4

-- Acceptable for a single lookup. Now scale it:
-- 200 posts, each needing their comments:
-- 200 x Index Scan = 200 queries, ~800 buffer hits.

A single type-specific lookup is fine. The composite index handles it. Four buffer hits, sub-millisecond response. If this is your only query pattern — look up comments for a specific post, a specific photo — polymorphic associations perform adequately. The index was designed for exactly this access pattern.

But now consider the eager-loading pattern Rails uses when you call Comment.includes(:commentable) on a collection of 200 comments belonging to mixed types. Rails issues three separate queries — one per type — each with an IN clause containing the relevant IDs. Three round trips, three index scans, three result sets stitched together in Ruby. Each individual query is fast. The combined cost — network round trips, Ruby object allocation, result merging — adds up to tens of milliseconds on a good day and hundreds on a busy one.

Now consider a query that crosses types entirely:

Cross-type query (painful)

-- Query: find recent comments across ALL types, sorted
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.*, commentable_type, commentable_id
FROM comments c
WHERE created_at > '2025-01-01'
ORDER BY created_at DESC
LIMIT 50;

-- The composite index on (commentable_type, commentable_id) is useless here.
-- PostgreSQL has no index on created_at:
Sort  (cost=28471.32..28596.32 rows=50000 width=97)
  Sort Key: created_at DESC
  Sort Method: top-N heapsort  Memory: 71kB
  ->  Seq Scan on comments  (cost=0.00..21370.00 rows=50000 width=97)
        (actual time=0.021..98.432 rows=47823 loops=1)
        Filter: (created_at > '2025-01-01')
        Rows Removed by Filter: 952177
        Buffers: shared hit=8370

-- Sequential scan over 1M rows. 98ms. And this is a warm cache.

Sequential scan. 98 milliseconds on a warm cache. 952,177 rows read and discarded to return 47,823. On a cold cache or a table with 10 million rows, this becomes measured in seconds. The composite index on (commentable_type, commentable_id) cannot help because the query does not filter by type.

This query is not exotic. It is the "recent activity" feed. It is the admin dashboard showing latest comments. It is the moderation queue. Any feature that displays comments regardless of what they are attached to hits this wall.

In fairness: when polymorphic associations are the right choice

I should be forthcoming about the boundaries of this argument, because presenting polymorphic associations as universally wrong would be a disservice to you and an embarrassment to me.

Polymorphic associations are a reasonable choice when:

You have two or three types and no plans for more. The complexity of the alternative patterns — join tables, separate tables, partitioning — is not free. For a small number of types with modest data volumes, the simplicity of polymorphic: true earns its keep.
Your queries almost always filter by type. If you never need cross-type queries — no activity feeds, no global search, no aggregate reports — the default composite index serves every query you actually run. The limitations exist but do not affect you.
The table will remain small. Below 100,000 rows, PostgreSQL handles sequential scans quickly enough that index optimization is academic. The break-even point where polymorphic overhead becomes measurable depends on your hardware and query frequency, but most tables under 100K rows perform well regardless of schema design.
You are prototyping. The cost of polymorphic associations is paid at scale. If you are validating a product idea and will redesign the data model before launch, the speed of polymorphic: true is a legitimate advantage. Ship it, learn what your actual query patterns are, then design the schema around those patterns.

The advice in this article is for the other case: you have shipped, your table has grown, your queries are slowing down, and the bill is coming due. For that case, the escape routes below are ordered from least disruptive to most thorough.

Do I have to migrate away from polymorphic associations?

Not necessarily. The severity depends on your scale, query patterns, and tolerance for the constraints above. For a small application with two or three types and modest data volumes, polymorphic associations are perfectly adequate. ActiveRecord makes them convenient, and convenience has value — real, measurable value in development velocity.

But if any of the following apply, it is time to consider alternatives:

Your polymorphic table exceeds 1 million rows
You need cross-type queries (recent activity feeds, global search)
You have orphaned rows from inconsistent deletions
Eager loading performance is degrading as types multiply
You need to JOIN through the polymorphic association for reporting
Your EXPLAIN output shows cardinality estimates that are wildly wrong
You are adding new types and each one adds another eager-load query

GitLab reached this conclusion years ago and documented a formal policy against polymorphic associations in their codebase. Their migration to join tables is one of the patterns below. They did not arrive at this position casually — they arrived at it after years of operating polymorphic tables at scale and measuring the cost.

Four escape routes

Each pattern resolves the core problems — FK constraints, JOINs, index efficiency — with different trade-offs. The right choice depends on whether your types share structure, whether you need global queries across types, and how much migration risk you can absorb.

Pattern 1: Explicit per-type tables

The most straightforward migration. Replace the polymorphic table with separate tables, one per type. Each gets its own foreign key, its own indexes, its own constraints. The data model becomes boring. Boring, in infrastructure, is the highest compliment available.

Rails models

# Pattern 1: Explicit per-type associations
# Replace polymorphic with dedicated foreign keys.

class PostComment < ApplicationRecord
  belongs_to :post
end

class PhotoComment < ApplicationRecord
  belongs_to :photo
end

class VideoComment < ApplicationRecord
  belongs_to :video
end

# Or, if comments share structure, use STI on the comment side:
class Comment < ApplicationRecord
  # No polymorphic columns
end

class PostComment < Comment
  belongs_to :post
end

PostgreSQL schema

-- Explicit associations: real foreign keys, real JOINs
CREATE TABLE post_comments (
  id         bigserial PRIMARY KEY,
  post_id    bigint NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  body       text NOT NULL,
  created_at timestamp NOT NULL
);

CREATE TABLE photo_comments (
  id         bigserial PRIMARY KEY,
  photo_id   bigint NOT NULL REFERENCES photos(id) ON DELETE CASCADE,
  body       text NOT NULL,
  created_at timestamp NOT NULL
);

-- Foreign key constraints: enforced.
-- JOINs: straightforward.
-- Query planner: one table, one index, no type discrimination.

This works well when the types are genuinely different — when post comments and photo comments will eventually diverge in structure, permissions, or lifecycle. It is the pattern the Rails Guides describe as the alternative.

The query planner loves this pattern. One table, one index, one set of statistics. No type discrimination, no cardinality confusion, no composite index with a string leading column. Every query plan is straightforward. Every estimate is accurate.

The cost: if you need "all recent comments regardless of type," you must UNION ALL across tables. With two or three types, this is manageable. With twelve, it is not. This is the fundamental trade-off — perfect per-type performance at the expense of cross-type query convenience.

Pattern 2: Join tables (GitLab's approach)

Keep the comments in one table. Create a small join table per type to link comments to their parent. Each join table carries its own FK constraints.

Schema

-- Pattern 2: Separate join tables (GitLab's approach)
-- Keep a single comments table, but use join tables per type.

CREATE TABLE comments (
  id         bigserial PRIMARY KEY,
  body       text NOT NULL,
  created_at timestamp NOT NULL
);

CREATE TABLE post_comments (
  post_id    bigint NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  comment_id bigint NOT NULL REFERENCES comments(id) ON DELETE CASCADE,
  PRIMARY KEY (post_id, comment_id)
);

CREATE TABLE photo_comments (
  photo_id   bigint NOT NULL REFERENCES photos(id) ON DELETE CASCADE,
  comment_id bigint NOT NULL REFERENCES comments(id) ON DELETE CASCADE,
  PRIMARY KEY (photo_id, comment_id)
);

-- One comments table (shared structure, easy to query globally).
-- Join tables provide FK constraints per type.
-- JOINs work naturally through the join table.

Query plan

-- Fetch all comments for a post, with a proper JOIN:
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.*
FROM comments c
JOIN post_comments pc ON pc.comment_id = c.id
WHERE pc.post_id = 42
ORDER BY c.created_at DESC;

Nested Loop  (cost=0.56..12.84 rows=12 width=64)
  ->  Index Scan using post_comments_pkey on post_comments pc
        (cost=0.29..4.41 rows=12 width=8)
        Index Cond: (post_id = 42)
  ->  Index Scan using comments_pkey on comments c
        (cost=0.42..0.70 rows=1 width=64)
        Index Cond: (id = pc.comment_id)
Buffers: shared hit=5
-- Clean nested loop. FK-enforced. No type column needed.

This is the pattern GitLab adopted across their codebase. It preserves the ability to query all comments globally (they are in one table) while adding FK constraints through the join tables. JOINs work naturally through the bridge table.

Global queries still work

-- The join table pattern's advantage: global queries still work.
-- "Show the 50 most recent comments regardless of type"

EXPLAIN (ANALYZE, BUFFERS)
SELECT c.*, 'Post' AS source_type, pc.post_id AS source_id
FROM comments c
JOIN post_comments pc ON pc.comment_id = c.id

UNION ALL

SELECT c.*, 'Photo' AS source_type, phc.photo_id AS source_id
FROM comments c
JOIN photo_comments phc ON phc.comment_id = c.id

ORDER BY created_at DESC
LIMIT 50;

-- Each branch uses the join table's primary key index.
-- Merge Append combines the pre-sorted results.
-- No sequential scan on the comments table.

-- Alternatively, if you do not need the source:
SELECT * FROM comments ORDER BY created_at DESC LIMIT 50;
-- This is a simple index scan on (created_at DESC).
-- No type column in the way. No polymorphic filter.
-- Just a clean, fast query on a single table.

The join table pattern is my recommendation for most applications migrating away from polymorphic associations. It preserves the single-table advantage for global queries — SELECT * FROM comments ORDER BY created_at DESC LIMIT 50 is a simple index scan with no type filtering, no UNION ALL, no complexity. But it adds per-type FK constraints through the bridge tables, so deletions cascade properly and orphaned rows become structurally impossible.

The cost: an extra JOIN per query (through the bridge table), and more tables to manage. In practice, the bridge tables are small — just two integer columns with a composite primary key — and the JOIN cost is negligible. The nested loop plan above shows 5 buffer hits. The polymorphic equivalent shows 4. One extra buffer hit is not a concern worth losing sleep over.

Pattern 3: PostgreSQL table inheritance

Schema

-- Pattern 3: PostgreSQL table inheritance
CREATE TABLE comments (
  id         bigserial PRIMARY KEY,
  body       text NOT NULL,
  created_at timestamp NOT NULL
);

CREATE TABLE post_comments (
  post_id bigint NOT NULL REFERENCES posts(id) ON DELETE CASCADE
) INHERITS (comments);

CREATE TABLE photo_comments (
  photo_id bigint NOT NULL REFERENCES photos(id) ON DELETE CASCADE
) INHERITS (comments);

-- Querying the parent table scans all children:
SELECT * FROM comments;           -- returns post_comments + photo_comments
SELECT * FROM ONLY comments;      -- returns only direct comments rows
SELECT * FROM post_comments;      -- returns only post comments

-- Caveats (significant):
-- - Unique constraints do NOT span children
-- - Primary keys are NOT inherited
-- - Indexes must be created on EACH child table separately
-- - Most ORMs have poor or no support for this

PostgreSQL's table inheritance lets child tables inherit columns from a parent. Querying the parent table automatically includes rows from all children. Each child can add its own columns and FK constraints.

This is the most PostgreSQL-native solution and the least ORM-friendly. ActiveRecord does not support table inheritance. Sequelize does not. Django does not. Most ORMs treat it as an edge case or ignore it entirely. If you are writing raw SQL or using a thin query builder, inheritance is elegant. If you are committed to an ORM, this pattern adds friction that may not be worth the benefit.

The caveats are real and significant: unique constraints and indexes do not span child tables automatically, primary keys are not inherited, and you must create indexes on each child separately. A SELECT * FROM comments WHERE id = 42 may match rows in multiple children with the same ID — because each child's bigserial sequence is independent unless you share a sequence explicitly.

I include this pattern for completeness and because it is genuinely the right choice for PostgreSQL-native applications that do not use an ORM. But for the majority of web applications using ActiveRecord, Django ORM, or Eloquent, Patterns 1, 2, or 4 are more practical.

Pattern 4: Declarative partitioning

PostgreSQL 10 introduced declarative partitioning, which provides the query-routing benefits of table inheritance with better planner integration and stronger constraint enforcement. You keep the original table structure — including the type column — but PostgreSQL physically separates the rows into per-type partitions.

Schema

-- Pattern 4: Declarative partitioning (PostgreSQL 10+)
-- Similar to inheritance but with planner-level optimization.

CREATE TABLE comments (
  id              bigserial,
  body            text NOT NULL,
  commentable_type varchar NOT NULL,
  commentable_id   bigint NOT NULL,
  created_at      timestamp NOT NULL,
  PRIMARY KEY (id, commentable_type)  -- partition key must be in PK
) PARTITION BY LIST (commentable_type);

CREATE TABLE comments_post PARTITION OF comments
  FOR VALUES IN ('Post');

CREATE TABLE comments_photo PARTITION OF comments
  FOR VALUES IN ('Photo');

CREATE TABLE comments_video PARTITION OF comments
  FOR VALUES IN ('Video');

-- Now add per-partition FK constraints:
ALTER TABLE comments_post
  ADD CONSTRAINT fk_post
  FOREIGN KEY (commentable_id) REFERENCES posts(id) ON DELETE CASCADE;

-- Per-partition indexes:
CREATE INDEX idx_comments_post_id ON comments_post (commentable_id);
CREATE INDEX idx_comments_photo_id ON comments_photo (commentable_id);

-- Partition pruning: when you query WHERE commentable_type = 'Post',
-- PostgreSQL only scans the comments_post partition.
-- No sequential scan of the full table. Automatic.

Partition pruning in action

-- Partition pruning in action:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM comments
WHERE commentable_type = 'Post'
  AND commentable_id = 42;

Append  (cost=0.29..8.31 rows=1 width=97)
  ->  Index Scan using idx_comments_post_id on comments_post
        (cost=0.29..8.31 rows=1 width=97)
        (actual time=0.011..0.016 rows=12 loops=1)
        Index Cond: (commentable_id = 42)
        Buffers: shared hit=3
Planning Time: 0.284 ms

-- PostgreSQL pruned the photo and video partitions entirely.
-- Only the post partition was scanned. The planner did this
-- at planning time, not execution time.

-- Cross-type queries scan all partitions but in parallel:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM comments
ORDER BY created_at DESC LIMIT 50;

Limit  (cost=0.86..3.47 rows=50 width=97)
  ->  Merge Append  (cost=0.86..52184.47 rows=1000000 width=97)
        Sort Key: created_at DESC
        ->  Index Scan Backward using idx_comments_post_created
              on comments_post  (rows=600000)
        ->  Index Scan Backward using idx_comments_photo_created
              on comments_photo  (rows=300000)
        ->  Index Scan Backward using idx_comments_video_created
              on comments_video  (rows=100000)
Buffers: shared hit=6

Declarative partitioning is the most interesting escape route because it preserves the original table's interface. Application code continues to INSERT INTO comments and SELECT FROM comments without knowing partitions exist. PostgreSQL routes rows to the correct partition on insert and prunes irrelevant partitions on select. Your ORM does not need to change.

The trade-offs are specific and measurable. The partition key (commentable_type) must be part of the primary key, which changes your PK from (id) to (id, commentable_type). Any foreign keys pointing to the comments table must include the partition key. Unique indexes must include the partition key. These constraints can require changes to referencing tables — a cost that must be evaluated against the benefits.

Per-partition FK constraints are the headline feature for polymorphic tables. You can add FOREIGN KEY (commentable_id) REFERENCES posts(id) on the comments_post partition specifically. PostgreSQL enforces referential integrity for Post comments without affecting Photo or Video partitions. This is the only pattern that provides FK constraints while keeping a single logical table.

"Most application developers interact with PostgreSQL through an ORM that exposes perhaps 10-15% of what PostgreSQL actually offers. The remaining 85% includes features that eliminate the need for much of the infrastructure complexity that modern applications carry."
— from You Don't Need Redis, Chapter 1: Good Evening. We Have a Problem.

How do the patterns compare?

Strategy	FK constraints	JOINs	Query complexity	Schema cost	Best for
Polymorphic (status quo)	No	No (EagerLoadPolymorphicError)	Multiple queries per type	Minimal	Prototyping, few types, low scale
Explicit per-type tables	Yes	Yes	Single JOIN	One table per type	Divergent schemas, independent lifecycles
Join tables (GitLab pattern)	Yes	Yes	Single JOIN through bridge	One join table per type	Shared structure, many types, global queries
Table inheritance	Yes (per child)	Yes	Automatic with parent query	Child tables inherit parent	PostgreSQL-native apps, no ORM
Declarative partitioning	Yes (per partition)	Yes	Automatic partition pruning	Partition per type value	High-volume tables, per-type FK needed

For most applications at scale, the join table pattern (Pattern 2) offers the best balance: global queries still work, FK constraints are enforced, and JOINs are available. Explicit per-type tables (Pattern 1) are preferable when the types have genuinely different lifecycles. Declarative partitioning (Pattern 4) is the strongest choice when you need FK constraints but want to preserve the single-table interface for application code.

Table inheritance (Pattern 3) is the right answer for a narrow set of applications — PostgreSQL-native codebases that do not use an ORM. For the rest, its ORM incompatibility creates more friction than it resolves.

How do you actually migrate?

Migrating a live polymorphic association is a five-phase operation. The critical principle: never switch reads and writes simultaneously. Dual-write first, backfill, switch reads, then retire the old structure. Impatience at this stage produces data loss. Patience produces a clean migration with zero downtime.

Migration phases

# Migrating polymorphic to explicit associations in Rails:
# Phase 1: Add the new table alongside the old one

class CreatePostComments < ActiveRecord::Migration[7.1]
  def change
    create_table :post_comments do |t|
      t.references :post, null: false, foreign_key: true
      t.text :body, null: false
      t.timestamps
    end
  end
end

# Phase 2: Dual-write (write to both old and new)
class Comment < ApplicationRecord
  after_create :sync_to_typed_table

  private

  def sync_to_typed_table
    case commentable_type
    when 'Post'
      PostComment.create!(post_id: commentable_id, body: body)
    end
  end
end

# Phase 3: Backfill existing data
Comment.where(commentable_type: 'Post').find_each do |c|
  PostComment.create!(post_id: c.commentable_id, body: c.body)
end

# Phase 4: Switch reads to the new table
# Phase 5: Stop dual-writing, drop the old columns

Allow me to expand on each phase, because the devil resides in the transitions between them.

Phase 1 — Create the new structure. Add the new table alongside the old one. Do not drop anything. Do not modify the existing table. The new table simply exists, empty, waiting. This migration is zero-risk and can be deployed at any time.

Phase 2 — Dual-write. Every new comment is written to both the old polymorphic table and the new typed table. Use an after_create callback, a database trigger, or application-level logic. The old table remains the source of truth for reads. The new table is accumulating data for the cutover.

Phase 3 — Backfill. Copy existing data from the old table to the new table in batches. Use find_each (which processes records in batches of 1,000 by default) rather than loading everything into memory. For tables with tens of millions of rows, consider a background job that processes batches with a delay between them to avoid saturating the database.

Do not attempt to backfill in a single transaction — it will hold locks for too long and generate excessive WAL. Batched inserts with periodic commits allow autovacuum to work and keep your replication lag manageable.

Phase 4 — Verify and switch reads. This is where correctness is confirmed. Run both old and new queries in parallel, compare results, and only cut over when they match.

Verification queries

-- Phase 4 verification: compare old and new before cutover.
-- Run both queries and confirm identical results.

-- Old path (polymorphic):
SELECT body, created_at FROM comments
WHERE commentable_type = 'Post' AND commentable_id = 42
ORDER BY created_at DESC;

-- New path (join table):
SELECT c.body, c.created_at FROM comments c
JOIN post_comments pc ON pc.comment_id = c.id
WHERE pc.post_id = 42
ORDER BY c.created_at DESC;

-- Automated verification for the full table:
SELECT count(*) AS old_count FROM comments
WHERE commentable_type = 'Post';

SELECT count(*) AS new_count FROM post_comments;

-- These counts MUST match after backfill completes.
-- If they differ, the backfill missed rows or the
-- dual-write has a race condition. Do not proceed.

Phase 5 — Retire the old structure. Stop dual-writing. Drop the old columns or table. This phase can wait weeks — there is no cost to leaving the old structure in place temporarily, and the safety margin is worth it. I have seen teams drop the old table the same day as the cutover and regret it within hours. There is no prize for speed here.

If you are keeping polymorphic associations, index them properly

Not every polymorphic association needs to be migrated. If the scale is modest and the pattern works for your access patterns, the right move is better indexes rather than a schema migration. A well-indexed polymorphic table can perform within 10-20% of a properly normalized alternative for type-specific queries. The gap widens for cross-type queries, but if your application does not run those, the gap does not matter.

The default composite index on (commentable_type, commentable_id) is a start, but it serves only one query shape: lookup by type and ID. For other patterns — recent comments of a specific type, all comments for a user filtered by type — partial indexes are dramatically more effective.

Partial indexes by type

-- Partial index filtered by type:
CREATE INDEX idx_comments_post_created
  ON comments (commentable_id, created_at)
  WHERE commentable_type = 'Post';

CREATE INDEX idx_comments_photo_created
  ON comments (commentable_id, created_at)
  WHERE commentable_type = 'Photo';

-- Now the same query, filtered by type:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM comments
WHERE commentable_type = 'Post'
  AND commentable_id = 42
ORDER BY created_at DESC;

Index Scan Backward using idx_comments_post_created on comments
  (cost=0.42..4.12 rows=12 width=97)
  (actual time=0.009..0.014 rows=12 loops=1)
  Index Cond: (commentable_id = 42)
  Buffers: shared hit=3

-- Smaller index (only Post rows), and created_at is included
-- for sort elimination. 3 buffer hits instead of 4.

Partial indexes filtered by commentable_type are smaller, faster to scan, and cheaper to maintain than the full composite index. Each partial index contains only the rows for its type — a natural fit for the polymorphic pattern where queries almost always filter by type.

Three specific improvements in the partial index plan:

No type comparison at query time. The WHERE commentable_type = 'Post' predicate matches the index's filter clause. PostgreSQL does not evaluate it row-by-row — the index already contains only Post rows.
Sort elimination. The partial index includes created_at as the second column. Queries with ORDER BY created_at DESC use an index scan backward instead of a separate sort step.
Smaller index footprint. Each partial index is a fraction of the size of the full composite. On a table with 1M rows (600K Post, 300K Photo, 100K Video), the Post partial index is 60% the size of the full composite. Fewer pages, fewer buffer hits, faster maintenance.

This is precisely the kind of optimization Gold Lapel creates automatically. The proxy observes your query traffic — sees repeated queries like WHERE commentable_type = 'Post' AND commentable_id = $1 — and creates partial indexes filtered by each type value it encounters. It also detects the multi-query eager-loading pattern that polymorphic associations force: the rapid succession of SELECT * FROM posts WHERE id IN (...) followed by SELECT * FROM photos WHERE id IN (...), each of which benefits from its own targeted index.

For polymorphic tables with expression-based queries — lower(commentable_type), JSONB extractions on metadata columns, date truncations on created_at — the proxy creates expression indexes as well. The type column becomes a partitioning dimension for index strategy, not just a filter value.

Whether you migrate to join tables or keep the polymorphic pattern, the query traffic reveals which indexes actually matter. Schema decisions and index decisions are separate concerns — and the indexes should reflect what your application actually does, not what the schema suggests it might do.

A brief word about the alternatives that are not alternatives

In the course of researching polymorphic association alternatives, you will encounter several suggestions that I feel obligated to address before you invest time in them.

JSONB with a type field. Storing the associated record as a JSONB document with a type key does not solve the foreign key problem — JSONB values cannot participate in FK constraints. It trades the string column for a JSONB key lookup (slower) and loses the composite index entirely. This is a lateral move at best and a regression at worst.

Multiple nullable FK columns. Adding post_id, photo_id, and video_id columns to the comments table, with a CHECK constraint ensuring exactly one is non-null. This works for small numbers of types but creates increasingly sparse rows as types multiply. A table with 12 nullable FK columns — 11 of which are always NULL — is paying a storage and maintenance cost for columns that contribute nothing. It also forces every query to check which column is populated, replacing commentable_type with COALESCE(post_id, photo_id, video_id, ...).

Materialized views. Creating a materialized view that pre-joins polymorphic associations is tempting but deferred — the view must be refreshed, the refresh locks the table (unless you use CONCURRENTLY, which requires a unique index), and the data is always as stale as the last refresh. For real-time comment displays, this is insufficient. For nightly reports, it may be acceptable.

None of these are terrible ideas. They are simply not better than the patterns described above. The join table pattern gives you FK constraints, JOINs, and global queries without creating sparse rows, JSONB lookups, or stale materialized views. Prefer it.

The uncomfortable question: should you have used polymorphic associations at all?

I am asked this frequently, and my answer is more nuanced than you might expect from the preceding 3,000 words of criticism.

Yes. Probably. At the time, it was likely the right decision.

The polymorphic pattern exists because it solves a real problem — shared behavior across multiple parent types — with minimal schema complexity. One table, two columns, done. The alternatives all require more tables, more joins, more code, more migration steps. When you are building a product and validating an idea, that simplicity has genuine value. The team that ships a working comment system in an afternoon using polymorphic: true is further ahead than the team that spends a week designing a join table architecture for a product that may not survive its first month.

The mistake is not using polymorphic associations. The mistake is not recognizing when they have outgrown their welcome. The pattern that served you well at 10,000 rows becomes a burden at 1 million and a crisis at 10 million. The escape routes exist because growth is the normal case. A successful product outgrows its initial schema. That is not a failure of planning — it is a sign that the product worked.

The best database design is the one that matches your current scale and query patterns, with awareness of where the ceiling is. If polymorphic associations are serving you well today and your table is growing slowly, keep them. Add partial indexes. Use extended statistics. Enjoy the simplicity. But know where the escape routes are, and when you see the signs — orphaned rows accumulating, cross-type queries degrading, eager loading adding round trips — do not wait for a crisis to plan the migration.

A well-run household anticipates its needs. It does not wait for the pipes to burst before calling the plumber.

Frequently asked questions

Why can't PostgreSQL enforce foreign keys on polymorphic associations?

At what table size do polymorphic associations start causing performance problems?

Can I fix polymorphic association performance without a schema migration?

What is the best alternative to polymorphic associations in PostgreSQL?

Terms referenced in this article

While the matter is fresh in your mind, I have written at some length on the broader question of ORM-generated SQL versus hand-written queries — the performance characteristics of each, and why polymorphic eager loading is one of the patterns where the gap is widest. The investigation may inform your next schema decision.

Polymorphic Associations Are Undermining Your PostgreSQL Performance. Here Are the Escape Routes.

Good morning. You have a polymorphic association.

This is not a Rails problem. It is a relational modeling problem.

What exactly are you giving up?

1. Foreign key constraints — gone

2. JOINs — blocked

3. Index efficiency — compromised

4. Query planner confusion — the hidden cost

What does this cost in practice?

In fairness: when polymorphic associations are the right choice

Do I have to migrate away from polymorphic associations?

Four escape routes

Pattern 1: Explicit per-type tables

Pattern 2: Join tables (GitLab's approach)

Pattern 3: PostgreSQL table inheritance

Pattern 4: Declarative partitioning

How do the patterns compare?

How do you actually migrate?

If you are keeping polymorphic associations, index them properly

A brief word about the alternatives that are not alternatives

The uncomfortable question: should you have used polymorphic associations at all?

Frequently asked questions

Why can't PostgreSQL enforce foreign keys on polymorphic associations?

At what table size do polymorphic associations start causing performance problems?

Can I fix polymorphic association performance without a schema migration?

What is the best alternative to polymorphic associations in PostgreSQL?

Terms referenced in this article

Related content