← PostgreSQL Concepts

Partial index

An index that catalogues only the rows worth cataloguing. If you'll allow me — the rest do not deserve the shelf space.

Concept · March 21, 2026 · 8 min read

A partial index is an index built on a subset of a table's rows, defined by a WHERE clause in the CREATE INDEX statement. Only rows that satisfy the condition are included in the index. If your queries always filter on the same predicate — WHERE status = 'active', WHERE deleted_at IS NULL, WHERE created_at > some_date — a partial index gives you the same lookup speed as a full index at a fraction of the size, with less write overhead and less bloat over time. It is, if I may say so, indexing with discrimination. The good kind.

What a partial index is

A standard index includes every row in the table — the active, the archived, the forgotten. A partial index is more selective. It includes only the rows that match a predicate you specify at creation time, ignoring everything else entirely. The syntax adds a WHERE clause to CREATE INDEX:

SQL
-- Standard index: indexes every row in the table
CREATE INDEX idx_orders_status ON orders (status);

-- Partial index: only indexes rows where status = 'active'
CREATE INDEX idx_orders_active ON orders (status)
  WHERE status = 'active';

The first index stores an entry for every row in orders, regardless of status. The second stores entries only for rows where status = 'active'. If the orders table has ten million rows but only 500,000 are active, the partial index is roughly 5% the size of the full index. One does not catalogue the entire archive when the guest only ever asks about the active inventory.

The indexed column and the predicate column do not have to be the same. You can create a partial index on customer_id with a predicate on status — the predicate controls which rows enter the index, while the indexed columns control how they are organized within it.

Why partial indexes matter

Every index comes with costs: disk space, buffer cache pressure, and write amplification. A full index on a large table pays these costs for every row, including rows that no query will ever look up. This is the indexing equivalent of polishing silver that never leaves the cabinet. Partial indexes pay only for the rows that matter.

  • Smaller index — fewer rows means fewer index pages. A smaller index fits more easily in the buffer cache, which means more cache hits and fewer disk reads during scans.
  • Faster scans — fewer pages to traverse during an index scan. For range scans across a partial index, PostgreSQL reads fewer leaf pages than it would on a full index covering the same columns.
  • Less write amplification — when a row is inserted or updated, PostgreSQL only writes to the partial index if the row matches the predicate. Rows that do not match are invisible to the index, so no index maintenance is performed.
  • Less bloat — fewer index entries means fewer dead index entries after updates and deletes. The index stays compact longer between REINDEX operations.
SQL
-- Compare the sizes
SELECT
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE tablename = 'orders'
  AND indexname IN ('idx_orders_status', 'idx_orders_active');

-- If 5% of rows are active, the partial index is ~5% the size of the full index.

Common use cases

Partial indexes earn their keep when the data has a clear, stable partition between rows that queries care about and rows that they do not. The pattern is always the same: a large, quiet majority and a small, active minority that receives all the attention.

Active or pending status flags

Most tables with a status column have a skewed distribution — a large archive of completed, cancelled, or expired rows and a smaller set of active or pending rows that the application queries constantly. A partial index on the active subset keeps the index small and focused. The archived rows are not consulted, not maintained, and not missed.

Soft deletes

Applications that use soft deletes — setting deleted_at instead of actually deleting — accumulate a growing pile of "deleted" rows that no query ever returns. They linger in the index like guests who have checked out but left their luggage. A partial index on WHERE deleted_at IS NULL excludes all soft-deleted rows entirely.

SQL
-- Index only non-deleted rows
CREATE INDEX idx_users_email_live ON users (email)
  WHERE deleted_at IS NULL;

-- This query uses the partial index
SELECT id, email, name
FROM users
WHERE email = 'alice@example.com'
  AND deleted_at IS NULL;

Recent rows

Time-series workloads where queries focus on recent data can use a partial index to cover only the active window. This is most effective when the window is relatively stable and the table is not partitioned.

SQL
-- Index only recent orders (useful for time-series workloads)
CREATE INDEX idx_orders_recent ON orders (created_at)
  WHERE created_at > '2026-01-01';

-- Note: this predicate is static. As time passes, you'll need to
-- recreate the index with an updated date, or use a different
-- strategy like partitioning for rolling windows.

Nullable columns

When a column is NULL for most rows and queries only look for non-NULL values, a partial index with WHERE col IS NOT NULL avoids indexing the NULL majority entirely. There is no reason to maintain an index entry for the absence of information.

SQL
-- Index only rows where the column has a value
CREATE INDEX idx_contacts_phone ON contacts (phone)
  WHERE phone IS NOT NULL;

-- If 90% of rows have NULL phone numbers, this index is 10% the size
-- of a full index — and the only queries that need it are the ones
-- looking for non-NULL values anyway.

Creating partial indexes

The WHERE clause in a partial index supports any immutable expression: comparisons, IS NULL, IS NOT NULL, boolean columns, and immutable function calls. Subqueries and volatile functions are not allowed.

Partial indexes combine naturally with composite indexes. You can index multiple columns while restricting which rows are included:

SQL
-- Combine a composite index with a partial predicate
CREATE INDEX idx_orders_customer_active ON orders (customer_id, created_at)
  WHERE status = 'active';

-- Supports queries like:
-- SELECT * FROM orders
-- WHERE customer_id = 42
--   AND status = 'active'
-- ORDER BY created_at DESC;

You can also create partial unique indexes — enforcing uniqueness only among rows that match the predicate:

SQL
-- Enforce uniqueness only on active rows
CREATE UNIQUE INDEX idx_unique_active_email ON users (email)
  WHERE status = 'active';

-- Two inactive users can share the same email address.
-- Only one active user can have a given email.

This is an elegant pattern for constraints that apply conditionally. The database enforces the rule without application logic — no triggers, no validation code, no room for error. The constraint is the index, and the index is the constraint. I find that rather satisfying.

Query matching

A partial index is only useful if the query planner recognizes it. This deserves emphasis, because it is where partial indexes most often disappoint. The planner compares the query's conditions against the index's WHERE clause and uses the index only when it can prove that every row the query might return is guaranteed to be in the index. The planner must see the match — it will not infer it, guess at it, or take your word for it.

SQL
-- The planner uses the partial index when the query's WHERE clause
-- matches or implies the index predicate.

-- Uses idx_orders_active (exact match)
SELECT * FROM orders WHERE status = 'active' AND total > 100;

-- Uses idx_orders_active (status = 'active' is implied)
SELECT * FROM orders WHERE status = 'active' AND customer_id = 42;

-- Does NOT use idx_orders_active (different predicate)
SELECT * FROM orders WHERE status = 'pending';

-- Does NOT use idx_orders_active (no status filter at all)
SELECT * FROM orders WHERE total > 100;

The matching logic is, to put it diplomatically, conservative. The planner handles straightforward cases — exact equality, AND combinations, IS NULL / IS NOT NULL — but does not perform complex logical inference. If the predicate is WHERE status = 'active', a query with WHERE status IN ('active') will match, but a query with WHERE status != 'inactive' may not, even if the result set is the same. The planner is not wrong to be cautious. But it does require that you meet it halfway.

When in doubt, verify with EXPLAIN ANALYZE:

SQL
-- Verify the planner is using your partial index
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE status = 'active' AND customer_id = 42;

-- Look for: Index Scan using idx_orders_active on orders
-- If you see Seq Scan instead, the planner didn't match the predicate.

If the planner is not using your partial index, check that the query includes the predicate condition explicitly. Adding the condition to the query — even if it feels redundant — is often enough to enable the match. A small concession to the planner's literalism, but a worthwhile one.

How Gold Lapel relates

Gold Lapel sits between your application and PostgreSQL, observing every query in real time. When we see the same predicate appear consistently — the same WHERE status = 'active' clause on thousands of queries per minute — we recognize the pattern as a candidate for a partial index. Your queries have been telling you what they need; we simply listen on their behalf.

We factor in the data distribution before making a recommendation. If the predicate matches a small fraction of the table, a partial index saves meaningful space and write overhead compared to a full index. If the predicate matches most of the table, we recognize that a partial index offers little advantage and recommend a full index instead. A partial index on 95% of the table is just a full index with pretensions.

The recommendation includes the indexed columns, the predicate, and an estimate of the size reduction compared to a full index on the same columns — giving you what you need to decide whether the partial index is worth creating.

Frequently asked questions