Chapter 9: Autocomplete (pg_trgm)

The Waiter of Gold Lapel · Published Apr 12, 2026 · 13 min

I should like to reintroduce you to an old friend.

In Chapter 5, pg_trgm handled the imprecision of the fingers — the typo, the misspelling, the truncation. It forgave what the user mistyped and found what they meant. It did this well, and I trust you remember it fondly.

pg_trgm has another talent. It handles autocomplete — the search feature the user interacts with before they have even finished typing. The user enters “Post” into a search box, and before the thought is complete, suggestions appear: “PostgreSQL,” “PostGIS,” “Postgres Connection Pooling.” The dropdown is there. The user selects. The search is done. If the autocomplete is good, the user never presses Enter at all.

This is the feature users notice most — not when it is present, but when it is absent. A search box without autocomplete feels unfinished. A search box with slow autocomplete feels broken. Teams reach for Elasticsearch’s completion suggester specifically for this, and it is often the first Elasticsearch feature adopted — the feature that creates the initial dependency. The search bar felt slow, someone added Elasticsearch for autocomplete, and six months later the team is running a JVM cluster for what started as a typeahead dropdown.

PostgreSQL handles it with the extension you already have installed.

The Basic Pattern

How to implement autocomplete in PostgreSQL using pg_trgm:

SQL

SELECT name, similarity(name, 'Post') AS _score
FROM products
WHERE name ILIKE 'Post%'
ORDER BY _score DESC, name
LIMIT 10;

Each part serves a specific purpose, and I find it worth walking through them because the simplicity is the point:

ILIKE 'Post%' — case-insensitive prefix matching. Finds all products whose name starts with “Post,” regardless of capitalization. The trailing % is the wildcard — “Post” followed by anything.

similarity(name, 'Post') — ranks results by how closely the full name matches the typed prefix. “PostgreSQL” ranks higher than “Postgres Connection Pooling Guide” because the prefix “Post” constitutes a larger portion of the shorter string. The ranking captures the intuition that a closer match deserves a higher position.

ORDER BY _score DESC, name — rank by similarity, break ties alphabetically. The alphabetical tiebreaker gives deterministic ordering when multiple results score similarly — a small detail that prevents the dropdown from shuffling between page loads.

LIMIT 10 — autocomplete typically shows 5–10 suggestions. More than 10 overwhelms the dropdown without helping the user find what they want faster. Restraint, in this case, is a feature.

Gold Lapel wraps this pattern: gl.suggest("products", "name", "Post", limit=10). One method call. The proxy creates the index. The user gets suggestions before they finish typing.

The GIN Trigram Index: Making It Fast

What is the fastest index for autocomplete in PostgreSQL? A GIN index with gin_trgm_ops from pg_trgm.

SQL

CREATE INDEX idx_trgm_name ON products USING GIN(name gin_trgm_ops);

Without this index, ILIKE 'Post%' might use a btree index for simple prefix matching, but ILIKE '%search%' — infix matching, where the term appears in the middle of the string — is always a sequential scan. With the GIN trigram index, both prefix and infix patterns use the index.

On a table with millions of rows, this is the difference between an autocomplete that feels sluggish and one that feels instant. GIN trigram indexes handle sub-millisecond autocomplete even on very large tables. At that speed, the suggestions appear before the user has consciously registered that they are waiting. That is the standard autocomplete should meet.

If you created this index in Chapter 5 for fuzzy matching, autocomplete is already fast. Same index, serving a second purpose. I appreciate an index that earns its keep twice.

All PostgreSQL Autocomplete Approaches

Several approaches to autocomplete exist in PostgreSQL. Most tutorials cover only one. I think you deserve to see them all:

Approach	How It Works	Index Type	Handles Infix?	Handles Typos?	Best For
`ILIKE` + pg_trgm	Prefix matching, similarity ranking	GIN trigram	Yes (`%term%`)	No (exact prefix)	Most applications (recommended)
`text_pattern_ops` btree	B-tree index for pattern matching	btree	No (prefix only)	No	Simple prefix-only, no pg_trgm
tsvector `:*` prefix	`to_tsquery('post:*')` matches stemmed lexemes	GIN tsvector	No	No (but stems)	When tsvector column already exists
pg_trgm `similarity()`	Pure similarity ranking, no prefix requirement	GIN trigram	Yes	Yes	Fuzzy autocomplete (see below)

My recommendation: ILIKE + pg_trgm GIN index for most applications. It handles both prefix and infix patterns, it is fast, and it composes with similarity() for ranking. The other approaches have their uses — text_pattern_ops when pg_trgm is not available, tsvector :* when you want stemmed completion — but the pg_trgm approach covers the most ground with the least configuration.

A note on the tsvector :* approach: to_tsquery('english', 'post:*') matches stemmed lexemes, not raw text. This means it finds “postal,” “postfix,” and “posting” — but through stemming, not prefix matching. For user-facing typeahead, where the user expects to see exactly what they are typing reflected in the suggestions, raw text matching with ILIKE is typically more intuitive. The user types “post” and expects to see results starting with “post,” not results containing words that stem to “post.”

Building Responsive Typeahead

The database query is half the problem. The other half is the frontend behavior. Autocomplete is a UX feature as much as a database feature, and the UX details determine whether the user perceives the search as fast or slow — regardless of what the query planner achieved.

Debouncing. Do not fire a query on every keystroke. Wait 150–300ms after the user stops typing. This reduces database load and prevents a problem that is more disorienting than slowness: results appearing in the wrong order because a fast query for “Po” returns after a slower query for “Pos,” and the dropdown flickers between two different suggestion lists. Debouncing eliminates this entirely.

Minimum prefix length. Do not query until the user has typed at least 2–3 characters. Single-character queries return too many results and are expensive to compute. The user has not expressed enough intent after one character to receive useful suggestions, and the database should not be asked to guess from a single letter.

Result limit. 5–10 suggestions. More than 10 overwhelms the dropdown without helping the user find what they want faster. I have observed that 7 is a number many applications settle on, and I find it a reasonable choice.

Caching. Popular prefixes — “app,” “post,” “data” — will be typed by many users. Cache results briefly, 1–5 minutes, to avoid repeated identical queries. Gold Lapel’s L1/L2 cache handles this automatically. The first user to type “post” pays the query cost. The next hundred receive cached results.

These are frontend concerns, not database concerns. But autocomplete that is fast at the database layer and slow at the UX layer still feels slow to the user. The database’s job is to respond in under a millisecond. The frontend’s job is to ask at the right moment and display the results without flicker. Both must be right.

Ranking Strategies

The basic pattern ranks by similarity(). This works well for most cases. But the ranking can be refined, and the refinement is worth considering because autocomplete is the feature where ranking is most visible — the user sees the top 5–10 results and nothing else.

Pure similarity. ORDER BY similarity(name, $1) DESC — rank by how closely the name matches the typed prefix. Simple, effective, the sensible default.

Popularity-weighted. Combine similarity with a popularity signal:

SQL

ORDER BY similarity(name, $1) * LOG(view_count + 1) DESC

“PostgreSQL” (viewed 50,000 times) ranks above “PostgreSQL Conference 2019 Poster Contest” (viewed 12 times) even if similarity scores are comparable. Use when your data has natural popularity signals — view counts, purchase counts, click-through rates. The logarithm prevents extremely popular items from dominating regardless of relevance.

Recency-weighted. Boost recent results. Useful for news sites, content feeds, recently added products — any context where newer is likely more relevant.

Category-aware. Group results by category in the dropdown: “PostgreSQL” (Database), “PostCSS” (CSS Tool), “Postman” (API Tool). This requires a JOIN or a denormalized category column, but it provides the user with context that helps them select the right suggestion faster.

Start with pure similarity. Add popularity weighting when your data has it. The database query stays fast regardless of ranking strategy — the difference is in the ORDER BY clause, not the index scan.

Beyond Simple Prefix: Infix and Fuzzy Autocomplete

Infix matching. WHERE name ILIKE '%search%' — finds “Elasticsearch” when the user types “search.” The GIN trigram index handles this efficiently. Useful when the user might type a word that appears in the middle of a product name, not just at the beginning. The same index, the same extension, a different wildcard position.

Fuzzy autocomplete — and here is where PostgreSQL’s composability earns its keep.

Elasticsearch’s completion suggester requires exact prefix matches on a specially configured completion field. If the user types “Postges” — a typo — the suggester returns nothing. Typo tolerance in autocomplete requires a separate fuzzy query alongside the completion suggester, with separate configuration.

PostgreSQL composes naturally. If prefix matching returns no results, fall back to similarity() from Chapter 5:

SQL

-- Step 1: Try prefix matching
SELECT name, similarity(name, 'Postges') AS _score
FROM products
WHERE name ILIKE 'Postges%'
ORDER BY _score DESC LIMIT 10;
-- Returns 0 rows (no product starts with "Postges")

SQL

-- Step 2: Fall back to fuzzy matching
SELECT name, similarity(name, 'Postges') AS _score
FROM products
WHERE similarity(name, 'Postges') > 0.3
ORDER BY _score DESC LIMIT 10;
-- Returns "PostgreSQL" (high trigram overlap despite the typo)

In application code: try prefix first. If results are empty or below a count threshold, run the fuzzy fallback. The user types “Postges,” sees no prefix matches, and immediately gets “PostgreSQL” as a suggestion. The typo is handled gracefully. The user does not notice the two-step process. They notice only that the search understood what they meant.

This two-step pattern — prefix matching with fuzzy fallback — is more resilient than Elasticsearch’s completion suggester, which requires separate configuration for each behavior. PostgreSQL handles both with one extension and two queries. Composability over configuration. I find that a principle worth remembering.

Elasticsearch Completion Suggester vs. PostgreSQL Autocomplete

Aspect	ES Completion Suggester	PostgreSQL pg_trgm
Prefix matching	Yes (FST-based)	Yes (ILIKE + GIN trigram)
Infix matching	No (prefix only)	Yes (`ILIKE '%term%'`)
Typo tolerance	No (requires separate fuzzy query)	Yes (similarity fallback)
Ranking	Weight field, category contexts	`similarity()`, custom ORDER BY
Index type	Dedicated completion field	GIN trigram (shared with fuzzy search)
Configuration	Dedicated mapping + analyzer	`CREATE EXTENSION pg_trgm` + one index
Infrastructure	Elasticsearch cluster	Same database

Elasticsearch’s completion suggester is purpose-built for speed — it uses a finite state transducer optimized for prefix lookups, and it is genuinely fast. I acknowledge that engineering with respect. PostgreSQL’s approach is more flexible: it handles infix matching, provides typo tolerance through fuzzy fallback, and shares infrastructure with the rest of the search stack. Both produce fast autocomplete. PostgreSQL requires less configuration and handles more of the edge cases that users actually encounter — because users, being human, do not always type prefixes correctly.

Honest Boundary

Autocomplete quality depends on what you are completing against. A table of 10,000 product names works beautifully — finite, structured, distinct values. A table of 10 million free-text descriptions is less useful for autocomplete — too many candidates, too little structure for the dropdown to be helpful. Autocomplete works best against structured, finite lists: product names, category names, user names, tag names.

For very high-throughput autocomplete (hundreds of requests per second), the GIN trigram index plus caching handles it well. For thousands of requests per second sustained, consider read replicas or a dedicated autocomplete cache. Gold Lapel’s L1 cache mitigates this for common prefixes, but you should plan for your traffic patterns.

The five pillars are in place.

Lexical search found the right words (Chapter 4). Fuzzy matching forgave the wrong spelling (Chapter 5). Phonetic search matched the right sounds (Chapter 6). Semantic search understood the right meaning (Chapters 7–8). And autocomplete — the feature the user encounters first, before they have finished typing, before they have committed to a search at all — is handled by a familiar extension applied to a problem that feels, in PostgreSQL, surprisingly natural to solve (Chapter 9).

Five pillars. Built-in capabilities and three extensions. No separate service. No sync pipeline. No cluster to manage at three in the morning. The search stack, if I may say, is sound.

But if you have used Elasticsearch, you know that search is only part of what Elasticsearch provides. Aggregations — the faceted search results that help users navigate large result sets. The percolator — reverse search, where stored queries are matched against incoming documents. Custom analyzers. Relevance debugging. These are the features that keep teams on Elasticsearch even after they have considered leaving. These are the features that make migration feel impossible.

Part III addresses each of them. And you may find, as I hope you have come to expect, that the parity is more complete than you assumed.

If you will follow me, I should like to open a door that most Elasticsearch users do not expect to find in PostgreSQL. The view on the other side is, I think, worth the walk.