pg_trgm

Q: Can pg_trgm replace full-text search?

No, and I would not suggest otherwise. They solve different problems. pg_trgm excels at fuzzy matching, typo tolerance, and substring search (LIKE with leading wildcards). Full-text search (tsvector/tsquery) is better for natural language queries with stemming, ranking, and boolean operators. The wisest approach is to use both: full-text search for primary results and pg_trgm for "did you mean?" suggestions.

Q: How much space does a trigram GIN index use?

They are not small — this is worth knowing upfront. Trigram GIN indexes typically run 2-5x the size of the indexed column data. A text column averaging 50 characters across 1 million rows might produce a GIN index of 100-300 MB. Whether that trade-off is worth it depends on the alternative: sequential scans on every search query. In most cases, the disk space is a bargain.

Q: What languages does pg_trgm support?

All of them, after a fashion. pg_trgm is language-agnostic — it works on raw character trigrams, not linguistic units. This means it handles English, German, Japanese (when stored in UTF-8), and any other language with equal competence. The limitation is that it does not understand word boundaries or morphology the way full-text search does. It sees characters, not meaning.

Q: Does pg_trgm work with ILIKE?

Yes, and this is perhaps the single best reason to install the extension. A GIN index with gin_trgm_ops accelerates both LIKE and ILIKE queries, including patterns with leading wildcards. It turns what would be an expensive sequential scan into a proper index scan for case-insensitive substring searches.

Q: Should I use GIN or GiST for trigram indexes?

GIN, unless you have a specific reason not to. It is faster for lookups and supports LIKE/ILIKE acceleration. Reach for GiST when you need the distance operator ( ) for ORDER BY similarity ranking, or when write performance takes precedence over read performance. If you are unsure, GIN is the correct choice.

Trigram-based similarity matching and fast LIKE/ILIKE searches — because a sequential scan on every keystroke is no way to run a search box.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 7 min read

If you have a search box and a growing table, you will meet this extension eventually. pg_trgm breaks text into three-character sequences (trigrams) and uses them for similarity comparison and index-accelerated substring search. It turns expensive LIKE '%term%' queries — which normally force a sequential scan — into fast index lookups.

What pg_trgm does

A trigram is a group of three consecutive characters. The string "cat" produces the trigrams " c", " ca", "cat", "at ". pg_trgm uses these trigrams to measure how similar two strings are — the more trigrams they share, the higher the similarity score (0 to 1).

This is useful for two things: fuzzy matching (finding strings that are similar to a search term, even with typos) and substring search (finding strings that contain a term). The fuzzy matching works through the similarity() function and the % operator. The substring search works by creating GIN or GiST indexes with trigram operator classes. Between the two, you have a remarkably capable search toolkit that never leaves PostgreSQL.

When to use pg_trgm

The most common use cases:

Search boxes that need %LIKE% — product search, user search, any text search with a leading wildcard. Without pg_trgm, these force a sequential scan on every query.
Typo-tolerant search — "posgres" should match "PostgreSQL". The similarity() function handles this gracefully.
"Did you mean?" suggestions — rank candidates by similarity score to suggest corrections for misspelled input.
Deduplication — find near-duplicate records by comparing text fields with a similarity threshold.

If your search needs are more linguistic — stemming, ranking by relevance, boolean queries — you want PostgreSQL's built-in full-text search (tsvector/tsquery). Many applications use both: full-text search for primary results, pg_trgm for fuzzy fallback. For a thorough comparison of what this combination can handle versus a dedicated search engine, see PostgreSQL vs Elasticsearch.

Installation and setup

pg_trgm is a contrib module included with PostgreSQL — no external packages or shared library preloading required. The official PostgreSQL documentation covers every function, operator, and index operator class. One statement and it is at your service.

SQL

-- No shared_preload_libraries needed — just create the extension
CREATE EXTENSION pg_trgm;

Similarity matching

The similarity() function returns a score between 0 (no match) and 1 (identical).

SQL

-- Basic similarity comparison
SELECT similarity('PostgreSQL', 'Postgresql');
-- Returns: 0.727273

-- Find similar product names
SELECT name, similarity(name, 'Posgres') AS sim
FROM products
WHERE similarity(name, 'Posgres') > 0.3
ORDER BY sim DESC;

The % operator is the indexed version — it returns true when similarity exceeds the threshold (default: 0.3).

Word similarity

PostgreSQL 11 added word_similarity(), which checks whether the query appears as a contiguous substring within the target. This is more useful for autocomplete-style matching where the search term is a partial word.

SQL

-- word_similarity: checks if the query appears as a substring
-- More useful for autocomplete-style matching
SELECT word_similarity('SQL', 'PostgreSQL');
-- Returns: 1.0 (SQL is a complete substring match within PostgreSQL)

SELECT word_similarity('post', 'PostgreSQL');
-- Returns: 0.5

-- The <<% and %>> operators use word_similarity
SELECT name FROM products WHERE name %>> 'sql';

Indexing LIKE and ILIKE

This is the most impactful feature of pg_trgm — and, candidly, the reason most people install it. A standard B-tree index cannot accelerate LIKE '%term%' because the leading wildcard prevents index range scans. A trigram GIN index can.

SQL

-- Create a GIN index for trigram-accelerated LIKE/ILIKE
CREATE INDEX idx_products_name_trgm ON products
  USING gin (name gin_trgm_ops);

-- These queries now use the index:
SELECT * FROM products WHERE name LIKE '%widget%';
SELECT * FROM products WHERE name ILIKE '%PostgreSQL%';

-- Without pg_trgm, LIKE with a leading wildcard forces a sequential scan.
-- With the trigram GIN index, PostgreSQL can use the index for any
-- substring match — leading, trailing, or middle.

One index. Leading wildcards, trailing wildcards, middle-of-string matches — all served from that single GIN index. The sequential scan that was quietly getting worse with every new row simply stops being a concern.

GIN vs GiST indexes

pg_trgm provides operator classes for both GIN and GiST indexes. They serve different masters, and knowing which to choose will save you from an awkward re-indexing later.

SQL

-- GIN index: faster reads, slower writes, larger on disk
CREATE INDEX idx_gin ON products USING gin (name gin_trgm_ops);

-- GiST index: slower reads, faster writes, supports distance operator
CREATE INDEX idx_gist ON products USING gist (name gist_trgm_ops);

-- GiST supports the distance operator for ORDER BY similarity:
SELECT name
FROM products
ORDER BY name <-> 'Posgres'
LIMIT 10;

-- GIN does not support <-> ordering.
-- Rule of thumb: use GIN for search, GiST for nearest-neighbor ranking.

Property	GIN	GiST
Read speed	Faster	Slower
Write speed	Slower	Faster
Index size	Larger	Smaller
LIKE/ILIKE acceleration	Yes	Yes
Distance operator (<->)	No	Yes
Similarity operator (%)	Yes	Yes

Tuning the similarity threshold

The % operator and similarity()-based queries use a configurable threshold to decide what counts as a match.

SQL

-- Check the current similarity threshold (default: 0.3)
SHOW pg_trgm.similarity_threshold;

-- Set a stricter threshold for the current session
SET pg_trgm.similarity_threshold = 0.5;

-- The % operator uses this threshold:
SELECT * FROM products WHERE name % 'Posgres';
-- Returns rows where similarity(name, 'Posgres') >= 0.5

Lower thresholds return more results but include weaker matches. Higher thresholds are more precise but may miss legitimate matches. Start with the default (0.3) and adjust based on your data — the right threshold depends on how forgiving your users expect the search to be, and that is something only your data can tell you. Depesz published a thorough walkthrough of trigram index performance that remains a useful reference.

Cloud availability

Provider	Status
Amazon RDS / Aurora	Available — contrib module, create with CREATE EXTENSION
Google Cloud SQL	Available
Azure Database for PostgreSQL	Available
Supabase	Available
Neon	Available
Crunchy Bridge	Available

How Gold Lapel relates

I should mention that Gold Lapel watches for exactly the pattern described above — repeated LIKE or ILIKE queries with leading wildcards hitting unindexed text columns. When it sees this, it recommends the trigram GIN index, and can create it automatically if you have given it permission to do so.

This is, in my experience, one of the most frequently surfaced recommendations. The story is always the same: a search feature ships with a simple WHERE name LIKE '%term%' that performs admirably on a small table, then quietly degrades to multi-second sequential scans as the data grows. Gold Lapel catches this trajectory early — before your users notice, and well before anyone opens a performance ticket at 2 a.m.

pg_trgm

What pg_trgm does

When to use pg_trgm

Installation and setup

Similarity matching

Word similarity

Indexing LIKE and ILIKE

GIN vs GiST indexes

Tuning the similarity threshold

Cloud availability

How Gold Lapel relates

Frequently asked questions

Can pg_trgm replace full-text search?

How much space does a trigram GIN index use?

What languages does pg_trgm support?

Does pg_trgm work with ILIKE?

Should I use GIN or GiST for trigram indexes?

Related content