Appendix D: Filter Operator Reference
All fifteen operators of Gold Lapel's filter language, plus the containment shortcut for plain equality and the dot-notation convention. Chapter 10 is the narrative treatment. This is the page to keep open in another tab.
For each operator: MongoDB syntax, Gold Lapel syntax (identical), the generated SQL, and whether the clause is served by the default jsonb_path_ops GIN index or bypasses it.
A note on indexing: every plain-equality predicate and the $eq operator compile to data @> '{`{"field": value}`}', which the GIN index serves as an inverted-index lookup. The seven comparison operators, $in, $nin, $regex, $elemMatch, and nested $exists do not use the JSONB GIN index. On large tables they are sequential scans unless you add a targeted expression index — the cast chain in the index must match the generated SQL character for character.
Plain Equality and the Containment Operator
The fastest path in the filter language. Every plain-valued predicate routes to the containment branch, and adjacent plain predicates collapse into a single @> probe.
| Syntax | Generated SQL | Indexed? |
|---|---|---|
{"status": "active"} | data @> '{`{"status":"active"}`}' | Yes (GIN) |
{"status": "active", "tier": "gold"} | data @> '{`{"status":"active","tier":"gold"}`}' | Yes (one GIN probe) |
{"addr": {"city": "NY", "zip": "10001"}} | data @> '{`{"addr":{"city":"NY","zip":"10001"}}`}' | Yes (nested) |
Two plain predicates collapse into one containment object, not two. One GIN probe is cheaper than two intersected.
Comparison Operators
| Operator | Gold Lapel syntax | Generated SQL | Indexed by default? |
|---|---|---|---|
$eq | {"field": {"$eq": v}} | data @> '{`{"field": v}`}' | Yes (GIN). Note the null exception below. |
$ne | {"field": {"$ne": v}} | NOT (data @> '{`{"field": v}`}') | Partial |
$gt | {"field": {"$gt": v}} | (data->>'field')::T > v | No. Add an expression index. |
$gte | {"field": {"$gte": v}} | (data->>'field')::T >= v | No. |
$lt | {"field": {"$lt": v}} | (data->>'field')::T < v | No. |
$lte | {"field": {"$lte": v}} | (data->>'field')::T <= v | No. |
Type-aware casting. Gold Lapel inspects the JSON type of the right-hand value at compile time and emits the correct cast: numbers → numeric, booleans → boolean, ISO-8601 strings recognised as timestamps → timestamptz. Other strings stay as text and sort lexically.
The null exception. $eq and plain equality are equivalent for every scalar value except null. {"deleted_at": null} emits data @> '{`{"deleted_at": null}`}', which matches rows where the key is present with a null value — not rows where the key is absent. MongoDB's {deleted_at: null} matches both. If your application relies on MongoDB's laxer semantics, write {"$or": [{"deleted_at": null}, {"deleted_at": {"$exists": false}}]}.
$nin and the NULL trap. SQL NOT IN with a NULL in the list collapses to unknown. Gold Lapel rejects null in $nin lists at compile time. For missing-field semantics compatibility, Gold Lapel emits data->>'field' IS NULL OR data->>'field' NOT IN (...) — preserving MongoDB's behaviour that a missing field should match $nin.
Expression indexes. For hot comparison paths:
CREATE INDEX users_age_idx ON users (((data->>'age')::numeric)); The double parentheses are required by PostgreSQL's expression-index grammar. The cast chain must match what Gold Lapel emits exactly — (data->>'age')::integer will not accelerate a query that casts to numeric.
Set Membership: $in, $nin
| Operator | Syntax | Generated SQL | Indexed? |
|---|---|---|---|
$in | {"field": {"$in": [v1, v2, ...]}} | data->>'field' IN (v1, v2, ...) | No (add expression index) |
$nin | {"field": {"$nin": [v1, v2, ...]}} | data->>'field' IS NULL OR data->>'field' NOT IN (...) | No |
For non-string values, the cast lives inside the extraction: {"total": {"$in": [49, 99]}} emits (data->>'total')::numeric IN (49, 99). Gold Lapel reads the JSON type of the first list element to choose the cast.
Key Existence: $exists
| Syntax | Generated SQL | Indexed? |
|---|---|---|
{"field": {"$exists": true}} (top level) | data ? 'field' | Yes under jsonb_ops. Not supported by jsonb_path_ops. |
{"field": {"$exists": false}} (top level) | NOT (data ? 'field') | Partial |
{"profile.verified": {"$exists": true}} (nested) | data #> '{profile,verified}' IS NOT NULL | No (add an expression index on the path) |
The default collection GIN index uses jsonb_path_ops, which does not support the ? key-existence operator. If $exists is a frequent pattern, either:
- Change the GIN opclass to
jsonb_ops(larger index, supports?,?|,?&), or - Add a targeted expression index on the specific path.
$exists: false emits NOT (data ? 'field'), which is not the same as data->>'field' IS NULL. The latter is true when the key is present but explicitly null. Gold Lapel's semantics follow MongoDB's: absent means absent.
Regex: $regex
| Syntax | Generated SQL | Indexed? |
|---|---|---|
{"field": {"$regex": "pattern"}} | data->>'field' ~ 'pattern' | No |
{"field": {"$regex": "pattern", "$options": "i"}} | data->>'field' ~* 'pattern' | No |
Flag mapping (MongoDB → PostgreSQL): i → ~* (case-insensitive operator). m (multiline) → (?n) prepended. s (dotall) → (?s) prepended. x (extended whitespace) → (?x) prepended.
GIN on JSONB does not support ~. For prefix-anchored patterns (^alice), add a B-tree with text_pattern_ops:
CREATE INDEX users_name_pattern_idx
ON users ((data->>'name') text_pattern_ops); For substring or fuzzy patterns, pg_trgm (trigram GIN) is the right tool. CREATE EXTENSION is DBA-scope and Gold Lapel does not install it.
Logical Operators: $or, $and, $not
| Operator | Syntax | Generated SQL | Indexed? |
|---|---|---|---|
$or | {"$or": [f1, f2, ...]} | (sql(f1) OR sql(f2) OR ...) | Per-branch |
$and | {"$and": [f1, f2, ...]} | (sql(f1) AND sql(f2) AND ...) | Per-branch |
$not | {"$not": {...}} | NOT (sql(inner)) | Per-branch |
The filter compiler is recursive. A branch that reduces to plain equalities still compiles to @>; a branch with operator values fans out. Every indexed fast path survives nesting.
Gold Lapel does not rewrite OR-of-equalities into $in, because the two forms have subtle differences in null handling. The compiler emits what you wrote.
Array Element Matching: $elemMatch
| Syntax | Generated SQL | Indexed? |
|---|---|---|
{"arr": {"$elemMatch": { | EXISTS (SELECT 1 FROM jsonb_array_elements(data->'arr') AS elem WHERE <compiled inner on elem>) | No (add expression index on array elements if hot) |
$elemMatch scopes a multi-condition filter to a single array element. Without it, two top-level containment predicates on an array field can be satisfied by different elements — not what you asked.
Gold Lapel emits EXISTS + jsonb_array_elements rather than jsonb_path_exists for three reasons: the generated SQL is readable in EXPLAIN; an expression index on the array elements is definable and usable; and the inner predicate composes with the rest of the filter grammar without a second parser.
Full-Text Search: $text
| Syntax | Generated SQL | Indexed? |
|---|---|---|
{"$text": {"$search": "terms"}} (document-wide) | to_tsvector('english', data::text) @@ plainto_tsquery('english', 'terms') | Yes with GIN on to_tsvector(...) |
{"field": {"$text": {"$search": "terms"}}} (field-scoped) | to_tsvector('english', data->>'field') @@ plainto_tsquery('english', 'terms') | Yes with GIN on the field's tsvector |
{"$text": {"$search": "...", "$language": "french"}} | to_tsvector('french', ...) @@ plainto_tsquery('french', ...) | Yes |
to_tsvector on the fly is correct but slow. In production, back it with an expression GIN index:
CREATE INDEX docs_notes_fts_idx
ON docs USING gin (to_tsvector('english', data->>'notes')); Language mapping: english, french, german, spanish, italian, portuguese, dutch, russian map directly. Languages outside PostgreSQL's bundled set (Arabic, Korean, Chinese) require a custom text-search configuration or a third-party extension. The default is english, configurable per collection at connection time.
For the full treatment of PostgreSQL search — tsvector, pg_trgm, fuzzystrmatch, pgvector — see You Don't Need Elasticsearch.
Dot Notation
| Syntax | Expanded to | Generated SQL |
|---|---|---|
{"addr.city": "NY"} | {"addr": {"city": "NY"}} | data @> '{`{"addr":{"city":"NY"}}`}' |
{"addr.city": "NY", "addr.zip": "10001"} | {"addr": {"city": "NY", "zip": "10001"}} | data @> '{`{"addr":{"city":"NY","zip":"10001"}}`}' (sibling merge) |
{"billing.address.country": "FR"} | {"billing": {"address": {"country": "FR"}}} | data @> '{`{"billing":{"address":{"country":"FR"}}}`}' |
Dot notation is syntactic sugar for nested objects and runs before SQL generation. Sibling merge preserves the indexed fast path: every dotted predicate under a shared prefix merges into a single containment probe. The depth does not cost you extra operators.
Mixing a dotted key with an explicit nested object at the same path deep-merges. Conflicting leaf values at the same path are implementation-defined — do not write such a filter.
Mixed Filters
Plain-valued keys fold into one containment object. Operator-valued keys each produce one SQL clause. Everything is ANDed.
users.find({
"status": "active",
"tier": "gold",
"age": { "$gt": 21 },
"signup": { "$gte": "2024-01-01" }
}) SELECT data FROM users
WHERE data @> '{"status":"active","tier":"gold"}'
AND (data->>'age')::numeric > 21
AND (data->>'signup')::timestamptz >= '2024-01-01'; Two plain predicates, one @>. Two operator predicates, two clauses. Three SQL expressions, not four. The @> clause with GIN prunes first; the operator clauses filter what survives.
Index Coverage Summary
| Operator | Default GIN (jsonb_path_ops) | jsonb_ops GIN | Expression index |
|---|---|---|---|
Plain equality / $eq | Yes | Yes | Possible on (data->>'field') |
$ne | Partial | Partial | Possible |
$gt, $gte, $lt, $lte | No | No | Required for speed |
$in, $nin | No | No | Required for speed |
$exists: true | No | Yes | Possible |
$regex (prefix) | No | No | text_pattern_ops B-tree |
$regex (substring) | No | No | pg_trgm GIN |
$or / $and / $not | Per-branch | Per-branch | Per-branch |
$elemMatch | No | No | Expression index on array elements |
$text | No | No | GIN on to_tsvector(...) required |
| Dot notation (plain) | Yes | Yes | — |
Seven operators bypass the JSONB GIN index by default. Add nested $exists and $elemMatch and the count is nine of fifteen. On a large table, those are sequential scans without targeted expression indexes. Gold Lapel reports candidate expressions in EXPLAIN output but does not create them automatically — the write-amplification cost of an index belongs to the person who owns the table.