← PostgreSQL Concepts

JSONB

Binary JSON storage in PostgreSQL. Parsed on write, indexable with GIN, and queryable with a rich set of operators. If you find yourself reaching for a document database, allow me to suggest you check the room you are already standing in.

Concept · March 21, 2026 · 9 min read

JSONB is PostgreSQL's binary JSON type. Unlike the json type, which stores the exact text you give it, JSONB decomposes the document into a binary format on write. This enables indexing (GIN), containment queries (@>, <@), path extraction (->>, #>>), and key-existence checks (?). The trade-off: duplicate keys are removed, key ordering is not preserved, and insignificant whitespace is discarded. I have yet to encounter a production workload where that trade-off was not well worth it.

What JSONB is

JSONB stores JSON documents in a decomposed binary format. When you insert a document, PostgreSQL parses the JSON text, validates it, removes duplicate keys (keeping the last value), and writes the result as a binary structure. A small investment on write that pays for itself on every subsequent read and query. PostgreSQL at its most pragmatic.

The binary format enables direct access to nested values without re-parsing the JSON text. When you extract a key from a JSONB column, PostgreSQL traverses the binary structure to find it — it does not scan the string character by character, looking for matching quotes. This is what makes JSONB genuinely queryable, not merely storable.

SQL
-- Create a table with a JSONB column
CREATE TABLE events (
  id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  payload jsonb NOT NULL,
  created_at timestamptz DEFAULT now()
);

-- Insert a document
INSERT INTO events (payload) VALUES (
  '{"type": "signup", "user": {"name": "Alice", "plan": "pro"}, "tags": ["new", "organic"]}'
);

JSON vs JSONB

PostgreSQL offers two JSON types, and the distinction is not academic. If you will permit me a firm recommendation early: use JSONB. Nearly always.

jsonjsonb
StorageExact text, as insertedDecomposed binary
Duplicate keysPreserved (all duplicates kept)Last value wins
Key orderingPreservedNot guaranteed
WhitespacePreservedDiscarded
Write speedFaster (no parsing beyond validation)Slower (full parse + binary conversion)
Read speedSlower (re-parsed on every access)Faster (direct binary access)
IndexingNot supportedGIN indexes, expression indexes
Containment (@>)Not supportedSupported
Equality (=)Not supportedSupported
SQL
-- JSON: stores exact text, preserves whitespace and key order
SELECT '{"b": 1,  "a": 2}'::json;
-- returns: {"b": 1,  "a": 2}

-- JSONB: parses on write, removes duplicates, normalizes
SELECT '{"b": 1,  "a": 2, "b": 3}'::jsonb;
-- returns: {"a": 2, "b": 3}   (duplicate "b" removed, keys sorted)

Use json only when you need to preserve the exact text as inserted — audit logging, for example, where the original formatting has legal or compliance significance. For everything else, use jsonb. I do not say this to be prescriptive. I say it because the list of genuine reasons to prefer json is vanishingly short.

Querying JSONB

JSONB comes with a well-appointed set of operators for navigating into documents, extracting values, and testing containment. The syntax takes a moment to learn; the capability it unlocks is considerable.

SQL
-- Arrow operators: navigate into the document
SELECT payload -> 'user'              -- returns JSONB object: {"name": "Alice", "plan": "pro"}
FROM events;

SELECT payload -> 'user' ->> 'name'   -- returns text: Alice
FROM events;

-- Path operators: navigate multiple levels at once
SELECT payload #> '{user, name}'      -- returns JSONB: "Alice"
FROM events;

SELECT payload #>> '{user, name}'     -- returns text: Alice
FROM events;

-- Containment: does the document contain this sub-structure?
SELECT * FROM events
WHERE payload @> '{"type": "signup"}';

-- Key existence: does the top-level key exist?
SELECT * FROM events
WHERE payload ? 'tags';
OperatorDescriptionReturns
->Get JSON object field by key or array element by indexjsonb
->>Get JSON object field or array element as texttext
#>Get value at a nested pathjsonb
#>>Get value at a nested path as texttext
@>Left contains right (containment)boolean
<@Left is contained by rightboolean
?Does the top-level key exist?boolean
?|Do any of these keys exist?boolean
?&Do all of these keys exist?boolean

The arrow operators (->, ->>) are the workhorses for extracting data. The containment operator (@>) is the workhorse for filtering — and the one that benefits most from GIN indexing. If you learn one JSONB operator well, make it that one.

Indexing JSONB

JSONB supports two indexing strategies, and knowing which to reach for is one of those quiet decisions that separates a well-maintained schema from one that merely functions. GIN indexes serve containment and key-existence queries. Expression indexes (typically B-tree) serve specific extracted paths.

GIN indexes

A GIN index on a JSONB column indexes every key and value in every document. This makes it possible to answer containment queries (@>) and key-existence checks (?) without scanning the table. Indexed flexibility — a combination I am always pleased to see.

SQL
-- Default operator class (jsonb_ops): supports @>, ?, ?|, ?&
CREATE INDEX idx_events_payload ON events USING gin (payload);

-- Path operator class (jsonb_path_ops): supports only @>, but smaller and faster
CREATE INDEX idx_events_payload_path ON events USING gin (payload jsonb_path_ops);

-- These queries use the GIN index:
SELECT * FROM events WHERE payload @> '{"type": "signup"}';
SELECT * FROM events WHERE payload ? 'email';          -- jsonb_ops only

The jsonb_ops operator class (default) indexes keys and values separately, supporting the full range of JSONB operators. The jsonb_path_ops class hashes the full path to each value, producing a smaller index (roughly 30-40% smaller) that is faster for containment but only supports @>.

Expression indexes

When you always query the same path — WHERE payload ->> 'type' = 'signup' — a GIN index is more generosity than the situation requires. An expression index on that specific path is smaller, faster, and uses a B-tree for equality and range operations.

SQL
-- Expression index on a specific JSONB path
CREATE INDEX idx_events_type ON events ((payload ->> 'type'));

-- This query uses the expression index (B-tree on the extracted text value)
SELECT * FROM events WHERE payload ->> 'type' = 'signup';

-- Combine with a partial index for even more selectivity
CREATE INDEX idx_events_signup ON events ((payload ->> 'plan'))
WHERE payload ->> 'type' = 'signup';

Expression indexes also enable you to cast the extracted value to a specific type — for example, ((payload ->> 'amount')::numeric) — and then use range queries on it. GIN indexes cannot do this.

Modifying JSONB

JSONB supports in-place modification using the concatenation operator (||), jsonb_set(), and key removal (-, #-).

SQL
-- Set a top-level key
UPDATE events SET payload = payload || '{"status": "confirmed"}';

-- Set a nested key
UPDATE events SET payload = jsonb_set(payload, '{user, verified}', 'true');

-- Remove a key
UPDATE events SET payload = payload - 'tags';

-- Remove a nested key
UPDATE events SET payload = payload #- '{user, plan}';

One thing to be aware of: these operations rewrite the entire JSONB value. PostgreSQL does not modify the binary structure in place — it creates a new version of the whole document. For small documents this is negligible. For documents that are hundreds of KB, frequent partial updates can be expensive. If you update one key in a 500 KB document, PostgreSQL writes a new 500 KB document. I mention this not to discourage, but because surprises in write amplification are the sort of thing one prefers to learn from documentation rather than from a production incident.

When to use JSONB vs relational columns

Allow me to be direct: JSONB is not a replacement for relational modeling. It is a complement to it. Structure when you can, JSONB when you must. The decision comes down to how predictable your data is and how you need to query it.

Use JSONB when:

  • The schema varies between rows — user preferences, feature flags, per-tenant configuration
  • You are storing metadata or tags where the set of keys is not known in advance
  • You are storing API responses or external data that you do not control
  • You query the data by containment (@>) or key existence (?) rather than by specific field values
  • The document is a self-contained unit that you read and write as a whole

Use relational columns when:

  • The schema is consistent across rows — every row has the same fields
  • You need constraints (NOT NULL, UNIQUE, FOREIGN KEY, CHECK)
  • You need to JOIN on the data frequently
  • You need to aggregate (SUM, AVG, GROUP BY) on specific fields
  • Individual fields need different data types with type safety

A common and effective pattern is to use relational columns for structured fields you query and join on, and a JSONB column for the rest — the "extras" column pattern. You get the best of both: schema enforcement where it matters, flexibility where it does not. This is PostgreSQL at its most versatile, and I find it entirely admirable that you need not hire a document database to get document storage.

How Gold Lapel relates

Gold Lapel analyzes query patterns at the proxy level and identifies JSONB columns that would benefit from indexing. When it observes containment queries (@>) hitting a JSONB column without a GIN index, it recommends one — including the appropriate operator class based on whether the workload uses containment only (jsonb_path_ops) or also needs key-existence operators (jsonb_ops). The distinction matters, and it is not one I leave to chance.

For workloads that consistently query the same JSONB path with equality or range conditions, Gold Lapel recommends expression indexes on those specific paths instead of a full GIN index. The right index depends on the query pattern, not the data type — and Gold Lapel observes the patterns before making the recommendation. A recommendation without evidence is merely a guess, and I do not guess.

Frequently asked questions