PostgreSQL JSON vs JSONB: Which to Use and When
One stores text. The other stores parsed binary. The difference is everything.
Two types, one name, very different behavior
Good evening. PostgreSQL has two JSON column types: json and jsonb. They accept the same input, support some of the same operators, and store the same logical data. But they are fundamentally different in how they store, access, and index that data — and the distinction determines which one you should use. Allow me to clarify.
json arrived in PostgreSQL 9.2 (2012). It stores JSON as a validated text string — whitespace, key ordering, and duplicate keys all preserved verbatim.
jsonb arrived in PostgreSQL 9.4 (2014). The "b" stands for binary. Instead of storing the raw text, PostgreSQL parses the document and converts it into a decomposed binary representation. Whitespace is stripped. Keys are deduplicated (last value wins). Key ordering is not preserved — keys are stored in a sorted structure optimized for lookup.
This architectural difference — text storage versus parsed binary storage — drives every performance, indexing, and querying difference discussed in this article.
CREATE TABLE demo_json (data json);
CREATE TABLE demo_jsonb (data jsonb);
INSERT INTO demo_json VALUES ('{"b": 2, "a": 1, "b": 3}');
INSERT INTO demo_jsonb VALUES ('{"b": 2, "a": 1, "b": 3}');
SELECT data FROM demo_json;
-- {"b": 2, "a": 1, "b": 3}
-- Exact input preserved: whitespace, key order, duplicate "b" key
SELECT data FROM demo_jsonb;
-- {"a": 1, "b": 3}
-- Whitespace stripped, keys sorted, duplicate "b" resolved (last value wins) The json column returned the original text byte-for-byte. The jsonb column returned a normalized representation. Neither is wrong — they serve different purposes.
Storage and performance differences
Write performance
json is faster on INSERT. PostgreSQL validates the syntax and stores the text as-is. jsonb is slower on INSERT because PostgreSQL must parse the entire document, resolve duplicate keys, and build the binary representation. In INSERT-heavy benchmarks, the overhead is typically 5-15%.
For the vast majority of workloads, this write-side cost is negligible relative to the read-side gains.
Read and query performance
When you access a key inside a json column, PostgreSQL must parse the entire text string from the beginning. This happens on every row, for every query. When you access a key inside a jsonb column, PostgreSQL reads directly from the binary structure — O(log n) key lookup.
The difference scales with table size:
-- On a json column: parses 1 million JSON texts
SELECT data->>'email' FROM users;
-- On a jsonb column: performs 1 million binary key lookups
SELECT data->>'email' FROM users; The jsonb query is substantially faster — often by a factor of 3-10x depending on document size. Add a WHERE clause that filters on a JSON key, and the difference widens further because jsonb can use indexes while json cannot.
Size on disk
jsonb documents are typically slightly larger due to metadata, but jsonb strips whitespace and deduplicates keys. In practice, the size difference is small enough that it should not influence the decision.
Indexing — where JSONB earns its place
Indexing is the most consequential difference. If you need to search, filter, or join on data inside a JSON column, this section deserves your full attention.
GIN indexes on JSONB
JSONB supports GIN (Generalized Inverted Index) indexes. A GIN index on a JSONB column indexes the entire document:
CREATE INDEX idx_orders_data ON orders USING GIN (data); Containment queries using the @> operator:
-- Find all orders with status "shipped"
SELECT * FROM orders WHERE data @> '{"status": "shipped"}';
-- Find all orders containing a specific tag in a nested array
SELECT * FROM orders WHERE data @> '{"tags": ["priority"]}'; Existence queries using the ? operator:
-- Find all orders that have a "priority" key
SELECT * FROM orders WHERE data ? 'priority';
-- Find all orders that have any of these keys
SELECT * FROM orders WHERE data ?| array['priority', 'expedited']; For containment-only workloads, a more compact index using jsonb_path_ops:
CREATE INDEX idx_orders_data_path ON orders USING GIN (data jsonb_path_ops); The jsonb_path_ops index is typically 20-40% smaller and faster for @> queries, but does not support key-existence operators.
Expression indexes on JSONB
When you query a specific key repeatedly, an expression index performs identically to an index on a regular column:
-- Index a specific extracted value
CREATE INDEX idx_users_email ON users ((data->>'email')); -- Uses idx_users_email — same performance as indexing a text column
SELECT * FROM users WHERE data->>'email' = 'guest@example.com'; Expression indexes can be combined with partial indexes:
-- Index only active users' emails
CREATE INDEX idx_active_users_email ON users ((data->>'email'))
WHERE (data->>'active')::boolean = true; For a deeper overview of all available index types, see the index types guide.
JSON cannot be indexed
The json type does not support GIN indexes. At all. The conclusion is unambiguous: if you need to index any part of your JSON data, JSONB is the only serious option.
Operator and function differences
Both types share basic extraction operators (->, ->>, #>, #>>). JSONB supports a substantially richer set:
| Operator | Description | JSON | JSONB |
|---|---|---|---|
@> | Contains (left contains right) | No | Yes |
<@ | Contained by | No | Yes |
? | Key exists | No | Yes |
?| | Any of these keys exist | No | Yes |
?& | All of these keys exist | No | Yes |
|| | Concatenate two JSONB values | No | Yes |
- | Remove key or array element | No | Yes |
#- | Remove value at path | No | Yes |
JSONB also has exclusive functions for manipulation:
-- Update a nested value
SELECT jsonb_set('{"user": {"name": "old"}}', '{user,name}', '"new"');
-- {"user": {"name": "new"}}
-- Insert a new key
SELECT jsonb_insert('{"a": 1}', '{b}', '2');
-- {"a": 1, "b": 2}
-- Remove null values
SELECT jsonb_strip_nulls('{"a": 1, "b": null, "c": 3}');
-- {"a": 1, "c": 3}
-- Pretty-print for debugging
SELECT jsonb_pretty('{"a":1,"b":{"c":2}}');
-- {
-- "a": 1,
-- "b": {
-- "c": 2
-- }
-- } The containment operator @> deserves emphasis — it enables indexed lookups into nested structures:
-- Find users in a specific city, nested inside an address object
SELECT * FROM users
WHERE profile @> '{"address": {"city": "Portland"}}'; When JSON is actually the right choice
JSONB is the better choice in nearly every case, but json has legitimate uses:
- Ingest-only storage. If you receive JSON from an external API and store it for later retrieval — never querying its internal structure —
jsonpreserves the exact original text and avoids the conversion overhead. - Audit logging. When byte-for-byte fidelity matters — preserving exact key ordering, whitespace, and formatting for compliance or forensic purposes.
- Temporary staging. JSON data that lands in a staging table and will be parsed into proper columns by a batch ETL process.
All three scenarios share a characteristic: the JSON data is stored and retrieved as a complete document, never queried by internal structure. The moment you need to filter, extract, or join on a key inside the document, json becomes a liability.
Migrating from JSON to JSONB
The simplest approach:
ALTER TABLE orders ALTER COLUMN data TYPE jsonb USING data::jsonb; This rewrites the entire table and acquires an ACCESS EXCLUSIVE lock. For small tables, fine. For large tables, a batched migration avoids extended downtime:
-- Step 1: Add a new JSONB column
ALTER TABLE orders ADD COLUMN data_jsonb jsonb;
-- Step 2: Backfill in batches
UPDATE orders SET data_jsonb = data::jsonb
WHERE id BETWEEN 1 AND 100000;
-- Repeat for subsequent ranges, committing between batches
-- Step 3: Swap columns (brief lock for the rename)
ALTER TABLE orders RENAME COLUMN data TO data_json_old;
ALTER TABLE orders RENAME COLUMN data_jsonb TO data;
-- Step 4: Drop the old column after verifying correctness
ALTER TABLE orders DROP COLUMN data_json_old; What changes after migration: duplicate keys are deduplicated, key ordering may change, and whitespace is stripped. Test your application for code that depends on these properties.
For broader performance tuning strategies, see the PostgreSQL performance tuning guide.
JSONB in practice — patterns and anti-patterns
Good patterns
Flexible metadata alongside structured columns — the most common and most effective use:
CREATE TABLE products (
id serial PRIMARY KEY,
name text NOT NULL,
price numeric(10, 2) NOT NULL,
category text NOT NULL,
attributes jsonb DEFAULT '{}'
);
-- Structured data is queryable, type-safe, and constrained
-- Variable data (color, size, material, wattage — varies by product) lives in JSONB
CREATE INDEX idx_products_attributes ON products USING GIN (attributes); Tag-based and faceted filtering:
-- Products tagged "organic" and "local"
SELECT * FROM products
WHERE attributes @> '{"tags": ["organic"]}'
AND attributes @> '{"tags": ["local"]}'; Configuration and feature flags:
CREATE TABLE user_preferences (
user_id integer PRIMARY KEY REFERENCES users(id),
preferences jsonb DEFAULT '{}'
);
CREATE INDEX idx_prefs ON user_preferences USING GIN (preferences);
-- Find all users with dark mode enabled
SELECT user_id FROM user_preferences
WHERE preferences @> '{"theme": "dark"}'; Anti-patterns
Storing everything in a single JSONB column. When a table has one id column and one data JSONB column, you have a document database with less favorable performance than purpose-built document stores. You lose type safety, NOT NULL constraints, foreign keys, and planner statistics on individual columns.
Using JSONB as a relational substitute:
-- Anti-pattern: relational data buried in JSONB
SELECT o.* FROM orders o
JOIN users u ON (o.data->>'user_id')::integer = u.id;
-- Correct: relational data as a column
SELECT o.* FROM orders o
JOIN users u ON o.user_id = u.id; Deeply nested queries across multiple levels:
-- Fragile and hard to maintain
SELECT data->'order'->'items'->0->'product'->'variants'->0->>'sku'
FROM orders; The guideline: JSONB is for semi-structured data that varies between rows. Structured data that every row shares should be proper columns with proper types.
Honest counterpoint
JSONB is powerful enough that it can tempt teams into treating PostgreSQL as a document database. It works — up to a point. But "works" and "works well" are different things.
For read-heavy workloads on deeply nested documents with complex queries spanning multiple levels of nesting, better relational schema design will outperform a JSONB-centric approach. PostgreSQL's query planner has limited visibility into JSONB column contents.
The best use of JSONB is hybrid: structured columns for the data you query, filter, join, and constrain, with JSONB for the flexible remainder. A products table with name, price, and category as columns and attributes as JSONB is a strong design. A products table with everything in a single data JSONB column is a schema problem that JSONB makes easy to create and hard to undo.
PostgreSQL's JSON support is exceptional — genuinely so. It does not make schema design optional.
For practical guidance on using JSONB effectively, see the JSONB how-to guide.