PostgreSQL intarray: A Practical Guide to Integer Array Operations

Q: What is the difference between intarray operators and PostgreSQL's native array operators?

PostgreSQL's native @>, <@, and && operators work on any array type — integer[], text[], uuid[], and others. They support GIN indexing out of the box. intarray adds operators that exist only for integer arrays: & (intersection), | (union), and - (difference). These operators have no equivalent in standard PostgreSQL. intarray also provides GiST index support for integer arrays.

Q: Should I use a GIN or GiST index for integer array columns?

GIN indexes are generally faster for read-heavy containment and overlap queries. GIN results are exact (no false positives), and lookup performance is typically better. GiST indexes via intarray are better when you need the extended set operators (&, |, -) in indexed contexts, when write performance matters, or when you want fixed-size index entries. For most read-heavy applications, GIN is the default recommendation.

Q: Can intarray handle arrays with thousands of elements?

It can, but performance degrades with very large arrays. intarray is optimized for arrays of up to a few hundred elements. If your arrays regularly exceed 500-1,000 elements, consider a normalized junction table instead. Junction tables give you foreign key constraints, more flexible querying with JOINs, and better performance for very large relationships.

Q: Is intarray available on managed PostgreSQL services like Amazon RDS?

Yes. intarray is part of PostgreSQL's contrib modules and is available on virtually every managed provider, including Amazon RDS, Google Cloud SQL, Supabase, Neon, and Azure Flexible Server. It does not require shared_preload_libraries configuration or a server restart.

Q: When should I use intarray vs a junction table for many-to-many relationships?

Use intarray when the related IDs are read-mostly with occasional mutations, array sizes are modest (under a few hundred elements), you need fast set operations, and you do not need foreign key constraints. Use a junction table when you need referential integrity, arrays would be large, you need to JOIN on attributes of the related entity, or the relationship needs its own attributes. Many applications use both.

One operator where there were five lines of SQL. That is the kind of improvement worth attending to.

March 27, 2026 · 16 min read

The artist delivered three versions of the illustration. We took the intersection, subtracted the unnecessary elements, and produced the union. One operator where there were three meetings.

What intarray adds to PostgreSQL

PostgreSQL includes native array support — integer[], text[], and operators like @>, <@, and && for containment and overlap checks. These work on any array type and can be indexed with GIN indexes. For many applications, native arrays are sufficient.

intarray extends PostgreSQL specifically for integer arrays. It adds operators that the native array system does not provide — intersection (&), union (|), and difference (-) — along with utility functions for sorting, deduplication, and element lookup. It also provides GiST index support for integer array columns, enabling fast containment and overlap queries through a different indexing strategy than GIN.

Why a dedicated extension for integer arrays? Performance and convenience. intarray's operators are implemented in C and optimized for sorted integer arrays. Operations that would require subqueries with unnest() and INTERSECT in standard SQL become single-operator expressions. One operator where there were five lines of SQL. That is the kind of improvement worth attending to.

Installation requires no restart and no shared_preload_libraries configuration:

SQL

CREATE EXTENSION intarray;

intarray is part of PostgreSQL's contrib modules and is available on virtually every managed provider including Amazon RDS, Google Cloud SQL, Supabase, and Neon.

When to use intarray versus native array operators: install intarray when your workload involves integer arrays specifically and you need set operations (intersection, union, difference) or want GiST-based indexing for containment and overlap queries. If your arrays contain non-integer types, or if you only need basic containment checks with GIN indexing, native array operators will serve you well.

The core operators

Allow me to introduce the operators that make intarray worth installing.

Intersection — the & operator

The & operator returns elements present in both arrays:

SQL — intersection

SELECT ARRAY[1,2,3] & ARRAY[2,3,4];
-- Result: {2,3}

SELECT ARRAY[10,20,30,40] & ARRAY[20,40,60];
-- Result: {20,40}

SELECT ARRAY[1,2,3] & ARRAY[4,5,6];
-- Result: {}

The primary use case is finding common elements between two sets — shared tags between two items, overlapping permissions between two roles, or categories that two products have in common.

Standard PostgreSQL has no single-operator intersection for arrays. Without intarray, you would need a subquery approach:

SQL — without intarray

-- Without intarray: verbose and slower
SELECT ARRAY(
  SELECT unnest(ARRAY[1,2,3])
  INTERSECT
  SELECT unnest(ARRAY[2,3,4])
);

intarray's C implementation operates on sorted arrays in linear time, making it substantially faster than the subquery approach for repeated operations.

Union — the | operator

The | operator returns all unique elements from both arrays, deduplicated and sorted:

SQL — union

SELECT ARRAY[1,2,3] | ARRAY[3,4,5];
-- Result: {1,2,3,4,5}

SELECT ARRAY[10,20] | ARRAY[20,30] | ARRAY[30,40];
-- Result: {10,20,30,40}

The result is always sorted and deduplicated — a useful property for consistent storage. If you store arrays normalized with |, you need never worry about duplicate entries or inconsistent ordering.

Standard PostgreSQL has no single-operator array union. Without intarray, you would need:

SQL — without intarray

-- Without intarray
SELECT ARRAY(
  SELECT DISTINCT unnest(ARRAY[1,2,3] || ARRAY[3,4,5])
  ORDER BY 1
);

Difference — the - operator

The - operator returns elements in the first array that are not in the second:

SQL — difference

SELECT ARRAY[1,2,3,4,5] - ARRAY[2,4];
-- Result: {1,3,5}

SELECT ARRAY[10,20,30] - ARRAY[30,40,50];
-- Result: {10,20}

This is an asymmetric operation — A - B is not the same as B - A:

SQL — asymmetric difference

SELECT ARRAY[1,2,3] - ARRAY[3,4,5];
-- Result: {1,2}

SELECT ARRAY[3,4,5] - ARRAY[1,2,3];
-- Result: {4,5}

Use cases include removing revoked permissions, untagging items, and applying exclusion filters.

Containment — @> and <@ (enhanced)

The containment operators check whether one array contains all elements of another:

SQL — containment

SELECT ARRAY[1,2,3,4,5] @> ARRAY[2,4];
-- true (left contains right)

SELECT ARRAY[2,4] <@ ARRAY[1,2,3,4,5];
-- true (left is contained by right)

These operators exist in base PostgreSQL and work the same way. intarray does not change their behavior — it makes them indexable with GiST indexes. Without intarray, containment queries on integer arrays require either a GIN index or a sequential scan. With intarray, you gain an additional indexing option (GiST) that has different performance characteristics.

Overlap — && (enhanced)

The overlap operator checks whether two arrays share at least one element:

SQL — overlap

SELECT ARRAY[1,2,3] && ARRAY[3,4,5];
-- true (element 3 in common)

SELECT ARRAY[1,2,3] && ARRAY[4,5,6];
-- false (no elements in common)

Like containment, the && operator exists in base PostgreSQL. intarray enhances it with GiST index support. The practical value is in queries like "find all products that share at least one tag with this product" — with a GiST index on the tag array column, this query uses an index scan instead of a sequential scan.

Useful functions

sort(), sort_asc(), sort_desc()

The sort() function sorts the elements of an integer array:

SQL — sorting

SELECT sort(ARRAY[3,1,4,1,5,9,2,6]);
-- Result: {1,1,2,3,4,5,6,9}

SELECT sort_asc(ARRAY[3,1,2]);
-- Result: {1,2,3}

SELECT sort_desc(ARRAY[3,1,2]);
-- Result: {3,2,1}

Sorting matters because intarray's operators perform best on sorted input. The &, |, and - operators use sorted-array fast paths. Storing arrays pre-sorted — via a trigger, a generated column, or application code — ensures consistent performance.

uniq()

The uniq() function removes consecutive duplicate elements:

SQL — deduplication

SELECT uniq(ARRAY[1,1,2,2,3,3]);
-- Result: {1,2,3}

SELECT uniq(ARRAY[1,2,1,3,2,3]);
-- Result: {1,2,1,3,2,3}  -- only consecutive duplicates removed

Because uniq() only removes consecutive duplicates, it requires sorted input to fully deduplicate an array. The proper normalization pattern is sort() followed by uniq():

SQL — full normalization

SELECT uniq(sort(ARRAY[3,1,2,1,3,2]));
-- Result: {1,2,3}

This uniq(sort(...)) combination is the standard way to normalize integer arrays before storage.

idx()

The idx() function returns the 1-based position of an element in an array, or 0 if the element is not found:

SQL — element lookup

SELECT idx(ARRAY[10,20,30,40], 30);
-- Result: 3

SELECT idx(ARRAY[10,20,30,40], 50);
-- Result: 0

Use idx() to check whether a value exists in an array and to find its position. A return value of 0 means the element is absent.

subarray()

The subarray() function extracts a slice of the array, using 1-based indexing:

SQL — array slicing

SELECT subarray(ARRAY[10,20,30,40,50], 2, 3);
-- Result: {20,30,40}  (start at position 2, take 3 elements)

SELECT subarray(ARRAY[10,20,30,40,50], 3);
-- Result: {30,40,50}  (start at position 3, take rest)

icount()

The icount() function returns the number of elements in an integer array:

SQL — element count

SELECT icount(ARRAY[10,20,30]);
-- Result: 3

SELECT icount(ARRAY[]::integer[]);
-- Result: 0

Functionally equivalent to array_length(arr, 1), but icount() reads more clearly in complex expressions and handles empty arrays without returning NULL.

GiST indexing for integer arrays

If the operators are the reason to consider intarray, the GiST index support is the reason to commit. Without intarray, array containment and overlap queries can use GIN indexes (native PostgreSQL) or sequential scans. With intarray, GiST becomes an additional option with different trade-offs.

intarray provides two GiST operator classes:

gist__int_ops — exact representation. Stores the actual set of integers in the index entry. Best for arrays where all values are small (under approximately 100,000). Produces no false positives.

SQL — exact GiST index

CREATE INDEX idx_tags_exact ON products USING gist(tag_ids gist__int_ops);

gist__intbig_ops — signature-based (lossy). Uses a fixed-size bitmap signature to represent the array. Works with any integer values. The trade-off is that the bitmap can produce false positives, requiring a recheck against the actual table data. In practice, the false positive rate is low for reasonably-sized arrays.

SQL — signature-based GiST index

CREATE INDEX idx_tags_sig ON products USING gist(tag_ids gist__intbig_ops);

Which to choose: gist__intbig_ops is the safe default — it works with any integer values and produces fixed-size index entries regardless of array content. Use gist__int_ops only when your arrays contain small values and you want to eliminate rechecks.

What the GiST index accelerates

With a GiST index in place, these queries use index scans instead of sequential scans:

Containment (@>) — find all products tagged with tags 5 and 12:

SQL

SELECT * FROM products WHERE tag_ids @> ARRAY[5, 12];

Contained by (<@) — find all products whose tags are a subset of {1, 2, 3, 4, 5}:

SQL

SELECT * FROM products WHERE tag_ids <@ ARRAY[1, 2, 3, 4, 5];

Overlap (&&) — find all products that share at least one tag with {5, 12}:

SQL

SELECT * FROM products WHERE tag_ids && ARRAY[5, 12];

GiST vs GIN for array indexing

Both GiST (via intarray) and GIN indexes support containment (@>, <@) and overlap (&&) queries on integer arrays. The choice between them depends on your workload:

Factor	GIN	GiST (intarray)
Lookup speed	Generally faster	Slightly slower
False positives	None (exact)	Possible with gist__intbig_ops
Insert/update speed	Slower (pending list + merge)	Faster
Supports &, \|, -	No	Yes (in index-aware contexts)
Supports =	Yes	No
Index size	Varies with vocabulary	Fixed with gist__intbig_ops

General guidance: Use GIN for read-heavy workloads where lookup speed is the priority and you do not need intarray's extended operators. Use GiST via intarray when you need the intersection, union, and difference operators, when write performance matters, or when your workload is write-heavy.

For a broader comparison of PostgreSQL index types, see the index types overview.

Real-world patterns

Tagging system

A tag system using integer arrays stores tag IDs directly on the tagged entity:

SQL — schema

CREATE TABLE products (
  id         SERIAL PRIMARY KEY,
  name       TEXT NOT NULL,
  tag_ids    INTEGER[] NOT NULL DEFAULT '{}'
);

CREATE INDEX idx_products_tags ON products USING gist(tag_ids gist__intbig_ops);

SQL — tagging operations

-- Find products with all specified tags (containment):
SELECT * FROM products WHERE tag_ids @> ARRAY[5, 12];

-- Find products with any of the specified tags (overlap):
SELECT * FROM products WHERE tag_ids && ARRAY[5, 12, 20];

-- Find common tags between two products (intersection):
SELECT
  p1.tag_ids & p2.tag_ids AS shared_tags
FROM products p1, products p2
WHERE p1.id = 101 AND p2.id = 202;

-- Add a tag (union -- deduplicates automatically):
UPDATE products SET tag_ids = tag_ids | ARRAY[15] WHERE id = 101;

-- Remove a tag (difference):
UPDATE products SET tag_ids = tag_ids - ARRAY[15] WHERE id = 101;

The GiST index makes the WHERE clause queries fast. The set operators make the mutations clean — no need to unnest, filter, and re-aggregate. Each operation is a single, readable expression.

Permission bitmaps

User permissions as an integer array where each integer represents a permission ID:

SQL — permissions

CREATE TABLE users (
  id          SERIAL PRIMARY KEY,
  name        TEXT NOT NULL,
  permissions INTEGER[] NOT NULL DEFAULT '{}'
);

CREATE INDEX idx_users_perms ON users USING gist(permissions gist__intbig_ops);

-- Check if a user has all required permissions:
SELECT * FROM users WHERE permissions @> ARRAY[1, 4, 7];

-- Grant new permissions:
UPDATE users SET permissions = permissions | ARRAY[9, 10] WHERE id = 42;

-- Revoke a permission:
UPDATE users SET permissions = permissions - ARRAY[4] WHERE id = 42;

-- Find users who share at least one permission with a set:
SELECT * FROM users WHERE permissions && ARRAY[1, 2, 3];

Multi-category filtering

Products assigned to multiple categories using an integer array of category IDs:

SQL — category filtering

CREATE TABLE catalog_items (
  id           SERIAL PRIMARY KEY,
  title        TEXT NOT NULL,
  category_ids INTEGER[] NOT NULL DEFAULT '{}'
);

CREATE INDEX idx_catalog_cats ON catalog_items USING gist(category_ids gist__intbig_ops);

-- OR filter -- show products in categories 5, 12, or 20:
SELECT * FROM catalog_items WHERE category_ids && ARRAY[5, 12, 20];

-- AND filter -- show products in all of categories 5 and 12:
SELECT * FROM catalog_items WHERE category_ids @> ARRAY[5, 12];

-- Category difference -- which categories does item A have that item B does not?
SELECT a.category_ids - b.category_ids AS unique_to_a
FROM catalog_items a, catalog_items b
WHERE a.id = 101 AND b.id = 202;

Performance considerations

I should note a few practical boundaries.

Array size limits. intarray works best with arrays of up to a few hundred elements. As arrays grow beyond that, the cost of set operations and GiST index maintenance increases. intarray is not designed for arrays of thousands of elements.

Pre-sort arrays on write. intarray's operators exploit sorted-array fast paths. Storing arrays already sorted avoids repeated sorting during query execution. Normalize arrays with uniq(sort(arr)) in a trigger or application code:

SQL — normalization trigger

CREATE OR REPLACE FUNCTION normalize_int_array()
RETURNS TRIGGER AS $$
BEGIN
  NEW.tag_ids := uniq(sort(NEW.tag_ids));
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_normalize_tags
  BEFORE INSERT OR UPDATE ON products
  FOR EACH ROW EXECUTE FUNCTION normalize_int_array();

VACUUM and GiST indexes. GiST indexes can bloat after heavy update activity. Monitor GiST index size relative to the data size and REINDEX when bloat becomes significant. The pgstatindex function from the pgstattuple extension provides page-level statistics for index analysis.

When to use a junction table instead. I should be honest about the limits of the array approach. Integer arrays with intarray are not a universal replacement for junction tables. Consider a normalized junction table when:

Arrays regularly exceed 500 elements
You need foreign key constraints to enforce referential integrity
You need to JOIN on attributes of the related entity (e.g., tag name, category description)
You need complex aggregations across the relationship

The array approach trades normalization for simplicity and performance on read-heavy set operations. The junction table approach trades performance for flexibility and referential integrity. The right choice depends on your access patterns — and many applications benefit from both, each in its proper place.

For comprehensive guidance on PostgreSQL performance optimization, see the performance tuning guide. For an overview of PostgreSQL extensions that improve performance, see the extensions roundup.