pgvector

Q: Does pgvector slow down regular PostgreSQL queries?

Not in the slightest. pgvector only affects queries that use its vector types and operators. It does not hook into the query planner or execution engine for non-vector queries. There is no background overhead from having the extension installed. It is, if you will, a guest that keeps entirely to its own wing of the manor.

Q: How many dimensions can pgvector handle?

The vector type supports up to 16,000 dimensions. The halfvec type also supports up to 16,000 dimensions at half the storage cost. For most embedding models (OpenAI, Cohere, etc.), 768 to 3,072 dimensions is typical.

Q: Should I use IVFFlat or HNSW indexes?

HNSW is the better default for most workloads. It offers higher recall at the same speed, can be built on an empty table, and does not require a training step. IVFFlat builds faster and uses less memory, which can matter for very large datasets where build time is a genuine constraint. If you are unsure, start with HNSW. You can always revisit the decision once your data and query patterns become clearer — and they will.

Q: Do I need to normalize my vectors before storing them?

It depends on the distance metric. Cosine distance ( operator) normalizes internally, so you do not need to pre-normalize. For inner product ( operator), results are more meaningful with normalized vectors. L2 distance ( operator) works on raw vectors. Many embedding APIs return normalized vectors by default.

Q: Can pgvector do exact nearest neighbor search?

Yes, and it does so by default. Without an index, pgvector performs exact (brute-force) nearest neighbor search by scanning every row. This guarantees perfect recall but scales linearly with table size. For tables under roughly 100,000 rows, exact search is often perfectly adequate. Beyond that, an approximate index becomes not merely recommended but rather essential to your response times.

Q: How does pgvector compare to dedicated vector databases like Pinecone or Weaviate?

pgvector keeps your vectors in the same database as your relational data, which eliminates synchronization complexity and allows you to combine vector search with SQL filters, joins, and transactions. I should be forthcoming: dedicated vector databases may offer higher throughput for pure vector workloads at very large scale. But for most applications, the operational simplicity of keeping everything in PostgreSQL — one backup strategy, one replication topology, one set of credentials — outweighs the performance difference. For a detailed feature, performance, and cost comparison, see the pgvector vs Weaviate vs Qdrant comparison. The operational simplicity of one well-run household is not to be underestimated.

Vector similarity search for PostgreSQL — because the manor already has a perfectly good room for your embeddings.

The Waiter of Gold Lapel · Updated Mar 30, 2026 Published Mar 21, 2026 · 10 min read

I confess a particular admiration for pgvector. While the industry was building an entirely new category of database for embeddings, Andrew Kane wrote an extension that added vector similarity search to PostgreSQL with a single CREATE EXTENSION. PostgreSQL is not a database. It is an ecosystem that most teams use as a database. pgvector is among the finest evidence of this.

pgvector is an open-source PostgreSQL extension that adds vector data types and similarity search operators. It lets you store embedding vectors directly in your database and find nearest neighbors using L2 distance, cosine distance, or inner product — with optional approximate indexing via IVFFlat or HNSW for fast search at scale.

What pgvector does

pgvector introduces the vector data type to PostgreSQL, allowing you to store fixed-length arrays of floating-point numbers as column values. These vectors typically represent embeddings generated by machine learning models — dense numerical representations of text, images, or other data that capture semantic meaning.

Once vectors are stored, pgvector provides distance operators that measure how similar two vectors are. You can use standard SQL ORDER BY with these operators to find the nearest neighbors to a query vector. Without an index, this is an exact (brute-force) sequential scan. With an IVFFlat or HNSW index, pgvector performs approximate nearest neighbor (ANN) search that trades a small amount of recall for significantly faster queries.

The key advantage is operational simplicity. Your vectors live in the same database as your relational data. You can join vector search results with regular tables, filter by SQL predicates, wrap searches in transactions, and use your existing backup and replication infrastructure. No separate vector database to deploy, synchronize, or maintain. One household, properly staffed, rather than two households in perpetual need of coordination.

When to use pgvector

pgvector is a good fit when your vector workload lives alongside relational data and you want to avoid the operational complexity of a separate system. Specific use cases:

Semantic search — embed documents or product descriptions, then find results by meaning rather than keyword matching
Retrieval-Augmented Generation (RAG) — store chunked documents as embeddings, retrieve relevant context at query time, and feed it to an LLM
Recommendation engines — represent users and items as vectors, then find similar items or users via nearest neighbor search
Image similarity — store image embeddings from models like CLIP and search by visual similarity
Deduplication and clustering — find near-duplicate records by computing distances between embedding vectors
Hybrid search — combine vector similarity with traditional SQL filters (e.g., nearest neighbors where category = 'electronics' and price < 100)

Installation and setup

pgvector does not require shared_preload_libraries and does not need a server restart. Installation is a single CREATE EXTENSION statement. The extension is open-source, hosted on GitHub, and not a core contrib module, so it must be installed on the server first — on most cloud providers it is pre-installed and ready to enable.

SQL

-- Install the extension (no shared_preload_libraries needed)
CREATE EXTENSION vector;

-- Verify it's working
SELECT * FROM pg_extension WHERE extname = 'vector';

Two statements. No restart. If your vectors already live alongside relational data, the simplicity of this setup is difficult to overstate.

On self-managed PostgreSQL, install the package first. For Debian/Ubuntu, use the PostgreSQL APT repository (apt install postgresql-17-pgvector). For Red Hat/CentOS, use the PostgreSQL Yum repository.

Vector types

pgvector provides several data types for different precision and storage needs:

vector — single-precision (4 bytes per dimension), up to 16,000 dimensions. The default choice for most embedding models.
halfvec — half-precision (2 bytes per dimension), up to 16,000 dimensions. Cuts storage and memory in half with minimal quality loss for most embeddings.
sparsevec — sparse vectors that only store non-zero elements. Efficient for high-dimensional vectors that are mostly zeros.
bit — binary vectors for Hamming and Jaccard distance calculations.

SQL

-- Create a table with a vector column (1536 dimensions for OpenAI embeddings)
CREATE TABLE documents (
  id bigserial PRIMARY KEY,
  content text,
  embedding vector(1536)
);

-- Insert a vector
INSERT INTO documents (content, embedding)
VALUES ('PostgreSQL is a relational database', '[0.1, 0.2, 0.3, ...]');

Distance operators

pgvector uses custom operators for distance calculations. Because PostgreSQL index scans only support ascending order, some operators return negative or inverted values so that ORDER BY ... ASC returns the most similar results first.

Operator	Distance metric	Use case
`<->`	L2 (Euclidean) distance	General-purpose; works on raw vectors
`<=>`	Cosine distance	Preferred for normalized embeddings (most common)
`<#>`	Negative inner product	When you need dot product similarity
`<+>`	L1 (Manhattan) distance	Alternative distance metric

SQL

-- L2 (Euclidean) distance — find nearest neighbors
SELECT id, content, embedding <-> '[0.1, 0.2, 0.3, ...]' AS distance
FROM documents
ORDER BY embedding <-> '[0.1, 0.2, 0.3, ...]'
LIMIT 5;

-- Cosine distance — preferred for normalized embeddings
SELECT id, content, 1 - (embedding <=> '[0.1, 0.2, 0.3, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'
LIMIT 5;

-- Negative inner product — useful when vectors are not normalized
SELECT id, content, (embedding <#> '[0.1, 0.2, 0.3, ...]') * -1 AS inner_product
FROM documents
ORDER BY embedding <#> '[0.1, 0.2, 0.3, ...]'
LIMIT 5;

Standard SQL. Standard ORDER BY. Standard LIMIT. The only thing new is the operator in the middle — the rest is PostgreSQL as you already know it. The pgvector distance operator reference covers all supported metrics and their index compatibility.

Index types

Without an index, pgvector performs exact nearest neighbor search — it computes distances against every row. This guarantees perfect recall but becomes slow as tables grow. pgvector provides two approximate nearest neighbor index types that trade a small amount of recall for dramatically faster queries.

HNSW (Hierarchical Navigable Small World)

HNSW builds a multi-layered graph structure. It provides the best speed-recall tradeoff for most workloads and is the recommended default. Key properties:

Can be created on an empty table (no training step required)
Better recall than IVFFlat at the same query speed
Slower to build and uses more memory than IVFFlat
Tunable via m (connections per node) and ef_construction (build-time beam width)
Query-time recall controlled by hnsw.ef_search

SQL

-- Create an HNSW index (can be created on an empty table)
-- m: max connections per layer (default 16)
-- ef_construction: size of the dynamic candidate list (default 64)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Set the search beam width at query time (higher = better recall, slower)
SET hnsw.ef_search = 40;

IVFFlat (Inverted File with Flat Compression)

IVFFlat divides vectors into clusters (lists) and searches only the nearest clusters at query time. Key properties:

Requires data in the table before building (uses k-means clustering as a training step)
Faster to build and uses less memory than HNSW
Lower recall than HNSW at the same query speed
Tunable via lists (number of clusters)
Query-time recall controlled by ivfflat.probes

SQL

-- Create an IVFFlat index (requires data in the table first)
-- lists: number of clusters — a common starting point is rows / 1000
-- for up to 1M rows, or sqrt(rows) for larger tables
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- Set the number of probes at query time (higher = better recall, slower)
SET ivfflat.probes = 10;

Half-precision vectors

For large tables where storage and memory are a concern, halfvec stores each dimension in 2 bytes instead of 4. This halves the index size and improves cache efficiency, with negligible impact on recall for most embedding models. Half the storage for very nearly the same quality — the sort of economy that appeals to any well-run household.

SQL

-- Half-precision vectors use 2 bytes per dimension instead of 4
-- Useful for reducing storage and memory when full precision isn't needed
CREATE TABLE documents_half (
  id bigserial PRIMARY KEY,
  embedding halfvec(1536)
);

-- Index half-precision vectors
CREATE INDEX ON documents_half
USING hnsw (embedding halfvec_cosine_ops);

Cloud availability

Provider	Status
Amazon RDS / Aurora	Available — supported on PostgreSQL 15.2+ instances
Google Cloud SQL	Available — enable via database flags
Azure Database for PostgreSQL	Available — add to allowlist, then CREATE EXTENSION
Supabase	Available — pre-installed, enable with CREATE EXTENSION
Neon	Available — pre-installed on PostgreSQL 15+
Crunchy Bridge	Available — pre-installed on all clusters

How Gold Lapel relates

Allow me a candid observation about vector workloads: they are expensive. A nearest-neighbor search touches the index, computes distances across candidate vectors, and returns results — all of which consume CPU and memory that your PostgreSQL instance would rather spend on other guests. In RAG pipelines and semantic search, these queries arrive on every user request, often with the same or remarkably similar query vectors.

Gold Lapel sits between your application and PostgreSQL as a proxy. It sees every query before it reaches the database — including your vector similarity searches. When the same embedding is searched repeatedly, or when query patterns emerge across your traffic, Gold Lapel identifies them and applies proxy-level optimizations. The database is spared redundant computation. Response times improve. Your application code and pgvector configuration remain untouched.

pgvector handles vector storage and search inside PostgreSQL. Gold Lapel attends to the query traffic in front of it. Different levels of the stack, complementary by nature. I would not suggest one as a substitute for the other — but together, they keep the household running efficiently under load.

pgvector

What pgvector does

When to use pgvector

Installation and setup

Vector types

Distance operators

Index types

HNSW (Hierarchical Navigable Small World)

IVFFlat (Inverted File with Flat Compression)

Half-precision vectors

Cloud availability

How Gold Lapel relates

Frequently asked questions

Does pgvector slow down regular PostgreSQL queries?

How many dimensions can pgvector handle?

Should I use IVFFlat or HNSW indexes?

Do I need to normalize my vectors before storing them?

Can pgvector do exact nearest neighbor search?

How does pgvector compare to dedicated vector databases like Pinecone or Weaviate?

Related content