pgvector vs Weaviate vs Qdrant

Q: Is pgvector fast enough for production vector search?

Yes, for the majority of workloads. pgvector with HNSW indexes achieves sub-10ms latency at 99%+ recall for datasets up to 1–5 million vectors. The performance depends on tuning — ef_search, m, and ef_construction parameters, adequate shared_buffers, and SSD storage. Above 5–10 million vectors, or with highly selective filters under strict recall requirements, dedicated engines like Qdrant offer measurable advantages. Below 1 million vectors, the performance difference between pgvector and any dedicated engine is unlikely to matter.

Q: Can I use pgvector and a dedicated vector database together?

Yes. A common evolutionary pattern is to prototype with pgvector, then add Qdrant or Weaviate for the search path while keeping PostgreSQL as the source of truth for embeddings and metadata. This avoids a full migration but adds a sync pipeline to maintain. The pipeline must handle inserts, updates, and deletes, with monitoring for consistency drift.

Q: How does filtering work differently in pgvector vs Qdrant?

pgvector applies SQL WHERE filters after the ANN search. The HNSW index returns its top candidates based purely on vector similarity, and PostgreSQL then discards candidates that do not match the WHERE clause. When filters are selective, effective recall drops. Qdrant integrates payload filtering into the HNSW graph traversal itself, only visiting nodes that match filter conditions and dynamically expanding its search when filtered nodes are encountered. This maintains recall even under highly selective filters.

Q: Do I need a vector database for RAG (Retrieval-Augmented Generation)?

Not necessarily. pgvector handles RAG retrieval well for most applications. Store document chunks and their embeddings in PostgreSQL, query with cosine or L2 similarity, and return the top-K results to your LLM. A dedicated vector database adds value when your corpus exceeds millions of chunks, when you need hybrid search for better retrieval quality, or when you need production-grade recall guarantees under concurrent load at scale.

Q: What is the operational overhead of running Weaviate or Qdrant alongside PostgreSQL?

The overhead is meaningful: a separate deployment pipeline, monitoring stack, backup procedure, data synchronization pipeline, and on-call competency. Budget 2–4 weeks of engineering time for initial integration and expect ongoing operational overhead equivalent to running a second database — because that is precisely what it is. The operational burden is justified when performance or feature requirements genuinely exceed what pgvector provides.

Vector Database Comparison for PostgreSQL Teams

Three artists were engaged to produce a single illustration. The coordination overhead now exceeds the cost of the artwork. We are consolidating.

The Decision That Matters More Than the Benchmarks

Good evening. I see you have arrived with a decision to make — and I should like to help you make it with full information rather than marketing materials.

The vector database market has expanded rapidly alongside the adoption of embedding models and retrieval-augmented generation. If you are building on PostgreSQL, you now face a concrete choice: extend your existing database with pgvector, adopt a purpose-built vector engine like Weaviate or Qdrant, or use a managed SaaS service like Pinecone.

This comparison focuses on the self-hosted options that PostgreSQL teams are most likely evaluating: pgvector (the PostgreSQL extension), Weaviate (an open-source hybrid search engine), and Qdrant (an open-source high-performance vector engine). If you are still weighing whether a dedicated vector database is warranted at all, I addressed that question separately in Do You Need a Vector Database?

I should be candid about the real decision here. It is not "which is fastest on a benchmark" — it is "what operational complexity am I willing to accept for what marginal performance gain?" This guide is intended to help you answer that question honestly.

Three Approaches to Vector Search

The vector search landscape has settled into three models:

Extend your existing database. pgvector adds vector types and ANN indexes to PostgreSQL. Your vectors live alongside your application data, queried with SQL, backed up with your existing tools, and governed by your existing access controls.
Adopt a purpose-built open-source engine. Weaviate, Qdrant, Milvus, and others are designed from the ground up for vector search. They offer specialized indexing, distributed scaling, and features like built-in embedding pipelines or advanced filtered search.
Use a managed SaaS. Pinecone, Zilliz Cloud, and Weaviate Cloud offload operational burden entirely. You get an API endpoint and pay per query or per stored vector.

Each model makes a different tradeoff between operational simplicity, performance ceiling, and feature breadth. The right choice depends on your dataset size, your team's expertise, your performance requirements, and — if I may be forthright — your tolerance for operational complexity.

What Each Tool Is

pgvector — Vector Search as a PostgreSQL Extension

pgvector is not a separate database — and this is the single most important thing to understand about it. It is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest-neighbor (ANN) index types to your existing PostgreSQL instance.

-- pgvector in action
CREATE TABLE documents (
  id bigserial PRIMARY KEY,
  title text,
  content text,
  embedding vector(1536)
);

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Similarity search with a SQL WHERE filter
SELECT title, embedding <=> query_embedding AS distance
FROM documents
WHERE category = 'engineering'
ORDER BY embedding <=> query_embedding
LIMIT 10;

Strengths: Zero additional infrastructure. Full SQL. Transactional consistency between vectors and application data. No data synchronization pipeline. Your existing PostgreSQL monitoring, backup, and operational tooling covers vector workloads.

Limitations: Single-node scaling (bounded by your PostgreSQL instance). Vector workloads share CPU, memory, and I/O with your OLTP workload. Post-filtering with SQL WHERE clauses can degrade recall when filters are selective.

I have prepared two guides that cover pgvector tuning in detail: the performance tuning guide addresses HNSW parameters, index builds, and recall tradeoffs, while the query optimization guide covers distance function selection and filtering strategies.

Weaviate — Hybrid Search Engine with Native Vectorization

Weaviate is a purpose-built vector database written in Go. It is designed as a standalone service, accessed via GraphQL and REST APIs.

# Weaviate GraphQL query — hybrid search
{
  Get {
    Document(
      hybrid: {
        query: "database performance tuning"
        alpha: 0.75  # 75% vector, 25% keyword
      }
      where: {
        path: ["category"]
        operator: Equal
        valueText: "engineering"
      }
      limit: 10
    ) {
      title
      content
      _additional {
        score
        distance
      }
    }
  }
}

Strengths: Hybrid search (vector + BM25) out of the box, which would require combining pgvector with tsvector in PostgreSQL. Built-in embedding pipelines eliminate the need for a separate embedding service. Multi-tenancy is a first-class feature.

Limitations: Another stateful service to deploy, monitor, and back up. GraphQL adds a learning curve for teams accustomed to SQL. Resource-intensive at small scale. Requires a data synchronization pipeline if PostgreSQL remains your source of truth.

Qdrant — High-Performance Vector Engine with Advanced Filtering

Qdrant is a purpose-built vector database written in Rust. It is designed for high-throughput, low-latency vector search with a focus on filtering performance.

# Qdrant Python client — filtered search
from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue

client = QdrantClient("localhost", port=6333)

results = client.search(
    collection_name="documents",
    query_vector=query_embedding,
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchValue(value="engineering")
            )
        ]
    ),
    limit=10
)

Strengths: Filtering performance is Qdrant's distinguishing feature — payload-aware filtering integrates filters into the HNSW graph traversal, maintaining recall even with highly selective filters. Quantization reduces memory footprint by 4–32x. Rust implementation delivers predictable latency under load.

Limitations: Another stateful service to operate. No native hybrid text search. Younger ecosystem compared to Weaviate or Elasticsearch. Requires a data sync pipeline when used alongside PostgreSQL.

Feature Comparison

Feature	pgvector	Weaviate	Qdrant
Index types	HNSW, IVFFlat	HNSW	HNSW
Distance metrics	L2, cosine, inner product	L2, cosine, dot product, hamming, manhattan	L2, cosine, dot product, manhattan, euclid
Max dimensions	16,000	No hard limit	65,536
Filtering approach	Post-filter (SQL WHERE after ANN)	Pre-filter (inverted index, then vector search)	Integrated (payload-aware graph traversal)
Hybrid search (vector + keyword)	Manual (pgvector + tsvector)	Native (vector + BM25 in one query)	No native keyword search
Quantization	Half-vector (binary in development)	Product quantization, binary quantization	Scalar, product, binary quantization
Multi-tenancy	Schema-based or row-level security	Native (per-tenant isolation)	Collection-based
Replication	PostgreSQL streaming replication	Built-in	Built-in (Raft consensus)
Sharding	Table partitioning	Built-in (automatic)	Built-in (automatic)
ACID transactions	Yes (full PostgreSQL ACID)	No	No
Backup / PITR	pg_dump, WAL archiving, PITR	Snapshot-based	Snapshot-based
Built-in embedding	No (external embedding required)	Yes (OpenAI, Cohere, HuggingFace modules)	No (external embedding required)
API	SQL	GraphQL, REST	REST, gRPC
License	PostgreSQL License (open source)	BSD-3-Clause	Apache 2.0

Performance — What the Benchmarks Actually Tell You

The Benchmark Landscape

The vector search community relies on ANN-Benchmarks as the neutral benchmark suite for approximate nearest-neighbor search. It measures recall vs queries-per-second across various datasets and algorithms.

I should note, however, that ANN-Benchmarks measures pure ANN search: a single query, no filters, no concurrent writes, no hybrid queries. Real production workloads differ substantially. A benchmark showing that Engine A handles 10,000 QPS and Engine B handles 8,000 QPS tells you very little about how they will perform under your actual workload with filters, concurrent users, and mixed read-write traffic.

Recall vs Latency at Different Scales

100K vectors (1536 dimensions): All three systems perform well at this scale. pgvector delivers sub-5ms latency at 99% recall. Weaviate and Qdrant are similarly fast. At 100K vectors, performance is not a differentiator — any of these tools will be fast enough.

1M vectors (1536 dimensions): Qdrant and Weaviate typically maintain sub-5ms latency at high recall. pgvector ranges from 5–15ms depending on ef_search settings and dimension count. The gap is measurable but rarely user-facing — the difference between 5ms and 15ms is absorbed by network latency in most architectures.

10M+ vectors (1536 dimensions): At this scale, dedicated engines pull ahead. Sharding distributes the index across nodes, and quantization reduces memory requirements. pgvector on a single PostgreSQL instance faces memory pressure (the HNSW index for 10M vectors at 1536 dimensions exceeds 60 GB) and build times measured in hours.

The honest assessment: Below 1 million vectors, performance differences between these tools are unlikely to be your bottleneck. Network latency, embedding generation time, and application-layer processing dominate end-to-end latency. A waiter who overstates the importance of a 10ms difference when the network adds 50ms would be a poor advisor. Above 5–10 million vectors, the architectural advantages of purpose-built engines begin to matter.

Filtering Performance — Where the Real Differences Emerge

If you will permit me one strong opinion: filtering is where the architectural differences between these systems have the most practical impact.

pgvector: Applies standard SQL WHERE clauses after the ANN search (post-filtering). The HNSW index returns its top candidates without awareness of the filter, and PostgreSQL then discards candidates that do not match the WHERE clause. When the filter is selective, many ANN results are discarded, and effective recall drops. Partial indexes and table partitioning can mitigate this, but require explicit setup per filter column.

Qdrant: Integrates payload filtering into the HNSW graph traversal itself. During the search, the algorithm only considers nodes that match the filter conditions. This maintains recall even under highly selective filters. This is Qdrant's most significant architectural advantage over pgvector.

Weaviate: Uses pre-filtering with inverted indexes. Before the vector search begins, Weaviate identifies the set of documents matching the filter criteria, then runs the vector search only over that filtered set. This maintains recall but can be slower than Qdrant's integrated approach for certain filter patterns.

For workloads where most queries include tight metadata filters — multi-tenant SaaS applications, e-commerce with category filters, content search with access controls — filtering performance is often more important than raw ANN throughput.

Operational Complexity — The Cost That Deserves Your Attention

Deployment

pgvector: CREATE EXTENSION vector; on your existing PostgreSQL instance. No new servers, no new containers, no new networking. If you can run a SQL command, you can deploy pgvector.

Weaviate: Deploy a Docker container or Kubernetes Helm chart. Configure vectorization modules, persistence volume, replication settings, and resource limits.

Qdrant: Deploy a Docker container or Kubernetes pod. Configure collection settings, storage paths, and replication. Simpler to configure than Weaviate, but still a separate stateful service with its own lifecycle.

Data Consistency

This is the cost that benchmark comparisons quietly omit, and I believe it deserves particular attention.

pgvector: Your vectors live in the same database as your application data. Insert a product and its embedding in the same transaction. Update a document and re-embed it atomically. Delete a record and its vector disappears. ACID guarantees apply to vectors exactly as they apply to any other column.

-- Atomic: product and embedding are always consistent
BEGIN;
INSERT INTO products (name, description, embedding)
VALUES ('Widget', 'A useful widget', embedding_vector);
COMMIT;

Weaviate and Qdrant: Your source of truth is PostgreSQL. Your vectors live in a separate system. You need a synchronization pipeline: application writes to PostgreSQL, a worker detects changes, generates embeddings, writes to the vector database. This pipeline must handle initial data loading, incremental updates, deletes, failures, retries, ordering guarantees, and schema changes in either system. It must be monitored, alerted on, and maintained.

The sync pipeline is not a theoretical concern. It is the primary source of bugs and operational incidents in dual-database architectures. Stale vectors, missing vectors, orphaned vectors, and consistency windows during updates are real problems that require engineering investment to solve.

Backup and Disaster Recovery

pgvector: Your existing PostgreSQL backup strategy — pg_dump, WAL archiving, point-in-time recovery — covers vectors. You restore one system and everything is consistent.

Weaviate/Qdrant: Separate backup procedures, separate backup schedules, separate restore testing. After a disaster, you restore PostgreSQL and then restore the vector database, hoping the two are consistent at the same point in time. If they are not, you re-sync.

Monitoring and Upgrades

pgvector: pg_stat_statements captures vector query performance. Your existing PostgreSQL monitoring reports on vector query latency, index usage, and buffer hit rates. Upgrades: ALTER EXTENSION vector UPDATE;

Weaviate/Qdrant: Separate Prometheus metrics endpoints, separate Grafana dashboards, separate alerting rules. Your on-call rotation now covers two stateful systems. Separate version management, separate upgrade procedures, potential breaking API changes.

Cost Modeling

The following estimates assume 1536-dimensional vectors (OpenAI embedding size) running on cloud infrastructure.

Scale	pgvector (incremental)	Weaviate (dedicated)	Qdrant (dedicated)
100K vectors	~$0/mo additional (fits on existing instance)	$50–150/mo (minimal instance)	$30–100/mo (minimal instance)
1M vectors	$50–200/mo additional RAM/storage	$200–500/mo (memory for vectors)	$100–300/mo (with quantization)
5M vectors	$200–800/mo (larger instance)	$500–1,500/mo	$300–800/mo
10M vectors	$500–2,000/mo (large instance or partition)	$1,500–4,000/mo	$500–1,500/mo

Key factors: pgvector cost is incremental — you are adding RAM and storage to an instance you already pay for. Weaviate keeps vectors in memory by default, making it the most memory-intensive option. Qdrant's quantization reduces memory requirements by 4–32x, making it more cost-effective at scale.

Engineering Cost

The infrastructure cost comparison omits the larger expense: engineering time.

pgvector: Near-zero integration cost. Your team writes SQL, uses existing PostgreSQL clients, and deploys through existing pipelines.

Weaviate or Qdrant: Budget 2–4 weeks of engineering time for initial integration — API client integration, data sync pipeline, monitoring setup, backup configuration, operational runbooks. Ongoing maintenance adds continuous operational overhead.

The honest calculation: the engineering cost of integrating and operationalizing a dedicated vector database exceeds the infrastructure cost at any scale below roughly 5M vectors.

When Each Tool Is the Right Choice

Choose pgvector When

Your dataset is under 1–5M vectors. pgvector handles this scale with single-digit millisecond latency and 99%+ recall when properly tuned.
You need transactional consistency. Vectors and application data in the same transaction, no sync pipeline, no eventual consistency.
Your team's expertise is in PostgreSQL, not distributed systems. pgvector requires SQL knowledge you already have.
You want vector search without expanding your ops surface. No new services to deploy, monitor, back up, or troubleshoot at 3 AM.
You are prototyping or in early product stages. Start simple, measure real performance, and migrate only if you hit a specific limitation.

Choose Weaviate When

Native hybrid search is a core requirement. Combining vector similarity with BM25 keyword ranking in a single query is Weaviate's strongest differentiator.
You want built-in embedding pipelines. Weaviate's vectorization modules let you send raw text and have Weaviate handle embedding generation.
Your team has Kubernetes expertise. Weaviate runs well on Kubernetes. If your team already operates stateful services there, the marginal operational cost is lower.
Multi-tenancy is a hard requirement. Weaviate's native tenant isolation is more mature than PostgreSQL's approaches for vector workloads.

Choose Qdrant When

Filtering performance is critical. If most queries include tight metadata filters and you need high recall under those filters, Qdrant's payload-aware filtering is a genuine architectural advantage.
Memory efficiency at scale matters. Qdrant's quantization can reduce memory requirements by 4–32x.
Your dataset exceeds 5–10M vectors. Qdrant's horizontal sharding distributes the index across nodes.
Latency sensitivity is extreme. For consistent sub-5ms p99 latency at millions of vectors under concurrent load, Qdrant's Rust implementation delivers more predictable performance.

Migration Paths — Changing Your Mind Is Perfectly Reasonable

I should like to offer some reassurance: none of these choices is permanent. Migration between them is straightforward, if not trivial.

pgvector to Weaviate or Qdrant

Your embeddings are already in PostgreSQL. Export them:

-- Export embeddings as JSON
COPY (
  SELECT id, title, category, embedding::text
  FROM documents
) TO '/tmp/documents.csv' WITH (FORMAT csv, HEADER);

Then bulk import via the Weaviate or Qdrant API. The harder part is building the sync pipeline to keep the new system current with ongoing PostgreSQL changes.

Weaviate or Qdrant to pgvector

Export via the engine's API (Weaviate scroll API, Qdrant scroll endpoint), then insert into PostgreSQL:

-- Import embeddings into pgvector
INSERT INTO documents (id, title, category, embedding)
VALUES (1, 'Title', 'category', '[0.1, 0.2, ...]'::vector);

The Starting-Point Advantage

Starting with pgvector: Your embeddings are already in PostgreSQL. If you later need a dedicated engine, migration is a data export — you copy vectors out. Your PostgreSQL data remains intact and unchanged.

Starting with a dedicated engine: If you later determine that pgvector is sufficient (a common outcome after the initial workload stabilizes), you migrate vectors into PostgreSQL and decommission the separate service, its sync pipeline, and its operational overhead.

Starting with pgvector carries less risk. You can always add complexity later. Removing complexity after adoption is harder — sync pipelines, operational runbooks, and monitoring integrations tend to persist long after the need that motivated them has passed. In infrastructure, simplicity is not the absence of capability. It is the discipline to use existing capability before adding new complexity.

How Gold Lapel Relates

Gold Lapel attends to PostgreSQL query performance, including pgvector workloads. It detects pgvector queries that would benefit from HNSW indexes or ef_search tuning, and it monitors pgvector query latency alongside your OLTP workload to identify when vector queries are competing for shared resources. The comparison above stands on its own merits regardless of any specific tooling.

Frequently asked questions

Is pgvector fast enough for production vector search?

Can I use pgvector and a dedicated vector database together?

How does filtering work differently in pgvector vs Qdrant?

Do I need a vector database for RAG (Retrieval-Augmented Generation)?

What is the operational overhead of running Weaviate or Qdrant alongside PostgreSQL?

Terms referenced in this article

The broader question — whether you need a dedicated vector database at all — has its own examination. I have written a piece on whether you need a vector database that steps back from the product comparison and asks the architectural question first.