pgvector vs Weaviate vs Qdrant
Vector Database Comparison for PostgreSQL Teams
The Decision That Matters More Than the Benchmarks
Good evening. I see you have arrived with a decision to make — and I should like to help you make it with full information rather than marketing materials.
The vector database market has expanded rapidly alongside the adoption of embedding models and retrieval-augmented generation. If you are building on PostgreSQL, you now face a concrete choice: extend your existing database with pgvector, adopt a purpose-built vector engine like Weaviate or Qdrant, or use a managed SaaS service like Pinecone.
This comparison focuses on the self-hosted options that PostgreSQL teams are most likely evaluating: pgvector (the PostgreSQL extension), Weaviate (an open-source hybrid search engine), and Qdrant (an open-source high-performance vector engine). For the broader decision framework on whether you need a dedicated vector database at all, see Do You Need a Vector Database?
I should be candid about the real decision here. It is not "which is fastest on a benchmark" — it is "what operational complexity am I willing to accept for what marginal performance gain?" This guide is intended to help you answer that question honestly.
Three Approaches to Vector Search
The vector search landscape has settled into three models:
- Extend your existing database. pgvector adds vector types and ANN indexes to PostgreSQL. Your vectors live alongside your application data, queried with SQL, backed up with your existing tools, and governed by your existing access controls.
- Adopt a purpose-built open-source engine. Weaviate, Qdrant, Milvus, and others are designed from the ground up for vector search. They offer specialized indexing, distributed scaling, and features like built-in embedding pipelines or advanced filtered search.
- Use a managed SaaS. Pinecone, Zilliz Cloud, and Weaviate Cloud offload operational burden entirely. You get an API endpoint and pay per query or per stored vector.
Each model makes a different tradeoff between operational simplicity, performance ceiling, and feature breadth. The right choice depends on your dataset size, your team's expertise, your performance requirements, and — if I may be forthright — your tolerance for operational complexity.
What Each Tool Is
pgvector — Vector Search as a PostgreSQL Extension
pgvector is not a separate database — and this is the single most important thing to understand about it. It is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest-neighbor (ANN) index types to your existing PostgreSQL instance.
-- pgvector in action
CREATE TABLE documents (
id bigserial PRIMARY KEY,
title text,
content text,
embedding vector(1536)
);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
-- Similarity search with a SQL WHERE filter
SELECT title, embedding <=> query_embedding AS distance
FROM documents
WHERE category = 'engineering'
ORDER BY embedding <=> query_embedding
LIMIT 10; Strengths: Zero additional infrastructure. Full SQL. Transactional consistency between vectors and application data. No data synchronization pipeline. Your existing PostgreSQL monitoring, backup, and operational tooling covers vector workloads.
Limitations: Single-node scaling (bounded by your PostgreSQL instance). Vector workloads share CPU, memory, and I/O with your OLTP workload. Post-filtering with SQL WHERE clauses can degrade recall when filters are selective.
For detailed pgvector tuning guidance, see the pgvector performance tuning guide and the pgvector query optimization guide.
Weaviate — Hybrid Search Engine with Native Vectorization
Weaviate is a purpose-built vector database written in Go. It is designed as a standalone service, accessed via GraphQL and REST APIs.
# Weaviate GraphQL query — hybrid search
{
Get {
Document(
hybrid: {
query: "database performance tuning"
alpha: 0.75 # 75% vector, 25% keyword
}
where: {
path: ["category"]
operator: Equal
valueText: "engineering"
}
limit: 10
) {
title
content
_additional {
score
distance
}
}
}
} Strengths: Hybrid search (vector + BM25) out of the box, which would require combining pgvector with tsvector in PostgreSQL. Built-in embedding pipelines eliminate the need for a separate embedding service. Multi-tenancy is a first-class feature.
Limitations: Another stateful service to deploy, monitor, and back up. GraphQL adds a learning curve for teams accustomed to SQL. Resource-intensive at small scale. Requires a data synchronization pipeline if PostgreSQL remains your source of truth.
Qdrant — High-Performance Vector Engine with Advanced Filtering
Qdrant is a purpose-built vector database written in Rust. It is designed for high-throughput, low-latency vector search with a focus on filtering performance.
# Qdrant Python client — filtered search
from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue
client = QdrantClient("localhost", port=6333)
results = client.search(
collection_name="documents",
query_vector=query_embedding,
query_filter=Filter(
must=[
FieldCondition(
key="category",
match=MatchValue(value="engineering")
)
]
),
limit=10
) Strengths: Filtering performance is Qdrant's distinguishing feature — payload-aware filtering integrates filters into the HNSW graph traversal, maintaining recall even with highly selective filters. Quantization reduces memory footprint by 4–32x. Rust implementation delivers predictable latency under load.
Limitations: Another stateful service to operate. No native hybrid text search. Younger ecosystem compared to Weaviate or Elasticsearch. Requires a data sync pipeline when used alongside PostgreSQL.
Feature Comparison
| Feature | pgvector | Weaviate | Qdrant |
|---|---|---|---|
| Index types | HNSW, IVFFlat | HNSW | HNSW |
| Distance metrics | L2, cosine, inner product | L2, cosine, dot product, hamming, manhattan | L2, cosine, dot product, manhattan, euclid |
| Max dimensions | 16,000 | No hard limit | 65,536 |
| Filtering approach | Post-filter (SQL WHERE after ANN) | Pre-filter (inverted index, then vector search) | Integrated (payload-aware graph traversal) |
| Hybrid search (vector + keyword) | Manual (pgvector + tsvector) | Native (vector + BM25 in one query) | No native keyword search |
| Quantization | Half-vector (binary in development) | Product quantization, binary quantization | Scalar, product, binary quantization |
| Multi-tenancy | Schema-based or row-level security | Native (per-tenant isolation) | Collection-based |
| Replication | PostgreSQL streaming replication | Built-in | Built-in (Raft consensus) |
| Sharding | Table partitioning | Built-in (automatic) | Built-in (automatic) |
| ACID transactions | Yes (full PostgreSQL ACID) | No | No |
| Backup / PITR | pg_dump, WAL archiving, PITR | Snapshot-based | Snapshot-based |
| Built-in embedding | No (external embedding required) | Yes (OpenAI, Cohere, HuggingFace modules) | No (external embedding required) |
| API | SQL | GraphQL, REST | REST, gRPC |
| License | PostgreSQL License (open source) | BSD-3-Clause | Apache 2.0 |
Performance — What the Benchmarks Actually Tell You
The Benchmark Landscape
ANN-Benchmarks is the widely referenced neutral benchmark suite for approximate nearest-neighbor search. It measures recall vs queries-per-second across various datasets and algorithms.
I should note, however, that ANN-Benchmarks measures pure ANN search: a single query, no filters, no concurrent writes, no hybrid queries. Real production workloads differ substantially. A benchmark showing that Engine A handles 10,000 QPS and Engine B handles 8,000 QPS tells you very little about how they will perform under your actual workload with filters, concurrent users, and mixed read-write traffic.
Recall vs Latency at Different Scales
100K vectors (1536 dimensions): All three systems perform well at this scale. pgvector delivers sub-5ms latency at 99% recall. Weaviate and Qdrant are similarly fast. At 100K vectors, performance is not a differentiator — any of these tools will be fast enough.
1M vectors (1536 dimensions): Qdrant and Weaviate typically maintain sub-5ms latency at high recall. pgvector ranges from 5–15ms depending on ef_search settings and dimension count. The gap is measurable but rarely user-facing — the difference between 5ms and 15ms is absorbed by network latency in most architectures.
10M+ vectors (1536 dimensions): At this scale, dedicated engines pull ahead. Sharding distributes the index across nodes, and quantization reduces memory requirements. pgvector on a single PostgreSQL instance faces memory pressure (the HNSW index for 10M vectors at 1536 dimensions exceeds 60 GB) and build times measured in hours.
The honest assessment: Below 1 million vectors, performance differences between these tools are unlikely to be your bottleneck. Network latency, embedding generation time, and application-layer processing dominate end-to-end latency. A butler who overstates the importance of a 10ms difference when the network adds 50ms would be a poor advisor. Above 5–10 million vectors, the architectural advantages of purpose-built engines begin to matter.
Filtering Performance — Where the Real Differences Emerge
If you will permit me one strong opinion: filtering is where the architectural differences between these systems have the most practical impact.
pgvector: Applies standard SQL WHERE clauses after the ANN search (post-filtering). The HNSW index returns its top candidates without awareness of the filter, and PostgreSQL then discards candidates that do not match the WHERE clause. When the filter is selective, many ANN results are discarded, and effective recall drops. Partial indexes and table partitioning can mitigate this, but require explicit setup per filter column.
Qdrant: Integrates payload filtering into the HNSW graph traversal itself. During the search, the algorithm only considers nodes that match the filter conditions. This maintains recall even under highly selective filters. This is Qdrant's most significant architectural advantage over pgvector.
Weaviate: Uses pre-filtering with inverted indexes. Before the vector search begins, Weaviate identifies the set of documents matching the filter criteria, then runs the vector search only over that filtered set. This maintains recall but can be slower than Qdrant's integrated approach for certain filter patterns.
For workloads where most queries include tight metadata filters — multi-tenant SaaS applications, e-commerce with category filters, content search with access controls — filtering performance is often more important than raw ANN throughput.
Operational Complexity — The Cost That Deserves Your Attention
Deployment
pgvector: CREATE EXTENSION vector; on your existing PostgreSQL instance. No new servers, no new containers, no new networking. If you can run a SQL command, you can deploy pgvector.
Weaviate: Deploy a Docker container or Kubernetes Helm chart. Configure vectorization modules, persistence volume, replication settings, and resource limits.
Qdrant: Deploy a Docker container or Kubernetes pod. Configure collection settings, storage paths, and replication. Simpler to configure than Weaviate, but still a separate stateful service with its own lifecycle.
Data Consistency
This is the cost that benchmark comparisons quietly omit, and I believe it deserves particular attention.
pgvector: Your vectors live in the same database as your application data. Insert a product and its embedding in the same transaction. Update a document and re-embed it atomically. Delete a record and its vector disappears. ACID guarantees apply to vectors exactly as they apply to any other column.
-- Atomic: product and embedding are always consistent
BEGIN;
INSERT INTO products (name, description, embedding)
VALUES ('Widget', 'A useful widget', embedding_vector);
COMMIT; Weaviate and Qdrant: Your source of truth is PostgreSQL. Your vectors live in a separate system. You need a synchronization pipeline: application writes to PostgreSQL, a worker detects changes, generates embeddings, writes to the vector database. This pipeline must handle initial data loading, incremental updates, deletes, failures, retries, ordering guarantees, and schema changes in either system. It must be monitored, alerted on, and maintained.
The sync pipeline is not a theoretical concern. It is the primary source of bugs and operational incidents in dual-database architectures. Stale vectors, missing vectors, orphaned vectors, and consistency windows during updates are real problems that require engineering investment to solve.
Backup and Disaster Recovery
pgvector: Your existing PostgreSQL backup strategy — pg_dump, WAL archiving, point-in-time recovery — covers vectors. You restore one system and everything is consistent.
Weaviate/Qdrant: Separate backup procedures, separate backup schedules, separate restore testing. After a disaster, you restore PostgreSQL and then restore the vector database, hoping the two are consistent at the same point in time. If they are not, you re-sync.
Monitoring and Upgrades
pgvector: pg_stat_statements captures vector query performance. Your existing PostgreSQL monitoring reports on vector query latency, index usage, and buffer hit rates. Upgrades: ALTER EXTENSION vector UPDATE;
Weaviate/Qdrant: Separate Prometheus metrics endpoints, separate Grafana dashboards, separate alerting rules. Your on-call rotation now covers two stateful systems. Separate version management, separate upgrade procedures, potential breaking API changes.
Cost Modeling
The following estimates assume 1536-dimensional vectors (OpenAI embedding size) running on cloud infrastructure.
| Scale | pgvector (incremental) | Weaviate (dedicated) | Qdrant (dedicated) |
|---|---|---|---|
| 100K vectors | ~$0/mo additional (fits on existing instance) | $50–150/mo (minimal instance) | $30–100/mo (minimal instance) |
| 1M vectors | $50–200/mo additional RAM/storage | $200–500/mo (memory for vectors) | $100–300/mo (with quantization) |
| 5M vectors | $200–800/mo (larger instance) | $500–1,500/mo | $300–800/mo |
| 10M vectors | $500–2,000/mo (large instance or partition) | $1,500–4,000/mo | $500–1,500/mo |
Key factors: pgvector cost is incremental — you are adding RAM and storage to an instance you already pay for. Weaviate keeps vectors in memory by default, making it the most memory-intensive option. Qdrant's quantization reduces memory requirements by 4–32x, making it more cost-effective at scale.
Engineering Cost
The infrastructure cost comparison omits the larger expense: engineering time.
pgvector: Near-zero integration cost. Your team writes SQL, uses existing PostgreSQL clients, and deploys through existing pipelines.
Weaviate or Qdrant: Budget 2–4 weeks of engineering time for initial integration — API client integration, data sync pipeline, monitoring setup, backup configuration, operational runbooks. Ongoing maintenance adds continuous operational overhead.
The honest calculation: the engineering cost of integrating and operationalizing a dedicated vector database exceeds the infrastructure cost at any scale below roughly 5M vectors.
When Each Tool Is the Right Choice
Choose pgvector When
- Your dataset is under 1–5M vectors. pgvector handles this scale with single-digit millisecond latency and 99%+ recall when properly tuned.
- You need transactional consistency. Vectors and application data in the same transaction, no sync pipeline, no eventual consistency.
- Your team's expertise is in PostgreSQL, not distributed systems. pgvector requires SQL knowledge you already have.
- You want vector search without expanding your ops surface. No new services to deploy, monitor, back up, or troubleshoot at 3 AM.
- You are prototyping or in early product stages. Start simple, measure real performance, and migrate only if you hit a specific limitation.
Choose Weaviate When
- Native hybrid search is a core requirement. Combining vector similarity with BM25 keyword ranking in a single query is Weaviate's strongest differentiator.
- You want built-in embedding pipelines. Weaviate's vectorization modules let you send raw text and have Weaviate handle embedding generation.
- Your team has Kubernetes expertise. Weaviate runs well on Kubernetes. If your team already operates stateful services there, the marginal operational cost is lower.
- Multi-tenancy is a hard requirement. Weaviate's native tenant isolation is more mature than PostgreSQL's approaches for vector workloads.
Choose Qdrant When
- Filtering performance is critical. If most queries include tight metadata filters and you need high recall under those filters, Qdrant's payload-aware filtering is a genuine architectural advantage.
- Memory efficiency at scale matters. Qdrant's quantization can reduce memory requirements by 4–32x.
- Your dataset exceeds 5–10M vectors. Qdrant's horizontal sharding distributes the index across nodes.
- Latency sensitivity is extreme. For consistent sub-5ms p99 latency at millions of vectors under concurrent load, Qdrant's Rust implementation delivers more predictable performance.
Migration Paths — Changing Your Mind Is Perfectly Reasonable
I should like to offer some reassurance: none of these choices is permanent. Migration between them is straightforward, if not trivial.
pgvector to Weaviate or Qdrant
Your embeddings are already in PostgreSQL. Export them:
-- Export embeddings as JSON
COPY (
SELECT id, title, category, embedding::text
FROM documents
) TO '/tmp/documents.csv' WITH (FORMAT csv, HEADER); Then bulk import via the Weaviate or Qdrant API. The harder part is building the sync pipeline to keep the new system current with ongoing PostgreSQL changes.
Weaviate or Qdrant to pgvector
Export via the engine's API (Weaviate scroll API, Qdrant scroll endpoint), then insert into PostgreSQL:
-- Import embeddings into pgvector
INSERT INTO documents (id, title, category, embedding)
VALUES (1, 'Title', 'category', '[0.1, 0.2, ...]'::vector); The Starting-Point Advantage
Starting with pgvector: Your embeddings are already in PostgreSQL. If you later need a dedicated engine, migration is a data export — you copy vectors out. Your PostgreSQL data remains intact and unchanged.
Starting with a dedicated engine: If you later determine that pgvector is sufficient (a common outcome after the initial workload stabilizes), you migrate vectors into PostgreSQL and decommission the separate service, its sync pipeline, and its operational overhead.
Starting with pgvector carries less risk. You can always add complexity later. Removing complexity after adoption is harder — sync pipelines, operational runbooks, and monitoring integrations tend to persist long after the need that motivated them has passed. In infrastructure, simplicity is not the absence of capability. It is the discipline to use existing capability before adding new complexity.
How Gold Lapel Relates
Gold Lapel attends to PostgreSQL query performance, including pgvector workloads. It detects pgvector queries that would benefit from HNSW indexes or ef_search tuning, and it monitors pgvector query latency alongside your OLTP workload to identify when vector queries are competing for shared resources. The comparison above stands on its own merits regardless of any specific tooling.