pglogical
Logical replication with row filtering, conflict resolution, and cross-version support. If I may say so, few extensions have earned their keep as thoroughly as this one.
Replication is the art of maintaining identical copies of data across multiple databases — and doing so without letting any of them fall out of step. It is a discipline that rewards precision. pglogical is a PostgreSQL extension that provides logical replication using a publish/subscribe model. Originally developed by 2ndQuadrant (now part of EDB), it predates PostgreSQL's built-in logical replication and still offers features the native system lacks, including conflict resolution, cross-version replication, and fine-grained row and column filtering. It is the most widely deployed third-party replication extension for PostgreSQL.
What pglogical does
pglogical reads changes from the write-ahead log (WAL) on a provider node and streams them to one or more subscriber nodes. Unlike physical replication, which copies the entire database byte-for-byte, logical replication decodes WAL entries into logical changes (INSERT, UPDATE, DELETE, TRUNCATE) and applies them on the subscriber. This means the provider and subscriber can have different schemas, different indexes, and even different PostgreSQL versions.
The core abstraction is the replication set — a named group of tables with rules about which operations and which rows to replicate. You can create multiple replication sets per provider, each with different tables and filters, and subscribers choose which sets to follow. Surgical control over what data moves where, in other words — nothing travels that you have not expressly permitted.
pglogical handles the initial data synchronization automatically. When a subscription is created with synchronize_data := true, it copies the current state of all tables in the subscribed replication sets before switching to streaming live changes. After the initial sync, only incremental changes are transmitted.
When to use pglogical
pglogical is the right tool when you need more control than physical replication provides, or when native logical replication does not cover your requirements.
- Zero-downtime major version upgrades — replicate from PostgreSQL 12 to PostgreSQL 16 (or any supported version pair), let the subscriber catch up, then switch application traffic
- Selective replication — replicate only specific tables, specific columns, or rows matching a filter expression to a downstream database
- Cross-version replication — maintain replication between clusters running different PostgreSQL major versions, which physical replication cannot do
- Multi-directional replication — configure multiple providers and subscribers with conflict resolution for active-active setups (with careful planning)
- Data distribution — feed filtered subsets of data to analytics databases, reporting replicas, or regional instances
- Migration between cloud providers — replicate from one managed PostgreSQL service to another with minimal downtime
Installation and setup
pglogical is a third-party extension — it does not ship with PostgreSQL. On self-managed servers, install it from the 2ndQuadrant repository or your distribution's package manager. On managed services like RDS or Azure, it is available as a supported extension.
pglogical must be loaded via shared_preload_libraries, which requires a PostgreSQL restart. The WAL level must also be set to logical. Allow me to walk through the full setup.
-- 1. Add to postgresql.conf (requires restart)
shared_preload_libraries = 'pglogical'
wal_level = 'logical'
max_worker_processes = 10
max_replication_slots = 10
max_wal_senders = 10
-- 2. Create the extension on both provider and subscriber databases
CREATE EXTENSION pglogical;
-- 3. Verify the extension is installed
SELECT * FROM pg_extension WHERE extname = 'pglogical'; After the restart, create the extension in every database that will participate in replication — both provider and subscriber databases need it.
Setting up replication
Provider node
The provider is the source database. Create a node identity and add tables to a replication set.
-- On the provider (source) database: create a node
SELECT pglogical.create_node(
node_name := 'provider',
dsn := 'host=provider-host port=5432 dbname=mydb'
);
-- Add all tables in the public schema to the default replication set
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); Subscriber node
The subscriber is the target database. Create a node identity and subscribe to the provider. Setting synchronize_data := true copies the current table contents before streaming live changes.
-- On the subscriber (target) database: create a node
SELECT pglogical.create_node(
node_name := 'subscriber',
dsn := 'host=subscriber-host port=5432 dbname=mydb'
);
-- Create a subscription to the provider
SELECT pglogical.create_subscription(
subscription_name := 'my_subscription',
provider_dsn := 'host=provider-host port=5432 dbname=mydb',
replication_sets := ARRAY['default'],
synchronize_data := true
); Replication sets and filtering
Replication sets are pglogical's mechanism for selective replication. Each set defines which tables are included, which operations (INSERT, UPDATE, DELETE, TRUNCATE) are replicated, and optional row and column filters.
-- Create a custom replication set
SELECT pglogical.create_replication_set(
set_name := 'orders_only',
replicate_insert := true,
replicate_update := true,
replicate_delete := true,
replicate_truncate := true
);
-- Add a table with row filtering — only replicate completed orders
SELECT pglogical.replication_set_add_table(
set_name := 'orders_only',
relation := 'public.orders',
row_filter := $$ status = 'completed' $$
);
-- Add a table with column filtering — replicate only specific columns
SELECT pglogical.replication_set_add_table(
set_name := 'orders_only',
relation := 'public.customers',
columns := '{id, name, email}'
); Row filters are standard PostgreSQL expressions evaluated on the provider. Only rows matching the expression are sent to subscribers. Column filters specify which columns to include — unlisted columns are not transmitted, reducing bandwidth and allowing the subscriber schema to differ from the provider. A well-run replication setup, like a well-run household, sends only what is needed and nothing more.
Conflict resolution
When a subscriber receives a change that conflicts with its local data — for example, an INSERT with a primary key that already exists — pglogical can resolve the conflict automatically based on the configured strategy.
-- Set conflict resolution strategy (in postgresql.conf or per-session)
-- Options: error, apply_remote, keep_local, last_update_wins, first_update_wins
SET pglogical.conflict_resolution = 'last_update_wins';
-- Monitor replication status
SELECT * FROM pglogical.show_subscription_status();
-- Check replication lag
SELECT
subscription_name,
status,
provider_dsn
FROM pglogical.show_subscription_status(); The five resolution strategies are:
- error — stop replication on conflict (safest, requires manual intervention)
- apply_remote — the remote (provider) change wins
- keep_local — the local (subscriber) change wins
- last_update_wins — the most recent commit timestamp wins
- first_update_wins — the earliest commit timestamp wins
For multi-directional setups, last_update_wins is the most common choice. For unidirectional replication where the subscriber should not have local writes, apply_remote (the default) is appropriate. I should note that multi-directional replication with conflict resolution is powerful, but it demands careful planning — two databases disagreeing about the state of a row is not a situation that resolves itself gracefully without rules in place.
pglogical vs. native logical replication
| Feature | pglogical | Native (PG 10+) |
|---|---|---|
| Minimum PostgreSQL version | 9.4 | 10 |
| Row filtering | Yes (all versions) | PostgreSQL 15+ |
| Column filtering | Yes (all versions) | PostgreSQL 16+ |
| Conflict resolution | Yes (5 strategies) | Limited (PG 17+ adds some) |
| Cross-version replication | Yes | Yes |
| Initial sync parallelism | Sequential | Parallel |
| DDL replication | Limited | No |
| Requires extension install | Yes | No (built-in) |
Native logical replication has been steadily closing the feature gap, and credit where it is due — the PostgreSQL core team has been thorough. For new deployments on PostgreSQL 16+, evaluate whether native logical replication meets your needs before adding the pglogical dependency. For cross-version upgrades or environments where conflict resolution is essential, pglogical remains the stronger option.
Cloud availability
| Provider | Status |
|---|---|
| Amazon RDS / Aurora | Available — enable via parameter group (shared_preload_libraries) |
| Google Cloud SQL | Available — enable via database flags |
| Azure Database for PostgreSQL | Available — set shared_preload_libraries and azure.extensions to include pglogical |
| Supabase | Not available — use native logical replication instead |
| Neon | Not available — use native logical replication instead |
How Gold Lapel relates
I should be forthcoming: Gold Lapel does not participate in replication. It is a query-optimizing proxy that sits between your application and PostgreSQL, and it has no opinion about how your data arrives at the databases it serves. The connection to pglogical is indirect, but it is worth understanding.
A common reason teams set up read replicas — often via pglogical or native logical replication — is to offload expensive analytical queries from the primary. Gold Lapel offers a different path to the same relief: by automatically creating and maintaining materialized views for expensive query patterns, it can serve cached results directly from the primary, reducing or in some workloads eliminating the need for a separate read replica entirely. One well-maintained database doing the work of two is, in my experience, preferable to two databases requiring coordination between them.
If you do use pglogical for replication, Gold Lapel sits comfortably in front of either the provider or the subscriber — or both. It optimizes queries independently of how the underlying data is replicated.