Name: EVE AI Core
Author: Jamaurice Holt

Master PostgreSQL logical replication to selectively replicate data between databases, enable zero-downtime upgrades, and build flexible data distribution architectures.

What is Logical Replication?

Logical replication uses a publish/subscribe model to replicate data changes at the row level based on their identity (primary key). Unlike physical replication, it allows:

💡 Key Advantage: Logical replication lets you replicate specific tables, filter rows, and even replicate between different PostgreSQL versions.

Logical vs Physical Replication

Feature	Logical Replication	Physical Replication
Granularity	Table-level	Entire cluster
Version Compatibility	Cross-version	Same version required
Schema Changes	DDL not replicated	All changes replicated
Read/Write on Replica	Yes (writable)	No (read-only)
Selective Replication	Yes (tables/rows)	No
Use Cases	Data integration, upgrades	HA/DR

🔧 Prerequisites and Configuration

Publisher Configuration

Configure the source database (publisher):

# postgresql.conf on publisher
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10

# Restart PostgreSQL after changes
sudo systemctl restart postgresql

Subscriber Configuration

Configure the target database (subscriber):

# postgresql.conf on subscriber
max_replication_slots = 10
max_logical_replication_workers = 10
max_worker_processes = 16

Network Configuration

# pg_hba.conf on publisher - allow replication connections
host    replication     replication_user    192.168.1.0/24    scram-sha-256
host    mydb            replication_user    192.168.1.0/24    scram-sha-256

Setting Up Logical Replication

Step 1: Create Replication User

-- On publisher
CREATE ROLE replication_user WITH REPLICATION LOGIN PASSWORD 'secure_password';

-- Grant permissions to tables
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user;
GRANT USAGE ON SCHEMA public TO replication_user;

Step 2: Create Publication

-- On publisher: Create publication for all tables
CREATE PUBLICATION my_publication FOR ALL TABLES;

-- Or for specific tables
CREATE PUBLICATION my_publication FOR TABLE
    users,
    orders,
    products;

-- With row filter (PostgreSQL 15+)
CREATE PUBLICATION filtered_publication FOR TABLE
    orders WHERE (status = 'completed');

-- View publications
SELECT * FROM pg_publication;
SELECT * FROM pg_publication_tables;

Step 3: Create Table Structure on Subscriber

-- On subscriber: Create matching table structure
-- Option 1: Use pg_dump for schema only
-- pg_dump -h publisher_host -U postgres -s -t users -t orders mydb | psql -h subscriber_host mydb

-- Option 2: Create tables manually
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(100),
    email VARCHAR(255),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    total DECIMAL(10,2),
    status VARCHAR(50),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Step 4: Create Subscription

-- On subscriber
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=publisher_host port=5432 dbname=mydb user=replication_user password=secure_password'
PUBLICATION my_publication;

-- View subscriptions
SELECT * FROM pg_subscription;
SELECT * FROM pg_stat_subscription;

📊 Monitoring Replication

Check Replication Status

-- On publisher: View replication slots
SELECT
    slot_name,
    plugin,
    slot_type,
    database,
    active,
    restart_lsn,
    confirmed_flush_lsn,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag
FROM pg_replication_slots;

-- View active replication connections
SELECT
    pid,
    usename,
    application_name,
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn
FROM pg_stat_replication;

Check Subscription Status

-- On subscriber: View subscription status
SELECT
    subname,
    pid,
    relid::regclass AS table_name,
    received_lsn,
    last_msg_send_time,
    last_msg_receipt_time,
    latest_end_lsn,
    latest_end_time
FROM pg_stat_subscription;

-- Check subscription relation state
SELECT
    srsubid,
    srrelid::regclass AS table_name,
    srsubstate AS state,
    srsublsn AS lsn
FROM pg_subscription_rel;

Monitor Replication Lag

-- Calculate replication lag in bytes
SELECT
    slot_name,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS replication_lag
FROM pg_replication_slots
WHERE slot_type = 'logical';

Advanced Configuration

Row Filtering (PostgreSQL 15+)

-- Replicate only specific rows
CREATE PUBLICATION orders_completed FOR TABLE
    orders WHERE (status = 'completed');

CREATE PUBLICATION high_value_orders FOR TABLE
    orders WHERE (total > 1000);

-- Multiple conditions
CREATE PUBLICATION regional_orders FOR TABLE
    orders WHERE (region = 'US' AND status = 'active');

Column Filtering (PostgreSQL 15+)

-- Replicate only specific columns
CREATE PUBLICATION partial_users FOR TABLE
    users (id, username, email);  -- Excludes sensitive columns

-- Combine row and column filtering
CREATE PUBLICATION filtered_orders FOR TABLE
    orders (id, user_id, total, status) WHERE (status != 'draft');

Multiple Publications

-- Create separate publications for different purposes
CREATE PUBLICATION analytics_pub FOR TABLE
    orders,
    products,
    page_views;

CREATE PUBLICATION audit_pub FOR TABLE
    audit_log,
    user_sessions;

-- Subscribe to multiple publications
CREATE SUBSCRIPTION analytics_sub
CONNECTION 'host=publisher dbname=mydb user=rep_user'
PUBLICATION analytics_pub;

CREATE SUBSCRIPTION audit_sub
CONNECTION 'host=publisher dbname=mydb user=rep_user'
PUBLICATION audit_pub;

🚀 Common Use Cases

Use Case 1: Zero-Downtime Major Version Upgrade

-- Step 1: Set up logical replication from old to new version
-- On old server (publisher)
CREATE PUBLICATION upgrade_pub FOR ALL TABLES;

-- On new server (subscriber)
CREATE SUBSCRIPTION upgrade_sub
CONNECTION 'host=old_server dbname=mydb user=rep_user'
PUBLICATION upgrade_pub
WITH (copy_data = true);

-- Step 2: Wait for initial sync and replication to catch up
SELECT * FROM pg_stat_subscription;

-- Step 3: Stop application writes to old server
-- Step 4: Verify replication is caught up
-- Step 5: Point application to new server
-- Step 6: Drop subscription on new server
DROP SUBSCRIPTION upgrade_sub;

Use Case 2: Selective Data Distribution

-- Replicate only US data to US regional database
CREATE PUBLICATION us_data FOR TABLE
    customers WHERE (country = 'US'),
    orders WHERE (region = 'US');

-- Replicate only EU data to EU regional database
CREATE PUBLICATION eu_data FOR TABLE
    customers WHERE (country IN ('DE', 'FR', 'GB', 'IT')),
    orders WHERE (region = 'EU');

Use Case 3: Real-time Analytics Replica

-- Create publication for analytics tables
CREATE PUBLICATION analytics_pub FOR TABLE
    sales,
    inventory,
    customer_behavior;

-- On analytics server
CREATE SUBSCRIPTION analytics_sub
CONNECTION 'host=production dbname=app user=rep_user'
PUBLICATION analytics_pub
WITH (synchronous_commit = off);  -- Allow async for better performance

Managing Subscriptions

Refresh Subscription

-- After adding tables to publication
ALTER PUBLICATION my_publication ADD TABLE new_table;

-- Refresh subscription to include new table
ALTER SUBSCRIPTION my_subscription REFRESH PUBLICATION;

-- With copy_data option
ALTER SUBSCRIPTION my_subscription REFRESH PUBLICATION
    WITH (copy_data = true);

Disable/Enable Subscription

-- Temporarily disable replication
ALTER SUBSCRIPTION my_subscription DISABLE;

-- Re-enable
ALTER SUBSCRIPTION my_subscription ENABLE;

Change Connection

-- Update connection string
ALTER SUBSCRIPTION my_subscription
CONNECTION 'host=new_publisher dbname=mydb user=rep_user password=new_pass';

Drop Subscription

-- Stop and remove subscription
DROP SUBSCRIPTION my_subscription;

-- On publisher: Drop replication slot if subscription dropped incorrectly
SELECT pg_drop_replication_slot('my_subscription');

Handling Schema Changes

⚠️ Important: DDL changes are NOT automatically replicated. You must apply schema changes manually on both publisher and subscriber.

Best Practice for Schema Changes

-- Step 1: Apply DDL on subscriber first (if adding columns)
ALTER TABLE orders ADD COLUMN discount DECIMAL(5,2);

-- Step 2: Apply DDL on publisher
ALTER TABLE orders ADD COLUMN discount DECIMAL(5,2);

-- For dropping columns, reverse the order:
-- Step 1: Apply on publisher first
-- Step 2: Apply on subscriber

Handle Table Additions

-- On publisher
CREATE TABLE new_table (...);
ALTER PUBLICATION my_publication ADD TABLE new_table;

-- On subscriber
CREATE TABLE new_table (...);  -- Same structure
ALTER SUBSCRIPTION my_subscription REFRESH PUBLICATION WITH (copy_data = true);

⚠️ Troubleshooting

Issue 1: Subscription Not Replicating

-- Check subscription state
SELECT * FROM pg_stat_subscription WHERE subname = 'my_subscription';

-- Check for errors in PostgreSQL logs
-- Look for: "logical replication apply worker"

-- Verify publication exists
SELECT * FROM pg_publication WHERE pubname = 'my_publication';

-- Verify network connectivity
-- Test from subscriber to publisher

Issue 2: Replication Slot Not Active

-- Check slot status
SELECT slot_name, active, restart_lsn
FROM pg_replication_slots;

-- If slot is inactive and subscription exists, try:
ALTER SUBSCRIPTION my_subscription ENABLE;

-- Force drop inactive slot (use with caution)
SELECT pg_drop_replication_slot('slot_name');

Issue 3: Initial Sync Fails

-- Check relation sync state
SELECT srrelid::regclass, srsubstate
FROM pg_subscription_rel;

-- States: i = init, d = data copy, f = finished, s = sync, r = ready

-- Resync a specific table
ALTER SUBSCRIPTION my_subscription
    REFRESH PUBLICATION WITH (copy_data = true);

Issue 4: Conflict on Subscriber

-- If subscriber has conflicting data
-- Option 1: Skip the transaction
-- Set in postgresql.conf or per-session:
-- ALTER SUBSCRIPTION my_subscription SET (disable_on_error = false);

-- Option 2: Fix data manually and restart
-- Find conflicting row, fix it, then:
ALTER SUBSCRIPTION my_subscription ENABLE;

✅ Best Practices

Design Guidelines

Use primary keys - Required for UPDATE/DELETE replication
Match schemas exactly - Column names, types, and order
Plan for DDL changes - Document and coordinate schema updates
Monitor lag regularly - Set up alerting for replication lag
Test failover procedures - Validate promotion steps

Performance Optimization

-- Increase worker processes for large datasets
ALTER SYSTEM SET max_logical_replication_workers = 20;
ALTER SYSTEM SET max_sync_workers_per_subscription = 4;

-- For initial sync of large tables
ALTER SUBSCRIPTION my_subscription SET (streaming = on);

-- Reduce WAL retention if lag is consistently low
ALTER SYSTEM SET wal_keep_size = '1GB';

Security Considerations

-- Use SSL for replication connections
CREATE SUBSCRIPTION secure_sub
CONNECTION 'host=publisher sslmode=require ...'
PUBLICATION my_publication;

-- Limit replication user permissions
GRANT SELECT ON TABLE users TO replication_user;  -- Only needed tables
REVOKE ALL ON TABLE sensitive_data FROM replication_user;

Comparison with Other Solutions

Solution	Use Case	Complexity
Logical Replication	Selective replication, upgrades	Medium
Streaming Replication	HA/DR, read replicas	Low
pglogical	Advanced logical replication	High
Bucardo	Multi-master replication	High
Slony	Cascading replication	High

Conclusion

PostgreSQL logical replication provides flexible, powerful data distribution capabilities. By understanding its architecture and following best practices, you can implement robust replication solutions for upgrades, analytics, and data distribution.

Key Takeaways

✅ Use publications to define what data to replicate
✅ Use subscriptions to receive replicated data
✅ Monitor lag and slot status regularly
✅ Plan for DDL changes manually
✅ Test failover procedures before you need them

About the Author

Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.

Need help with PostgreSQL optimization? Contact us for expert database consulting.