Configure Postgres for Replication
Set up Postgres with the correct permissions and settings for ETL logical replication
This guide covers the essential Postgres concepts and configuration needed for logical replication with ETL.
Prerequisites
- Postgres 10 or later
- Superuser access to the Postgres server
- Ability to restart Postgres server (for configuration changes)
Understanding WAL Logical
Postgres's Write-Ahead Log (WAL) is the foundation of logical replication. When wal_level = logical
, Postgres:
- Records detailed information about data changes (not just physical changes)
- Includes enough metadata to reconstruct logical changes
- Allows external tools to decode and stream these changes
1 2 |
|
Restart Postgres after changing this setting:
1 |
|
Replication Slots
Replication slots ensure that Postgres retains WAL data for logical replication consumers, even if they disconnect temporarily.
What are Replication Slots?
- Persistent markers in Postgres that track replication progress
- Prevent WAL cleanup until the consumer catches up
- Guarantee data consistency across disconnections
Creating Replication Slots
1 2 |
|
Viewing Replication Slots
1 2 3 |
|
Deleting Replication Slots
1 2 |
|
Warning: Only delete slots when you're sure they're not in use. Deleting an active slot can cause data loss.
Max Replication Slots
Controls how many replication slots Postgres can maintain simultaneously.
1 2 |
|
ETL uses a single replication slot for its main apply worker. However, additional slots may be created for parallel table
copies when the pipeline is initialized or when a new table is added to the publication. The max_table_sync_workers
parameter
controls the number of these parallel copies, ensuring that the total replication slots used by ETL never exceed max_table_sync_workers + 1
.
When to increase:
- Running multiple ETL pipelines
- Development/testing with frequent slot creation
WAL Keep Size
Determines how much WAL data to retain on disk, providing a safety buffer for replication consumers.
1 2 3 4 5 |
|
Purpose:
- Prevents WAL deletion when replication consumers fall behind
- Provides recovery time if ETL pipelines temporarily disconnect
- Balances disk usage with replication reliability
Publications
Publications define which tables and operations to replicate.
Creating Publications
1 2 3 4 5 6 7 8 |
|
Managing Publications
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Complete Configuration Example
Here's a minimal Postgres.conf
setup:
1 2 3 4 5 6 7 8 9 10 |
|
After editing the configuration:
- Restart Postgres
- Create your publication:
1
CREATE PUBLICATION etl_publication FOR TABLE your_table;
- Verify the setup:
1 2 3
SHOW wal_level; SHOW max_replication_slots; SELECT * FROM pg_publication WHERE pubname = 'etl_publication';
Next Steps
- Build your first pipeline → Build Your First ETL Pipeline
- Build custom implementations → Custom Stores and Destinations
See Also
- Build Your First ETL Pipeline - Hands-on tutorial using these settings
- ETL Architecture - Understanding how ETL uses these settings
- API Reference - All available connection options