How to Implement Database Replication and High Availability

About us: Personal website of Timofey Bugaevsky and the company Zetka Interactive

Guides: A comprehensive collection of technical guides written from a senior developer's perspective. Each article provides in-depth explanations, practical code examples, and production-ready patterns.

Databases: Database administration, optimization, and patterns

Introduction

Database replication ensures data durability and availability by maintaining copies across multiple servers. For production systems, replication isn't optional—it's essential for disaster recovery, read scaling, and maintenance without downtime. This guide covers practical replication strategies for PostgreSQL and MySQL.

Replication Concepts

Replication Types

Physical Replication: Copies the exact bytes of the database. Fast, but replicas must be identical PostgreSQL versions.

Logical Replication: Copies logical changes (SQL operations). More flexible, allows different versions and selective replication.

Synchronous vs. Asynchronous:

Synchronous: Primary waits for replica confirmation. No data loss, but higher latency.
Asynchronous: Primary doesn't wait. Better performance, but potential data loss.

PostgreSQL Streaming Replication

Primary Server Configuration

# postgresql.conf on the primary
# Enable WAL archiving
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
# For synchronous replication (optional)
synchronous_standby_names = 'replica1'

# pg_hba.conf - Allow replication connections
host replication replicator 192.168.1.0/24 md5

Create replication user:

CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';

Replica Server Setup

# Stop PostgreSQL on the replica
sudo systemctl stop postgresql
# Remove existing data
rm -rf /var/lib/postgresql/14/main/*
# Copy from the primary using pg_basebackup
pg_basebackup -h primary.example.com -U replicator -D /var/lib/postgresql/14/main -P -R
# The -R flag creates standby.signal and configures the connection

# postgresql.conf on the replica
# Connection to the primary
primary_conninfo = 'host=primary.example.com port=5432 user=replicator password=secure_password'

# Start the replica
sudo systemctl start postgresql

Verify Replication

-- On the primary: Check replication status
SELECT client_addr, state, sent_lsn, write_lsn, replay_lsn
FROM pg_stat_replication;
-- On the replica: Check recovery status
SELECT pg_is_in_recovery();
SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();

PostgreSQL Logical Replication

Publisher (Source)

# postgresql.conf
wal_level = logical
max_replication_slots = 4
max_wal_senders = 4

-- Create publication
CREATE PUBLICATION my_publication FOR TABLE users, orders;
-- Or all tables
CREATE PUBLICATION full_publication FOR ALL TABLES;

Subscriber (Target)

-- Create subscription
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=publisher.example.com dbname=mydb user=replicator password=secure'
PUBLICATION my_publication;
-- Check status
SELECT * FROM pg_stat_subscription;

MySQL Replication

Primary Configuration

# my.cnf on the primary
[mysqld]
server-id = 1
log_bin = mysql-bin
binlog_format = ROW
binlog_do_db = mydb

-- Create replication user
CREATE USER 'replicator'@'%' IDENTIFIED BY 'secure_password';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;
-- Get binary log position
SHOW MASTER STATUS;
-- Note: File and Position values

Replica Configuration

# my.cnf on the replica
[mysqld]
server-id = 2
relay_log = mysql-relay-bin
read_only = 1

-- Configure replication
CHANGE MASTER TO
MASTER_HOST='primary.example.com',
MASTER_USER='replicator',
MASTER_PASSWORD='secure_password',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=154;
-- Start replication
START SLAVE;
-- Check status
SHOW SLAVE STATUS\G

Failover Strategies

Manual Failover (PostgreSQL)

-- On the replica: Promote to primary
SELECT pg_promote();
-- Or via the command line
pg_ctl promote -D /var/lib/postgresql/14/main

Update application connection strings to point to the new primary.

Automatic Failover with Patroni

Patroni manages PostgreSQL high availability with automatic failover.

# patroni.yml
scope: postgres-cluster
name: node1
restapi:
listen: 0.0.0.0:8008
connect_address: node1.example.com:8008
etcd:
hosts: etcd1:2379,etcd2:2379,etcd3:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
parameters:
max_connections: 100
postgresql:
listen: 0.0.0.0:5432
connect_address: node1.example.com:5432
data_dir: /var/lib/postgresql/14/main
authentication:
replication:
username: replicator
password: secure_password
superuser:
username: postgres
password: postgres_password

HAProxy for Connection Routing

# haproxy.cfg
frontend postgres_frontend
bind *:5432
default_backend postgres_backend
backend postgres_backend
option httpchk GET /master
http-check expect status 200
server node1 node1:5432 check port 8008
server node2 node2:5432 check port 8008
server node3 node3:5432 check port 8008

Read Scaling with Replicas

Application-Level Read/Write Splitting

// Laravel example
// config/database.php
'pgsql' => [
'read' => [
'host' => [
'replica1.example.com',
'replica2.example.com',
],
],
'write' => [
'host' => 'primary.example.com',
],
'driver' => 'pgsql',
'database' => 'mydb',
// ...
],

// Force write connection for critical reads
$user = DB::connection('pgsql')->select('SELECT * FROM users WHERE id = ?', [1]);

Connection Pooling with Read Replicas

# PgBouncer with multiple backends
[databases]
mydb = host=primary.example.com port=5432
mydb_ro = host=replica1.example.com,replica2.example.com port=5432

Monitoring Replication

PostgreSQL Replication Lag

-- On the primary: Check lag per replica
SELECT client_addr,
pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes,
pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) / 1024 / 1024 AS lag_mb
FROM pg_stat_replication;
-- On the replica: Check lag
SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) AS lag_seconds;

MySQL Replication Lag

SHOW SLAVE STATUS\G
-- Look for: Seconds_Behind_Master

Alerting Thresholds

Warning: Lag > 10 seconds
Critical: Lag > 60 seconds
Emergency: Replica disconnected

Backup with Replicas

Take backups from replicas to avoid impacting the primary:

# PostgreSQL - backup from replica
pg_dump -h replica.example.com -U postgres mydb > backup.sql
# Or physical backup
pg_basebackup -h replica.example.com -U replicator -D /backup -P

Best Practices

Use synchronous replication for critical data (at least one replica)
Monitor replication lag continuously
Test failover regularly in non-production
Keep replicas in different availability zones
Document failover procedures
Automate failover where possible (Patroni, orchestrator)
Use connection pooling to manage connections efficiently

Conclusion

Database replication provides the foundation for high availability and disaster recovery. Start with asynchronous streaming replication for read scaling, add synchronous replication for critical data, and implement automatic failover for production systems. Regular testing ensures your failover procedures work when needed.

Site materials may be used with reference to the source. License

About us

Guides

Databases

How to Implement Database Replication and High Availability

Свернуть [_]

Log In

How to Implement Database Replication and High Availability

Introduction

Replication Concepts

Replication Types

PostgreSQL Streaming Replication

Primary Server Configuration

Replica Server Setup

Verify Replication

PostgreSQL Logical Replication

Publisher (Source)

Subscriber (Target)

MySQL Replication

Primary Configuration

Replica Configuration

Failover Strategies

Manual Failover (PostgreSQL)

Automatic Failover with Patroni

HAProxy for Connection Routing

Read Scaling with Replicas

Application-Level Read/Write Splitting

Connection Pooling with Read Replicas

Monitoring Replication

PostgreSQL Replication Lag

MySQL Replication Lag

Alerting Thresholds

Backup with Replicas

Best Practices

Conclusion