About us Guides Projects Contacts
Админка
please wait

Problem Statement

You need to ensure your database can handle failures, provide read scaling, and maintain data durability across multiple servers or regions.

When to Use Each Pattern

PatternUse CaseTrade-offs
Primary-ReplicaRead scaling, simple failoverReplication lag, manual failover
Multi-PrimaryWrite scaling, geographic distributionConflict resolution complexity
SynchronousZero data lossSlower writes, requires low latency
AsynchronousHigher performancePotential data loss on failure

MySQL/Percona Replication

Step 1: Configure Primary Server

Edit /etc/my.cnf:

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
binlog_do_db = myapp_production
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1
# GTID-based replication (recommended)
gtid_mode = ON
enforce_gtid_consistency = ON

Create replication user:

CREATE USER 'replicator'@'replica-host' IDENTIFIED BY 'secure_password';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'replica-host';
FLUSH PRIVILEGES;

Step 2: Configure Replica Server

Edit /etc/my.cnf:

[mysqld]
server-id = 2
relay_log = /var/log/mysql/mysql-relay-bin.log
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
read_only = ON
# GTID-based replication
gtid_mode = ON
enforce_gtid_consistency = ON

Step 3: Initialize Replication

On the replica:

CHANGE MASTER TO
MASTER_HOST = 'primary-host',
MASTER_USER = 'replicator',
MASTER_PASSWORD = 'secure_password',
MASTER_AUTO_POSITION = 1;
START SLAVE;

Step 4: Verify Replication Status

SHOW SLAVE STATUS\G

Check for:

  • SlaveIORunning: Yes
  • SlaveSQLRunning: Yes
  • SecondsBehindMaster: 0 (or close to 0)

Semi-Synchronous Replication

For stronger durability guarantees:

On the primary:

INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
SET GLOBAL rpl_semi_sync_master_enabled = 1;
SET GLOBAL rpl_semi_sync_master_timeout = 1000; -- 1 second

On the replica:

INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
SET GLOBAL rpl_semi_sync_slave_enabled = 1;
STOP SLAVE IO_THREAD;
START SLAVE IO_THREAD;

PostgreSQL Streaming Replication

Step 1: Configure Primary Server

Edit postgresql.conf:

wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB
synchronous_commit = on
synchronous_standby_names = 'standby1' # For synchronous replication

Edit pg_hba.conf:

host replication replicator replica-ip/32 scram-sha-256

Create replication user:

CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';

Step 2: Create Base Backup

On the replica:

pg_basebackup -h primary-host -U replicator -D /var/lib/pgsql/data -Fp -Xs -P -R

The -R flag creates standby.signal and configures postgresql.auto.conf.

Step 3: Configure Replica

Edit postgresql.conf (if needed):

hot_standby = on
primary_conninfo = 'host=primary-host user=replicator password=secure_password'

Step 4: Start Replica and Verify

systemctl start postgresql
# On the primary, check connected replicas
SELECT * FROM pg_stat_replication;
# On the replica, check recovery status
SELECT pg_is_in_recovery();

Automatic Failover with Patroni

For production, use Patroni for automatic failover:

# patroni.yml
scope: postgres-cluster
namespace: /service/
name: node1
restapi:
listen: 0.0.0.0:8008
connect_address: node1:8008
etcd:
hosts: etcd1:2379,etcd2:2379,etcd3:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
parameters:
wal_level: replica
hot_standby: "on"
max_wal_senders: 5
max_replication_slots: 5
postgresql:
listen: 0.0.0.0:5432
connect_address: node1:5432
data_dir: /var/lib/postgresql/data
authentication:
replication:
username: replicator
password: secure_password
superuser:
username: postgres
password: admin_password

MySQL Replication in Kubernetes

Using Percona Operator

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBCluster
metadata:
name: mysql-cluster
spec:
crVersion: 1.13.0
secretsName: mysql-secrets
pxc:
size: 3
image: percona/percona-xtradb-cluster:8.0
resources:
requests:
memory: 1G
cpu: 600m
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 100Gi
haproxy:
enabled: true
size: 2
image: percona/percona-xtradb-cluster-operator:1.13.0-haproxy
proxysql:
enabled: false

Connection String

mysql://user:password@mysql-cluster-haproxy:3306/database

PostgreSQL Replication in Kubernetes

Using CloudNativePG Operator

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster
spec:
instances: 3
postgresql:
parameters:
max_connections: "300"
shared_buffers: "256MB"
storage:
size: 100Gi
storageClass: fast-ssd
backup:
barmanObjectStore:
destinationPath: s3://backups/postgres
s3Credentials:
accessKeyId:
name: s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: s3-creds
key: SECRET_ACCESS_KEY

Replication Monitoring

Key Metrics to Monitor

  1. Replication Lag

``` sql -- MySQL SHOW SLAVE STATUS\G -- Check SecondsBehindMaster

-- PostgreSQL SELECT clientaddr, state, sentlsn, writelsn, flushlsn, replaylsn, pgwallsndiff(sentlsn, replaylsn) AS lagbytes FROM pgstat_replication; ```

  1. Connection Status

`` sql -- MySQL: Check SlaveIORunning and SlaveSQLRunning -- PostgreSQL: Check pgstatreplication state ``

  1. Binary Log/WAL Position

Alerting Rules

# Prometheus alerting rules
groups:
- name: database-replication
rules:
- alert: ReplicationLagHigh
expr: mysql_slave_seconds_behind_master > 60
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL replication lag is {{ $value }} seconds"
- alert: ReplicationStopped
expr: mysql_slave_sql_running == 0 OR mysql_slave_io_running == 0
for: 1m
labels:
severity: critical
annotations:
summary: "MySQL replication has stopped"

Failover Procedures

Manual Failover (MySQL)

  1. Stop writes to primary

`` sql SET GLOBAL read_only = ON; FLUSH TABLES WITH READ LOCK; ``

  1. Ensure replica is caught up

`` sql -- On the replica SHOW SLAVE STATUS\G -- Wait for SecondsBehindMaster = 0 ``

  1. Promote replica

`` sql STOP SLAVE; RESET SLAVE ALL; SET GLOBAL read_only = OFF; ``

  1. Update application connection strings

Manual Failover (PostgreSQL)

  1. Promote standby

`` bash pgctl promote -D /var/lib/pgsql/data # Or SELECT pgpromote(); ``

  1. Reconfigure old primary as standby (if recovered)

Best Practices Checklist

  • [ ] Use GTID-based replication (MySQL) for easier failover
  • [ ] Configure monitoring for replication lag
  • [ ] Test failover procedures regularly
  • [ ] Use semi-synchronous replication for critical data
  • [ ] Keep binlogs/WAL retained long enough for recovery
  • [ ] Document and automate failover procedures
  • [ ] Use connection pooling (PgBouncer, ProxySQL)
  • [ ] Configure read replicas in application for read scaling
  • [ ] Back up both primary and replicas
  • [ ] Monitor disk space for logs and data

Related Wiki Articles

 
 
 
Языки
Темы
Copyright © 1999 — 2026
ZK Interactive