Database Replication: Ensuring High Availability and Data Durability
Database replication is the process of copying data from one database server to one or more other servers in real time. It is the foundation of high availability, disaster recovery, and read scalability. This guide covers replication architectures, synchronous versus asynchronous replication, conflict resolution, and failover strategies.
Why Replication Matters
Every production database needs replication. A single database server is a single point of failure — if it crashes, your entire application goes down and you risk data loss. Replication provides three critical benefits:
- High Availability: If the primary server fails, a replica can take over (failover)
- Data Durability: Data exists on multiple machines, surviving hardware failures
- Read Scalability: Read traffic can be distributed across replicas (see read replicas)
Master-Slave (Primary-Replica) Replication
In master-slave replication, one server (the primary/master) handles all write operations. Changes are then propagated to one or more replicas (slaves). Replicas serve read-only queries. This is the most common replication topology.
All writes go to the primary, and the primary streams changes to replicas via a replication log (WAL in PostgreSQL, binlog in MySQL). Replicas apply these changes to stay in sync.
-- PostgreSQL: Configure streaming replication
-- On Primary (postgresql.conf):
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1024 -- MB of WAL to retain
-- Create replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';
-- On Primary (pg_hba.conf): Allow replica to connect
-- host replication replicator replica_ip/32 md5
-- On Replica: Set up streaming replication
-- pg_basebackup -h primary_host -D /var/lib/postgresql/data -U replicator -P
-- On Replica (postgresql.conf):
primary_conninfo = 'host=primary_host port=5432 user=replicator password=secure_password'
-- Verify replication status on primary
SELECT client_addr, state, sent_lsn, write_lsn, replay_lsn,
sent_lsn - replay_lsn AS replication_lag
FROM pg_stat_replication;
MySQL Replication Setup
-- MySQL: Configure binary log replication
-- On Primary (my.cnf):
-- server-id = 1
-- log_bin = /var/log/mysql/mysql-bin.log
-- binlog_format = ROW
-- Create replication user on primary
CREATE USER 'replicator'@'%' IDENTIFIED BY 'secure_password';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;
-- Get binary log position
SHOW MASTER STATUS;
-- +------------------+----------+
-- | File | Position |
-- +------------------+----------+
-- | mysql-bin.000003 | 73485 |
-- +------------------+----------+
-- On Replica:
CHANGE MASTER TO
MASTER_HOST='primary_host',
MASTER_USER='replicator',
MASTER_PASSWORD='secure_password',
MASTER_LOG_FILE='mysql-bin.000003',
MASTER_LOG_POS=73485;
START SLAVE;
SHOW SLAVE STATUS\G
Master-Master (Multi-Primary) Replication
In master-master replication, two or more servers accept writes and replicate to each other. This provides write availability in multiple regions and eliminates the single primary bottleneck. However, it introduces the challenge of write conflicts.
Master-master replication is used when you need writes in multiple geographic regions (e.g., US and EU data centers) or when you cannot tolerate any write downtime during failover. It requires careful conflict resolution strategies.
| Topology | Writes | Conflicts | Complexity | Use Case |
|---|---|---|---|---|
| Master-Slave | Single primary | None | Low | Most applications |
| Master-Master | Multiple primaries | Possible | High | Multi-region writes |
| Chain Replication | Single primary | None | Medium | Reduced primary load |
Synchronous vs Asynchronous Replication
Asynchronous Replication
In async replication, the primary commits a transaction and immediately acknowledges it to the client without waiting for replicas to confirm. Replicas receive and apply changes later. This is the default in most databases because it offers the lowest write latency.
Risk: If the primary crashes before a replica receives the latest changes, those changes are lost. The window of potential data loss is called the replication lag.
Synchronous Replication
In sync replication, the primary waits for at least one replica to confirm it has written the data before acknowledging the transaction. This guarantees zero data loss but increases write latency (every write must wait for a network round trip to the replica).
-- PostgreSQL: Synchronous replication
-- On Primary (postgresql.conf):
synchronous_standby_names = 'FIRST 1 (replica1, replica2)'
synchronous_commit = on
-- Levels of synchronous_commit:
-- off : No durability guarantee (fastest writes)
-- local : Write to primary WAL only (default async)
-- remote_write : Replica received data in OS buffer
-- on : Replica wrote to its WAL (durable)
-- remote_apply : Replica applied the change (readable)
Semi-Synchronous Replication
Semi-synchronous is a middle ground: the primary waits for at least one replica to acknowledge receipt (not application) of the data. MySQL supports semi-synchronous replication natively. It provides better durability than async without the full latency cost of sync.
Replication Lag
Replication lag is the delay between when data is written to the primary and when it appears on a replica. With async replication, this lag can range from milliseconds to seconds (or even minutes under heavy load).
Lag causes stale reads: a user writes data, then reads it from a replica and does not see their own change. This is called the read-after-write consistency problem.
Solutions for Replication Lag
- Read-after-write consistency: After a write, route the user's subsequent reads to the primary for a short window (e.g., 5 seconds)
- Monotonic reads: Ensure a user always reads from the same replica, preventing seeing older data after newer data
- Lag monitoring: Track replication lag and redirect reads to the primary when lag exceeds a threshold
- Causal consistency: Attach a logical timestamp to writes and only serve reads from replicas that have caught up to that timestamp
-- Monitor replication lag in PostgreSQL
SELECT
client_addr,
state,
pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes,
replay_lag
FROM pg_stat_replication;
-- Monitor replication lag in MySQL
SHOW SLAVE STATUS\G
-- Look for: Seconds_Behind_Master
Conflict Resolution in Multi-Primary Replication
When two primaries accept conflicting writes (e.g., both update the same row), the system must resolve the conflict. Common strategies include:
- Last-Write-Wins (LWW): The write with the latest timestamp wins. Simple but can lose data. Used by Cassandra.
- Vector Clocks: Track causal relationships between writes. Can detect concurrent writes but complex to implement. Used by DynamoDB (originally Dynamo).
- CRDTs (Conflict-free Replicated Data Types): Data structures that mathematically guarantee convergence without conflicts. Used by Redis Enterprise and Riak.
- Application-Level Resolution: Store both versions and let the application or user decide. Used by CouchDB.
Failover Mechanisms
Automatic Failover
Automatic failover uses a monitoring system to detect primary failure and promote a replica. Tools like Patroni (PostgreSQL), Orchestrator (MySQL), and cloud services (AWS RDS Multi-AZ) handle this automatically.
The failover process involves: detecting the primary is down (typically via health checks with a timeout), selecting the most up-to-date replica, promoting that replica to primary, and reconfiguring other replicas and application connections to point to the new primary.
Manual Failover
Manual failover requires an operator to initiate the promotion. While slower, it avoids false positives where the monitoring system incorrectly believes the primary is down (split-brain scenarios).
-- PostgreSQL: Promote replica to primary
-- On the replica server:
pg_ctl promote -D /var/lib/postgresql/data
-- Or create a trigger file
touch /var/lib/postgresql/data/promote
-- MySQL: Stop slave and reset
STOP SLAVE;
RESET SLAVE ALL;
-- This replica is now an independent primary
Split-Brain Problem
Split-brain occurs when network partitions cause both the old primary and the promoted replica to accept writes simultaneously. This creates conflicting data that is extremely difficult to reconcile. Solutions include fencing (STONITH — Shoot The Other Node In The Head), quorum-based decisions, and using a consensus protocol like Raft.
Replication in Distributed Systems
Modern distributed databases like CockroachDB and Spanner use consensus-based replication (Raft or Paxos) where a majority of replicas must agree before a write is committed. This provides strong consistency without the split-brain problem but at the cost of higher write latency.
For more on scaling database reads, see our guide on read replicas. For understanding the consistency trade-offs, see ACID vs BASE.
Frequently Asked Questions
How much replication lag is acceptable?
It depends on your application. For e-commerce checkout, even a few seconds of lag can cause problems (user places order, refreshes, does not see it). For analytics dashboards, minutes of lag may be fine. Set up monitoring and alerting for replication lag, and have a strategy for routing critical reads to the primary.
Should I use synchronous or asynchronous replication?
Use asynchronous for most workloads — the latency impact of synchronous replication is significant. Use synchronous only when zero data loss is critical (financial transactions). A common middle ground is semi-synchronous: wait for one replica to acknowledge, but have additional async replicas for read scaling.
How many replicas should I have?
For high availability, at least one synchronous replica. For read scaling, add replicas based on read traffic. AWS RDS supports up to 15 read replicas. In practice, 2-3 replicas provide a good balance of availability, durability, and operational simplicity.
What is the difference between physical and logical replication?
Physical replication copies the exact bytes (WAL records) from primary to replica — the replica is a byte-for-byte copy. Logical replication copies the logical changes (INSERT, UPDATE, DELETE) and allows the replica to have different indexes, schema variations, or even different database versions. PostgreSQL supports both.
How does replication relate to sharding?
They solve different problems. Sharding distributes data across multiple servers for write scalability. Replication copies data to multiple servers for read scalability and durability. In practice, each shard has its own replicas — combining both techniques for a fully scalable, highly available architecture.