Skip to main content
🧠Advanced Topics

Quorum Systems: Tunable Consistency in Distributed Databases

Quorum systems are the foundation of tunable consistency in distributed databases. A quorum is the minimum number of nodes that must participate in a read ...

📖 6 min read

Quorum Systems: Tunable Consistency in Distributed Databases

Quorum systems are the foundation of tunable consistency in distributed databases. A quorum is the minimum number of nodes that must participate in a read or write operation for it to be considered successful. By adjusting the quorum sizes for reads and writes, you can trade off between consistency, availability, and latency. This is the mechanism behind DynamoDB, Cassandra, and Riak's consistency models.

Understanding Read and Write Quorums

In a system with N replicas, a write quorum (W) is the number of replicas that must acknowledge a write, and a read quorum (R) is the number of replicas that must respond to a read. The fundamental rule for strong consistency is:

W + R > N

This guarantees that any read quorum overlaps with any write quorum — at least one node in every read set has the latest write. When this condition holds, reads always return the most recent write.

Configuration (N=3) W R Consistency Trade-off
Strong Consistency 2 2 Strong (2+2>3) Balanced reads and writes
Write-heavy 1 3 Strong (1+3>3) Fast writes, slow reads
Read-heavy 3 1 Strong (3+1>3) Fast reads, slow writes
Eventual 1 1 Eventual (1+1=2, not >3) Fast but potentially stale

Implementation Example

import asyncio
from dataclasses import dataclass
from typing import List, Optional, Tuple

@dataclass
class VersionedValue:
    value: any
    version: int
    timestamp: float

class QuorumClient:
    def __init__(self, replicas: List[str], n: int, w: int, r: int):
        self.replicas = replicas
        self.n = n
        self.w = w
        self.r = r
        assert w + r > n, "W + R must be > N for strong consistency"

    async def quorum_write(self, key: str, value: any) -> bool:
        version = int(time.time() * 1000)
        tasks = [
            self._write_to_replica(replica, key, value, version)
            for replica in self.replicas[:self.n]
        ]
        results = await asyncio.gather(*tasks, return_exceptions=True)
        ack_count = sum(1 for r in results if r is True)
        return ack_count >= self.w

    async def quorum_read(self, key: str) -> Optional[any]:
        tasks = [
            self._read_from_replica(replica, key)
            for replica in self.replicas[:self.n]
        ]
        results = await asyncio.gather(*tasks, return_exceptions=True)
        valid_results = [r for r in results
                        if isinstance(r, VersionedValue)]

        if len(valid_results) < self.r:
            raise ConsistencyError("Could not reach read quorum")

        # Return the value with the highest version
        latest = max(valid_results, key=lambda x: x.version)

        # Read repair: update stale replicas
        asyncio.create_task(
            self._read_repair(key, latest, valid_results)
        )
        return latest.value

    async def _read_repair(self, key, latest, results):
        for result in results:
            if result.version < latest.version:
                await self._write_to_replica(
                    result.replica, key, latest.value, latest.version
                )

Sloppy Quorum

A strict quorum requires responses from the designated replicas for a key. A sloppy quorum allows any N nodes to participate, not just the designated ones. This improves availability during network partitions — if a designated node is unreachable, a temporary stand-in accepts the write.

class SloppyQuorumClient:
    def __init__(self, all_nodes, preference_list_size=5, w=2, r=2):
        self.all_nodes = all_nodes
        self.preference_list_size = preference_list_size
        self.w = w
        self.r = r

    def get_preference_list(self, key):
        # Nodes that should own this key (via consistent hashing)
        primary_nodes = self.consistent_hash.get_nodes(key, count=3)
        # Additional fallback nodes
        fallback_nodes = [n for n in self.all_nodes
                         if n not in primary_nodes]
        return primary_nodes + fallback_nodes[:self.preference_list_size - 3]

    async def sloppy_write(self, key, value):
        preference_list = self.get_preference_list(key)
        ack_count = 0
        hints = []

        for node in preference_list:
            if ack_count >= self.w:
                break
            try:
                if node in self.primary_nodes(key):
                    await write_to_node(node, key, value)
                else:
                    # Store as hinted handoff
                    target = self.get_target_node(key, node)
                    await write_hint(node, key, value, target)
                    hints.append((node, target))
                ack_count += 1
            except ConnectionError:
                continue
        return ack_count >= self.w

Hinted Handoff

When a sloppy quorum writes to a substitute node, that node stores a "hint" indicating the intended recipient. When the intended node recovers, the hint is delivered and the data is transferred. This ensures data eventually reaches the correct replica.

class HintedHandoff:
    def __init__(self, node_id):
        self.node_id = node_id
        self.hints = {}  # target_node -> [(key, value, version)]

    def store_hint(self, target_node, key, value, version):
        if target_node not in self.hints:
            self.hints[target_node] = []
        self.hints[target_node].append((key, value, version))

    async def deliver_hints(self):
        for target_node, items in list(self.hints.items()):
            try:
                for key, value, version in items:
                    await write_to_node(target_node, key, value, version)
                del self.hints[target_node]
            except ConnectionError:
                pass  # Retry next cycle

DynamoDB Quorum Example

Amazon DynamoDB uses quorum-based replication across three availability zones. For a table with N=3 replicas:

  • Strongly consistent reads: R=2, reads from majority and returns latest
  • Eventually consistent reads: R=1, reads from any one replica (faster, cheaper)
  • Writes: W=2, write acknowledged after 2 of 3 replicas confirm

DynamoDB also uses sloppy quorum with hinted handoff for durability during availability zone failures. The system automatically transfers hinted data when the target zone recovers.

Anti-Entropy and Read Repair

Quorum systems use two mechanisms to keep replicas synchronized:

Mechanism When It Runs How It Works
Read Repair On every read When a read detects stale data, update the stale replica
Anti-Entropy Background process Merkle tree comparison to find and fix differences
Hinted Handoff On node recovery Transfer data stored by substitute nodes

Quorum systems connect to consistent hashing for determining which nodes are replicas, vector clocks for detecting conflicts, and gossip protocols for membership management.

Frequently Asked Questions

Q: Does W + R > N guarantee linearizability?

No. W + R > N guarantees that reads see the most recent write, but this alone does not provide linearizability. Linearizability additionally requires that operations appear to happen instantaneously. Achieving linearizability requires additional mechanisms like read-your-writes consistency or using a leader-based approach.

Q: What happens when a quorum cannot be reached?

The operation fails. With strict quorum, if fewer than W nodes are available for a write, the write is rejected. This is the availability trade-off for consistency. Sloppy quorum mitigates this by allowing substitute nodes to participate.

Q: How do I choose between strict and sloppy quorum?

Use strict quorum when consistency is critical (financial transactions, inventory). Use sloppy quorum when availability is more important (shopping cart, user preferences). DynamoDB uses sloppy quorum for its core key-value operations, prioritizing availability.

Q: What is the relationship between quorums and the CAP theorem?

Quorums let you tune the CAP trade-off. With W + R > N, you get consistency but sacrifice availability during partitions (CP). With W + R <= N, you get availability but sacrifice consistency (AP). The partition tolerance dimension is not optional in distributed systems.

Related Articles