Three-Phase Commit (3PC): Non-Blocking Distributed Transactions
Three-Phase Commit (3PC) is an extension of the Two-Phase Commit (2PC) protocol designed to eliminate the blocking problem. By introducing an additional pre-commit phase between the voting and commit phases, 3PC ensures that no participant is left in an uncertain state when the coordinator fails. While theoretically superior to 2PC, 3PC has its own limitations that have limited its practical adoption.
The Three Phases
| Phase | Coordinator Action | Participant Action | Purpose |
|---|---|---|---|
| 1. CanCommit | Sends canCommit? to all participants | Responds Yes or No | Lightweight vote — no locks yet |
| 2. PreCommit | Sends preCommit (if all Yes) or abort | Acquires locks, writes WAL, ACKs | Ensures all participants know the decision |
| 3. DoCommit | Sends doCommit | Commits and ACKs | Final commit |
Implementation
class ThreePhaseCoordinator:
def __init__(self, participants, timeout=30):
self.participants = participants
self.timeout = timeout
self.transaction_log = TransactionLog()
def execute(self, transaction):
tx_id = generate_transaction_id()
# Phase 1: CanCommit
can_commit_votes = {}
for p in self.participants:
try:
vote = p.can_commit(tx_id, transaction, timeout=self.timeout)
can_commit_votes[p.id] = vote
except TimeoutError:
can_commit_votes[p.id] = "NO"
if not all(v == "YES" for v in can_commit_votes.values()):
self._abort_all(tx_id)
return "ABORTED"
# Phase 2: PreCommit
self.transaction_log.write("PRE_COMMIT", tx_id)
pre_commit_acks = {}
for p in self.participants:
try:
ack = p.pre_commit(tx_id, timeout=self.timeout)
pre_commit_acks[p.id] = ack
except TimeoutError:
pre_commit_acks[p.id] = "TIMEOUT"
if not all(v == "ACK" for v in pre_commit_acks.values()):
self._abort_all(tx_id)
return "ABORTED"
# Phase 3: DoCommit
self.transaction_log.write("DO_COMMIT", tx_id)
for p in self.participants:
try:
p.do_commit(tx_id, timeout=self.timeout)
except TimeoutError:
self.retry_queue.add(p.id, tx_id, "COMMIT")
return "COMMITTED"
def _abort_all(self, tx_id):
self.transaction_log.write("ABORT", tx_id)
for p in self.participants:
try:
p.abort(tx_id)
except Exception:
pass
class ThreePhaseParticipant:
def __init__(self, participant_id, database):
self.id = participant_id
self.db = database
self.state = {} # tx_id -> state
def can_commit(self, tx_id, transaction, timeout):
# Lightweight check — can we do this transaction?
if self.db.can_execute(transaction):
self.state[tx_id] = "READY"
return "YES"
return "NO"
def pre_commit(self, tx_id, timeout):
# Acquire locks, write to WAL
self.db.begin_transaction()
self.db.execute_and_prepare(tx_id)
self.state[tx_id] = "PRE_COMMITTED"
return "ACK"
def do_commit(self, tx_id, timeout):
self.db.commit_prepared(tx_id)
self.state[tx_id] = "COMMITTED"
return "ACK"
def abort(self, tx_id):
if self.state.get(tx_id) == "PRE_COMMITTED":
self.db.rollback_prepared(tx_id)
self.state[tx_id] = "ABORTED"
How 3PC Eliminates Blocking
The key insight is the pre-commit phase. In 2PC, after voting COMMIT, a participant does not know if the coordinator decided to commit or abort. In 3PC, the pre-commit message tells all participants "everyone voted yes, we are going to commit." This means:
- If a participant is in the PRE_COMMITTED state and the coordinator fails, it can safely commit (because it knows all participants agreed)
- If a participant is in the READY state (voted yes, but no pre-commit received) and the coordinator fails, it can safely abort (because the pre-commit was never sent, so not all participants may be ready)
- Participants can use timeouts to make unilateral decisions instead of blocking indefinitely
Timeout-Based Recovery
class ThreePhaseParticipantWithRecovery:
def handle_coordinator_timeout(self, tx_id):
current_state = self.state.get(tx_id)
if current_state == "READY":
# Voted yes, but no preCommit received
# Safe to abort: coordinator may have aborted
self.abort(tx_id)
elif current_state == "PRE_COMMITTED":
# PreCommit received, waiting for doCommit
# Safe to commit: all participants agreed
self.do_commit(tx_id, timeout=None)
elif current_state is None:
# Never participated, ignore
pass
Limitations of 3PC
Despite solving the blocking problem, 3PC has significant limitations:
| Limitation | Description |
|---|---|
| Network Partitions | 3PC assumes fail-stop (crash) failures, not network partitions. During a partition, nodes on different sides may make different decisions. |
| Extra Round Trip | 3 round trips instead of 2, adding latency to every transaction. |
| Complexity | More states to manage, more failure scenarios to handle. |
| Rarely Implemented | Most systems prefer Paxos/Raft-based approaches or Saga patterns instead. |
3PC vs 2PC Detailed Comparison
| Aspect | 2PC | 3PC |
|---|---|---|
| Phases | Prepare + Commit | CanCommit + PreCommit + DoCommit |
| Blocking | Yes (on coordinator failure) | No (timeout-based recovery) |
| Network Partition Safe | No | No |
| Round Trips | 2 | 3 |
| Message Complexity | O(N) | O(N) |
| Practical Adoption | Widespread (XA, databases) | Rare |
| Safety Under Partitions | Blocks (safe but unavailable) | May diverge (unsafe) |
Network Partition Problem
# Scenario: Network partition during Phase 2
# Partition A: Coordinator + P1 (received preCommit)
# Partition B: P2, P3 (did NOT receive preCommit)
# In Partition A:
# Coordinator sends doCommit to P1 -> P1 COMMITS
# Coordinator cannot reach P2, P3
# In Partition B:
# P2 and P3 timeout waiting for preCommit
# They are in READY state -> timeout -> ABORT
# Result: P1 committed, P2 and P3 aborted -> INCONSISTENCY
# This is why 3PC does not work with network partitions
Modern Alternatives
Due to 3PC's partition vulnerability, modern systems use other approaches:
- Paxos Commit: Replaces the coordinator with a Paxos group, tolerating both crashes and partitions
- Raft-based transactions: Used by CockroachDB and TiDB for distributed transactions
- Saga pattern: Avoids distributed transactions entirely using compensating actions (see distributed transactions)
For the foundational protocol, see our 2PC guide. For practical alternatives used in microservices, see distributed transaction patterns.
Frequently Asked Questions
Q: If 3PC solves the blocking problem, why is it not widely used?
Because 3PC only works correctly under the fail-stop model (nodes crash but do not partition). In real networks, partitions are common, and 3PC can lead to inconsistency during partitions. Consensus protocols like Paxos and Raft handle both crashes and partitions correctly, making them the preferred choice for modern systems.
Q: Is 3PC faster or slower than 2PC?
3PC is slower due to the extra round trip. In the normal (no failure) case, 2PC requires 2 round trips while 3PC requires 3. The benefit of 3PC only manifests during coordinator failures, which are relatively rare. This performance cost for a rare benefit is another reason 3PC is rarely adopted.
Q: Can 3PC be combined with Paxos?
Yes, and this is essentially what Paxos Commit does. By replacing the coordinator with a replicated state machine (Paxos group), you get both non-blocking behavior and partition tolerance. Google Spanner uses a similar approach: 2PC for transaction coordination with Paxos for coordinator replication.