Vector Clocks: Tracking Causality in Distributed Systems
Vector clocks are a mechanism for tracking the partial ordering of events in a distributed system. They allow you to determine whether two events are causally related (one happened before the other) or concurrent (independent and potentially conflicting). This is essential for conflict detection and resolution in systems without a single source of truth, such as replicated databases and collaborative applications.
The Problem: Ordering Events Without a Global Clock
In a distributed system, there is no global clock that all nodes can reliably reference. Physical clocks drift, network delays vary, and NTP synchronization has bounded accuracy. Logical clocks solve this by tracking causal ordering rather than wall-clock time.
Lamport Timestamps
Leslie Lamport's logical clock (1978) assigns a monotonically increasing counter to events. The rules are simple:
- Before each local event, increment the counter
- When sending a message, include the counter value
- When receiving a message, set the counter to max(local, received) + 1
class LamportClock:
def __init__(self):
self.time = 0
def local_event(self):
self.time += 1
return self.time
def send_event(self):
self.time += 1
return self.time # attach to message
def receive_event(self, received_time):
self.time = max(self.time, received_time) + 1
return self.time
Limitation: Lamport timestamps provide a total order but cannot distinguish between causal relationships and concurrency. If L(a) < L(b), we know b did not happen before a, but we cannot tell if a happened before b or if they are concurrent.
Vector Clocks
Vector clocks extend Lamport timestamps by maintaining a counter per node. A vector clock for a system with N nodes is an array of N counters, where the i-th counter represents the number of events known from node i.
class VectorClock:
def __init__(self, node_id, nodes):
self.node_id = node_id
self.clock = {node: 0 for node in nodes}
def local_event(self):
self.clock[self.node_id] += 1
return dict(self.clock)
def send_event(self):
self.clock[self.node_id] += 1
return dict(self.clock)
def receive_event(self, received_clock):
for node in self.clock:
self.clock[node] = max(self.clock[node],
received_clock.get(node, 0))
self.clock[self.node_id] += 1
return dict(self.clock)
@staticmethod
def compare(vc1, vc2):
"""Returns: 'before', 'after', 'concurrent', or 'equal'"""
all_nodes = set(vc1.keys()) | set(vc2.keys())
less = False
greater = False
for node in all_nodes:
v1 = vc1.get(node, 0)
v2 = vc2.get(node, 0)
if v1 < v2:
less = True
elif v1 > v2:
greater = True
if less and not greater:
return "before"
elif greater and not less:
return "after"
elif not less and not greater:
return "equal"
else:
return "concurrent"
Detecting Causality and Concurrency
Given two vector clocks VC(a) and VC(b):
| Condition | Meaning | Example |
|---|---|---|
| VC(a) < VC(b) | a happened before b (causally) | [1,0,0] < [1,1,0] |
| VC(a) > VC(b) | b happened before a | [2,1,0] > [1,1,0] |
| VC(a) || VC(b) | Concurrent — potential conflict | [2,0,0] || [0,1,0] |
| VC(a) = VC(b) | Same event | [1,1,1] = [1,1,1] |
Practical Example: Shopping Cart
# User adds items to cart from two different devices simultaneously
# Device A (through Node 1): Add "book"
# VC_A = {node1: 1, node2: 0, node3: 0}
# Cart_A = ["book"]
# Device B (through Node 2): Add "pen"
# VC_B = {node1: 0, node2: 1, node3: 0}
# Cart_B = ["pen"]
# These are concurrent: VC_A || VC_B
# Resolution: merge both versions
# Merged_Cart = ["book", "pen"]
# Merged_VC = {node1: 1, node2: 1, node3: 0}
def merge_carts(cart_a, vc_a, cart_b, vc_b):
relation = VectorClock.compare(vc_a, vc_b)
if relation == "before":
return cart_b, vc_b
elif relation == "after":
return cart_a, vc_a
elif relation == "concurrent":
# Conflict: merge items from both
merged_cart = list(set(cart_a) | set(cart_b))
merged_vc = {
node: max(vc_a.get(node, 0), vc_b.get(node, 0))
for node in set(vc_a) | set(vc_b)
}
return merged_cart, merged_vc
else:
return cart_a, vc_a
Conflict Resolution Strategies
| Strategy | Description | Use Case |
|---|---|---|
| Last-Writer-Wins (LWW) | Use physical timestamp to pick winner | Session data, caches |
| Application-Level Merge | Return all versions; app decides | Shopping carts (Amazon Dynamo) |
| CRDTs | Data structures that auto-merge | Counters, sets (see data sync) |
| Semantic Resolution | Domain-specific merge logic | Collaborative editing |
DynamoDB and Version Vectors
The original Amazon Dynamo paper (2007) used vector clocks for conflict detection. In practice, DynamoDB uses a simplified version called version vectors that tracks versions per node rather than per client. When concurrent writes are detected, DynamoDB can either use last-writer-wins (default) or return all conflicting versions to the application (conditional writes).
Scaling Challenges
Vector clocks grow linearly with the number of nodes. In a system with thousands of nodes, the vector clock size becomes impractical. Solutions include:
- Version vectors with pruning: Remove entries for nodes that have not been updated recently
- Dotted version vectors: An optimization that reduces the size while maintaining correctness
- Interval tree clocks: A dynamic alternative that does not require knowing all participants upfront
class PrunedVectorClock:
def __init__(self, node_id, max_entries=20):
self.node_id = node_id
self.clock = {}
self.max_entries = max_entries
self.timestamps = {} # last update time per entry
def prune(self):
if len(self.clock) <= self.max_entries:
return
# Remove oldest entries
sorted_entries = sorted(self.timestamps.items(),
key=lambda x: x[1])
to_remove = len(self.clock) - self.max_entries
for node, _ in sorted_entries[:to_remove]:
if node != self.node_id:
del self.clock[node]
del self.timestamps[node]
Vector clocks work alongside quorum systems for read/write consistency, gossip protocols for state dissemination, and data sync strategies for conflict resolution.
Frequently Asked Questions
Q: Why not just use physical timestamps instead of vector clocks?
Physical timestamps are unreliable in distributed systems due to clock skew and NTP inaccuracies. Two events might have the same timestamp, or a later event might get an earlier timestamp. Last-writer-wins with physical timestamps can silently lose data. Vector clocks provide a mathematically correct causal ordering.
Q: How are vector clocks different from Lamport timestamps?
Lamport timestamps are a single counter that provides a total order but cannot detect concurrency. If L(a) < L(b), you only know b did not happen before a. Vector clocks maintain per-node counters, allowing you to determine if events are causally related or concurrent — critical for correct conflict detection.
Q: Do modern systems still use vector clocks?
Many modern systems have moved to simpler alternatives. DynamoDB uses last-writer-wins for most cases. Cassandra uses timestamps with conflict-free data types. Riak uses dotted version vectors (an improvement on vector clocks). CRDTs eliminate the need for explicit conflict detection entirely. However, understanding vector clocks remains essential for system design interviews and for building custom distributed systems.
Q: What is the difference between vector clocks and version vectors?
Vector clocks track events (incremented on every operation), while version vectors track versions of data objects (incremented only on writes to a specific key). Version vectors are more space-efficient for key-value stores because they only need entries for nodes that have actually written to a particular key, not all nodes in the system.