Data Synchronization: Conflict Resolution and Offline-First Architectures
Data synchronization is the process of keeping data consistent across multiple replicas, devices, or data centers. In distributed systems, conflicts are inevitable — two users edit the same document, two data centers process the same update, or an offline device reconnects with stale data. This guide covers the core strategies for handling these conflicts, including CRDTs, operational transformation, version vectors, and offline-first architecture patterns.
Conflict Resolution Strategies
| Strategy | How It Works | Data Loss Risk | Use Case |
|---|---|---|---|
| Last-Writer-Wins (LWW) | Highest timestamp wins | High — concurrent writes are silently dropped | Caches, session data |
| Merge Function | Application-defined merge logic | None if merge is correct | Shopping carts, counters |
| CRDTs | Data structures that auto-merge | None | Counters, sets, collaborative editing |
| Operational Transform | Transform operations against concurrent ops | None | Real-time collaborative text editing |
| Multi-Value (Siblings) | Keep all versions, let client resolve | None — deferred to application | Amazon Dynamo shopping cart |
CRDTs: Conflict-Free Replicated Data Types
CRDTs are data structures designed to be replicated across multiple nodes and merged automatically without conflicts. They guarantee that all replicas converge to the same state, regardless of the order in which operations are applied.
G-Counter (Grow-Only Counter)
class GCounter:
def __init__(self, node_id, nodes):
self.node_id = node_id
self.counts = {node: 0 for node in nodes}
def increment(self, amount=1):
self.counts[self.node_id] += amount
def value(self):
return sum(self.counts.values())
def merge(self, other):
for node in self.counts:
self.counts[node] = max(self.counts[node],
other.counts.get(node, 0))
# Example: Page view counter across 3 servers
counter_a = GCounter("A", ["A", "B", "C"])
counter_b = GCounter("B", ["A", "B", "C"])
counter_a.increment(5) # A sees 5 views
counter_b.increment(3) # B sees 3 views
counter_a.merge(counter_b)
print(counter_a.value()) # 8 — correct total
PN-Counter (Positive-Negative Counter)
class PNCounter:
def __init__(self, node_id, nodes):
self.positive = GCounter(node_id, nodes)
self.negative = GCounter(node_id, nodes)
def increment(self, amount=1):
self.positive.increment(amount)
def decrement(self, amount=1):
self.negative.increment(amount)
def value(self):
return self.positive.value() - self.negative.value()
def merge(self, other):
self.positive.merge(other.positive)
self.negative.merge(other.negative)
LWW-Register (Last-Writer-Wins Register)
class LWWRegister:
def __init__(self):
self.value = None
self.timestamp = 0
def set(self, value, timestamp):
if timestamp > self.timestamp:
self.value = value
self.timestamp = timestamp
def merge(self, other):
if other.timestamp > self.timestamp:
self.value = other.value
self.timestamp = other.timestamp
OR-Set (Observed-Remove Set)
import uuid
class ORSet:
def __init__(self):
self.elements = {} # element -> set of unique tags
def add(self, element):
tag = str(uuid.uuid4())
if element not in self.elements:
self.elements[element] = set()
self.elements[element].add(tag)
def remove(self, element):
if element in self.elements:
del self.elements[element]
def contains(self, element):
return element in self.elements and len(self.elements[element]) > 0
def value(self):
return {e for e, tags in self.elements.items() if tags}
def merge(self, other):
all_elements = set(self.elements.keys()) | set(other.elements.keys())
for element in all_elements:
self_tags = self.elements.get(element, set())
other_tags = other.elements.get(element, set())
self.elements[element] = self_tags | other_tags
Operational Transformation (OT)
OT is the algorithm behind Google Docs' real-time collaborative editing. When two users make concurrent edits to the same document, OT transforms one operation against the other to preserve both users' intentions.
def transform(op1, op2):
"""Transform op1 against op2 for text editing.
Both ops are concurrent; we need to adjust op1 assuming op2 happened first.
"""
if op1["type"] == "insert" and op2["type"] == "insert":
if op1["position"] < op2["position"]:
return op1 # op1 is before op2, no change
elif op1["position"] > op2["position"]:
return {**op1, "position": op1["position"] + len(op2["text"])}
else:
# Same position: break tie by user ID
if op1["user_id"] < op2["user_id"]:
return op1
else:
return {**op1, "position": op1["position"] + len(op2["text"])}
elif op1["type"] == "insert" and op2["type"] == "delete":
if op1["position"] <= op2["position"]:
return op1
else:
return {**op1, "position": op1["position"] - op2["count"]}
elif op1["type"] == "delete" and op2["type"] == "insert":
if op1["position"] < op2["position"]:
return op1
else:
return {**op1, "position": op1["position"] + len(op2["text"])}
Version Vectors for Conflict Detection
Version vectors (a practical variant of vector clocks) track the causal history of each data item. When two replicas have diverged, version vectors detect the conflict so the appropriate resolution strategy can be applied.
Offline-First Architecture
Offline-first applications work without a network connection and sync when connectivity is restored. Key principles:
- Local-first: All reads and writes go to local storage first
- Background sync: Changes are synced to the server when online
- Conflict resolution: Use CRDTs or merge functions to handle concurrent edits
- Eventual consistency: Accept that replicas may temporarily diverge
class OfflineFirstClient:
def __init__(self, local_db, sync_service):
self.local_db = local_db
self.sync_service = sync_service
self.change_log = []
def write(self, key, value):
version = self.local_db.write(key, value)
self.change_log.append({
"key": key, "value": value,
"version": version, "timestamp": time.time()
})
def read(self, key):
return self.local_db.read(key)
def sync(self):
# Push local changes
for change in self.change_log:
try:
conflict = self.sync_service.push(change)
if conflict:
resolved = self.resolve_conflict(
change, conflict.server_version)
self.local_db.write(change["key"], resolved)
except NetworkError:
break # Will retry next sync
# Pull remote changes
remote_changes = self.sync_service.pull(
since=self.last_sync_timestamp)
for change in remote_changes:
local_version = self.local_db.get_version(change["key"])
if local_version and local_version != change["base_version"]:
resolved = self.resolve_conflict(local_version, change)
self.local_db.write(change["key"], resolved)
else:
self.local_db.write(change["key"], change["value"])
Data synchronization connects to vector clocks for causality tracking, quorum systems for replica management, and geo-distribution for multi-region replication.
Frequently Asked Questions
Q: When should I use CRDTs vs operational transformation?
Use CRDTs for simpler data types (counters, sets, registers, maps) where automatic merging is sufficient. Use OT for rich text editing where character-level operations need precise transformation. CRDTs are simpler to implement and reason about. OT provides more natural semantics for text editing but is complex to implement correctly.
Q: What databases support CRDTs natively?
Redis Enterprise supports CRDT-based active-active geo-replication. Riak uses CRDTs for its data types (counters, sets, maps). Azure Cosmos DB uses CRDTs for its conflict resolution. Automerge and Yjs are libraries that implement CRDTs for application-level data sync.
Q: Is last-writer-wins ever acceptable?
Yes, for data where losing a concurrent update is acceptable: cache entries, session data, user preferences, analytics counters. For data where loss is unacceptable (financial transactions, inventory, collaborative documents), use CRDTs, merge functions, or multi-value registers.
Q: How do I test conflict resolution logic?
Create deterministic test scenarios with known concurrent operations. Use property-based testing to verify convergence: apply random operations in random orders to multiple replicas and verify they converge to the same state. Tools like Jepsen test distributed systems for correctness under network partitions.