How do we ensure data consistency and durability across replicas after node failures or periods of unavailability

Feb 11, 2025

Triggered during reads: When a client performs a read operation, it retrieves the data from multiple replicas in parallel.
Version comparison: The client or the database compares the versions of the data returned by different replicas (using mechanisms like vector clocks or version numbers).
Correction: If a replica is found to have stale or missing data, the client writes the latest value back to that replica.

On-the-fly repair: Only the replicas involved in the read operation are repaired.
Low overhead: No background processing is required; repair is done as part of the read operation.
Effective for frequently accessed data: Ensures that "hot" data (frequently read) remains consistent across replicas.

Rarely accessed data: If data is not read frequently, inconsistencies can persist indefinitely.
Latency impact: Read operations may take longer because of the additional step of writing updates to stale replicas.

A user reads a record (key: 2345), and one replica returns version 6 while two replicas return version 7. The system detects the stale replica and writes version 7 back to the stale node.

Background synchronization: The anti-entropy process runs independently of client operations, periodically comparing the data across all replicas.
Detection of differences: Mechanisms like Merkle trees or checksums are used to detect differences in data efficiently without transferring the entire dataset.
Copying missing data: If a difference is detected, the missing or inconsistent data is copied from one replica to another to bring them into sync.

Comprehensive: Ensures consistency across the entire dataset, even for rarely accessed data.
Order-agnostic: Writes are copied without preserving the order in which they occurred, as opposed to leader-based replication.
Ensures durability: Prevents data loss in cases where inconsistencies might otherwise go unnoticed due to infrequent reads.

Performance overhead: The process can be resource-intensive, consuming bandwidth, CPU, and disk I/O.
Delay: Depending on the frequency of the anti-entropy process, there may be a significant delay before inconsistencies are resolved.

A background process runs periodically and detects that a replica is missing a particular key-value pair. The missing data is then copied to that replica from another.

These mechanisms are often used together to achieve eventual consistency:
- Read repair handles inconsistencies for "hot" data during regular operations.
- Anti-entropy ensures comprehensive consistency for the entire dataset, including infrequently accessed data.

DynamoDB:
- Uses both read repair and gossip-based anti-entropy to propagate updates and ensure consistency.
Cassandra:
- Implements read repair during quorum reads.
- Supports an anti-entropy mechanism via nodetool repair for cluster-wide consistency.
Voldemort:
- Lacks an anti-entropy process, relying solely on read repair, which makes infrequently accessed data more prone to inconsistencies.

For robust data consistency and durability in distributed databases:

Read repair is effective for keeping frequently accessed data consistent with minimal overhead.
Anti-entropy is essential for ensuring that all data, even rarely accessed records, remains consistent across replicas. Combining the two provides a balanced approach to maintaining consistency in a distributed system.

Image Source:-wikipedia

Shashank’s Substack