How do we resolve write conflicts in a leaderless architecture or multi-leader systems?

Feb 08, 2025

Multiple ways are there to resolve them

How It Works:
- Each write is timestamped (logical or physical).
- When conflicts occur, the system keeps the version with the most recent timestamp and discards older versions.
Advantages:
- Simple to implement and efficient.
- Provides deterministic conflict resolution.
Disadvantages:
- Can lead to data loss, as it arbitrarily discards conflicting updates.
- Relies heavily on clock synchronization (e.g., NTP).
Used In:
- DynamoDB (with vector clocks for added safety).
- Riak (with configurable conflict resolution).

How It Works:
- Each replica tracks versions using a vector clock or version vector.
- When conflicts occur, versions that cannot be causally ordered are retained.
- Applications or users must resolve the conflict manually or using custom logic.
Advantages:
- Preserves all conflicting versions, ensuring no data is lost.
- Captures causality between operations.
Disadvantages:
- More complex to implement and manage.
- Requires application-level logic for conflict resolution.
Used In:
- Dynamo, Riak, and systems inspired by Dynamo's architecture.

How It Works:
- The database detects conflicting writes and returns all versions to the client or application.
- The application implements custom logic to resolve the conflict based on business rules.
Advantages:
- Highly flexible; the application can resolve conflicts in a domain-specific way.
- No arbitrary data loss.
Disadvantages:
- Shifts complexity to the application layer.
- Increases latency as conflicts require application intervention.
Used In:
- CouchDB, where applications receive conflicting document revisions and decide how to merge them.

How It Works:
- When conflicts occur, all conflicting versions (siblings) are stored.
- The system merges these siblings using a predefined logic, such as:
  - Union: Combine conflicting values (e.g., shopping cart items).
  - Custom merge functions: Application-defined merge logic.
Advantages:
- Flexible and avoids data loss.
- Useful in scenarios like shopping carts or sets, where merging is intuitive.
Disadvantages:
- Requires careful design of merge functions.
- Inefficient if sibling versions grow excessively.
Used In:
- DynamoDB, Riak (siblings can be merged via application logic).

How It Works:
- Used in systems like collaborative editing, where conflicting writes are transformed into non-conflicting operations.
- For example, two users editing a document can have their changes transformed to maintain intent consistency.
Advantages:
- Ideal for real-time collaboration.
- Ensures consistency without losing intent.
Disadvantages:
- Complex to implement.
- Best suited for specific use cases like text or structured document editing.
Used In:
- Google Docs, collaborative systems like CRDT-based databases.

How It Works:
- Data structures are designed to resolve conflicts automatically using mathematical properties.
- For example:
  - G-Counter: Grow-only counters that always converge.
  - LWW-Register: Registers using timestamps to resolve conflicts.
  - Sets and Maps: Merge using union operations.
Advantages:
- Automatic resolution with guaranteed convergence.
- No need for manual conflict resolution or application logic.
Disadvantages:
- Limited to certain data structures (e.g., counters, sets, maps).
- May require redesign of application logic to use CRDTs.
Used In:
- Systems like Riak (using CRDTs for data types).
- Collaborative tools and eventually consistent databases.

How It Works:
- The database allows users to define custom conflict resolution logic, such as:
  - Prioritized writes: Always prefer writes from specific nodes.
  - Weighted writes: Resolve conflicts based on predefined weights or roles.
  - Application rules: Resolve based on domain-specific requirements (e.g., sum the values, pick the max, etc.).
Advantages:
- Fully customizable to meet application requirements.
- Provides deterministic resolution for specific use cases.
Disadvantages:
- Requires detailed understanding of application behavior.
- Complexity in implementation.
Used In:
- Riak and other configurable distributed databases.

How It Works:
- When a delete operation conflicts with an update, a tombstone is used to mark the data as deleted.
- The system resolves the conflict based on timestamps or other metadata.
Advantages:
- Handles conflicts between writes and deletes effectively.
- Ensures eventual consistency even in delete scenarios.
Disadvantages:
- Requires periodic cleanup of tombstones to avoid resource overhead.
Used In:
- Cassandra, DynamoDB.

The choice depends on the application requirements:

LWW: Suitable for scenarios where the latest update is always the most important (e.g., caching, metadata updates).
Vector Clocks or Application Logic: Useful in systems with complex business rules or frequent concurrent updates.
CRDTs or OT: Best for collaborative applications requiring automatic conflict resolution.
Sibling Merging: Ideal for aggregating or combining data (e.g., shopping carts).

Each strategy has trade-offs, and many systems combine multiple approaches to handle conflicts effectively.

Shashank’s Substack