What is 2PC and how it is useful in databases
Two-Phase Commit (2PC) is a protocol used to ensure atomicity across distributed systems. It is widely used in scenarios where a single transaction spans multiple systems or databases.
Phases in 2PC
Phase 1: Prepare (Voting Phase)
The coordinator sends a
PREPARE
request to all participating systems.Each participant:
Checks if it can commit the transaction (e.g., validates data, checks constraints, locks resources).
Replies with either
YES
(can commit) orNO
(cannot commit) to the coordinator.
Phase 2: Commit (Execution Phase)
If all participants respond with
YES
:The coordinator sends a
COMMIT
request to all participants.Participants commit their changes and send an acknowledgment to the coordinator.
If any participant responds with
NO
:The coordinator sends a
ROLLBACK
request to all participants.Participants roll back their changes and send an acknowledgment.
How 2PC is Useful
Ensures Atomicity:
Transactions either complete entirely or not at all across all participants, preventing partial updates.
Consistency in Distributed Systems:
Useful for maintaining consistency in distributed databases, microservices, or systems communicating via message queues.
Decouples Systems:
Participants need not coordinate directly with each other; they only communicate with the coordinator.
Reliable Recovery:
In case of failures, the protocol ensures recovery to a consistent state by reattempting or aborting transactions.
Example Use Case
Imagine a banking system where a user transfers money between two accounts managed by different databases:
Account A in Database 1.
Account B in Database 2.
2PC ensures:
If the debit operation on Account A is successful, the credit operation on Account B will also succeed.
If either operation fails, both operations are rolled back.
class Coordinator { private List<Participant> participants; public void twoPhaseCommit() { boolean allPrepared = true; // Phase 1: Prepare for (Participant participant : participants) { boolean canCommit = participant.prepare(); if (!canCommit) { allPrepared = false; break; } } // Phase 2: Commit or Rollback if (allPrepared) { for (Participant participant : participants) { participant.commit(); } } else { for (Participant participant : participants) { participant.rollback(); } } } } class Participant { private boolean localTransactionPrepared = false; public boolean prepare() { // Check if local transaction can be committed localTransactionPrepared = validateAndLockResources(); return localTransactionPrepared; } public void commit() { if (localTransactionPrepared) { applyTransaction(); } } public void rollback() { if (localTransactionPrepared) { undoTransaction(); } } private boolean validateAndLockResources() { // Validate and lock resources return true; // Assume successful validation } private void applyTransaction() { // Commit local transaction } private void undoTransaction() { // Rollback local transaction } }
Challenges with 2PC
Blocking Protocol:
If the coordinator crashes during the commit phase, participants may remain in an uncertain state, waiting indefinitely.
Performance Overhead:
Locks and communication across participants can slow down the system.
Single Point of Failure:
The coordinator is a critical component. Its failure can disrupt the entire process.
Scalability Issues:
As the number of participants increases, the protocol becomes slower due to increased communication.
Modern Alternatives
Three-Phase Commit (3PC): Introduces a timeout mechanism to reduce blocking issues.
Distributed Consensus Algorithms:
Paxos or Raft for high availability and fault tolerance.
These provide atomicity and reliability without relying on a central coordinator.
When to Use 2PC
Critical transactions requiring strong consistency (e.g., banking, inventory systems).
Systems where the number of participants is relatively small.
Scenarios where blocking or coordinator failure is acceptable or rare.
Source:-wikipedia