What is True Time and how it is used in Google Spanner
TrueTime is a globally distributed clock synchronization service developed by Google, which plays a crucial role in Google Spanner, a globally distributed, horizontally scalable.
TrueTime is one of the key innovations that allows Spanner to provide external consistency (a stronger form of consistency than traditional databases) across distributed systems.
Key Concepts of TrueTime:
Bounded Uncertainty: TrueTime provides a time interval,
[earliest, latest]
, rather than a single timestamp. This interval represents the possible range of the current time with bounded uncertainty. The uncertainty is caused by clock drift, synchronization delays, and other factors in distributed systems.TT.now()
returns a time interval[earliest, latest]
, where:earliest
is the earliest possible current time.latest
is the latest possible current time.
The width of the interval (
latest - earliest
) is called the uncertainty bound.
Clock Synchronization:
TrueTime relies on a combination of GPS and atomic clocks in Google's datacenters. GPS clocks provide global synchronization, while atomic clocks ensure precise timing within datacenters when GPS is unavailable.
These clocks are periodically synchronized, and TrueTime accounts for drift and synchronization uncertainties.
External Consistency:
TrueTime allows Spanner to ensure external consistency, meaning that transactions appear to have executed in a total order consistent with real-world time.
How TrueTime is Used in Google Spanner:
TrueTime is used in Spanner to implement TrueTime-based timestamps and achieve external consistency in transactions. Here's how it works:
1. Commit Wait:
When a transaction is committed, Spanner assigns it a commit timestamp using the TrueTime API.
To ensure external consistency, Spanner enforces a commit wait phase. This guarantees that the commit timestamp is in the past (relative to the real-world clock) for all replicas before the transaction is considered committed.
Specifically:
Spanner waits until
earliest
fromTT.now()
is greater than the assigned commit timestamp. This ensures that the commit timestamp is globally visible and no other transaction can assign an earlier timestamp for operations that happen later in real-world time.
2. Consistency in Distributed Transactions:
When a transaction writes data, Spanner assigns a commit timestamp using TrueTime.
All replicas participating in the transaction use the same commit timestamp, ensuring a globally consistent view of the data.
During reads, Spanner guarantees that only data with commit timestamps ≤ the current TrueTime
earliest
value is visible, ensuring consistent and repeatable reads.
3. Linearizability and Ordering:
TrueTime allows Spanner to enforce a strict global order of transactions. This is critical for maintaining linearizability (strong consistency) in distributed systems.
4. Applications in Schema Changes and Backups:
Schema changes (e.g., adding a new column) and backups rely on consistent snapshots. TrueTime ensures these snapshots are consistent globally by associating them with specific timestamps.
Example of TrueTime in Action:
Imagine two transactions, T1
and T2
, operating on different replicas in different regions:
T1
commits at timestampt1
and updates some data.T2
starts afterT1
has completed and should see the changes made byT1
.
Using TrueTime, Spanner ensures that:
T1
's commit timestamp is finalized only after waiting for the uncertainty window to pass.When
T2
reads the data, it waits until TrueTime guarantees thatT1
's changes are visible (based on their commit timestamps).
This mechanism ensures that T2
observes a consistent and up-to-date view of the data.
Benefits of TrueTime in Spanner:
Enables global consistency across distributed regions.
Guarantees external consistency and strict transaction ordering.
Provides a foundation for strong consistency in distributed database systems without compromising scalability.
1. TrueTime API Details
The TrueTime API consists of just two main operations:
TT.now()
: Returns a bounded time interval[earliest, latest]
.TT.after(t)
: Blocks the execution until the guaranteed current time (TT.now().earliest
) is greater than the specified timet
.
These operations are lightweight and are designed to provide confidence about the current time while accounting for uncertainty.
2. Design Principles Behind TrueTime
TrueTime was designed to solve challenges in distributed systems that arise due to:
Clock Drift: Hardware clocks on different machines can drift apart, leading to inconsistency in timestamps.
Network Latency: Synchronizing clocks across different regions introduces variability.
Byzantine Failures: Clocks can fail silently or behave unpredictably.
To address these challenges:
TrueTime uses redundant GPS and atomic clocks across Google’s global datacenters.
Clock synchronization happens frequently to minimize drift.
Spanner tolerates bounded uncertainty rather than attempting to eliminate it entirely.
This design allows Spanner to work reliably even in the presence of clock drift or temporary synchronization failures.
3. Commit Protocol in Spanner
TrueTime is deeply integrated into Spanner’s 2-phase commit protocol for distributed transactions:
Prepare Phase: Spanner ensures all participating replicas prepare the transaction and log the state.
Commit Timestamp Assignment: A commit timestamp is chosen, ensuring it is in the future relative to the latest known timestamp across replicas.
Commit Wait: The system waits until
TT.now().earliest > commit_timestamp
, ensuring external consistency and visibility to all replicas globally.Finalize Phase: Once the commit wait completes, the transaction is considered committed, and replicas finalize the state.
This ensures that:
Transactions are serialized in an order consistent with real-world time.
No transaction "sees into the future" (i.e., observes data with timestamps later than its commit time).
4. TrueTime and Paxos in Spanner
Spanner uses Paxos-based consensus to maintain replication consistency across replicas. TrueTime helps enforce the ordering of Paxos writes, ensuring that:
Writes with earlier timestamps are applied before those with later timestamps.
Paxos leaders use TrueTime to assign timestamps to ensure all replicas observe a globally consistent write order.
This makes Spanner a strongly consistent, globally distributed database, unlike eventually consistent systems such as DynamoDB or Cassandra.
5. Challenges of TrueTime
While TrueTime is highly reliable, it comes with challenges:
Infrastructure Dependency: TrueTime requires dedicated GPS and atomic clocks in every datacenter, which may not be feasible for smaller organizations.
Uncertainty Bound: The accuracy of TrueTime is limited by the uncertainty bound. Larger uncertainty windows can increase transaction latency due to longer commit waits.
Clock Failures: If a clock fails or drifts significantly, TrueTime needs fallback mechanisms to ensure safety.
Global Scale: Ensuring consistent clock synchronization across the globe at scale requires a sophisticated and robust infrastructure.
6. Comparison to Other Systems
a. Traditional Distributed Systems:
Traditional distributed databases often rely on Lamport timestamps or vector clocks to order events. These approaches provide logical ordering but cannot ensure external consistency or strong real-time guarantees.
TrueTime, in contrast, allows Spanner to provide strong guarantees aligned with real-world time.
b. Cassandra / DynamoDB (Eventual Consistency):
Systems like Cassandra and DynamoDB prioritize availability over consistency. They use techniques like quorum reads/writes but do not provide globally consistent timestamps.
Spanner, with TrueTime, achieves strong consistency while remaining highly available.
c. Amazon Aurora (RDS):
Aurora is a strongly consistent relational database but is limited to regional clusters. It lacks the global consistency and scalability of Spanner.
d. CockroachDB:
CockroachDB is an open-source distributed database inspired by Spanner. It emulates TrueTime by relying on software-based clock synchronization (e.g., NTP). However, this introduces larger uncertainty bounds and may reduce performance compared to Spanner's TrueTime.
7. Practical Implications for Applications Using Spanner
TrueTime has specific implications for developers building on Spanner:
Global Secondary Indexes: Spanner guarantees that secondary indexes are always consistent with the primary table because both are updated with the same commit timestamp.
Time-Series Applications: TrueTime is particularly suited for time-series applications where precise ordering and consistency are critical.
Cross-Region Transactions: Applications that require global consistency across regions (e.g., financial systems) benefit from TrueTime’s guarantees.
8. Research and Academic Context
TrueTime and Spanner were introduced in Google’s Spanner Research Paper (2012). The paper introduced two groundbreaking ideas:
TrueTime API: The bounded uncertainty model for timestamps.
External Consistency at Scale: Achieving strong consistency across global systems.
Spanner remains one of the most influential systems in the field of distributed databases, inspiring similar designs in CockroachDB, FaunaDB, and others.
9. Use Cases of Spanner (and TrueTime)
Some real-world use cases include:
Financial Systems: Global ledgers and transactional systems where strong consistency and real-time guarantees are critical.
Retail Applications: Inventory systems that span multiple regions and require consistent stock counts.
IoT Systems: Applications that generate massive streams of time-ordered data.
Global Collaboration Tools: Tools like Google Docs that require real-time collaboration across distributed users.
10. How TrueTime Could Evolve
Although TrueTime is highly effective, there’s potential for improvement:
Reducing Uncertainty Bounds: As hardware clocks become more precise and synchronization protocols improve, the uncertainty bound can be reduced further, minimizing transaction latencies.
Decoupling GPS/Atomic Clocks: Extending TrueTime's reliability to regions without GPS or atomic clocks could make it more widely applicable.
Open-Source TrueTime: While Spanner’s design inspired CockroachDB, an open-source implementation of TrueTime itself could drive innovation in distributed systems.
Image Source:-Wikipedia
Image Source:-Wikipedia