Summary of the paper TAO: Facebook’s Distributed Data Store for the Social Graph

TAO is Facebook's geographically distributed data store designed to provide efficient and timely access to the social graph, which represents the relationships and interactions among users

Jan 06, 2025

It was developed to handle Facebook's demanding workload, capable of processing a billion reads and millions of writes each second.

Overview of TAO

Purpose: TAO is Facebook’s distributed data store for managing and providing efficient access to the social graph, which is the network of relationships and interactions between Facebook users (nodes and edges).
Scale: It is designed to handle billions of reads and millions of writes per second, meeting the massive workload demands of Facebook.

Facebook Research

Key Features of TAO:

Data Model: TAO employs a simple data model consisting of objects (nodes) and associations (edges), effectively representing the social graph's structure.
Facebook Research
Two-Tier Architecture: TAO utilizes a two-tier caching system with leaders and followers. Leaders handle all write operations and propagate changes to followers, which serve read requests. This design ensures low-latency data access and high availability.
Engineering Blog
Eventual Consistency: TAO prioritizes availability and performance, adopting an eventual consistency model. While it may not provide immediate consistency across all replicas, it ensures that all replicas will eventually converge, balancing performance with data accuracy.
Facebook Research
Scalability: Designed to operate on thousands of machines, TAO manages many petabytes of data, efficiently handling Facebook's extensive and dynamic social workload.

Data Model:
- The system uses a graph data model with:
  - Objects (nodes): Representing entities like users, posts, pages, etc.
  - Associations (edges): Representing relationships between entities, like "User A likes Post B" or "User C is friends with User D."
Two-Tier Architecture:
- Leaders (Primary nodes): Handle all write operations and propagate updates to followers.
- Followers (Secondary nodes): Serve read operations to ensure low-latency access.
- This architecture ensures high availability and scalability for read-heavy workloads.
Caching System:
- TAO uses a two-tier caching mechanism to improve performance:
  - Local cache: For frequent queries.
  - Remote cache: For less common queries.
- This reduces the need for repeated database access.
Consistency Model:
- Implements eventual consistency:
  - Writes are propagated asynchronously to replicas, meaning all copies of data will eventually become consistent.
  - Prioritizes availability and performance over strict consistency.
  - Works well for scenarios where immediate data consistency isn’t critical (e.g., social interactions).
Scalability:
- TAO operates across thousands of machines and manages petabytes of data.
- It efficiently supports the dynamic and growing social workload of Facebook’s users.
Fault Tolerance:
- Designed for high availability, ensuring that even if some nodes fail, the system can still process requests.
- Data replication ensures redundancy and prevents data loss.
API Design:
- TAO provides a simple API for developers to query and modify the social graph:
  - Common operations include fetching friends, likes, and relationships between users and objects.

Advantages

Low Latency: Designed to provide fast responses for read queries.
High Throughput: Handles a large number of reads and writes simultaneously.
Read Optimization: Optimized for read-heavy workloads, which are common in social media applications.
Scalable Design: Easily scales to support billions of users and their interactions.

Use Case at Facebook

TAO powers many features on Facebook, such as:
- Fetching a user’s news feed.
- Loading the list of friends, likes, or comments for a post.
- Providing the backend infrastructure for notifications and interactions.

How TAO is able to handle 1 Billion Queries Per Second

1. Read-Optimized Design

Social Graph Access Patterns:
- Facebook workloads are highly read-intensive, as most user interactions (e.g., fetching posts, likes, and friends) are read operations.
- TAO is optimized to prioritize read operations, which allows for scaling reads horizontally across many machines.
Caching:
- TAO uses a two-tier caching system (local and remote caches) to store frequently accessed data.
  - Local Cache: Data is kept close to the application, minimizing latency for repeated queries.
  - Remote Cache: Stores less frequent data but avoids direct database hits.
- By leveraging caching, TAO reduces the load on the backend database, allowing it to handle billions of queries efficiently.

2. Distributed Architecture

Horizontal Scaling:
- TAO scales by adding more machines (horizontal scaling). Each machine handles a portion of the workload.
- Queries are distributed across thousands of servers, balancing the load effectively.
Geographic Distribution:
- TAO is deployed across geographically distributed data centers. This reduces latency by serving users from the closest data center.
Sharding:
- The social graph is sharded (partitioned) based on user IDs or other keys. This ensures that each server handles only a subset of the data.
- Each shard operates independently, allowing the system to process queries in parallel.

3. Asynchronous Write Propagation

Eventual Consistency:
- TAO uses eventual consistency to prioritize query speed over strict synchronization.
- Writes are asynchronous, meaning they are propagated to replicas in the background. This ensures that the system doesn’t block queries while waiting for all replicas to update.

4. Efficient Query Execution

Pre-Computed Results:
- TAO precomputes and stores results for commonly accessed queries (e.g., "Who are my friends?" or "Who liked this post?").
- This avoids recalculating results repeatedly.
Indexing:
- Relationships in the social graph (e.g., friendships, likes) are efficiently indexed, enabling fast lookups.

5. Fault Tolerance and Redundancy

Replication:
- Data is replicated across multiple machines and regions, ensuring availability even if some nodes fail.
Failover Mechanisms:
- If a server or data center goes down, the system automatically routes queries to replicas without downtime.

6. High Throughput System Design

Batching:
- TAO batches multiple small queries into a single larger request to reduce overhead.
Asynchronous APIs:
- APIs are designed to handle queries asynchronously, enabling better utilization of system resources.
Load Balancing:
- Facebook uses intelligent load balancers to distribute incoming queries evenly across servers, avoiding bottlenecks.

7. Efficient Data Structures

Graph Data Model:
- TAO uses a simple object-association graph model to represent the social graph. This reduces the complexity of queries compared to traditional relational databases.

8. Hardware Optimization

Custom Hardware:
- Facebook optimizes its hardware to process high workloads efficiently.
SSD and RAM Usage:
- Frequently accessed data is stored in RAM or high-speed SSDs for quick retrieval.

Breakdown of Query Handling

User Request: A user initiates a request (e.g., "Show me my friends").
Cache Check: The system first checks the local cache. If the data isn’t found:
Remote Cache: It queries the remote cache. If the data isn’t found:
Sharded Database: It queries the sharded backend database.
Response: The result is sent back to the user, and the cache is updated for future queries.

Why TAO Can Handle Such High Throughput

Massive Parallelism: Distributed and sharded architecture allows millions of queries to run simultaneously.
Efficient Caching: Most reads are served directly from cache, avoiding expensive database lookups.
Optimized for Reads: The system is designed specifically for the read-heavy nature of social media workloads.
Geographic Distribution: By distributing data and requests globally, it reduces bottlenecks at any single location.

Real-Life Example:

If a Facebook user requests their news feed:

The local cache serves the most recent posts from their friends.
If some data isn’t found in the cache, it queries the distributed servers responsible for that user’s data shard.
The results are assembled and sent back, ensuring sub-second response times, even at massive scales.

In summary, TAO achieves 1 billion queries per second by combining caching, distributed architecture, efficient query execution, and fault-tolerant design, all tailored to Facebook's specific social graph workload.

source:-From paper itself

source:-from paper itself

Shashank’s Substack

Discussion about this post

Ready for more?