Summary of the paper, Zanzibar: Google’s Consistent, Global Authorization System
Zanzibar is a high-performance, planet-scale authorization system designed by Google to manage access control for millions of users across thousands of services. It is described in the research paper
Problem Statement
Modern applications require flexible, fine-grained access control that supports hierarchical and relational data structures. Building a scalable, low-latency authorization system to handle billions of access requests across multiple services is a challenging problem.
Goals
Consistency: Provide strong guarantees so that permissions are always evaluated correctly.
Scalability: Support millions of queries per second at low latency.
Flexibility: Allow fine-grained access control policies for diverse use cases across different services.
Ease of Use: Simplify policy management for service developers.
Key Concepts and Components
Tuple-Based Data Model
Zanzibar represents access control relationships as "tuples" of the form:
object#relation@user
For example:document:123#viewer@user:alice means Alice is a viewer of document 123.
Namespace Configuration
Defines schema and rules for each object type. It is used to manage policies like inheritance or group memberships. For example, a folder might inherit permissions from its parent.Consistency Model
Zanzibar uses Per-Request Snapshot Consistency, which ensures that every authorization query observes a consistent snapshot of the data at a specific timestamp.Caching
Zanzibar employs aggressive caching at multiple levels (edge servers, datacenters) to minimize latency and reduce load on the storage system.Global Distribution
Zanzibar operates on a globally distributed infrastructure to serve low-latency queries from anywhere in the world.Change Propagation
Changes in access control policies are propagated through a globally distributed changelog. Services observe these changes in near real-time.APIs
Zanzibar provides APIs for:Writing tuples (managing access control policies)
Reading tuples (to understand access relationships)
Performing authorization checks (e.g., "Can user X perform action Y on object Z?")
Architecture
Storage System
Zanzibar uses Spanner, Google’s distributed SQL database, for reliable and consistent storage of access control data.Caching Layers
Content Cache: Stores frequently accessed tuples.
Lookup Cache: Caches authorization results for specific queries.
Distributed Execution
Requests are routed to the nearest edge server. If the request cannot be resolved via caching, it queries the backend data store.
Strengths
Handles fine-grained access control at a massive scale.
Ensures strong consistency, even across a globally distributed system.
Reduces complexity for developers by abstracting policy management.
Highly performant with low query latencies.
Use Cases
Zanzibar powers many Google products such as YouTube, Google Drive, and Photos, enabling features like sharing and access permissions.
Challenges
Balancing strong consistency with low latency.
Managing access control for complex hierarchical structures.
Propagating changes in near real-time while ensuring global consistency.
Impact
Zanzibar has influenced the design of modern access control systems. Open-source implementations inspired by Zanzibar, like OpenFGA and Authzed, are widely adopted.
By abstracting access control into a centralized, scalable service, Zanzibar has set a benchmark for authorization systems in modern distributed applications.
1. Data Model: Tuple-Based Representation
Zanzibar represents access control relationships in a tuple-based data model, where each tuple is structured as:
object#relation@user
Object: Represents the resource (e.g.,
document:123
,folder:abc
).Relation: Specifies the type of access (e.g.,
viewer
,editor
,owner
).User: Specifies the principal who has the relation (e.g.,
user:alice
,group:marketing
).
Advanced Examples
Direct Relationship:
document:123#viewer@user:alice
→ Alice is a viewer of document 123.Group Membership:
group:marketing#member@user:bob
→ Bob is a member of the "marketing" group.Inherited Permission:
A folder may inherit permissions from its parent. For example:
folder:root#viewer@user:charlie
folder:root#viewer@folder:child
Here, if
folder:child
inherits theviewer
role from its parent (folder:root
), Charlie can view any child folder.Transitive Relationships:
Group memberships or hierarchies can form transitive relationships:
document:123#viewer@group:marketing
group:marketing#member@user:bob
This implies Bob can view document 123 because he is a member of the marketing group.
2. Namespace Configuration
Namespaces in Zanzibar define the schema, rules, and semantics for objects and relations. It serves as a blueprint for managing access control policies for specific object types.
Key Features of Namespace Configuration
Relations and Permissions
Relations can be mapped to explicit permissions. For example:editor
impliesviewer
.owner
implieseditor
.
Computed Permissions
Relationships can be dynamically resolved using set operations likeunion
orintersection
. For example:
permission viewer = union {
relation viewer,
permission editor,
}
This means that if a user is an
editor
, they are automatically considered aviewer
.Inheritance
Define parent-child relationships (e.g., files inherit permissions from folders).
3. Consistency Model: Per-Request Snapshot Consistency
What is Per-Request Snapshot Consistency?
Zanzibar ensures that every access check observes a consistent snapshot of the system's state. Even in the face of concurrent updates, Zanzibar provides a stable and reliable view of permissions for the duration of a request.
Key Advantage: Clients are guaranteed not to see partially applied updates or stale data.
Implementation:
Snapshots are achieved using timestamps from Spanner, Google's distributed database.
Queries for access checks specify a snapshot timestamp, ensuring the evaluation is consistent with the state of the system at that point.
Challenge of Consistency
Propagating changes globally while maintaining snapshot consistency is non-trivial.
Zanzibar achieves this via:
A changelog to propagate updates.
Caching to reduce the load on Spanner while preserving consistency guarantees.
4. Change Propagation and Global Distribution
Zanzibar supports near real-time updates to access control policies while ensuring consistency across the globe.
Changelog System
Every update to a tuple (e.g., adding or removing a permission) is appended to a changelog.
Services subscribed to the changelog are notified to update their caches or re-evaluate permissions as needed.
Global Distribution
Zanzibar is designed to serve low-latency requests across the globe.
To achieve this, it uses:
Edge Caching: Stores frequently accessed data close to users.
Region Replication: Replicates data across multiple datacenters using Spanner.
5. Caching Layers
Caching is critical for Zanzibar’s low-latency performance. It uses multiple caching layers:
Content Cache
Caches tuples (e.g.,
document:123#viewer@user:alice
).Reduces load on the backend storage.
Lookup Cache
Caches the results of specific authorization checks (e.g., "Can Alice view document 123?").
Optimized for high-speed query resolution.
Cache Expiry
Caches are invalidated when changes are propagated through the changelog system.
Snapshot consistency ensures that stale cache entries are avoided during query execution.
6. Query Execution
The core operation in Zanzibar is the authorization check, which evaluates whether a user has a specific relation with an object.
Steps in Query Execution
Parse the Query
For example: "Can Alice view document 123?"Check Caches
Lookup cache: If the result is cached, return it immediately.
Content cache: If the necessary tuples are cached, evaluate the query locally.
Backend Query
If the result isn’t cached, the query is executed against the Spanner backend, using the namespace configuration to resolve permissions.Return Result
The result is cached for future queries.
7. Scalability
Zanzibar is designed to handle planet-scale authorization by:
Sharding Access Control Data
Tuples are sharded across Spanner instances to distribute load.
Aggressive Caching
By caching frequently accessed data at multiple levels, Zanzibar minimizes backend queries.
Efficient Query Planning
Complex queries are optimized using namespace rules and pre-computed relationships.
8. Open Challenges
Although Zanzibar achieves significant scalability and flexibility, it has some inherent challenges:
Latency-Sensitive Scenarios
Real-time updates may still experience propagation delays due to the changelog system.
Complex Policies
Highly nested or deeply hierarchical policies can result in complex queries, which may strain performance.
Debugging and Monitoring
Managing and debugging fine-grained access control rules for large-scale systems can be challenging.
Applications and Open-Source Influence
Zanzibar inspired open-source projects like OpenFGA and Authzed, which implement similar principles.
It is the backbone for access control in products like:
Google Drive: Sharing and file permissions.
YouTube: Video access (public/private/unlisted).
Photos: Album sharing and permissions.
By abstracting access control into a centralized system with robust guarantees, Zanzibar has redefined how modern distributed systems handle authorization. It’s a perfect balance of flexibility, scalability, and performance.
Source:-from paper itself