In the ACID properties of database transactions, the "repeatable read" isolation level ensures that read queries within a transaction see the same data throughout its execution, even if other transactions modify the data. However, repeatable read cannot prevent a phenomenon called phantom reads.
Phantom Reads
What happens: Phantom reads occur when a transaction re-queries a range of rows and finds new rows (phantoms) inserted or deleted by other transactions during its execution.
Why it happens in repeatable read: While repeatable read prevents changes to data already read, it does not lock the range of rows. Therefore, new rows can still appear that match the original query's conditions.
How to prevent phantom reads
The serializable isolation level is required to fully prevent phantom reads. It ensures no other transactions can insert or modify rows that would affect the results of the current transaction's queries.
Example of Phantom Reads in MySQL (Repeatable Read Isolation Level)
Imagine you’re managing a car rental service database. Your system has a cars
table to store available cars, and you want to count how many cars are available for rent in a particular city. Here's the table schema:
CREATE TABLE cars (
id INT PRIMARY KEY,
city VARCHAR(100),
status VARCHAR(20) -- 'available' or 'rented'
);
Transaction 1: Start and Query Data
You start a transaction and execute a query to count available cars in New York:
START TRANSACTION;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 2
Transaction 2: Concurrent Insert
At the same time, another transaction inserts a new available car in New York:
START TRANSACTION;
INSERT INTO cars (id, city, status) VALUES (4, 'New York', 'available');
COMMIT;
Transaction 1: Re-query Data
In repeatable read, if you re-execute the same query, phantom reads may occur because the new row (id = 4
) wasn’t locked by your transaction:
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 3 (a "phantom row" appears)
Despite repeatable read guaranteeing consistency for rows that were already read, it does not lock ranges of rows (e.g., city = 'New York'
). Therefore, new rows matching the query conditions can "appear," causing phantom reads.
Preventing Phantom Reads: Serializable Isolation
If you use the serializable isolation level, the database will place range locks (using gap locking) on all rows matching the query. This prevents other transactions from inserting new rows into the range.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Locks the range of rows in the query condition.
Now, Transaction 2’s attempt to insert a new row in New York would block until Transaction 1 commits or rolls back.
Example of Phantom Reads in MySQL (Repeatable Read Isolation Level)
Imagine you’re managing a car rental service database. Your system has a cars
table to store available cars, and you want to count how many cars are available for rent in a particular city. Here's the table schema:
CREATE TABLE cars (
id INT PRIMARY KEY,
city VARCHAR(100),
status VARCHAR(20) -- 'available' or 'rented'
);
Initial Data in the cars
Table:
idcitystatus1New Yorkavailable2New Yorkavailable3Bostonavailable
Transaction 1: Start and Query Data
You start a transaction and execute a query to count available cars in New York:
START TRANSACTION;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 2
Transaction 2: Concurrent Insert
At the same time, another transaction inserts a new available car in New York:
START TRANSACTION;
INSERT INTO cars (id, city, status) VALUES (4, 'New York', 'available');
COMMIT;
Transaction 1: Re-query Data
In repeatable read, if you re-execute the same query, phantom reads may occur because the new row (id = 4
) wasn’t locked by your transaction:
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 3 (a "phantom row" appears)
Despite repeatable read guaranteeing consistency for rows that were already read, it does not lock ranges of rows (e.g., city = 'New York'
). Therefore, new rows matching the query conditions can "appear," causing phantom reads.
Preventing Phantom Reads: Serializable Isolation
If you use the serializable isolation level, the database will place range locks (using gap locking) on all rows matching the query. This prevents other transactions from inserting new rows into the range.
Steps:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Locks the range of rows in the query condition.
Now, Transaction 2’s attempt to insert a new row in New York would block until Transaction 1 commits or rolls back.
Key Takeaways:
Repeatable Read: Prevents dirty reads and non-repeatable reads but cannot prevent phantom rows from appearing.
Serializable: Prevents phantom reads by locking the range of rows queried.
Trade-offs: Serializable isolation significantly reduces concurrency and can lead to higher contention in high-write systems.
In Oracle Database, phantom reads do not occur under normal circumstances, even at the default READ COMMITTED isolation level, due to Oracle's use of multiversion concurrency control (MVCC). Here’s a detailed breakdown of how Oracle handles this:
How Oracle Prevents Phantom Reads
Oracle's architecture differs from some other relational databases because:
MVCC Mechanism: When a transaction queries data, Oracle creates a consistent snapshot of the data as it existed at the query's start time. This means the query sees a "frozen view" of the data, unaffected by changes made by other transactions, even if new rows are inserted.
Undo Segments: Oracle uses undo segments to reconstruct older versions of rows if required. These segments ensure that a query always sees consistent data as of the beginning of the query or transaction.
No Range Locks: Unlike databases like MySQL or SQL Server, Oracle doesn’t need to lock a range of rows to prevent phantom reads. Instead, it simply avoids reading rows created after the query began.
In Oracle Database, phantom reads do not occur under normal circumstances, even at the default READ COMMITTED isolation level, due to Oracle's use of multiversion concurrency control (MVCC). Here’s a detailed breakdown of how Oracle handles this:
How Oracle Prevents Phantom Reads
Oracle's architecture differs from some other relational databases because:
MVCC Mechanism: When a transaction queries data, Oracle creates a consistent snapshot of the data as it existed at the query's start time. This means the query sees a "frozen view" of the data, unaffected by changes made by other transactions, even if new rows are inserted.
Undo Segments: Oracle uses undo segments to reconstruct older versions of rows if required. These segments ensure that a query always sees consistent data as of the beginning of the query or transaction.
No Range Locks: Unlike databases like MySQL or SQL Server, Oracle doesn’t need to lock a range of rows to prevent phantom reads. Instead, it simply avoids reading rows created after the query began.
Scenario: Phantom Reads in Oracle
Consider a similar cars
table used in a car rental service:
CREATE TABLE cars (
id NUMBER PRIMARY KEY,
city VARCHAR2(100),
status VARCHAR2(20) -- 'available' or 'rented'
);
Initial Data in the Table:
idcitystatus1New Yorkavailable2New Yorkavailable3Bostonavailable
Transaction 1: Query Count
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 2
Transaction 2: Concurrent Insert
A second transaction inserts a new car for New York:
BEGIN;
INSERT INTO cars (id, city, status) VALUES (4, 'New York', 'available');
COMMIT;
Transaction 1: Re-query Count
In Oracle, even if Transaction 2 committed its changes, Transaction 1 does not see the new row (phantom row) because Oracle ensures a consistent snapshot:
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Result: 2 (no phantom read)
Oracle relies on the undo segments to reconstruct the original view of the data, so the inserted row from Transaction 2 is ignored for the duration of Transaction 1’s query.
Serializable Isolation in Oracle
If you want to enforce even stricter isolation (e.g., disallowing other transactions from inserting new rows during your transaction), you can use the serializable isolation level:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
SELECT COUNT(*) FROM cars WHERE city = 'New York' AND status = 'available';
-- Locks the range of rows based on the query condition.
Now, if another transaction tries to insert a new row matching the condition, it will be blocked until Transaction 1 commits.
Why Oracle Prevents Phantoms by Default
Oracle's MVCC implementation ensures that even at the default READ COMMITTED level, no phantom rows are visible because queries use a consistent snapshot.
Unlike other databases (e.g., MySQL’s InnoDB), Oracle does not use range locking to handle phantoms
image source:-wikipedia