A transaction, or logical unit of work (LUW), is a series of actions taken against the database that occur as an atomic unit, with all actions considered to be one large group occurring together. A transaction typically contains multiple SQL statements. With each atomic unit, there is either success or failure, with all transactions being COMMITTed or none being added (ABORT / ROLLBACK).
ACID refers to a set of properties that guarantee that database transactions are processed reliably. We expect that our databases are always in a consistent state and these properties will help ensure the stability of our data.
It is easier to view transactions via an example. Let's look at a transaction that has three distinct actions.
Consistency ensures that the database moves from one valid state to a new valid state when changes are made via either statement level consistency or transaction level consistency.
With statement level consistency each statement independently processes the rows to ensure consistency, example:
UPDATE CUSTOMER
SET AreaCode = '410'
WHERE ZipCode = '21218';
All qualifying rows are updated and no concurrent updates to CUSTOMER are allowed
during the update.
With transaction level consistency all rows impacted by the SQL statements are protected from changes during the entire transaction. Note that with transaction level consistency a transaction may not see its own changes!
From the MySQL command line client we can start a transaction with START TRANSACTION;, and complete with either COMMIT; or ROLLBACK;, example:
START TRANSACTION;
UPDATE CUSTOMER
SET AreaCode = '425'
WHERE ZipCode = '21666'
# More transaction work...
UPDATE CUSTOMER
SET Discount = 0.25
WHERE AreaCode = '425'
COMMIT;
In your MySQL interactions with Python, the option to autocommit is by default disabled. In previous lectures we enabled it, but if you want to have control of when items are committed or rolled back, the autocommit needs to be off. Example:
import mysql.connector # MySQL Connection Library
# Connect to MySQL
try:
db = mysql.connector.connect(user=dbuser,
password = dbpass,
host = dbhost,
database=dbname)
db.autocommit = False # omit this line to leave default setting (off) or set autocommit to OFF
except mysql.connector.Error as err:
print(err)
exit() # assumes the program should terminate if we cannot connect
Now if we wished to commit, or rollback our work we would use either
db.commit()
to commit, or
db.rollback()
to remove our changes, which can be quite useful if we capture an error within
our Python scripts that would impact our work with the database.
Concurrent transactions are transactions that appear to users as if they are being processed at the same time. In reality, the database will only be executing one transaction at a time, similar to a CPU. However, steps from the two transactions will be interleaved. This could lead to concurrency problems where updates could be lost, or reads could become inconsistent. Imagine you have the scenario where in an online store one user is purchasing ten Snickers bars while a second is buying two Gatorade bottles.
Read nb Snickers (ns=500)
Reduce count Snickers by 10 (ns=490)
Write new nb Snickers back (ns=490)
Read nb Gatorades (ng=200)
Reduce count Gatorades by 2 (ng=198)
Write new nb Gatorades back (ng=198)
Read nb Snickers (ns=500)
Read nb Gatorades (ng=200)
Reduce count Snickers by 10 (ns=490)
Write new nb Snickers back (ns=490)
Reduce count Gatorades by 2 (ng=198)
Write new nb Gatorades back (ng=198)
Read nb Snickers (ns=500)
Reduce count Snickers by 10 (ns=490)
Write new nb Snickers back (ns=490)
Read nb Snickers (ns2=500)
Reduce count Snickers by 2 (ns2=498)
Write new nb Snickers back (ns2=498)
Read nb Snickers (ns=500)
Read nb Snickers (ns2=500)
Reduce count Snickers by 10 (ns=490)
Write new nb Snickers back (ns=490)
Reduce count Snickers by 2 (ns2=498)
Write new nb Snickers back (ns2=498)
U1: Read nb Snickers
(ns=500)
U2: Read nb Snickers
(ns2=500)
U1: Reduce count Snickers by 10
(ns=490)
U1: Write new nb Snickers back
(ns=490)
U2: Reduce count Snickers by 2
(ns2=498)
U2: Write new nb Snickers back
(ns2=498)
T1: R(Snickers)
T2: R(Snickers)
T1: W(Snickers)
T1: COMMIT
T2: W(Snickers)
T2: COMMIT
T1: R(S) W(S) Commit
T2: R(S) W(S) Commit
There are a few issues introduced when we encounter situations like the Snickers example. You must be able to distinguish between these problems. Focusing on what causes the problem is the most helpful way to distinguish between them!
Dirty Reads occur when uncommitted data from a first transaction (T1) is read by a second transaction (T2), and then T1's uncommitted data gets rolled back or aborted after T2 reads it! KEY: caused by uncommitted data being read! Note the dirty read by T2's R(A) step in this example:
T1: R(A), W(A), R(B), W(B), Abort(Undo Changes)
T2: R(A), W(A)
Non-repeatable Reads occur when a transaction reads the same data twice
during its steps and in between the reads another transaction updates or deletes
a portion of that data. KEY: caused by an UPDATE or DELETE statement
modifying data in between reads!
T1: R(A), R(A), W(A), Commit
T2: R(A), W(A), Commit
Phantom Reads occur when during a transaction, two identical queries are executed (say as part of checks at the start and end of a stored procedure), and the rows returned by the second query are inconsistent from the first due to new rows
being INSERTED and committed by another transaction. This is very similar to a non-repeatable read
and can thus be confusing. KEY: Have to have an INSERT statement to have a Phantom Read!
Locking prevents multiple applications from obtaining copies of the same resource when the
resource is about to be changed. Terminology:
Implicit Lock - Placed by the DBMS, automatically.
Explicit Lock - Issued by the application program, declared explicitly by an application programmer.
Lock Granularity - Size of the locked resource (rows, table, database level).
Exclusive Lock (X) - Prohibits other users from reading the locked resource.
Shared Lock (S) - Allows other user to read the locked resource, but they can't update it.
Lock Snickers
Read nb Snickers (ns=490)
Reduce count Snickers by 10 (ns=490)
Write new nb Snickers back (ns=490)
Release Lock
Lock Snickers (failed wait)
Lock Snickers
Read nb Snickers (ns2=490)
Reduce count Snickers by 2 (ns2=488)
Write new nb Snickers back (ns2=488)
Release Lock
There are two primary general approaches to locking:
Optimistic Locking - Assumes that no transaction conflict will occur
(preferred for most internet applications due to higher data throughput).
Pessimistic Locking - Assumes that conflict will occur (slower, but safer).
SELECT PRODUCT.Name, PRODUCT.Quantity
FROM PRODUCT
WHERE PRODUCT.Name = 'Pencil';
Set NewQuantity=PRODUCT.Quantity-5;
{Process Transaction, Assume all is OK}
LOCK TABLES PRODUCT WRITE;
UPDATE PRODUCT
SET PRODUCT.QUANTITY = NewQuantity
WHERE PRODUCT.NAME = 'Pencil';
UNLOCK TABLES;
{Check to see if write was successful,
that no other errors were produced
like an invalid negative PRODUCT.QUANTITY,
ROLLBACK changes and repeat if necessary}
LOCK TABLES PRODUCT WRITE;
SELECT PRODUCT.Name, PRODUCT.Quantity
FROM PRODUCT
WHERE PRODUCT.Name = 'Pencil';
Set NewQuantity=PRODUCT.Quantity-5;
{Process Transaction, Assume all is OK}
UPDATE PRODUCT
SET PRODUCT.QUANTITY = NewQuantity
WHERE PRODUCT.NAME = 'Pencil';
UNLOCK TABLES;
{No need to check to see if successful
as we executed our lock up front}
Most applications do not explicitly declare locks due to their complexity, instead they mark transaction boundaries and declare the locking behavior that they want the DBMS to use. With these markers (BEGIN or START TRANSACTION, COMMIT, and ROLLBACK), if the locking behavior needs to be changed, only the lock declaration passed to the DBMS needs to be changed but not the lock placement in the Persistent Stored Module in the DB. Here is the key table from the Kroenke textbook that you MUST understand to use the appropriate isolation level in MySQL. We want to choose the least restrictive level that ensures an atomic database! For example, if our stored procedure only has a single INSERT statement, NONE of our concurrency problems are possible, so use the least restrictive lock possible:
Let's look over an example in MySQL:
DELIMITER $$
CREATE PROCEDURE insertStudents (Alphavar char(6), LastNamevar varchar(50),
FirstNamevar varchar(50), Emailvar varchar(100), ClassYearvar int,
Majorvar char(4))
BEGIN
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
IF ClassYearvar > YEAR(CURDATE()) THEN
INSERT INTO Students(Alpha, LastName, FirstName, Email, ClassYear, Major)
VALUES (Alphavar, LastNamevar, FirstNamevar, Emailvar, ClassYearvar, Major);
END IF;
COMMIT;
END $$
DELIMITER ;
You can find more information on the isolation levels offered in MySQL in the documentation.
It is possible to put the database into a position where nothing can happen, example:
Database management systems must resolve such deadlocks. First, they must be able to identify that a deadlock exists. Second, they must cancel one of the transactions, ROLLBACK its changes, and allow the other transaction to complete before starting the cancelled transaction over.
Please walk through the following SQL script and execute the commands line by line in order to get more familiar with Transactions in MySQL and selecting the proper isolation level (NOTE: recommend downloading and opening in your favorite text editor for syntax highlighting and readability or reading inline below):
Here is a link to a downloadable SQL file of the code below:Transaction Isolation Levels Example.
/* TRANSACTION ISOLATION LEVELS AND DIRTY READ, NON-REPEATABLE READ,
PHANTOM READ EXERCISE
-- Goal: Show the different database concurrency problems (Dirty Read,
Nonrepeatable read, Phantom Read) and how the Transaction Isolation Levels solve them in MySQL.
Key Point: We have to choose these wisely within our transactions, we want to be
as least restrictive as is safe! (Speed is KING! But an error filled DB is worthless!)
To work through this example you will need to pull two terminals up side by
I will refer to terminal 1 and terminal 2 throughout the notes to correspond
to two separate transactions on the same table.
*/
/* SETUP THE TABLES: */
DROP TABLE IF EXISTS CourseCounts;
DROP TABLE IF EXISTS StudentEnrollment;
DROP TABLE IF EXISTS Courses;
DROP TABLE IF EXISTS Students;
CREATE TABLE Courses(
CourseNum VARCHAR(10),
CourseName VARCHAR(25),
CONSTRAINT PK_Courses PRIMARY KEY (CourseNum)
);
CREATE TABLE Students(
Alpha INT,
LName VARCHAR(45) NOT NULL,
FName VARCHAR(45) NOT NULL,
CONSTRAINT PK_Students PRIMARY KEY (Alpha)
);
CREATE TABLE StudentEnrollment(
Alpha INT,
CourseNum VARCHAR(10) NOT NULL,
Semester VARCHAR (10) NOT NULL,
Year INT NOT NULL,
CONSTRAINT PK_StudentEnrollment PRIMARY KEY (Alpha, CourseNum, Semester, Year),
CONSTRAINT FK_StudentEnrollment_Students FOREIGN KEY (Alpha)
REFERENCES Students (Alpha) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT FK_StudentEnrollment_Courses FOREIGN KEY (CourseNum)
REFERENCES Courses (CourseNum) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE TABLE CourseCounts(
CourseNum VARCHAR(10),
Semester VARCHAR(10),
Year INT,
CourseCountCur INT NOT NULL DEFAULT 0,
CourseMax INT NOT NULL DEFAULT 20,
CONSTRAINT PK_CourseCounts PRIMARY KEY (CourseNum, Semester, Year),
CONSTRAINT FK_CourseCounts_Courses FOREIGN KEY (CourseNum)
REFERENCES Courses(CourseNum) ON DELETE CASCADE ON UPDATE CASCADE
);
/* INSERT AT LEAST SOME STUDENTS */
INSERT INTO Students VALUES (255106, 'Smith', 'Paul');
INSERT INTO Students VALUES (255107, 'Duncan', 'Gavin');
DELETE FROM StudentEnrollment;
DELETE FROM Courses;
DELETE FROM CourseCounts;
INSERT INTO Courses VALUES ('NL110', 'Intro to Leadership');
/************************* MYSQL DEFAULT ISOLATION ****************************/
/* Step 1: See the MYSQL default transaction level: */
SELECT @@transaction_ISOLATION;
/***************************** READ UNCOMMITTED *******************************/
-- T1 is a stored procedure to remove a student from a course
-- T2 is a stored procedure to add a student to a course
-- SETUP: Establish a full course for NL110
INSERT INTO CourseCounts VALUES
("NL110", "FALL", 2025, 20, 20)
ON DUPLICATE KEY UPDATE CourseCountCur = 20;
-- Terminal 1:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
-- T1: R(CourseCounts) OUR SP TRANSACTION TO REMOVE A STUDENT FROM A COURSE
SELECT * FROM CourseCounts;
-- T1: W(CourseCounts) to remove student
UPDATE CourseCounts
SET CourseCountCur = CourseCountCur - 1
WHERE CourseNum = "NL110";
-- TERMINAL 2:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
START TRANSACTION;
-- T2: R(CourseCounts) OUR SP TRANSACTION TO ADD A STUDENT TO A COURSE
-- THIS IS OUR DIRTY READ!
-- T1's W(CourseCounts) Can be ROLLEDBACK in T1 after our logic check.
SELECT * FROM CourseCounts;
-- TERMINAL 1:
ROLLBACK;
-- TERMINAL 2:
ROLLBACK; -- OR COMMIT; -- same effect since no changes were made in T2
-- We have to be able to prevent cross transaction reads of non-committed changes!
/************************** READ COMMITTED ****************************/
-- --------------SHOW DIRTY READ IS NO LONGER POSSIBLE-------------------------
UPDATE CourseCounts
SET CourseCountCur = 20
WHERE CourseNum="NL110";
-- TERMINAL 1:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T1: R(CourseCounts) OUR SP TRANSACTION TO REMOVE A STUDENT FROM A COURSE
SELECT * FROM CourseCounts;
-- T1: W(CourseCounts) to remove student
UPDATE CourseCounts
SET CourseCountCur = CourseCountCur - 1
WHERE CourseNum = "NL110";
-- T1: R(CourseCounts) that changes visible within transaction
SELECT * FROM CourseCounts;
-- TERMINAL 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T2: R(CourseCounts)
SELECT * FROM CourseCounts;
/*
NOTE: Key Point, DIRTY READ is solved, we don't see the uncommitted changes!
DIRTY READ = when **uncommitted data** is allowed to be read from another running transaction.
*/
-- TERMINAL 1
COMMIT; -- just to finish the transaction
-- TERMINAL 2
COMMIT;
-- --------------SHOW NON-REPEATABLE READ STILL POSSIBLE ---------------------
-- SETUP:
UPDATE CourseCounts
SET CourseCountCur = 19
WHERE CourseNum="NL110";
-- TERMINAL 1:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T1: R(CourseCounts)
SELECT (CourseMax - CourseCountCur)
FROM CourseCounts
WHERE CourseNum = "NL110";
-- TERMINAL 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T2: R(CourseCounts)
SELECT (CourseMax - CourseCountCur)
FROM CourseCounts
WHERE CourseNum = "NL110";
-- T2: W(CourseCounts)
UPDATE CourseCounts
SET CourseCountCur = CourseCountCur + 1
WHERE CourseNum = "NL110";
COMMIT;
-- TERMINAL 1:
-- T1: R(CourseCounts)
SELECT (CourseMax - CourseCountCur)
FROM CourseCounts
WHERE CourseNum = "NL110";
ROLLBACK;
/* NOTE:
KEY POINT: This change in available course seats in the middle of our
T1 transaction (and after our first check) can cause errors in our DB, even though
all changes where legitimate!
NON-REPEATABLE READ = transaction reads same data twice and in between another
transaction has **updated** or **deleted** values!
*/
-- --------------SHOW PHANTOM READ STILL POSSIBLE -------------------------
-- REITERATE, WE ARE STILL USING READ COMMITTED LEVEL IN BOTH TERMINALS:
-- TERMINAL 1:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T1: R(Courses) Checking for how many courses we offer, only have enough to offer
-- three courses per semester.
SELECT COUNT(*) FROM Courses;
-- T1: W(Courses)
INSERT INTO Courses VALUES ('IC210', 'Intro to Computing');
-- TERMINAL 2:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
-- T2: W(Courses)
INSERT INTO Courses VALUES ('SY110', 'Intro to Cyber Security');
INSERT INTO Courses VALUES ('SD211', 'Data Science and Programming I');
COMMIT;
--TERMINAL 1:
-- T1: R(Courses) Checking courses again, new value?!
SELECT COUNT(*) FROM Courses;
-- We better rollback, we don't have enough instructors!
ROLLBACK;
-- SHOW THIS IS DIFFERENT THAN OUR NON-REPEATABLE READ, THIS IS OUR PHANTOM READ!
-- Rather than updated or deleted data (NON-REPEATABLE READ) we have new **INSERTED
-- ROWS**, this is what makes it a PHANTOM READ vice a NON-REPEATABLE READ!
/* NOTE:
IN MOST DBMS (ORACLE, MICROSOFT SQL), PHANTOM READ isn't solved until the SERIALIZABLE setting.
in MYSQL, because of implementation differences, PHANTOM READ gets solved
a step early. See here if you are curious:
https://stackoverflow.com/a/8302879/5960479
*/
/* STUDENT TODO: repeat above steps using REPEATABLE READ settings to see they aren't possible */
the questions are related to possible operations in a MIDS-like system. The
following tables are involved:
Students(Alpha, LName, FName)
Courses(CourseNum, CourseName)
CourseCounts(CourseNum, Semester, Year,CourseCountCur, CourseMax)
StudentEnrollment(Alpha, CourseNum, Semester, Year)
Assume that the following two types of problems can run concurrently
in your application and you could also have multiple calls to the same program
running concurrently:
spReadCourseCounts: read the rows in the CourseCounts table for
courses that will be offered during a particular semester. The procedure uses
only the CourseCounts table to read some courses
spPreregisterStudents: records and changes preregistration information.
The procedure writes to StudentEnrollment table, modifies the CourseCounts
table, and reads from Students, Courses, CourseCounts table, possibly multiple
times.