Schedule B

CS4432: Database Systems II
Transaction Management
Motivation
1
DBMS Backend
Components
Our next focus
2
Transactions
• A transaction = sequence of operations that either all
succeed, or all fail
• Basic unit of processing in DBMS
• Transactions have the ACID properties:
A = atomicity
C = consistency
I = independence (Isolation)
D = durability
3
Goal: The ACID properties
•
A tomicity:
•
C onsistency:
•
I solation:
•
D urability:
All actions in the transaction happen, or none happen.
If each transaction is consistent, and the DB starts consistent,
it ends up consistent.
Execution of one transaction is isolated from that of all others.
If a transaction commits, its effects persist.
4
Integrity & Consistency of Data
• Data in the DB should be always correct and
consistent
Is this data correct
(consistent)?
Name
Age
White
Green
Gray
52
3421
1
How DBMS decides if data is
consistent?
5
Integrity & Consistency Constraints
• Define predicates and constraints that the data must satisfy
• Examples:
-
x is key of relation R
x  y holds in R
Domain(x) = {Red, Blue, Green}
No employee should make more than twice the average salary
Defining constraints (CS3431)
Schema-level
Add Constraint command
CREATE TABLE Students
(sid: CHAR(20),
name: CHAR(20) NOT NULL,
login: CHAR(10),
age: INTEGER,
gpa: REAL Default 0,
Constraint pk Primary Key (sid),
Constraint u1 Unique (login));
Business-constraint
Use of Triggers
Create Trigger EmpBonus
Before Insert Or Update On Employee
For Each Row
Begin
:new.bonus := :new.salary * 0.03;
End;
/
6
FACT: DBMS is Not Consistent All the Time
Example: a1 + a2 +…. an = TOT (constraint)
Deposit $100 in a2: a2  a2 + 100
TOT  TOT + 100
a2
A transaction hides
intermediate states
(Even under failure)
Initial state
Intermediate state
.
.
.
.
.
.
50
150
150
.
.
.
.
.
.
1000
1100
TOT 1000
Final state
Not
7
Concept of Transactions
Transaction: a collection of actions that preserve consistency
Consistent DB
T
Consistent DB’
Main Assumption
If T starts with consistent state
AND
T executes in isolation
THEN
 T leaves consistent state
8
How Can Constraints Be
Violated?
DBMS can easily detect and prevent
that (if constraints are defined)
• Transaction Bug
– The semantics of the transaction is wrong
– E.g., update a2 and not ToT
Should not use this DBMS
• DBMS Bug
– DBMS fails to detect inconsistent states
• Hardware Failure
– Disk crash, memory failure, …
• Concurrent Access
– Many transactions accessing the data at the same time
– E.g., T1: give 10% raise to programmers
T2: change programmers
Our focus & Major
components in
DBMS
 systems analysts
9
How Can We Prevent/Fix Violations?
• Chapter 17: Due to failures only
• Chapter 18: Due to concurrent access only
• Chapter 19: Due to failures and concurrent access
10
Plan of Attack (ACID properties)
• First we will deal with “I”, by focusing on concurrency control.
• Then we will address “A” and “D” by looking at recovery.
• What about “C”?
– Well, if you have the other three working, and you set up your integrity
constraints correctly, then you get “C” for free
11
CS4432: Database Systems II
Transaction Management
Concurrency Control (Ch. 18)
12
Concurrent Transactions
T1
T2
T3
Tn
DB
(consistency
constraints)
• Many transactions access the data at the same time
• Some are reading, others are writing
• May conflict
13
Transactions: Example
T1: Read(A)
A  A + 100
Write(A)
Read(B)
B  B+100
Write(B)
T2: Read(A)
AA2
Write(A)
Read(B)
BB2
Write(B)
Constraint: A=B
• How to execute these two transactions?
• How to schedule the read/write operations?
14
A Schedule
An ordering of operations (reads/writes) inside one or
more transactions over time
What is correct outcome ?
Leads
To
What is good schedule ?
15
Schedule A
T1: Read(A)
A ¬ A + 100
Write(A)
Read(B)
B ¬ B+100
Write(B)
T2: Read(A)
A¬A´2
Write(A)
Read(B)
B¬B´2
Write(B)
Constraint: A=B
T1
Read(A); A  A+100
Write(A);
Read(B); B  B+100;
Write(B);
T2
A
25
B
25
125
125
Read(A);A  A2;
Write(A);
250
Read(B);B  B2;
Write(B);
250
250
250
Serial Schedule: T1, T2
16
Schedule B
T1: Read(A)
A ¬ A + 100
Write(A)
Read(B)
B ¬ B+100
Write(B)
T2: Read(A)
A¬A´2
Write(A)
Read(B)
B¬B´2
Write(B)
Constraint: A=B
T1
T2
Read(A);A ¬ A´2;
Write(A);
A
25
50
Read(B);B ¬ B´2;
Write(B);
Read(A); A ¬ A+100
Write(A);
Read(B); B ¬ B+100;
Write(B);
B
25
50
150
150
150
150
Serial Schedule: T2, T1
17
Serial Schedules !
• Definition: A schedule in which transactions are performed in a
serial order (no interleaving)
• The Good: Consistency is guaranteed
•
Any serial schedule is “good”.
• The Bad: Throughput is low, need to execute in parallel
Solution  Interleave Transactions in A Schedule…
18
Schedule C
T1: Read(A)
A ¬ A + 100
Write(A)
Read(B)
B ¬ B+100
Write(B)
T2: Read(A)
A¬A´2
Write(A)
Read(B)
B¬B´2
Write(B)
Constraint: A=B
T1
Read(A); A ¬ A+100
Write(A);
T2
Read(A);A ¬ A´2;
Write(A);
A
25
B
25
125
250
Read(B); B ¬ B+100;
Write(B);
125
Read(B);B ¬ B´2;
Write(B);
250
250
250
Schedule C is NOT serial but its Good
19
Schedule D
T1: Read(A)
A ¬ A + 100
Write(A)
Read(B)
B ¬ B+100
Write(B)
T2: Read(A)
A¬A´2
Write(A)
Read(B)
B¬B´2
Write(B)
Constraint: A=B
T1
Read(A); A ¬ A+100
Write(A);
T2
Read(A);A ¬ A´2;
Write(A);
A
25
125
250
Read(B);B ¬ B´2;
Write(B);
50
Read(B); B ¬ B+100;
Write(B);
250
Schedule C is NOT serial but its Bad
B
25
150
150
Not Consistent
20
Schedule E
T1
Read(A); A ¬ A+100
Write(A);
Same as Schedule D
but with new T2’
T2’
Read(A);A ¬ A´1;
Write(A);
A
25
125
125
Read(B);B ¬ B´1;
Write(B);
25
Read(B); B ¬ B+100;
Write(B);
125
Same schedule as D, but this one is Good
B
25
125
125
Consistent
21
What Is A ‘Good’ Schedule?
• Does not depend only on the sequence of operations
– Schedules D and E have the same sequence
– D produced inconsistent data
– E produced consistent data
Transaction semantics
played a role
• We want schedules that are guaranteed “good” regardless of:
– The initial state and
– The transaction semantics
• Hence we consider only:
– The order of read/write operations
– Any other computations are ignored (transaction semantics)
Example:
Schedule S =r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)
22
Example: Considering Only R/W
Operations
T1
Read(A); A ¬ A+100
Write(A);
T2’
Read(A);A ¬ A´1;
Write(A);
Read(B);B ¬ B´1;
Write(B);
Read(B); B ¬ B+100;
Write(B);
Schedule S =r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)
23
Concept: Conflicting Actions
Conflicting actions: Two actions from two different transactions on the
same object are conflicting iff one of them is write
r1(A)  W2(A)  Transaction 1 reads A, Transaction 2 write A
w1(A)  r2(A)  Transaction 1 writes A, Transaction 2 reads A
w1(A)  w2(A)  Transaction 1 writes A, Transaction 2 write A
No Conflict
r1(A)  r2(A)
 Transaction 1 reads A, Transaction 2 reads A
Conflicting actions can cause anomalies…Which is Bad
24
Anomalies with Interleaving
Reading Uncommitted Data (WR Conflicts, “dirty reads”):
e.g. T1: A+100, B+100,
T2: A*1.06, B*1.06
T1:
T2:
R(A), W(A),
R(A), W(A), C
Unrepeatable Reads (RW Conflicts):
E.g., T1: R(A), …..R(A), decrement,
T1:
T2:
R(A),
R(A), W(A), C
R(B), W(B), Abort
T2: R(A), decrement
R(A), W(A), C
Overwriting Uncommitted Data (WW Conflicts):
T1:
T2:
W(A),
W(A), W(B), C
We need
schedule that is
anomaly-free
W(B), C
25
Our Goal
• We need schedule that is equivalent to any serial schedule
It should allow
interleaving
Any serial
order is good
Produces
consistent result
& anomaly-free
Given schedule S:
If we can shuffle the non-conflicting actions to reach a serial schedule L
 S is equivalent to L
 S is good
26
Example: Schedule C
T1: Read(A)
A ¬ A + 100
Write(A)
Read(B)
B ¬ B+100
Write(B)
T2: Read(A)
A¬A´2
Write(A)
Read(B)
B¬B´2
Write(B)
Constraint: A=B
T1
Read(A); A ¬ A+100
Write(A);
T2
Read(A);A ¬ A´2;
Write(A);
A
25
B
25
125
250
Read(B); B ¬ B+100;
Write(B);
125
Read(B);B ¬ B´2;
Write(B);
250
250
250
27
Example: Schedule C
T1
Read(A); A ¬ A+100
Write(A);
T2
Read(A);A ¬ A´2;
Write(A);
Read(B); B ¬ B+100;
Write(B);
Read(B);B ¬ B´2;
Write(B);
Sc= r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)
Can be switched because
they are not conflicting
Sc”= r1(A) w1(A) r1(B) w1(B) r2(A) w2(A) r2(B) w2(B)
T1
T2
 Schedule C is equivalent to a serial schedule  So it is “Good”
28
Why Schedule C turned out
to be Good ?
(Some Formalization)
T1
Read(A); A ¬ A+100
Write(A);
T2
Read(A);A ¬ A´2;
Write(A);
Read(B); B ¬ B+100;
Write(B);
Read(B);B ¬ B´2;
Write(B);
Sc= r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)
T1  T2
(T1 precedes T2)
T1  T2
(T1 precedes T2)
 No cycles  Sc is “equivalent” to a
serial schedule where T1 precedes T2.
29
Example: Schedule D
SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)
• Can we shuffle non-conflicting actions to make
T1 T2 or T2 T1 ??
30
Example: Schedule D
SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)
• Can we make T1 first  [T1 T2]?
– No…Cannot move r1(B) w1(B) forward
– Why: because r1(B) conflict with w2(B) so it cannot move….Same for w1(B)
31
Example: Schedule D
SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)
• Can we make T2 first  [T2 T1]?
– No…Cannot move r2(A) w2(A) forward
– Why: because r2(A) conflict with w1(A) so it cannot move….Same for w2(A)
 Schedule D is NOT equivalent to a serial schedule  So it is “Bad”
32
Why Schedule D turned out to
be Bad?
(Some Formalization)
SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)
T1  T2
(T1 precedes T2)
T2  T1
(T2 precedes T1)
T1 T2
 Cycle Exist  SD is “Not equivalent” to any serial schedule.
33
Recap
• Serial Schedules are always “Good” (Consistency + no anomaly)
– But they limit the throughput
• Goal: Find interleaving schedule that is “equivalent to” a serial schedule
• Identify “Conflicting Actions”, and try to arrange the non-conflicting ones
to reach a serial schedule
• When formalized  Maps to Dependency Graphs and Cycle Testing
Next…
34
CS4432: Database Systems II
Transaction Management
Concurrency Control: Theory
35
Definitions
• Conflict Equivalent
– S1, S2 are conflict equivalent schedules if S1 can be transformed into S2
by a series of swaps of non-conflicting actions.
• Conflict Serializable (Serializable for short)
– A schedule S1 is conflict serializable if it is conflict equivalent to
some serial schedule.
 Schedule C is conflict serializable
 Schedule D is not conflict serializable
36
How to Determine This ?
Answer: A Precedence Graph !
If no cycles
Schedule is conflict serializable
(Good)
If cycles
Schedule is NOT conflict
serializable (Bad)
37
Precedence Graph P(S) (S
is schedule)
Nodes  Transactions in S
Edges  Ti  Tj whenever the 3 conditions are met
- pi(A), qj(A) are actions in S
- pi(A) <S qj(A)
- at least one of pi, qj is a write
Two actions, one from Ti
and one from Tj
Ti’s action before
Tj’s action
They are conflicting
actions
38
Precedence Graph
• Precedence graph for schedule S:
– Nodes: Transactions in S
– Edges: Ti → Tj whenever
• S: … ri (X) … wj (X) …
• S: … wi (X) … rj (X) …
• S: … wi(X) … wj (X) …
Note: not necessarily consecutive
39
Graph Theory 101
Directed Graph:
Not Cycle
Directed edges
Nodes
Cycle
40
Theorem
P(S1) acyclic  S1 conflict serializable
41
Time dim
r2(x) r1(y) r1(z) r5(v) r5(w) w5(w)….
42
Build P(A)
 No cycles
 Schedule A is Conflict Serializable
43
Exercise 1:
• What is P(S) for
S = w3(A) w2(C) r1(A) w1(B) r1(C) w2(A) r4(A) w4(D)
• Is S conflict-serializable?
44
Exercise 2:
• What is P(S) for
S = w1(A) r2(A) r3(A) w4(A) ?
• Is S conflict-serializable?
45
Exercise 3:
• Build P(F)….Is F Conflict Serializable ?
46
How to Find the Equivalent Serial Order
 No cycles  Schedule A is Conflict Serializable
So What is the serial order equivalent to A???
47
How to Find the Equivalent Serial Order
• The serializability order can be obtained by a topological
sorting of the graph. This is a linear order consistent with the
partial order of the graph.
 Take the transaction (T) with no incoming edges and put it in the serial order
(left–to-right)
 Delete T and its edges from the graph
 Repeat until all transactions are taken
 There can be many orders … It is not unqiue
48
How to Find the Equivalent Serial Order
One order  T5 T1 T2 T3 T4
Another order  T1 T3 T5 T2 T4
….
49
CS4432: Database Systems II
Concurrency Control
Enforcing Serializability: Locking
50
Enforcing Serializable Schedules
• DBMSs use a “Scheduler” that schedules the actions of
transactions
• Transactions send their requests (R or W) to Scheduler
• The scheduler prevents the formation of cycles
– It grants permission to R or W only if no cycle will be formed
51
Locking Protocol
• “Scheduler” uses a locking protocol to enforce serializability
• Two New actions
– Lock (exclusive): li(A)  Transaction Ti locks item A
– Unlock:
Ui(A)  Transaction Ti unlocks (releases) item A
lock
table
52
Rule #1: Well-Formed Transactions
Ti: … li(A) … pi(A) … ui(A) ...
Any action (R/W) must be after the lock (l) and before the unlock (u)
Rule 1 is at the level of each transaction
independent of the others
53
Rule #2
Legal Scheduler
S = …….. li(A) ………... ui(A) ……...
no lj(A)
No transaction Tj can lock item A that is already locked by another transaction Ti
(Transaction Tj must wait until Ti releases its lock)
Rule 2 is at the level of the complete
schedule (Set of interleaving transactions)
54
Exercise:
• What schedules are legal?
What transactions are well-formed?
S1 = l1(A)l1(B)r1(A)w1(B)l2(B)u1(A)u1(B)
r2(B)w2(B)u2(B)l3(B)r3(B)u3(B)
S2 = l1(A)r1(A)w1(B)u1(A)u1(B)
l2(B)r2(B)w2(B)l3(B)r3(B)u3(B)
S3 = l1(A)r1(A)u1(A)l1(B)w1(B)u1(B)
l2(B)r2(B)w2(B)u2(B)l3(B)r3(B)u3(B)
55
Schedule F: Let’s Add Some Locking!
T1
T2
l1(A);Read(A)
A A+100;Write(A);u1(A)
l2(A);Read(A)
A Ax2;Write(A);u2(A)
l2(B);Read(B)
B Bx2;Write(B);u2(B)
l1(B);Read(B)
B B+100;Write(B);u1(B)
Does the locking mechanism working? Does it
guarantee serializable schedule??
56
Still Something is Missing…
T1
T2
l1(A);Read(A)
A A+100;Write(A);u1(A)
l2(A);Read(A)
A Ax2;Write(A);u2(A)
l2(B);Read(B)
B Bx2;Write(B);u2(B)
l1(B);Read(B)
B B+100;Write(B);u1(B)
Still by applying the locks….results is not consistent !!!
Next: Rule #3 (Two-Phase Locking)
57