Transactions - CIS @ Temple University

Temple University – CIS Dept.
CIS616– Principles of Data
Management
V. Megalooikonomou
Transactions
(based on notes by C. Faloutsos at CMU)
General Overview





Relational model - SQL
Functional Dependencies &
Normalization
Physical Design & Indexing
Query optimization
Transaction processing


concurrency control
recovery
Transactions - dfn
= unit of work, e.g.,
move $10 from savings to checking
Atomicity (all or none)
Consistency (preservation)
Isolation (as if alone)
Durability (changes persist)
recovery
concurrency
control
Operational details
‘read(x)’: fetches ‘x’ from disk to main
memory (= buffer)
 ‘write(x)’: writes ‘x’ to disk (sometime
later)
power failure  troubles!
Also, could lead to inconsistencies ...

Durability
transactions should survive failures
(after a transaction completes
successfully the changes in the DB
persist)
Atomicity
straightforward:
Checking = Checking + 10
Savings = Savings - 10
Consistency
e.g., the total sum of $ is the same,
before and after
(but not necessarily during)
Isolation
Other transactions should not affect us
Counter-example: lost update problem:
read(N)
read(N)
N=N-1
N=N-1
write(N)
write(N)
Transaction states
partially
active
committed
failed
committed
aborted
Outline
concurrency control ( isolation)
- ‘correct’ interleavings
- how to achieve them
recovery ( durability, atomicity)
Concurrency
Why do we want it?



Increased throughput (# transactions executed
in a given amount of time)
Increased utilization (CPU and disk spend less
time idle)
Reduced waiting time (avg. response time: avg.
time for a transaction to be completed)
Example of interleaving:
T1: moves $10 from savings (X) to checking (Y)
T2: adds 10% interest to everything
Interleaved execution
Read(X)
Read(X)
time
X=X-10
Write(X)
Read(Y)
Y=Y+10
Write(Y)
‘correct’?
X = X * 1.1
Write(X)
Read(Y)
Y=Y*1.1
Write(Y)
How to define correctness?
Back to the basics…
… let’s start from something definitely
correct:
 Serial executions
Serial execution
T1
Read(X)
X=X-10
Write(X)
Read(Y)
Y=Y+10
Write(Y)
T2
‘correct’
by
definition
Read(X)
X = X * 1.1
Write(X)
Read(Y)
Y=Y*1.1
Write(Y)
How to define correctness?
A: Serializability:
A schedule (=interleaving) is ‘correct’ if it
is serializable,
i.e., equivalent to a serial interleaving
(regardless of the exact nature of the
updates)
examples and counter-examples:
Example: ‘Lost-update’ problem
T1
Read(N)
T2
Read(N)
N=N-1
N= N-1
Write(N)
Write(N)
not equivalent to any serial execution (why not?) 
incorrect!
More details: ‘conflict
serializability’
T1
Read(N)
T2
Read(N)
N=N-1
N= N-1
Write(N)
Write(N)
Conflict serializability
r/w, w/r: e.g., object X read by Ti and
written by Tj
w/w: ........written by Ti and written by Tj
-
the order matters in both cases …
PRECEDENCE GRAPH:
Nodes: transactions
Arcs: r/w, w/r or w/w conflicts
Precedence graph
T1
Read(N)
T2
Read(N)
N=N-1
N= N-1
N
Write(N)
T2
N
Write(N)
T1
Cycle -> not serializable
Example
T1
Read(A)
…
write(A)
T2
T3
Read(A)
…
Write(A)
Read(B)
…
Write(B)
Read(B)
…
Write(B)
Example
T1
Read(A)
…
write(A)
T2
T3
A
T3
Read(A)
…
Write(A)
Read(B)
…
Write(B)
Read(B)
…
Write(B)
T1
B
T2
serial execution?
Example
A: T2, T1, T3
(Notice that T3 should go after T2,
although it starts before it!)
Q: How to generate serial execution from
(acyclic) precedence graph?
Example
A: Topological sorting
A topological sort of a DAG=(V,E) is a linear ordering of all
its vertices such that if G contains an edge (u,v), then u
appears before v in the ordering.
…it is the ordering of its vertices along a horizontal line so
that all directed edges go from left to right.
…topologically sorted vertices appear in reverse order of
their finishing times according to depth first search (DFS)
Serializability
Ignore ‘view serializability’ (less stringent
than ‘conflict serializability’)
We assume ‘no blind writes’, i.e., ‘read
before write’
(counter) example:
‘Inconsistent analysis’
T1
Read(A)
A=A-10
Write(A)
T2
Read(A)
Sum = A
Read(B)
Sum += B
Read(B)
B=B+10
Write(B)
Precedence graph?
Conclusions





‘ACID’ properties of transactions
recovery for ‘A’, ‘D’
concurrency control for ‘I’
correct schedule -> serializable
precedence graph
acyclic -> serializable