ADBMS20051004

Transaction Processing:
Concurrency and
Serializability
10/4/05
Interleave transactions to improve
concurrency; increasing
concurrency can increase
throughput (performance).
Some interleaved transactions will
never violate isolation because
they act on different data.
Some interleaved transactions MAY
violate isolation.
Concurrency control: An algorithm
to (hopefully) permit good
interleaving and refuse bad
interleaving.
NB, Executing a concurrency
control algorithm will increase
overhead of the transaction
manager.
This will increase response time,
and reduce throughput.
Concurrency control
Input to the algorithm are the arriving
requests for database reads/writes.
The input is obtained from the various
transactions.
Output is a sequence of database
read/write requests.
The output is provided to the portion of
the data manager actually accessing
the disk.
A serial schedule has no
interleaving between transactions
(a transaction completes before
another begins).
A schedule is correct if it is
equivalent to a serial schedule.
Isolation levels
By relaxing the isolation
requirement, more interleaving is
possible -- at a greater risk to data
integrity.
Isolation levels characterize the
amount of isolation imposed.
Commuting operations
Two operations, p1 and p2, commute if, for all
possible initial database states,
 p1 returns the same value when executed in order
<p1, p2> or <p2, p1>
 p2 returns the same value when executed in order
<p1, p2> or <p2, p1>
 The database state produced by both sequences is
the same.
 Note, commutativity is symmetric.
 NOTE! Two operations on different data items
always commute.
 Note, Two operations on the same data item
MAY commute.
Conflicting operations
 Two operations that do not commute are conflicting
operations.
 E.g.,
S1 : <s11, s12>
S1’ : <s12, s11>
If they are run on the same starting state, and end up
in different states, then s11 and s12 conflict.
• Look at the following from the aspect of two different
transactions,
• A read and read on the same item always commute.
• A read and a write on the same item conflict because
(though the final state is the same), value returned
depends on order of ops.
• A write and a write on the same item conflict.
If S2 can be obtained from S1 by
“swapping” commuting operations,
then S1 and S2 are equivalent.
Equivalence of schedules is
transitive!
Example schedules
Two interleaved transactions T1 (t11,
t12), T2 (t21, t22):
• S1: s11, s12, s13, s14
• Suppose s12 and s13 commute, then
• S2 : s11, s13, s12, s14
Same start state
Same end state
Schedule equivalence (not the same as
E&N’s ‘complete schedule’ definition):
Two schedules of the same set of ops are
equivalent iff conflicting operations are
ordered in the same way in both schedules.
==> A schedule S2 can be derived from a
schedule S1 by interchanging commuting
operations iff conflicting operations are
ordered in the same way in both schedules.
Restatement of
Serializable Schedule
A schedule is serializable if it is
equivalent to a serial schedule
Equivalent construction:
Commute commuting operators and
use transitivity of equivalence, or
Conflicting operations are in the same
order in both schedules.
Try this: is S1 serializable (what
commutations?), S2? S3?
 T1: <r(a), r(b), b += a, w(b)>
 T2: <r(a), a ++, w(a) >
S1: <r1(a), r2(a), w2(a), r1(b) w1(b) >
S2: <r1(a), r1(b), r2(a), w1(b), w2(a)>
S3: <r2(a), r1(a), w2(a), r1(b), w1(b)>
Try this: is S4 serializable?
S4: <r1(a), r2(b), w2(a), w1(b)>
More on schedule
equivalence
The preceding definition of equivalence
(by commuting, AKA by maintaining
order of conflicting ops) is called conflict
equivalence.
A different kind of equivalence is view
equivalence, two schedules of the same
set of ops are view equivalent if both
the following are true:
Corresponding read ops in each schedule
return the same values,
Both schedules yield the same final state.
View equivalence
 If corresponding read ops in both schedules
return the same values, then the transactions
perform the same calculations and write the
same results!
 I.e., transactions in both schedules have the
same view of the database.
 Conflict equivalence implies view equivalence
 View equivalence does not imply conflict
equivalence.
 I.e., Conflict equivalence is the stronger; but it
turns out that conflict equivalence is easier to
use for concurrency control.
Serialization graphs
A schedule, S, is represented as a
directed graph.
Nodes are (committed) transactions.
Edge between Ti and Tj (Ti -> Tj) if:
Some op in Ti, pi, conflicts with some op,
pj, in Tj, and
pi appears before pj in S.
Example
S1: <r1(a), r2(a), w2(a), r1(b)
w2(b)>
T2 writes a after T1 reads a.
The ops do not commute:
r1(a), w2(a)
Graph of S1:
T1 T2
A schedule is conflict serializable iff
its serialization graph is acyclic.
T2
T4
T1
T3
T5
T6
T7
Topological sorts give conflict equivalent serial
schedules, e.g.:
T1, T3, T5, T2, T6, T7, T4.
Others?
In class
 Using concurrent transactions, deposit to a, withdraw
from a, make a (non-serial) schedule:
 Give the serialization graph
 Is it acyclic? If so, give a conflict equivalent serial
schedule.
 Identify commuting operations.
 Identify conflicting operations.
 Using the concurrent deposit, transfer and withdraw
transactions (deposit to a, withdraw from b, transfer
takes from b and puts in a), make a (non-serial)
schedule:
 Give the serialization graph
 Is it acyclic? Is there a serial schedule?
 How many total pairs of operations are there?
 Identify, at least some, commuting operations.
 Identify, at least some, conflicting operations.
A strict concurrency
control
A transaction is not allowed to read or
write data that has been written by
another still active transaction.
(Recoverability topic later).
Conflict avoidance:
If operation requests by T1 and T2 do not
conflict, they are granted.
Requests don’t conflict if either:
Requests are to different data items, OR
Requests are both reads.
In class
Make the conflict table for the
previous algorithm:(put X for conflicting
requests)
Requested op:
read
write
Granted op:
read
write
But if you make a transaction wait …
DEADLOCK
(a cycle of k transactions waiting for
each other)
Dealing with deadlock
 Prevention: maintain a data structure that checks
whether deadlock may result. If so, some transaction
involved in the deadlock must be aborted.
 Timeout: if time to execute exceeds a threshold, force
an abort.
 Timestamp:Timestamp start of each transaction. Use
timestamp to implement a conflict resolution policy:
 Older transaction never waits for younger (e.g., by
aborting younger, even though younger has been
waiting a long time),
 Younger transaction can only wait for an older (place
younger on wait-list)
Manual locking:
an alternative to AUTOMATIC locking
A transaction explicitly requests
concurrency control to grant a lock
on a data item, then makes the
read/write request.
Concurrency control grants (or
refuses) locks.
UNLOCKING
Can be automatic -- when a
transaction terminates, all locks
held by it are released.
Can be manual -- transaction
explicitly releases a lock.
Two phase locking: 2PL
A transaction maintains 2PL
protocol if it obtains all of its locks
before making any unlocks …
lock phase, followed by unlock phase
Automatic locking is 2PL.
Automatic unlocking is 2PL.
2PL protocol produces serializable
schedules.
For next time, we’ll discuss the
paper in the RedBook: “Granularity
of Locks …”
How are the different lock modes
used?
What are the degrees of consistency?
How does the locking protocol relate
to degrees of consistency.
What are the overhead costs of the
different locking protocols?