Transactions - COW :: Ceng

Transactions
1
Why Transactions?
• Database systems are normally being
accessed by many users or processes at the
same time.
– Both queries and modifications.
• Unlike operating systems, which support
interaction of processes, a DMBS needs to
keep processes from troublesome
interactions.
2
Transactions
• Transaction = process involving database
queries and/or modification.
• Normally with some strong properties
regarding concurrency.
• Formed in SQL from single statements or
explicit programmer control.
3
ACID Transactions
• ACID transactions are:
– Atomic : Whole transaction or none is done.
– Consistent : Database constraints preserved.
– Isolated : It appears to the user as if only one process
executes at a time.
– Durable : Effects of a process survive a crash.
• Optional: weaker forms of transactions are often
supported as well.
4
Atomicity of Transactions
• A transaction might commit after completing
all its actions, or it could abort (or be aborted
by the DBMS) after executing some actions.
• A very important property guaranteed by the
DBMS for all transactions is that they are
atomic. That is, a user can think of a Xact as
always executing all its actions in one step, or
not executing any actions at all.
–
DBMS logs all actions so that it can undo the
actions of aborted transactions.
5
Concurrency in a DBMS
• Users submit transactions, and can think of each
transaction as executing by itself.
–
–
Concurrency is achieved by the DBMS, which interleaves
actions (reads/writes of DB objects) of various transactions.
Each transaction must leave the database in a consistent
state if the DB is consistent when the transaction begins.
• DBMS will enforce some ICs, depending on the ICs declared in
CREATE TABLE statements.
• Beyond this, the DBMS does not really understand the semantics of
the data. (e.g., it does not understand how the interest on a bank
account is computed).
• Issues: Effect of interleaving transactions, and
crashes.
6
Concurrent Execution
7
Example
• Consider two transactions (Xacts):
T1:
T2:
BEGIN A=A+100, B=B-100 END
BEGIN A=1.06*A, B=1.06*B END
Intuitively, the first transaction is transferring $100
from B’s account to A’s account. The second is
crediting both accounts with a 6% interest payment.
 There is no guarantee that T1 will execute before T2
or vice-versa, if both are submitted together.
However, the net effect must be equivalent to these
two transactions running serially in some order.8

Example (Contd.)
• Consider a possible interleaving (schedule):
T1:
T2:
A=A+100,
A=1.06*A,
B=B-100
B=1.06*B
This is OK. But what about:
T1: A=A+100,
B=B-100
T2:
A=1.06*A, B=1.06*B
 The DBMS’s view of the second schedule:

T1:
T2:
R(A), W(A),
R(B), W(B)
R(A), W(A), R(B), W(B)
9
Scheduling Transactions
• Serial schedule: Schedule that does not interleave
the actions of different transactions.
• Equivalent schedules: For any database state, the
effect (on the set of objects in the database) of
executing the first schedule is identical to the effect
of executing the second schedule.
• Serializable schedule: A schedule that is equivalent
to some serial execution of the transactions.
(Note: If each transaction preserves consistency,
10
every serializable schedule preserves consistency.
)
Anomalies with Interleaved
Execution
• Reading Uncommitted Data (WR Conflicts,
“dirty reads”):
T1:
T2:
R(A), W(A),
R(B), W(B), Abort
R(A), W(A), C
• Unrepeatable Reads (RW Conflicts):
T1:
T2:
R(A),
R(A), W(A), C
R(A), W(A), C
11
Anomalies (Continued)
• Overwriting Uncommitted Data (WW
Conflicts) - lost update problem:
T1:
T2:
W(A),
W(B), C
W(A), W(B), C
12
Recoverability: Schedules with
Aborted Transactions
T1 :
r (x) w(y) commit
T2: w(x)
abort
• T2 has aborted but has had an indirect effect on the
database – schedule is unrecoverable
• Problem: T1 read uncommitted data - dirty read
• Solution: A concurrency control is recoverable if it
does not allow T1 to commit until all other
transactions that wrote values T1 read have committed
T1 :
r (x) w(y) req_commit
abort
T2: w(x)
abort
13
Cascaded Abort
• Recoverable schedules solve abort problem
but allow cascaded abort: abort of one
transaction forces abort of another
T1:
r (y) w(z)
abort
T2:
r (x) w(y)
abort
T3: w(x)
abort
• Better solution: prohibit dirty reads
14
Dirty Write
• Dirty write: A transaction writes a data item
written by an active transaction
• Dirty write complicates rollback:
no rollback necessary
T1: w(x)
abort
T2 :
w(x)
abort
what value of x
should be restored?
15
Strict Schedules
• Strict schedule: Dirty writes and dirty reads
are prohibited
• Strict and serializable are two different
properties
– Strict, non-serializable schedule:
r1(x) w2(x) r2(y) w1(y) c1 c2
– Serializable, non-strict schedule:
w2(x) r1(x) w2(y) r1(y) c1 c2
16
Implementing Isolation
17
Isolation
• Serial execution:
– Inadequate performance
• Since system has multiple asynchronous resources and
transaction uses only one at a time
• Concurrent execution:
– Improved performance
– We are interested in concurrent schedules that
are equivalent to serial schedules. These are
referred to as serializable schedules.
18
Serializable Schedules
• S is serializable if it is equivalent to a serial
schedule
• Transactions are totally isolated in a serializable
schedule
• A schedule is correct for any application if it is a
serializable schedule of consistent transactions
• The schedule :
r1(x) r2(y) w2(x) w1(y)
is not serializable
19
Conflict Equivalence
• Definition- Two schedules, S1 and S2, of the same
set of operations are conflict equivalent if
conflicting operations are ordered in the same way
in both
• Result- A schedule is serializable if it is conflict
equivalent to a serial schedule
r1(x) w2(x) w1(y) r2(y)  r1(x) w1(y) w2(x) r2(y)
conflict
conflict
20
View Equivalence
• Two schedules of the same set of operations are
view equivalent if:
– Corresponding read operations in each return the
same values (hence computations are the same)
– Both schedules yield the same final database state
• Conflict equivalence implies view equivalence.
• View equivalence does not imply conflict
equivalence.
21
View Equivalence
T1:
w(y) w(x)
T2: r(y)
w(x)
T3:
w(x)
• Schedule is not conflict equivalent to a serial
schedule
• Schedule has same effect as serial schedule
T2 T1 T3. It is view equivalent to a serial
schedule and hence serializable
22
Conflict vs. View Equivalence
set of schedules
that are view
equivalent to
serial schedules
set of schedules
that are conflict
equivalent to
serial schedules
• A concurrency control based on view equivalence should provide
better performance than one based on conflict equivalence since less
reordering is done but …
• It is difficult to implement a view equivalence concurrency control
• a concurrency control that guarantees conflict equivalence to serial
schedules ensures correctness and is easily implemented.
23
Concurrency Control
Serializable schedule
Arriving schedule
(from transactions)
Concurrency Control
(to processing engine)
• Concurrency control cannot see entire schedule:
– It sees one request at a time and must decide whether to
allow it to be serviced
• Strategy: Do not service a request if:
– It violates serializability, or
– There is a possibility that a subsequent arrival might
cause a violation of serializability
24
Models of Concurrency Controls
• Immediate Update
–
–
–
–
A write updates a database item
A read copies value from a database item
Commit makes updates durable
Abort undoes updates
• Deferred Update
– A write stores new value in the transaction’s intentions list
(does not update database)
– A read copies value from database or transaction’s
intentions list
– Commit uses intentions list to durably update database
– Abort discards intentions list
25
Immediate vs. Deferred Update
database
read/write
database
commit
T’s
intentions
list
read
read/write
Transaction
T
Immediate Update
Transaction
T
Deferred Update
26
Models of Concurrency Controls
• Pessimistic
– A transaction requests permission for each database
(read/write) operation
– Concurrency control can:
• Grant the operation (submit it for execution)
• Delay it until a subsequent event occurs (commit or abort of another
transaction), or
• Abort the transaction
– Decisions are made conservatively so that a commit request
can always be granted
• Takes precautions even if conflicts do not occur
27
Models of Concurrency Controls
• Optimistic – Request for database operations (read/write) are
always granted
– Request to commit might be denied
• Transaction is aborted if it performed a non-serializable
operation
• Assumes that conflicts are not likely
28
Locking: Implementation of an
Immediate-Update Pessimistic Control
• A transaction can read a database item if it
holds a read (shared) lock on the item
• It can read or update the item if it holds a
write (exclusive) lock
• If the transaction does not already hold the
required lock, a lock request is automatically
made as part of the access
29
Locking
• Request for read lock granted if no transaction currently holds
write lock on item
– Cannot read an item written by an active transaction
• Request for write lock granted if no transaction holds any lock
on item
– Cannot write an item read/written by an active transaction
• All locks held by a transaction are released when the transaction
completes (commits or aborts)
• Resulting schedules are serializable and strict
Granted mode
Requested mode
read
write
read
x
write
x
x
30
Manual Locking
• Better performance possible if transactions
are allowed to release locks before commit
• However, early lock release can lead to nonserializable schedules
T1: l(x) r(x) u(x)
l(y) r(y) u(y)
T2:
l(x) l(y) w(x) w(y) u(x) u(y)
commit
31
Two-Phase Locking
• Transaction does not release a lock until it has all
the locks it will ever require.
• Transaction, T, has a locking phase followed by an
unlocking phase
Number
of locks
held by T
First unlock
time
• Guarantees serializability when locking is done
manually
32
Strict Two-Phase Locking
• A strict two-phase locking control holds all locks
until commit and produces strict serializable
schedules
– This is automatic locking
– Equivalent to a serial schedule in which transactions are
ordered by their commit time
– Guarantees serializability, and recoverable schedule, too!
locks
t
33
Locking Implementation
• Associate a lock set, L(x), and a wait set, W(x), with
each active database item, x
– L(x) contains an entry for each granted lock
– W(x) contains an entry for each pending request
– When an entry is removed from L(x) (due to
transaction termination), promote (non- conflicting)
entries from W(x) using some scheduling policy
(e.g., FCFS)
• Associate a lock list, Li , with each transaction, Ti.
– Li links Ti’s elements in all lock and wait sets
– Used to release locks on termination
34
Handling a Lock Request
Lock Request (XID, OID, Mode)
Mode==S
Mode==X
Currently Locked?
Empty Wait Queue?
Yes
No
Yes
Currently X-locked?
Yes
No
Put on Queue
No
Grant Lock
35
More Lock Manager Logic
• On lock release (OID, XID):
– Update list of Xacts holding lock.
– Examine head of wait queue.
– If Xact there can run, add it to list of Xacts
holding lock (change mode as needed).
– Repeat until head of wait queue cannot be run.
• Note: Lock request handled atomically!
– via latches (i.e. semaphores/mutex; OS stuff).
36
Lock Upgrades
• Think about this scenario:
– T1 locks A in S mode, T2 requests X lock on A, T3 requests
S lock on A. What should we do?
• In contrast:
– T1 locks A in S mode, T2 requests X lock on A, T1 requests
X lock on A. What should we do?
• Allow such upgrades to supersede lock requests.
– Consider this scenario:
• S1(A), X2(A), X1(A):
DEADLOCK!
• BTW: Deadlock can occur even w/o upgrades:
– X1(A), X2(B), S1(B), S2(A)
37
Deadlocks
• Deadlock: Cycle of transactions waiting for locks
to be released by each other.
• Two ways of dealing with deadlocks:
–
–
Deadlock prevention
Deadlock detection
38
Deadlock Prevention
X1(A), X2(B), S1(B), S2(A)
• Assign a timestamp to each Xact as it enters the
system. “Older” Xacts have priority.
• Assume Ti requests a lock, but Tj holds a
conflicting lock.
– Wait-Die: If Ti has higher priority, it waits; else Ti
aborts. (non-preemptive)
– Wound-Wait: If Ti has higher priority, abort Tj; else Ti
waits. (preemptive)
– Note: After abort, restart with original timestamp!
– Both guarantee deadlock-free behavior! Pros and cons
of each?
39
Deadlock Detection
• Create a waits-for graph:
–
–
Nodes are transactions
There is an edge from Ti to Tj if Ti is waiting for Tj to
release a lock
• Periodically check for cycles in the waits-for graph.
• “Shoot” some Xact to break the cycle.
• Simpler hack: time-outs.
– T1 made no progress for a while? Shoot it.
40
Deadlock Detection (Continued)
Example:
T1: S(A), R(A),
T2:
T3:
T4:
S(B)
X(B),W(B)
X(C)
S(C), R(C)
X(A)
X(B)
T1
T2
T1
T2
T4
T3
T3
T3
41
Prevention vs. Detection
• Prevention might abort too many Xacts.
• Detection might allow deadlocks to tie up
resources for a while.
– Can detect more often, but it’s time-consuming.
• The usual answer:
– Detection is the winner.
– Deadlocks are pretty rare.
– If you get a lot of deadlocks, reconsider your
schema/workload!
42
Lock Granularity
• Data item: variable, record, row, table, file
• When an item is accessed, the DBMS locks an entity
that contains the item. The size of that entity
determines the granularity of the lock
– Coarse granularity (large entities locked)
• Advantage: If transactions tend to access multiple items
in the same entity, fewer lock requests need to be
processed and less lock storage space required
• Disadvantage: Concurrency is reduced since some
items are unnecessarily locked
– Fine granularity (small entities locked)
• Advantages and disadvantages are reversed
43
Lock Granularity
• Table locking (coarse)
– Lock entire table when a row is accessed.
• Row (tuple) locking (fine)
– Lock only the row that is accessed.
• Page locking (compromise)
– When a row is accessed, lock the containing
page
44
Dynamic Databases
• If we relax the assumption that the DB is a fixed collection
of objects, even Strict 2PL will not assure serializability:
– T1 locks all pages containing sailor records with rating
= 1, and finds oldest sailor (say, age = 71).
– Next, T2 inserts a new sailor; rating = 1, age = 96.
– T2 also deletes oldest sailor with rating = 2 (and, say,
age = 80), and commits.
– T1 now locks all pages containing sailor records with
rating = 2, and finds oldest (say, age = 63).
• No consistent DB state where T1 is “correct”!
45
The Problem
• T1 implicitly assumes that it has locked the set of all
sailor records with rating = 1.
– Assumption only holds if no sailor records are added
while T1 is executing!
– Need some mechanism to enforce this assumption.
(Index locking and predicate locking.)
• Example shows that conflict serializability guarantees
serializability only if the set of objects is fixed!
46
Data
Index Locking
Index
r=1
• If there is an unclustered index on the rating field, T1
should lock the index page containing the data entries with
rating = 1.
– If there are no records with rating = 1, T1 must lock the
index page where such a data entry would be, if it
existed!
• If there is no suitable index, T1 must lock all pages, and
lock the file/table to prevent new pages from being added,
to ensure that no new records with rating = 1 are added.
47
Predicate Locking
• Grant lock on all records that satisfy some logical
predicate, e.g. age > 2*salary.
• Index locking is a special case of predicate locking for
which an index supports efficient implementation of the
predicate lock.
– What is the predicate in the sailor example?
• In general, predicate locking has a lot of locking overhead.
48
Transaction Support in SQL
• Each transaction has an access mode, a diagnostics size, and
an isolation level.
• SET TRANSACTION ISOLATION LEVEL X
where X is:
Isolation Level
Dirty
Read
Unrepeatable
Read
Phantom
Problem
READ UNCOMMITTED Maybe
Maybe
Maybe
READ COMMITTED
No
Maybe
Maybe
REPEATABLE READ
No
No
Maybe
SERIALIZABLE
No
No
No
49
Timestamp-Ordered
Concurrency Control
• Each transaction given a (unique) timestamp
(current clock value) when initiated
• Uses the immediate update model
• Guarantees equivalent serial order based on
timestamps (initiation order)
50
Timestamp-Ordered
Concurrency Control
• Associated with each database item, x, are
two timestamps:
– wt(x), the largest timestamp of any transaction
that has written x,
– rt(x), the largest timestamp of any transaction
that has read x,
– and an indication of whether or not the last write
to that item is from a committed transaction
51
Read Procedure
• If T requests to read x:
– R1: if TS(T) < wt(x), then T is too old; abort T
– R2: if TS(T) > wt(x), then
• if the value of x is committed, grant T’s read and if
TS(T) > rt(x) assign TS(T) to rt(x)
• if the value of x is not committed, T waits (to avoid
a dirty read)
52
Write Procedure
• If T requests to write x :
– W1: If TS(T) < rt(x), then T is too old; abort T
– W2: If rt(x) < TS(T) < wt(x), then no transaction that
read x should have read the value T is attempting to write
and no transaction will read that value (R1)
• If x is committed, grant the request but do not do the write
• If x is not committed, T waits to see if newer value will commit.
If it does, discard T’s write, else perform it
– W3: If wt(x), rt(x) < TS(T), then if x is committed, grant
the request and assign TS(T) to wt(x), else T waits
53
Example
• Assume TS(T1) < TS(T2), at t0 x and y are committed,
and x’s and y’s read and write timestamps are less
than TS(T1)
T1 :
T2:
r(y)
t0
t1
w(x) commit
w(y)
t2
w(x) commit
t3
t4
t1: (R2) TS(T1) > wt(y); assign TS(T1) to rt(y)
t2: (W3) TS(T2) > rt(y), wt(y); assign TS(T2) to wt(y)
t3: (W3) TS(T2) > rt(x), wt(x); assign TS(T2) to wt(x)
t4: (W2) rt(x) < TS(T1) < wt(x); grant request, but don’t
do the write
54
Pros and Cons
• Control accepts schedules that are not conflict
equivalent to any serial schedule and would not
be accepted by a two-phase locking control
– Previous example equivalent to T1, T2
• But additional space required in database for
storing timestamps and time for managing
timestamps
– Reading a data item now implies writing back a new
value of its timestamp
55
Optimistic Algorithms
• Do task under simplifying (optimistic) assumption
– Example: Operations rarely conflict
• Check afterwards if assumption was true.
– Example: Did a conflict occur?
• Redo task if assumption was false
– Example: If a conflict has occurred rollback, else commit
• Performance benefit if assumption is generally true
and check can be done efficiently
56
Optimistic Concurrency Control
• Under the optimistic assumption that conflicts do not
occur, read and write requests are always granted (no
locking, no overhead!)
• Since conflicts might occur, database might be
corrupted if writes were immediate. Hence a deferredupdate model is used
• This approach contrasts with the pessimistic algorithm
in which conflicts are assumed likely, preventative
measures (acquire locks) are always taken, and no
validation required (commit always granted)
57
Optimistic CC (Kung-Robinson)
• Xacts have three phases:
– READ: Xacts read from the database, but make
changes to private copies of objects.
– VALIDATE: Check for conflicts.
– WRITE: Make local copies of changes public.
old
modified
objects
new
ROOT
58
Validation
• Test conditions that are sufficient to ensure that no conflict
occurred.
• Each Xact is assigned a numeric id.
– Just use a timestamp.
• Xact ids assigned at end of READ phase, just before
validation begins. (Why then?)
• ReadSet(Ti): Set of objects read by Xact Ti.
• WriteSet(Ti): Set of objects modified by Ti.
59
Test 1
• For all i and j such that Ti < Tj, check that
Ti completes before Tj begins.
Ti
R
V
Tj
W
R
V
W
60
Test 2
• For all i and j such that Ti < Tj, check that:
– Ti completes before Tj begins its Write phase +
– WriteSet(Ti)
ReadSet(Tj) is empty.
Ti
R
V
W
R
V
W
Tj
Does Tj read dirty data? Does Ti overwrite Tj’s writes?
61
Test 3
• For all i and j such that Ti < Tj, check that:
– Ti completes Read phase before Tj does +
– WriteSet(Ti)
ReadSet(Tj) is empty +
– WriteSet(Ti)
WriteSet(Tj) is empty.
Ti
R
V
R
W
V
W
Tj
Does Tj read dirty data? Does Ti overwrite Tj’s writes?
62
Overheads in Optimistic CC
• Must record read/write activity in ReadSet and WriteSet per
Xact.
– Must create and destroy these sets as needed.
• Must check for conflicts during validation, and must make
validated writes ``global’’.
– Critical section can reduce concurrency.
– Scheme for making writes global can reduce clustering of
objects.
• Optimistic CC restarts Xacts that fail validation.
– Work done so far is wasted; requires clean-up.
63
Distributed Transactions
64
Introduction
• Data is stored at several sites, each managed by a
DBMS that can run independently.
• Distributed Data Independence: Users should not
have to know where data is located (extends
Physical and Logical Data Independence
principles).
• Distributed Transaction Atomicity: Users should
be able to write Xacts accessing multiple sites just
like local Xacts.
Distributed DBMS Architectures
QUERY
• Client-Server
Client ships query
to single site. All query
processing at server.
CLIENT
SERVER
CLIENT
SERVER
SERVER
SERVER
Collaborating-Server
Query can span multiple
sites.

SERVER
QUERY
SERVER
ACID Properties
• Assuming each DBMS supports ACID
properties locally and eliminates local
deadlocks, the additional issues are:
– Global atomicity: all subordinates abort or all
commit
– Global deadlocks: there must be no deadlocks
involving multiple sites
– Global serialization: distributed transaction must
be globally serializable
67
Updating Distributed Data
• Synchronous Replication: All copies of a modified relation
(fragment) must be updated before the modifying Xact
commits.
– Data distribution is made transparent to users.
• Asynchronous Replication: Copies of a modified relation
are only periodically updated; different copies may get out
of synch in the meantime.
– Users must be aware of data distribution.
– Current products follow this approach.
Synchronous Replication
• Voting: Xact must write a majority of copies to modify an
object; must read enough copies to be sure of seeing at least one
most recent copy.
– E.g., 10 copies; 7 written for update; 4 copies read.
– Each copy has version number.
– Not attractive usually because reads are common.
• Read-any Write-all: Writes are slower and reads are faster,
relative to Voting.
– Most common approach to synchronous replication.
• Choice of technique determines which locks to set.
Cost of Synchronous Replication
• Before an update Xact can commit, it must obtain locks on
all modified copies.
– Sends lock requests to remote sites, and while waiting
for the response, holds on to other locks!
– If sites or links fail, Xact cannot commit until they are
back up.
– Even if there is no failure, committing must follow an
expensive commit protocol with many msgs.
• So the alternative of asynchronous replication is becoming
widely used.
Asynchronous Replication
• Allows modifying Xact to commit before all copies have
been changed (and readers nonetheless look at just one
copy).
– Users must be aware of which copy they are reading,
and that copies may be out-of-sync for short periods of
time.
• Two approaches: Primary Site and Peer-to-Peer
replication.
– Difference lies in how many copies are ``updatable’’ or
``master copies’’.
Distributed Locking
• How do we manage locks for objects across many sites?
– Centralized: One site does all locking.
• Vulnerable to single site failure.
– Primary Copy: All locking for an object done at the
primary copy site for this object.
• Reading requires access to locking site as well as site
where the object is stored.
– Fully Distributed: Locking for a copy done at site where
the copy is stored.
• Locks at all sites while writing an object.
Distributed Deadlock Detection
• Each site maintains a local waits-for graph.
• A global deadlock might exist even if the
local graphs contain no cycles:
T1
T2
SITE A

T1
T2
SITE B
T1
T2
GLOBAL
Three solutions: Centralized (send all local graphs to one site);
Hierarchical (organize sites into a hierarchy and send local graphs
to parent in the hierarchy); Timeout (abort Xact if it waits too
long).
Global Atomicity
• Two new issues:
– New kinds of failure, e.g., links and remote
sites.
– If “sub-transactions” of an Xact execute at
different sites, all or none must commit. Need
a commit protocol to achieve this.
• A log is maintained at each site, as in a
centralized DBMS, and commit protocol
actions are additionally logged.
Two-Phase Commit (2PC)
• Site at which Xact originates is coordinator; other sites at which
it executes are subordinates.
• When an Xact wants to commit:
 Coordinator sends prepare msg to each subordinate.
 Subordinate force-writes an abort or prepare log record
and then sends a no or yes msg to coordinator.
 If coordinator gets unanimous yes votes, force-writes a
commit log record and sends commit msg to all subs. Else,
force-writes abort log rec, and sends abort msg.
 Subordinates force-write abort/commit log rec based on msg
they get, then send ack msg to coordinator.
 Coordinator writes end log rec after getting all acks.
Comments on 2PC
• Two rounds of communication: first, voting; then,
termination. Both initiated by coordinator.
• Any site can decide to abort an Xact.
• Every msg reflects a decision by the sender; to ensure
that this decision survives failures, it is first recorded in
the local log.
• All commit protocol log recs for an Xact contain
Xactid and Coordinatorid. The coordinator’s
abort/commit record also includes ids of all
subordinates.
Restart After a Failure at a Site
• If we have a commit or abort log rec for Xact T, but not an end rec,
must redo/undo T.
– If this site is the coordinator for T, keep sending commit/abort
msgs to subs until acks received.
• If we have a prepare log rec for Xact T, but not commit/abort, this
site is a subordinate for T.
– Repeatedly contact the coordinator to find status of T, then write
commit/abort log rec; redo/undo T; and write end log rec.
• If we don’t have even a prepare log rec for T, unilaterally abort and
undo T.
– This site may be coordinator! If so, subs may send msgs.
Blocking
• If coordinator for Xact T fails, subordinates
who have voted yes cannot decide whether
to commit or abort T until coordinator
recovers.
– T is blocked.
– Even if all subordinates know each other (extra
overhead in prepare msg) they are blocked
unless one of them voted no.
Link and Remote Site Failures
• If a remote site does not respond during the
commit protocol for Xact T, either because
the site failed or the link failed:
– If the current site is the coordinator for T,
should abort T.
– If the current site is a subordinate, and has not
yet voted yes, it should abort T.
– If the current site is a subordinate and has voted
yes, it is blocked until the coordinator responds.
Observations on 2PC
• Ack msgs used to let coordinator know when it can
“forget” an Xact; until it receives all acks, it must keep T
in the Xact Table.
• If coordinator fails after sending prepare msgs but before
writing commit/abort log recs, when it comes back up it
aborts the Xact.
• If a subtransaction does no updates, its commit or abort
status is irrelevant.
2PC with Presumed Abort
• When coordinator aborts T, it undoes T and removes it from the
Xact Table immediately.
– Doesn’t wait for acks; “presumes abort” if Xact not in Xact
Table. Names of subs not recorded in abort log rec.
• Subordinates do not send acks on abort.
• If subxact does not do updates, it responds to prepare msg with
reader instead of yes/no.
• Coordinator subsequently ignores readers.
• If all subxacts are readers, 2nd phase not needed.