Two-phase commit protocol for nested transactions

Mutual Exclusion
 What is mutual exclusion?
– Make sure that no other will use the shared data structure at
the same time.
 Single processor systems
– use semaphores and monitors
 Three different algorithms
– Centralized Algorithm
– Distributed Algorithm
– Token Ring Algorithm
1
Mutual Exclusion:Centralized Algo(1)
 One process is elected as coordinator
 Other processes send it a message asking for
permission
– coordinator grants permission
– or says no-permission (or doesn’t reply at all)
• queues the request
 When the critical region is free
– it sends a message to the first one in the queue
2
Mutual Exclusion:
A Centralized Algorithm(2)
a)
b)
c)
Process 1 asks the coordinator (ask)for permission to enter a critical
region. Permission is granted
Process 2 then asks permission to enter the same critical region. The
coordinator does not reply.
When process 1 exits the critical region, it tells the
coordinator,(release) when then replies to 2
3
Mutual Exclusion:
A Centralized Algorithm(3)
Coordinator only let one process to enter the critical
region.
The request is granted in the order: no process ever
waits forever ( no starvation).
Three messages is use in accessing the critical
region/shared resources:
Request
Grant
Release
Drawback:coordinator is single point failure
If process blocked after making a request- it is cannot
distinguish either the coordinator is dead or resource not
available.
Performance bottleneck in a large system.
4
Mutual Exclusion:A Distributed Algo(1)




There are total ordering of all event in the system
Provide timestamps by using Lamport Algorithm
Algorithm:
A process wanting to enter the Critical Section (CS)
–
Build a msg :•
–
–
forms <cs-name, its process id, current-time>
sends to all processes including itself.
assume that sending is reliable; every msg is acknowledge
5
Mutual Exclusion: A Distributed
Algorithm(2)
Every receiving process
 sends an OK, if it is not interested in the CS
 if it is already in the CS, just queues the message
 if it itself has sent out a message for the CS
 compares the time stamps
 if an incoming message has lower timestamp
 it sends out an OK
 else it just queues it
 Once it receives an OK from everyone
 it enters the CS
 once its done, its sends an OK to everyone in its queue
6
Mutual Exclusion: A Distributed Algo(3)
8
12
a)
b)
c)
Two processes(0&2) want to enter the same critical region at the
same moment.
Process 1 not interested for CS-> send OK to 0 and 2.
0 & 1 compare the timestamps=> Process 0 has the lowest
timestamp, so it wins.
When process 0 is done, it sends an OK also, so 2 can now enter
the critical region.
7
A Token Ring Algorithm(1)
 Create a logical ring (in software)
– each process knows who is next
 When a process have the token, it can enter the CS
 Finished, release the token and pass to the next guy
 The token circulate at high speed around the ring if no
process wants to enter the CS.
 No starvation
– at worst wait for each other process to complete
 Detecting that a token has been lost is hard
 What if a process crashes?
– recovery depends on the processes being able to skip this
process while passing on the ring
8
A Token Ring Algorithm(2)
K+1%8
Token
6+1%8=7
a) An unordered group of processes on a network.
b) A logical ring constructed in software.
Process must have token to enter.
–
–
–
If don’t want to enter, pass token along.
If token lost (detection is hard), regenerate token.
If host down, recover ring.
9
Comparison
A comparison of three mutual exclusion algorithms.
Algorithm
Messages per
entry/exit
Delay before entry
(in message times)
Problems
Centralized
3
2
Coordinator crash
Distributed
2(n–1)
2(n–1)
Crash of any
process
Token ring
1 to 
0 to n – 1
Lost token,
process crash
Centralized most efficient
Token ring efficient when many want to use
critical region
10
The Transaction Model(1)
 A transaction is a unit of program execution that
accesses and possibly updates various data items.
 A transaction must see a consistent database.
 During transaction execution the database may be
inconsistent.
 When the transaction is committed, the database must
be consistent.
 Two main issues to deal with:
– Failures of various kinds, such as hardware failures
and system crashes
– Concurrent execution of multiple transactions
11
The Transaction Model (3)
Examples of primitives for transactions.
Primitive
Description
BEGIN_TRANSACTION
Make the start of a transaction
END_TRANSACTION
Terminate the transaction and try to commit
ABORT_TRANSACTION
Kill the transaction and restore the old values
READ
Read data from a file, a table, or otherwise
WRITE
Write data to a file, a table, or otherwise
 Above may be system calls, libraries or statements
in a language (Sequential Query Language or
12
SQL)
The Transaction Model (4)
Reserving Flight from White Plains to Malindi
BEGIN_TRANSACTION
reserve WP -> JFK;
reserve JFK -> Nairobi;
reserve Nairobi -> Malindi;
END_TRANSACTION
(a)
a)
b)
BEGIN_TRANSACTION
reserve WP -> JFK;
reserve JFK -> Nairobi;
reserve Nairobi -> Malindi full =>
ABORT_TRANSACTION
(b)
Transaction to reserve three flights commits
Transaction aborts when third flight is unavailable
13
Characteristics of Transaction(5)
 Atomic
– Completely happened or nothing
 Consistent
– The system not violate system invariant-one state to another
– Ex: no money lost after operations
 Isolated
– Operations can happen in parallel but as if were done serially
 Durable
– The result become permanent when its finish/commit
– ACID- FLAT TRANSACTION
14
Example: Funds Transfer
 Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
 Consistency requirement – the sum of A and B is
unchanged by the execution of the transaction.
 Atomicity requirement — if the transaction fails after
step 3 and before step 6, the system ensures that its
updates are not reflected in the database.
15
Example: Funds Transfer continued
 Durability requirement — once the user has been notified that the
transaction has completed (i.e., the transfer of the $50 has taken
place), the updates to the DB must persist despite failures.
 Isolation requirement — if between steps 3 and 6, another transaction
is allowed to access the partially updated database, it will see an
inconsistent database (the sum A + B will be less than it should be).
Can be ensured by running transactions serially.
16
Flat Transaction
 Simplest type of transaction; all sub transaction were
group into a single transaction.
 Limitation
–
what if want to keep first part of flight reservation? If abort
and then restart, those might be gone.
1. Does not allowed partial result to be
– committed or
• Aborted
 Solve by using nested transaction
17
Atomic Transactions
Transaction: an operation composed of a
number of discrete steps.
All the steps must be completed for the
transaction to be committed. The results are
made permanent.
Otherwise, the transaction is aborted and the
state of the system reverts to what it was before
the transaction started.
Example
Buying a house:
– Make an offer
– Sign contract
– Deposit money in escrow
– Inspect the house
– Critical problems from inspection?
– Get a mortgage
– Have seller make repairs
– Commit: sign closing papers & transfer deed
– Abort: return escrow and revert to pre-purchase state
All or nothing property
Basic Operations
Transaction primitives:
– Begin transaction: mark the start of a transaction
– End transaction: mark the end of a transaction; try to
commit
– Abort transaction: kill the transaction, restore old
values
– Read/write data from files (or object stores): data will
have to be restored if the transaction is aborted.
Programming in a Transaction System
 Begin_transaction
• Mark the start of a transaction
 End_transaction
• Mark the end of a transaction and try to “commit”
 Abort_transaction
• Terminate the transaction and restore old values
 Read
• Read data from a file, table, etc., on behalf of the transaction
 Write
• Write data to file, table, etc., on behalf of the transaction
Atomic Transactions
21
Tools for Implementing Atomic Transactions (continued)
Begin_transaction
• Place a begin entry in log
Write
• Write updated data to log
Abort_transaction
• Place abort entry in log
End_transaction (i.e., commit)
• Place commit entry in log
• Copy logged data to files
• Place done entry in log
Atomic Transactions
22
Programming in a Transaction System (continued)
As a matter of practice, separate transactions
are handled in separate threads or processes
Isolated property means that two concurrent
transactions are serialized
• I.e., they run in some indeterminate order with respect
to each other
Atomic Transactions
23
Programming in a Transaction System (continued)
Nested Transactions
• One or more transactions inside another transaction
• May individually commit, but may need to be undone
Example
• Planning a trip involving three flights
• Reservation for each flight “commits” individually
• Must be undone if entire trip cannot commit
Atomic Transactions
24
Another Example
Book a flight from Penang, KLIA to Waikato.
No non-stop flights are available:
Transaction begin
1. Reserve a seat for Penang to KLIA (PNG→KLIA)
2. Reserve a seat for KLIA to Bangkok (KLIA→BGK)
3. Reserve a seat for Bangkok to Waikato (BGK→WK)
Transaction end
– If there are no seatsavailable on the BGK→WK leg
of the journey, the transaction is aborted and
reservations for (1) and (2) are undone.
Tools for Implementing Atomic Transactions (single
system)
Stable storage
• i.e., write to disk “atomically”
Log file
• i.e., record actions in a log before “committing” them
• Log in stable storage
Locking protocols
• Serialize Read and Write operations of same data by
separate transactions
…
Atomic Transactions
26
Tools for Implementing Atomic Transactions (continued)
Crash recovery – search log
– If begin entry, look for matching entries
– If done, do nothing (all files have been updated)
– If abort, undo any permanent changes that
transaction may have made
– If commit but not done, copy updated blocks from
log to files, then add done entry
Atomic Transactions
27
Distributed Atomic Transactions
Atomic transactions that span multiple sites
and/or systems
Same semantics as atomic transactions on single
system
• ACID
Failure modes
• Crash or other failure of one site or system
• Network failure or partition
• Byzantine failures
Atomic Transactions
28
Properties of transactions: ACID
Atomic
– The transaction happens as a single indivisible action. Others do not see
intermediate results. All or nothing.
Consistent
– If the system has invariants, they must hold after the transaction. E.g., total
amount of money in all accounts must be the same before and after a “transfer
funds” transaction.
Isolated (Serializable)
– If transactions run at the same time, the final result must be the same as if they
executed in some serial order.
Durable
– Once a transaction commits, the results are made permanent. No failures after a
commit will cause the results to revert.
Nested Transactions
A top-level transaction may create subtransactions
Problem:
– subtransactions may commit (results are durable) but
the parent transaction may abort.
One solution: private workspace
– Each subtransaction is given a private copy of every
object it manipulates. On commit, the private copy
displaces the parent’s copy (which may also be a
private copy of the parent’s parent)
Nested Transaction
 Constructed from a number of sub-transaction
 Top-level transaction may fork children run in parallel
in different machine
 The children itself may fork another child or subs
transaction
 When one transaction is commit- it will make visible to
their parent
31
Nested transactions
T : top-level transaction
T1 = openSubTransaction
T2 = openSubTransaction
T1 :
commit
T2 :
openSubTransaction openSubTransaction
prov. commit
T11 :
T12 :
openSubTransaction
abort
T21 :
openSubTransaction
prov. commit
Figure 12.13
prov. commit
T211 :
prov. commit
prov.commit
 transactions may be composed of other transactions
– several transactions may be started from within a
transaction
– we have a top-level transaction and subtransactions which
may have their own subtransactions
32
•
Nested transactions (12.3)
 To a parent, a subtransaction is atomic with respect to
failures and concurrent access
 transactions at the same level (e.g. T1 and T2) can run
concurrently but access to common objects is serialised
 a subtransaction can fail independently of its parent and
other subtransactions
– when it aborts, its parent decides what to do, e.g. start another
subtransaction or give up
33
•
Example Nested Transaction

Nested transaction gives you a hierarchy
Can distribute (example: WPJFK, JFKNairobi,
Nairobi -> Malindi)


Each of them can be manage independently

But may require multiple databases
Transaction:Booking a
ticket
WPJFK
Commit
JFKNairobi
Commit
Nairobi Malindi
Abort
34
Distributed transaction
1. A distributed transaction is composed of several subtransactions each running on a different site.
2. Separate algorithms are needed to handle the locking of
data and committing the entire transaction.
Differences between nested transaction and distributed
transaction
35
Transaction:Implementation
 Two methods are used
– Private Workspace
– Writeahead Log
– Consideration on a file system
36
Private Workspace
Conceptually, when a process starts a transaction, it is
given a private workspace (copies) containing all the
files and data objects to which it has access.
 When it commits, the private workspace replaces the
corresponding data items in the permanent workspace. If the
transaction aborts, the private workspace can simply be
discarded.
 This type of implementation leads to many private workspaces
and thus consumes a lot of space.
Optimization: (as cost of copying is very expensive)
 No need for a private copy when a process reads a file.
 For writing a file, only the file’s index is copied.
37
Private Workspace
a)
b)
Original file index and disk blocks for a three-block file
The situation after a transaction has modified/update block 0
and appended block 3
•
•
c)
Copy file index only. Copy blocks only when written.
Modified block 0 and appended block 3
After committing;
38
More Efficient Implementation/Write ahead log
 Files are actually modified, but before changes are made,
a record <Ti,Oid,OldValue,NewValue> is written to the
writeahead log on the stable storage. Only after the log
has been written successfully is the change made to the
file.
 If the transaction succeeds and is committed, a record is
written to the log, but the data objects do not have to be
changed, as they have already been updated.
 If the transaction aborts, the log can be used to back up to
the original state (rollback).
 The log can also be used for recovering from crash.
39
Writeahead Log
Don’t make copies. Instead, record action plus old and new
values
x = 0;
y = 0;
BEGIN_TRANSACTION;
x = x + 1;
y=y+2
x = y * y;
END_TRANSACTION;
(a)


Log
Log
Log
Old value
[x = 0 / 1]
(b)
[x = 0 / 1]
[y = 0/2]
New
value (c)
[x = 0 / 1]
[y = 0/2]
[x = 1/4]
(d)
a) A transaction
b) – d) The log before each statement is executed
•
•
If transaction commits, nothing to do
If transaction is aborted, use log to rollback
40
Concurrency Control (1)
The goal of concurrency control is to allow several
transactions to be executed simultaneously, but the
collection of data item is remains in a consistent state.
The consistency can be achieved by giving access to the
items in a specific order
 General organization of managers for handling transactions.
41
Concurrency Control (2)
 General organization of
managers for handling
distributed transactions.
42
Serializability
BEGIN_TRANSACTION
x = 0;
x = x + 1;
END_TRANSACTION
(a)
BEGIN_TRANSACTION
x = 0;
x = x + 2;
END_TRANSACTION
BEGIN_TRANSACTION
x = 0;
x = x + 3;
END_TRANSACTION
(b)
(c)
Schedule 1
x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3
Legal
Schedule 2
x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3;
Legal
Schedule 3
x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3;
Illegal
(d)
 a) – c) Three transactions T1, T2, and T3
 d) Possible schedules
43
One-phase atomic commit protocol
The protocol
– Client request to end a transaction
– The coordinator communicates the commit or
abort request to all of the participants and to
keep on repeating the request until all of them
have acknowledged that they had carried it out
The problem
– some servers commit, some servers abort
• How to deal with the situation that some servers
decide to abort?
Introduction to two-phase commit protocol
Allow for any participant to abort
First phase
– Each participant votes to commit or abort
The second phase
– All participants reach the same decision
• If any one participant votes to abort, then all abort
• If all participants votes to commit, then all commit
– The challenge
• work correctly when error happens
Failure model
– Server crash, message may be lost
The two-phase commit protocol
When the client request to abort
– The coordinator informs all participants to abort
When the client request to commit
– First phase
• The coordinator ask all participants if they prepare to
commit
• If a participant prepare to commit, it saves in the
permanent storage all of the objects that it has altered
in the transaction and reply yes. Otherwise, reply no
– Second phase
• The coordinator tell all participants to commit ( or
abort)
The two-phase commit protocol … continued
Operations for two-phase commit protocol
The two-phase commit protocol
– Record updates that are prepared to commit in
the permanent storage
• When the server crash, the information can be
retrieved by a new process
• If the coordinator decide to commit, all
participants will commit eventually
Timeout actions in the two-phase commit protocol
Communication in two-phase commit protocol
New processes to mask crash failure
– Crashed process of coordinator and participant will be
replaced by new processes
Time out for the participant
– Timeout of waiting for canCommit: abort
– Timeout of waiting for doCommit
• Uncertain status: Keep updates in the permanent storage
• getDecision request to the coordinator
Time out for the coordinator
– Timeout of waiting for vote result: abort
– Timeout of waiting for haveCommited: do nothing
• The protocol can work correctly without the confirmation
Two-phase commit protocol for nested transactions
Nested transaction semantics
– Subtransaction
• Commit provisionally
• abort
– Parent transaction
• Abort: all subtransactions abort
• Commit: exclude aborting subtransactions
Distributed nested transaction
– When a subtransaction completes
• provisionally committed updates are not saved in
the permanent storage
Distributed nested transactions commit protocol
Each subtransaction
– If commit provisionally
• Report the status of it and its descendants to its
parent
– If abort
• Report abort to its parent
Top level transaction
– Receive a list of status of all subtransactions
– Start two-phase commit protocol on all
subtransactions that have committed
provisionally
Example of a distributed nested transactions
The execution process
The information held by each coordinator
– Top level coordinator
• The participant list: the coordinators of all the
subtransactions in the tree that have provisionally
committed but do not have aborted parent
– Two-phase commit protocol
• Conducted on the participant of T, T1 and T12
Different two-phase commit protocol
Hierarchic two-phase commit protocol
– Messages are transferred according to the
hierarchic relationship between successful
participants
– The interface
Flat two-phase commit protocol
– Messages are transferred from top-level
coordinator to all successful participants
directly
– The interface
Locking
 Locking is the oldest, and still most widely used,
form of concurrency control
 When a process needs access to a data item, it
tries to acquire a lock on it - when it no longer
needs the item, it releases the lock
 The scheduler’s job is to grant and release locks
in a way that guarantees valid schedules
53
 In 2PL, the scheduler grants all the
locks during a growing phase, and
releases them during a shrinking phase
 In describing the set of rules that govern
the scheduler,
 we will refer to an operation on
data item x by transaction T as
oper(T,x)
54
Two-Phase Locking Rules (Part
1)
 When the scheduler receives an operation
oper(T,x), it
tests whether that operation conflicts with any
operation
on x for which it has already granted a lock


If it conflicts, the operation is delayed
If not, the scheduler grants a lock for x and passes the
operation
to the data manager
 The scheduler will never release a lock for x until
the
data manager acknowledges that it has performed
55
the
Two-Phase Locking Rules (Part
2)
 Once the scheduler has released any lock on
behalf of
transaction T, it will never grant another lock on
behalf of
T, regardless of the data item T is requesting the
lock for
 An attempt by T to acquire another lock after
having
released any lock is considered a programming
error,
and causes T to abort
56
Two-Phase Locking (1)
Two-phase locking.
57