Ch. 10.
Transaction Manager Concepts
Dr. Hua
COP 6730
1
Transaction Manager Concepts
• The transaction manager (TM) furnishes the A, C,
and D of ACID.
– It provides the all-or-nothing property (atomicity) by
undoing aborted transactions, redoing committed ones,
and coordinating commitment with other TMs if a
transaction happens to be distributed.
– It provides consistency by aborting any transactions
that fail to pass the RM consistency tests at commit.
– It provides durability by forcing all log records of
committed transactions to durable memory as part of
commit processing, redoing any recently committed
work at restart.
• The TM together with the log manager and the
lock manager supplies the mechanism to build
RMs and computations with the ACID properties.
2
Normal Execution
new transaction
Begin_Work ( )
TRID
Normal
Work
Requests
Lock
Requests
Lock
Manager
4.
Functions
Lock
Records
Write
Log
Manager
Commit
log record
and
1. Want to Commit
Commit_Work ( )
Callback
2. Commit Phase 1?
Functions
3. YES to Phase 1
UNDO,
REDO,
COMMIT
5. Commit Phase 2
RMs
?
6. Acknowledge
3
TM
Transaction Abort
Application
Rollback_Work ( )
1. rollback transaction
2. Read transaction’s
log records
Log
Manager
Normal
Functions
Callback
Functions
RMs
5. write abort records
3. UNDO (log record)
4. Aborted (TRID)
Note: Rollback to a savepoint has
similar logic.
4
TM
DO-UNDO-REDO Protocol
• The DO-UNDO-REDO protocol is a
programming style for RMs implementing
transactional objects
DO program:
UNDO program:
Old
State
New State
Log Record
REDO program:
New State
DO
UNDO
DO
Old
State
REDO
New
State
Old State
Log Record
Log Record
• RM have following structure:
RM
Normal Function: DO program
Callback Functions: UNDO & REDO program
5
Restart
1. The TM regularly invokes checkpoints during
normal processing it informs each RM to
checkpoint its state to persistent memory.
2. At restart, the transaction mgr. scans the log table
forward from the most recent checkpoint to the end.
3. For each transaction that has not committed (e.g., T
? ) the TM calls the UNDO( ) callback of the RMs
to undo it to the most recent persistent savepoint.
Checkpoint
Crash
T1
T2
T3
6
Value Logging
Each log record contains the old and the new states of
the object.
UNDO Program: set the object to the old state.
REDO Program: set the object to the new state.
Example:
struct value_log_record_for_page_update
{
int
opcode;
/* opcode will say page update */
filename fname;
/* name of file that was updated */
long
pageno;
/* page that was updated */
char
old_value[PAGESIZE]; /* old value of page */
char
new_value[PAGESIZE]; /* new value of page */
};
7
Logical Logging
• Value logging is often called physical logging
because it records the physical addresses and
values of objects
• Logical (or operation) logging records the name of
an UNDO-REDO function and its parameter
It assume that each action is atomic and that in each
failure situation the system state will be action
consistent: each logical action will have been
completely done or completely undone.
8
Logical Logging (Cont’d)
step
1
step
2
step
3
step
1
logical action 1
step
2
step
3
logical action 2
a transaction
Problem: Partially complete actions can fail, and the
UNDO of these partial actions will not be presented
with an action-consistent state.
9
Physiological Logging
Motivation
• Physiological logging is a compromise between
logical and physical logging. It uses logical logging
where possible.
• There are the ideas that motivate physiological
logging:
Page actions: Complex actions can be structured as a
sequence of page actions.
Mini-transaction: Page actions can be structured as minitransactions that use logical logging.
When the action completes, the object is updated.
An UNDO-REDO log record is created to cover
that action.
These actions are atomic, consistent, and isolated.
10
Physiological Logging
Motivation (Cont’d)
Log-object consistency: It is possible to structure the
system so that at restart, the persistent state is pageaction consistent.
The log can then be used to transform this
action-consistent state into a transactionconsistent state at restart.
Note: Physiological log records are physical to a
page, and logical within a page.
11
Physiological Logging
An Example
File C
File B
Table T
(File A)
index 1
index 2
Key B
Key C
Consider the insert that has the following logical log
record:
<insert op, tablename = A, record value = r>
12
Physiological Logging
An Example (Cont’d)
This insert operation involves three page actions (we
assume that B-tree splits do not happen). The
corresponding physiological record bodies are:
<insert op, base filename = A, page number = 508,
record value = r>
<insert op, base filename = B, page number = 72,
index record value = s>
<insert op, index filename = C, page number = 94,
index record value = t>
Fundamental idea: Log records are generated on a
per-page basis. Log records are designed to make
logical transformation of pages.
13
Physiological Logging
During Online Operation
• We call normal operations without failures online
operations.
• To allow updates, all page changes must be
structured as mini-transactions of this form:
Mini_trans()
lock the object in exclusive mode
transform the object
generate an UNDO-REDO log record
unlock the object.
14
Physiological Logging
During Online Operation (Cont’d)
• The mini-transaction approach ensures online
consistency
Page-action consistency: volatile and
persistent memory are in a page-consistent
state, and each page reflects the most
recent updates to it.
Log consistency: The log contains a history of
all updates to pages.
15
One-Bit Resource Mgr
Requirements
This RM manages an array of bits stored in a single page.
Each bit is either free (TRUE) or busy (FALSE).
One-Bit RM
one
page
lsn
1
2
3
4
F
F
TF
F
5
FT
.
.
.
Client 1
locked
Client 2
unlocked
16
One-Bit Resource Mgr
Requirements (Cont’d)
Requirements:
1. Page Consistency
i. No clean free bit has been given to any transaction.
ii. Every clean busy bit has been given to exactly one
transaction.
iii. Dirty bits are locked in exclusive mode by the
transaction that modified them.
iv. The log sequence number (page lsn) reflects the most
recent log record for this page.
2. Log Consistency
i.
The log contains a log record for every completed
mini-transaction update to the page.
17
One-Bit Resource Mgr
give_bit( ) #1
give_bit (int i)
/* force a bit */
get XLOCK on the bit;
if the XLOCK is granted
then{
get the page semaphore;
free the bit;
generate log record saying bit is free;
write log record and update lsn;
/* page is now consistent */
free page semaphore;
}
else
abort caller’s transaction;
/* caller does not own the bit */
18
One-Bit Resource Mgr
give_bit( ) #1 (Cont’d)
Note: This code has all the elements of a minitransaction.
It is well formed and two-phased with respect
to the page semaphore.
It provides a page action-consistent
transformation of the page.
19
One-Bit Resource Mgr
give_bit( ) #2
get_bit (void)
/* allocate a free bit to and returns bit index */
get the page semaphore;
repeat_until end of bit array{
find the next free bit and XLOCK it;
if lock is granted
/* the bit is free */
then{ mark the bit busy;
generate log record describing update;
write log record and update lsn;
/* page is now consistent */
give up semaphore;
return the bit index to caller;}
if no free bits were found during the repeat loop
then {
abort transaction;
return “-1” to caller; }
20
The FIX Rule
While the semaphore is set, the page is said to be
fixed, and releasing the page is called unfixing it.
Fixed Rule:
1. Get the page semaphore in exclusive mode
prior to altering the page.
2. Get the semaphore in shared or exclusive
mode prior to reading the page.
3. Hold the semaphores until the page and log
are again consistent, and read or update is
complete.
21
The FIX Rule (Cont’d)
Note: This is just two-phase locking at the pagesemaphore level.
Isolation Theorem tells us that all read and
write actions on page will be isolated.
Page updates are actually mintransactions.
When the page is unfixed, the page should
be consistent and the log record should
allow UNDO or REDO of the page
transformation.
22
Multi-Page Actions
•
•
Some actions modify several pages at once.
Examples:
Inserting a multi-page record.
Splitting a B-tree node.
These actions are structured as follows:
1. Fix all the relevant pages
2. Do all the modifications and generate many
log records.
3. Unfix the page.
23
Dealing with Failures
• Page actions provide page consistency even if they fault.
– Copy the page at the beginning of the page action; then
– if anything goes wrong with the page action prior to writing the
log record, the page action just returns the page to its original
values by copying it back.
• Complex operations depend on transaction UNDO to roll
back.
– Each complex action should start by declaring a savepoint.
– If anything goes wrong during a page action, the operation first
makes that page consistent.
– The action can then call Roll_work () to return to the savepoint
Note: The save point wraps the complex action within a
subtransaction so that the complex action can be undone if it
fails.
24
Online Consistency & Restart
lsn
Consistency
time stamp
Volatile
Page Versions
...
Volatile
Log Records
...
Durable
Log Records
Persistent
Page Versions
VVlsn
VLlsn
...
...
DLlsn
PPlsn
• Online log consistency requires that volatile log
contain all log records up to and including vvlsn:
VVlsn VLlsn
25
Online Consistency & Restart
Consistency (Cont’d)
• Restart consistency ensures that if a transaction
has committed with commit_lsn, then that commit
record is in the durable log:
commit_lsn DLlsn
In addition, restart consistency guarantees that if version
X of the volatile copy overwrites the durable copy, then
the log records for version X are already present in the
durable log:
VVlsn DLlsn
Note: At restart, all volatile memory is reset and must be
reconstructed from persistent memory. We must
have:
PVlsn DLlsn
commit_lsn DLlsn
26
Write Ahead Log (WAL) Protocol
Protocol:
1. Each volatile page has a LSN field naming the log
record of the most recent update to the page.
2. Each update must maintain the page LSN field.
3. When a page is about to be copied to persistent
memory, the copier must first use the log manager to
copy all log records up to and including the page’s
LSN to durable memory (force them).
4. Once the force completes, the volatile version of the
page can overwrite the persistent version of the page.
5. The page must be fixed during the writes and during
the copies, to guarantee page action consistency.
Effect: The log record of a page must be moved to
durable memory prior to overwriting the page in
persistent memory.
27
Force-Log-at-Commit
Question: What if no pages were copied to persistent
memory, and the transaction committed?
If the system were to restart immediately, there would
be no record of the transaction’s updates, and the
transaction could not be undone.
Solution: Force-Log-at-Commit rule.
Rule: The transaction’s log records must be moved to
durable memory as part of commit
Implementation: When a transaction commits, the TM
writes a commit log record and requests the log
manager to flush the log.
As a consequence, all the log records prior to
the commit record are flushed as well.
28
Physiological Logging: Summary
The RM must observe the following three rules:
Fix rule: Cover all page reads and page writes
with the page semaphore.
Write-ahead log (WAL): Force the page’s log
records prior to overwriting its persistent copy.
Force-log-at-commit: Force the transaction’s log
records as part of commit.
Note: many systems use the physiological design.
29
UNDO: Compensation Log Records
Question: what should the page LSN become when
an action is undone?
If subsequent updates to the page by other transactions
have advanced the log sequence number, the LSN
should not be set back to its original value.
Strategy: the UNDO looks just like a new action that
generates a new log record called a
compensation log record.
This approach makes page LSNs monotonic, an
essential property for write-ahead log.
A transaction that produced n new log records
during forward processing will produce n new
log records when the transaction is aborted.
30
Idempotence and Testable
• Idempotent operation: If the UNDO or REDO
operation can be repeated an arbitrary number of
times and still result in the correct state, the
separation is idempotent.
Example: The operation “move the reactor rods to
position 35” is idempotent.
The operation “move the reactor rods down
2 cm” is not idempotent.
Note:
Repeated REDOs can arise from repeated
failures.
31
Idempotence and Testable (Cont’d)
• Testable state: If the old and new states can be
discriminated by the system, the state is testable.
Old
State
Unknown
State
Test
New
State
If an operation is not idempotent and the state is
not testable, the operation cannot be made atomic.
32
Idempotence of Physiological REDO
• Repeated REDOs can arise from repeated failures
during restart.
Example: Suppose the following physiological
log record were redone many times:
<insert op, base filename, page number,
record value >
If no special care were taken, this repeated REDO
would result in many inserts of the record into
the page.
33
Idempotence of Physiological REDO
(Cont’d)
• The following logic makes physiological REDOs
idempotent:
idempotent_physiologic_redo (page, logrec)
{
if (page_lsn < logrec_lsn)
redo (page, logrec);
}
Note: The first successful REDO will advance the
page LSN and cause all subsequent REDO of
this log record to be null operations.
34
The Need for the 2-Phase Commit
Protocol
Cancel key: The client may hit the cancel key at any
time during the transaction.
Server Logic: A server may require that a certain set
of steps be performed in order to make a complete
transaction.
Example: At commit, many forms-processing systems
check the completeness of the data.
Integrity check: SQL has the option to defer
referential integrity checks to transaction commit.
If any integrity checks are violated at commit, the
transaction changes cannot be committed, and
SQL wants to abort the transaction.
35
The Need for the 2-Phase Commit
Protocol (Cont’d)
Field calls: It is possible that field calls cannot
acquire the locks or that the predicates become
false at the end of the transaction. In such cases,
the RM waits to abort the transaction.
2-Phase Commit Protocol: When a transaction is
about to commit, each participant in the
transaction is given a chance to vote on whether
the transaction is a consistent state transformation.
If all the RMs vote yes, the transaction can
commit. If any vote no, the transaction is aborted.
36
2-Phase Commit
Commit
Phase I:
• Prepare: Invoke each RM asking for its vote.
• Decide: If all vote yes, durably write the transaction
commit log record.
Note: The commit record write is what makes a
transaction atomic and durable. If the
system fails prior to that instant, the
transaction will be undone at restart;
otherwise, phase 2 will be carried forward
by the restart logic.
37
2-Phase Commit
Commit (Cont’d)
Phase II:
• Commit: Invoke each RM telling it the commit
decision.
Note: The RM can now release locks, deliver real
messages, and perform other clean-up tasks.
• Complete: When all acknowledge the commit message,
write a commit completion record to the log, indicating
that phase 2 ended. When the completion message is
durable, deallocate the live transaction state.
Note: Phase 2 completion record, is used at restart
to indicate that the RM have all been
informed about the transaction
38
Performance Advantage of Logging
Commit copies no objects only log
records to durable memory.
logging converts random write
I/Os to sequential write I/Os.
39
2-Phase Commit
Abort
• If any RM votes no during the prepare step, or if it
does not respond at all, then the transaction cannot
commit.
• The simplest thing to do in this case is to roll back
the transaction by calling Abort_work ( ).
40
2-Phase Commit
Abort (Cont’d)
• The logic for Abort_work ( ) is as follows:
Undo: Read the transaction’s log backwards, issuing
UNDO of each record. The RM that wrote the record is
invoked to undo the operation.
Broadcast: At each savepoint, invoke each RM telling it
the transaction is at the savepoint.
Abort: Write the transaction abort record to the log
(UNDO of begin_work( )).
Complete: Write a complete record to the log indicating
that abort ended. Deallocate the live transaction state.
41
Transaction Trees
• How does a transaction manager first hear about a
distributed transaction?
There are two cases:
– Outgoing case: a local transaction sends a
request to another node.
– Incoming case: a new transaction request
arrives from a remote transaction manager.
42
Transaction Trees (Cont’d)
• The TM involved in a transaction form the
transaction tree.
Root TM (coordinator)
performs the original
Began_work( ).
TM
RM
a session
RM
A
participant
TM
TM
TM
RM
• This TM has one incoming session and
two outgoing sessions.
It is both a participant (on the
incoming session) and a coordinator
(on the outgoing session).
TM
TM
a local
RM
43
Distributed 2-Phase Commit
Commit Coordinator
The root commit coordinator executes the following
logic when a successful commit_work ( ) is
invoked on a distributed transaction.
Local prepare: Invoke each local RM to prepare
for commit.
Distributed prepare: Send a prepare request on
each of the transaction’s outgoing sessions.
Decide: If all RM vote yes and all outgoing
sessions respond yes, then durably write the
transaction commit log record containing a list
of participating RMs and TMs.
44
Distributed 2-Phase Commit
Commit Coordinator (Cont’d)
Commit: Invoke each participating RM, telling it
the commit decision. Send “commit” message
on each of the transaction’s outgoing sessions.
Complete: When all local RMs and all outgoing
sessions acknowledge commit, write a
completion record to the log indicating that
phase 2 completed.
When the completion record is durable,
deallocate the live transaction state.
45
Distributed 2-Phase Commit
Commit Participant
• When the prepare message arrives, the participant executes the
following logic:
Prepare ( )
Local Prepare: Invoke each local RM to prepare for commit
Distributed prepare: Send prepare requests on the outgoing
sessions.
Decide: If all RMs vote yes and all outgoing sessions
respond yes, then the local node is almost prepared.
Prepared: Durably write the transaction prepare log
record containing a list of participating RMs,
participating TMs, and the parent TM.
Respond: Send yes as response (vote) to the prepare
message on the incoming session.
46
Wait: Wait (forever) for a commit message coordinator.
© Copyright 2026 Paperzz