Bugz A way to trap bugs…

Concurrency Control
in Distributed
Databases.
By :Rishikesh Mandvikar
rmandvik[at]engr.smu.edu
May 1, 2004
Topics

Serializability Theory
 Centralized Databases
 Distributed Databases

Lock Based Concurrency Control Algorithms
 Centralized (2PL,
S2PL)
 Distributed (C2PL, PC2PL, D2PL)

Optimistic Concurrency Control
2
Serializability Theory [13]
3
Serializability Theory [13]
4
Serializability Theory extended to
Distributed Database [14]

Fragmentation
 Horizontal
 Vertical
 Hybrid

Replication
 Synchronous Replication
 ROWA Protocol
 Voting
 Asynchronous
Replication
5
Concurrency Control
algorithms
Optimistic
Pessimistic
Locking
Timestamp
ordering
Centralized
Centralized
Primary
copy
Multiversion
Distributed
Conservative
Hybrid
Locking
Timestamp
ordering
Classification of CC Algorithms [14]
6
Locking based CC Algorithms

Centralized
 2PL (Relaxed
S2PL)
 S2PL

Distributed
 C2PL
 PC2PL
 D2PL
7
2 Phase locking (2PL) [13]
Rules:
Growing phase:
 “A txn that has to read/write a data object first has
to request a read/write lock on it.”
Shrinking phase:
 “A txn cant request additional locks once it
releases a lock.”
8
Lock-point
Number
of locks
Locks
released
Locks
acquired
Transaction
Lock Graph for 2PL
9
Strict 2 Phase Locking (S2PL) [13]
Rules:
Growing phase:
 “A txn that has to read/write a data object first has
to request a read/write lock on it.”
Non - Shrinking phase:
 “Txn releases all locks only when it completes.”
10
Lock-point
Locks held till
transaction completion
Number
of locks
Locks
acquired
Locked data items used
Transaction
Lock Graph for S2PL
11
2PL, S2PL [13]
12
2PL, S2PL

Differences
 2PL
Cascading aborts
 Conflict serializable schedules (not all)
 High concurrency

 S2PL
No cascading aborts
 Serializable schedules
 Low concurrency

13
Centralized 2PL
User application
User application
TM
TM
LM
DP
Replica control
protocol
C2PL
DP
Local data
Local data
Site#1
Site#2
14
Centralized 2PL [14]
 Cons
Failure
of primary site
Bottleneck situation
Communication links
15
Primary Copy 2PL [14]
 Lock
on primary copy necessary
 Lock management at the primary-copy sites
only
 Pros
 Reduces
load at central site
 Cons
 Deadlock
handling is partially centralized
16
Distributed 2PL [14]
User application
User application
TM
TM
LM
LM
DP
DP
Local data
Local data
Site#1
Site#2
Replica control
protocol
D2PL
17
Distributed 2PL [14]
 Pros
 Lock
management independency
 Cons
 Complex
deadlock handling required
 Communication cost
18
Optimistic Concurrency Control [13][14]
Txns
assumed to have no conflicts
Private workspace area
Validation of txns before write phase
19
Optimistic Concurrency Control [13][14]

Txn phases:
 Read

and Compute
read from database and write into private workspace
 Validate
Timestamps assigned over here
 Check for conflict with concurrent txns

 Write

Copy into database if validation successful
20
Optimistic Concurrency Control [13][14]
For Ti and Tj where TS(Ti) < TS(Tj)
 Validation Criteria
 All
phases of Ti execute before Tj
 Ti ends before write phase of Tj and Ti doesn’t
modify data read by Tj
 Ti finishes its read phase before Tj finishes its read
phase and they both don’t read/write any common
data
21
Optimistic Concurrency Control [13][14]

Validation
 For
validating Tj w.r.t committed txn Ti where
TS(Ti) < TS(Tj)
Maintain a list of read/write object list for Tj
 Other cant commit while Tj is validated
 Once Validated, write phase allowed to finish
 Bottleneck situation

22
Optimistic Concurrency Control [13][14]

Advantages
 Increased
concurrency with a good “mix” of txns.
 Better than Lock based systems

Disadvantages
 Bottleneck situation
 Maintaining read/write list
for every txn
 Copying the private space to the database
 Long txns
23
Optimistic Concurrency Control [13][14]

Disadvantages
 Long
txns
Read/write list would be very long
 Chance of Restart is proportional to the square of its
size [9]

24
Research

Optimistic CC algorithm
 IBM’s
IMS FASTPATH (Centralized DBMS)
 OCC in Distributed DBMS
25
Conclusion
Serializability Theory
 Lock Based Systems
 Optimistic CC algorithms


Timestamp Ordering
26
Questions??
27