CS 245: Database System Principles

CS4432: Database Systems II
Lecture #3
Using the Disk, and Disk Optimizations
Professor Elke A. Rundensteiner
CS 4432
lecture #3
1
Thus far :
• Hardware: Disks
• Architecture: Layers of Access
• Access Times and Abstractions
CS 4432
lecture #3
2
Optimizations
(in controller or O.S.)
• Disk Scheduling Algorithms
– e.g., elevator algorithm
•
•
•
•
Larger Buffer
Pre-fetch
Disk Arrays
Disk Cache
CS 245
4432
lecture
Notes #3
2
3
One Simple Idea : Prefetching
Problem: Have a File
» Sequence of Blocks B1, B2
...
Have a Program
» Process B1
» Process B2
» Process B3
CS 4432
lecture #3
4
Single Buffer Solution
(1)
(2)
(3)
(4)
CS 4432
Read B1  Buffer
Process Data in Buffer
Read B2  Buffer
Process Data in Buffer ...
lecture #3
5
Say P = time to process/block
R = time to read in 1 block
n = # blocks
Single buffer time = n(P+R)
CS 4432
lecture #3
6
Question:
Could the DBMS know something about
behavior of such future block accesses ?
What if:
If we knew more about the sequence of
future block accesses, how could we do
better ?
CS 4432
lecture #3
7
Idea : Double Buffering/Prefetching
Memory:
Disk:
process
process
C
A
B
A B C D E F G
donedone
CS 4432
lecture #3
8
Say P  R
P = Processing time/block
R = IO time/block
n = # blocks
What is processing time now?
• Double buffering time = ?
CS 4432
lecture #3
9
Say P  R
P = Processing time/block
R = IO time/block
n = # blocks
• Double buffering time = R + nP
• Single buffering time
CS 4432
lecture #3
= n(R+P)
10
Disk Arrays
• RAIDs (various flavors)
• Block Striping
• Mirrored
logically one disk
CS 4432
lecture
Notes #3
2
11
On Disk Cache
P
...
M
C
...
cache
cache
CS 4432
lecture
Notes #3
2
12
Block Size Selection?
• Question :
Do we want Small or Big Block Sizes ?
• Pros ?
• Cons ?
CS 4432
lecture #3
13
Block Size Selection?
• Big Block  Amortize I/O Cost
– For seek and rotational delays are
reduced …
Unfortunately...
• Big Block  Read in more useless stuff!
and takes longer to read
CS 4432
lecture #3
14