Secondary Storage Agenda Secondary Storage Disk trends

Agenda
• Overview of secondary storage (disks)
Secondary Storage
CSCI 444/544 Operating Systems
Fall 2008
• Disk structure
• Disk performance
• Disk scheduling
• Disk management
• RAID (Redundant Arrays of Inexpensive Disks)
Secondary Storage
Disk trends
Disk capacity, 1975-1989
Secondary storage typically:
• is anything that is outside of “primary memory”
• does not permit direct execution of instructions or data
retrieval via machine load/store instructions
Characteristics:
•
•
•
•
it’s large: 200-1000GB
it’s cheap: $0.45/GB
it’s persistent: data survives power loss
it’s slow: milliseconds to access
•
•
•
•
doubled every 3+ years
25% improvement each year
factor of 10 every decade
Still exponential, but far less rapid than processor
performance
Disk capacity since 1990
•
•
•
•
doubling every 12 months
100% improvement each year
factor of 1000 every decade
10x as fast as the increase of processor
performance!
1
Memory Hierarchy
<1KB
CPU registers
64KB
L1 cache
4MB
2GB
1000GB
1-1000TB
Disks and the OS
<1ns
L2 cache
Primary Memory
Disks are messy, messy devices
1 ns
• errors, bad blocks, missed seeks, etc.
Job of OS is to hide this mess from higher-level software
4 ns
• low-level device drivers (initiate a disk read, etc.)
• higher-level abstractions (files, databases, etc.)
10 ns
Secondary Storage
Tertiary Storage
Each level acts as a cache of lower levels
10 ms
1s-1hr
OS may provide different levels of disk access to different
clients
• physical disk block (surface, cylinder, sector)
• disk logical block (disk block #)
• file logical (filename, block or record or byte #)
Physical Disk Structure
Disk Controller
Responsible for interface between OS and disk drive
• Common interfaces: ATA/IDE vs. SCSI
– ATA/IDE used for personal storage
– SCSI for enterprise-class storage
Basic operations
• Read block
• Write block
OS does not know of internal complexity of disk
• Disk exports array of Logical Block Numbers (LBNs)
• Disks map internal sectors to LBNs
2
Disk Operations
Disk performance depends on a number of operations
• seek: moving the disk arm (head) to the correct cylinder
– depends on how fast disk arm can move
• rotation (latency): waiting for the sector to rotate under head
– depends on rotation rate of disk
• transfer: sequentially moving data from surface into disk controller,
and from there sending it back to host
– depends on density of bytes on disk
When the OS uses the disk, it tries to minimize the cost of all
of these operations
• particularly seeks and rotation
Disk Scheduling
Disk Performance
Positioning (head): Seek + Rotation
• Positioning time: Seek time + Rotational Delay
How long to read or write n sectors?
• Positioning time + Transfer time (n)
Implicit contract:
• Large sequential accesses to contiguous LBNs achieve
much better performance than small transfers or
random accesses
FCFS
Goal: Minimize positioning time
FCFS: Schedule requests in order received
• Advantage: Fair
• Disadvantage: High seek cost and rotation
Shortest seek time first (SSTF):
• Handle nearest cylinder next
• Advantage: Reduces arm movement (seek time)
• Disadvantage: Unfair, can starve some requests
3
SSTF
Disk Scheduling (II)
• SCAN (elevator algorithm)
– move arm from one end toward the other end
– service requests until reach the other end, then reverse
– skews wait times non-uniformly
• C-SCAN (Circular-Scan)
– Like scan, but only go in one direction, then start over again
(typewriter)
– uniform wait times
• LOOK and C-LOOK
– similar to SCAN and C-SCAN, except stop at the last request
– look for a request before continue to move in a give direction
SCAN
C-SCAN
4
C-LOOK
Disk Management
Low-level formatting, or physical formatting — Dividing a disk
into sectors that the disk controller can read and write.
To use a disk to hold files, the operating system still needs to
record its own data structures on the disk.
• Partition the disk into one or more groups of cylinders.
• Logical formatting or “making a file system”.
Boot block initializes system.
• Bootstrap loader program in ROM.
• the full bootstrap program is stored in the “boot block” at the fixed
location on the disk.
Reliability
Disks fail more often....
• When continuously powered-on
• With heavy workloads
• Under high temperatures
How do disks fail?
• Whole disk can stop working (e.g., motor dies)
• Transient problem (cable disconnected)
• Individual sectors can fail (e.g., head crash or scratch)
– Data can be corrupted or block not readable/writable
Disks can internally fix some sector problems
• ECC (error correction code): Detect/correct bit flips
• Retry sector reads and writes: Try 20-30 different offset and timing
combinations for heads
• Remap sectors: Do not use bad sectors in future
RAID
RAID: multiple disk drives provide reliability via redundancy
• Performance: parallel access
• Capacity: store more data
Disk striping uses a group of disks as one storage unit.
RAID schemes improve performance and improve the
reliability of the storage system by storing redundant data.
• Mirroring or shadowing keeps duplicate of each disk.
• Block interleaved parity uses much less redundancy.
RAID turns multiple disks into one bigger, faster, more
reliable disk
5