HO Ch 8 - people.vcu.edu

Chapter 8
11/23/09
1
IO System
• Storage
– Must be dependable
• Networks
– Must tolerate faults in communications by
including mechanisms to detect and recover
form faults.
• Peripherals
– Extremely Diverse
2
I/O Systems
• Emphasis is placed on dependability and
cost.
– Processors and memory emphasize
performance and cost.
• I/O System performance must keep pace
with processor performance.
– I/O can become a bottleneck.
3
4
I/O performance
• Complex
– Access Latency
– Throughput
• Depends on many aspects of the system.
–
–
–
–
–
Device characteristics
Connection between device and rest of system
Memory hierarchy
Operating system
Etc.
• I/O benchmarks are primitive compared to
processor benchmarks.
5
6
7
Transferring data between a device
and memory
• Polling
• I/O interrupts
• DMA
– Special DMA controller handles transfers
– Processor sets up DMA
– DMA controller starts operation, arbitrates the
bus, and interrupts processor when DMA is
complete.
8
Disk Storage
• Nonvolatile – Data is not lost when power
turned off.
• Consists of platters (1-4) each with two
recordable disk surfaces, and R/W heads.
• Platters are rotated at 5400-15,000 RPM.
• Each dist is divided into tracks.
• Tracks are divided into sectors.
9
10
Answer: Second. Track Seek time.
11
12
Disk manufactures report minimum, maximum and average
seek time.
The first two are easy to measure.
Average is open to wide interpretation because it depends on
seek distance. The standard is
Sum of the time for all possible seeks divided by number
of possible seeks.
Actual average may be considerably less.
13
14
Transfer time – time to transfer a block of bits.
Function of sector size, rotation speed and recording density.
30 – 80 MB/sec typical.
However, most disk controllers have a built-in cache that stores
sectors as they are passed over. Resulting in higher transfer rates.
Today most disk transfers are multiple sectors lengths.
Controller time – Overhead imposed by controller in performing
disk I/O.
Disk I/O time consists of the above times plus any wait time
because other processes are using the disk.
15
16
17
18
Replace large disk with many small disks.
19
20
RAID
• RAID 0 – Spread data over multiple drives.
– Called striping improves performance but no
redundancy.
• RAID 1 – Mirrors or shadows data to
redundant drive.
• RAID 2-6 – Incorporates error correction
techniques.
21
RAID 5
• To illustrate concepts and implications
consider RAID 5.
• RAID 5 uses striped array with rotating
parity.
• Optimized for short, multithreaded
transfers.
• Capable of recovering from a single drive
failure.
22
RAID 5 system consisting of three data drives and rotating parity.
Four stripes for sectors A, B, C, and D are shown.
23
Rotating Parity
• Why rotating parity?
• The following steps are necessary to update a single data sector
in a stripe.
– The old data sector and the parity sector for the stripe must be read.
– Compute the new parity using the new data sector, old data sector, and
old parity.
– Write new data sector and new parity sector.
• Thus, to write to a data sector both the data sector and parity
sector must be read and written.
• Since there are many data drives a fixed parity drive would
accessed much more frequently than a data drive.
• This excessive access of a single parity drive is avoid by
rotating parity across all drives.
24
Parity
Parity encoding is given by
P  D0  D1  D2  D3  D4  D5
Where Di represent a data byte in a sector on drive i.
If both sides of the above equation are exclusive ored
with P, then
P  D0  D1  D2  D3  D4  D5
P  P  D0  D1  D2  D3  D4  D5  P
0  D0  D1  D2  D3  D4  D5  P
D5 for example can be recovered by
D5  D0  D1  D2  D3  D4  P
25
Raid 6
• Use two parity drives (P and Q).
• Data can be recovered if two sectors in a
stripe are corrupted.
• P parity is the same as RAID 5 (simple
XOR).
– Easy to encode and easy to recover data.
• Q parity is more complicated.
26
Q parity encoding
P  D0  D1  D2  D3  D4  D5
The Q parity is a Reed-Solomon code given by
Q   g0  D0    g1  D1    g2  D2    g3  D3    g4  D4    g5  D5 
Where  is Galois Field (GF) multiplication and gi is a
constant. For i < 8 it turns out that gi = 2i. For larger i, it not
as simple. For example g8 = 29.
But Q simplifies to
Q  1 D0    2  D1    4  D2   8  D3   16  D4   32  D5 
The problem is how to compute the GF multiplication.
27
GF multiplication
• In ordinary arithmetic multiplication can be
accomplished summing the logs and taking the
inverse log.
• GF multiplication is typically accomplished
using lookup tables to find the GF log and
inverse log. The addition in modulo 255.
A  B  log
1
GF
 logGF ( A) 255
logGF ( B) 
See Xilinx application note XAPP731
“Hardware Accelerator for RADD 6 Parity
Generation / Data Recovery Controller”.
28
29
Buses
• A bus is a shared communication link, which uses
one set of wires to connect multiple subsystems.
• Advantages
– Versatile
– Low cost
• Disadvantage
– Communication bottleneck
• A bus generally consists of data, and control lines.
– Control lines are used to signal request and
acknowledgments, and to indicate what type of
information is on the data lines.
– Data lines carry information between the source and
destination. These lines are often separated into address
and data.
30
Bus Transactions
• A sequence of bus operations that includes a
request and may include a response, either
of which may carry data.
• May require several bus operations to
complete.
• Includes two parts sending address and
sending or receiving data.
31
Processor-memory bus
•
•
•
•
Connects processor and memory.
Short
High speed
Matched to memory system to maximize
memory-processor bandwidth.
32
I/O Buses
•
•
•
•
Can connect many types of I/O devices.
Can be long.
Wide range of data bandwidths.
Provides a way of extending the machine
and adding new peripherals.
33
Backplane Bus
• Allows processor, memory and I/O to exist
on a single bus.
34
Synchronous Bus
• Contains a clock as part of the control lines,
and uses a fixed protocol for
communicating that is relative to the clock.
• Every device must run at the clock rate.
• Because of clock skew synchronous busses
can not be long.
• Processor-memory buses tend to be
synchronous.
35
Asynchronous Buses
•
•
•
•
•
•
Not clocked.
Uses handshaking.
Can accommodate wide variety of devices.
Can be long.
Frequently used in I/O buses.
USB and Firewire are asynchronous buses.
36
37
address
data
38
39