Chapter 5 Division - ECEN 621 Main Page

ECEn 621
Computer Arithmetic
Chapter 5 - Division
Winter 2008
Slide #1
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Two Types of Division Algorithms
● 
Integer
▪  Integer operands and result
▪  Requires an exact remainder
▪  Sign(dividend) = Sign(remainder)
● 
Fractional
▪  To avoid quotient overflow: x < d
▪  Quotient is rounded: 0 < q < 1
▪  Can result in a negative remainder
Winter 2008
Slide #2
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
1
Definitions for Division
x = q ⋅ d + rem
x
1
2
sign(x) = sign(rem)
x
rem
−q =
< ulp
d
d
rem < d ⋅ ulp
Error < ulp
Dividend
≤ d < 1 Normalized divisor (shift divisor until d = ±0.1xxxxxx)
q
Quotient
rem
ulp = 1
Remainder
⇒ integer quotient, integer division
ulp = r−n
⇒ fractional quotient, fractional division
n
q = qn = q0 + ∑ Di r−i
Di are quotient digits
Choice of digit set for Di determines the type of algorithm
i=1
Di ∈ {0,1,,r −1}
⇒ Restoring algorithm
r = 2 Di ∈ {−1,1} (no 0) ⇒ Binary non - restoring algorithm
Di ∈ {−a,, a} (incl 0)
Winter 2008
Slide #3
⇒ Non - restoring algorithm ( redundant )
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
€
Fractional Division Algorithm
k
Series
qk = q0 + ∑ Di r−i
i=1
Residual
w k = r k ( x − dqk )
Recurrences
qk +1 = qk + Dk +1r−( k +1)
w k +1 = rw k − dDk +1
Di
are quotient digits
in redundant digit set
Invariant
∀k : x = dqk + r−k w k
Initial Values q0 = 0 w 0 = x
−n
Final Values q = qn rem = r w n
Winter 2008
Slide #4
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
€
2
Bound on the Error of Division Recurrence
x
− qn < r−n (ulp)
d
qn = D0 • D1D2 D3 D j D j +1D j +2 Dn
 

qj
Geometric Series
digits left to compute
D ∈ {−a,,a}
n
x
− q j < r− j ≤ r−n + ∑ ar−i
d
i= j +1
n
= r−n + a ∑ r−i
i= j +1
= r−n + a
r− j − r−n
r −1

a 
ρ =


r −1
= r−n + ρr − j − ρr −n
< ρr− j
Winter 2008
Slide #5
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
€
Bound on the Residual
x
− q j < ρr− j
d
x − dq j < ρdr− j
r j ( x − dq j ) < ρd

wj
w j < ρd
Winter 2008
Slide #6
€
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
3
Digit Selection for Division
Winter 2008
Slide #7
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Derivation of Digit Section
Inductive Assumption:
Inductive Step:
Residual Recurrence:
Digit Selection:
Residual Minimum and Maximum:
Margin:
Overlap between Ux and Lx+1
Winter 2008
Slide #8
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
4
Digit Selection Functions
Winter 2008
Slide #9
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
P-D Diagram for Division, d > 0
Inductive Assumption:
Inductive Step:
Digit Selection
d≥
U 2 = d (2 + ρ)
1
2
U1 = d (1+ ρ)
Dk
2
rw k
L2 = d (2 − ρ)
€
U 0 = dρ
L1 = d (1− ρ)
€U
−1
1
k
= d (−1+ ρ)
0
L0 = −dρ
−1
U−2 = d (−2 + ρ)
L−1 = d (−1− ρ)
Winter 2008
Slide #10
−2
L−2 = d (−2 − ρ)
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
€
€
5
Redundancy Factor ρ and Overlap
a
ρ
Maximal Redundancy r −1 1
Minimal Redundancy
overlap
d ≥ 1/2
r 2 (1 2) + (2ρ −1)d > 0
€
Winter 2008
Slide #11
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Robertson Diagram for Division
Winter 2008
Slide #13
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
6
Redundancy in q-Digit Set ⇒ Overlap
Winter 2008
Slide #15
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Selection Function in P-D Diagram
Winter 2008
Slide #16
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
7
Digit Selection Using Constants
Winter 2008
Slide #17
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix-2 Division with BSD Quotient Digits (SRT)
Winter 2008
Slide #18
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
8
Radix-2 Division with BSD Quotient Digits
Winter 2008
Slide #19
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Relationship Between Roberston and P-D Diagrams
3. The distance between
selection bounds is fixed
on a Roberson diagram.
1. Choose a divisor d,
0.5 ≤ d < 1.0
2. This determines
the distance between
selection bounds.
Winter 2008
Slide #20
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
9
Staircase Selection Functions
Winter 2008
Slide #21
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Tiles = Uncertainty Rectangles
The larger the tiles,
the fewer bits need
to be inspected.
Truncation
Error of rw[j]
If rw[j] is in carry-save
form (u+v), then to get j
bits of accuracy for rw[j],
we need to inspect j+1
bits of u and v.
Winter 2008
Slide #22
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
10
p-d Plot with Overlap Region
Uncertainty
Rectangle
(because of
truncation)
A: 4 bits of p,
3 bits of d
OK
B: 3 bits of p,
4 bits of d
Ambiguous
Winter 2008
Slide #23
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Choosing the Section Boundary
1. Tile with largest
admissible rectangles
2. Verify that no tile
intersects both boundaries.
3. Associate a quotient
digit with each tile.
Winter 2008
Slide #24
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
11
The Asymmetry of the Quotient Digit Section
Process
P can also go negative.
The second quadrant is
not a simple negation of
the first quadrant, due to
the asymmetric effect of
truncation of two’s
complement numbers.
Separate table entries for
other quadrants must be
derived.
Winter 2008
Slide #25
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Large Uncertainty Rectangles
Only one of the
large uncertainty
rectangles is not
totally in the
overlap region.
Break it into
smaller rectangles.
One extra bit for
both p and d are
needed for this
case.
Winter 2008
Slide #26
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
12
Determining Tile Sizes
● 
● 
● 
Goal: Find the coarsest possible grid such that the staircase
boundaries are entirely contained in the overlap areas.
There is no closed form for the number of bits required,
given the parameters r and a.
However, we can derive lower bounds on the number of
bits required.
Winter 2008
Slide #27
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Finding Lower Bounds on Number of Bits
● 
● 
● 
By finding an upper bound on the dimension of the tile
box, that determines a lower bound on the number of bits
needed.
The narrowest overlap area is the area between the two
largest digits: a and a - 1 at d min
Find the minimum horizontal and vertical dimensions of
the overlap area in that narrowest region
Winter 2008
Slide #28
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
13
Automating the Process
● 
● 
Determining the bound on the number of bits required and
generating the contents of the digit selection PLA can be
easily automated.
However, the Intel Pentium bug teaches us an important
lesson.
Winter 2008
Slide #29
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Intel’s Pentium Division Bug
● 
● 
● 
● 
● 
● 
Intel used the Radix-4 SRT division algorithm
Quotient selection was implemented as a PLA
The p-d plot was numerically generated.
Script to download entries into the PLA inadvertently removed a
few table entries from the table.
When hit, these missing entries resulted in digit 0, instead of the
intended digits ±2.
These entries are consulted very rarely, and thus the bug was
very subtle and difficult to detect.
Winter 2008
Slide #30
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
14
Fractional Division Digit Selection
Digit Set
Redundancy Factor
dDk+1
requires multiple
generation
Residual Bound
Residual Recurrence
Digit Selection
Simplified Selection
Winter 2008
Slide #31
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix -4 Division, r = 4, a = 2
Winter 2008
Slide #37
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
15
Radix-4 Digit Selection
(partial P-D diagram)
Winter 2008
Slide #38
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Carry-Save Remainders
● 
● 
● 
More important for speed than high-radix
Lead to large performance increases by replacing carrypropagate adder with carry-save adder.
Key to keeping remainder in carry-save form is:
Redundancy in the representation of q.
▪  allows less precise guessing of quotient digit based on
approximate magnitude of partial remainder
▪  more redundancy → less precision required
Winter 2008
Slide #41
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
16
Carry-Save Partial Remainders
● 
● 
● 
Two numbers sum to the actual partial remainder
To perform exact comparison, a full CPA would be
required
Overlaps in the selection regions allow us to perform
approximate comparisons without risk of choosing a
wrong digit.
Winter 2008
Slide #42
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Carry-Save Partial Remainders
Winter 2008
Slide #43
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
17
Digit Selection
Winter 2008
Slide #44
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Summary: Digit Selection in Dividers
● 
● 
● 
p-d plots can be used to choose quotient digits.
The number of bits inspected in p and d affect the
uncertainty of the quotient bit. Uncertainty rectangles in pd plots can be used to determine the minimum number of
bits to inspect.
Prescaling can be used to restrict the divisor and simplify
the selection of quotient digits.
Winter 2008
Slide #45
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
18
Division With Prescaling
● 
● 
● 
● 
● 
● 
The overlap regions are tallest toward the high end of the
divisor d range, for example [1, 1+2−k).
If we can restrict d to be in a narrow region, then the
selection of the quotient digits is simpler (requires fewer
bits of p and d, possibly made independent of d altogether)
Prescaling -- instead of computing z/d, multiply both
dividend and divisor by a constant m before beginning
division and compute zm/dm
Table lookup used to determine scaling factor m ≈ 1/d.
Use existing hardware in divider for multiplying divisor
and quotient digits by m.
Speedup in selection logic must be weighed against extra
multiplication steps at the beginning.
Winter 2008
Slide #46
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
P-D Diagram for Division with Prescaling
z
1
Let m ≈
(rounded up)
d
d
zm z′
q=
=
where 1 ≤ d ′ < 1+ 2−k
dm d ′
q=
€
[1 ... 1+2−k)
The better the approximation for
1/d, the higher the value of k
= 1.000xxx
If set up properly, digit selection
can be done trivially by rounding
the residual
Winter 2008
Slide #47
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
19
Divider Hardware
Winter 2008
Slide #48
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Components & Timing of a Division Step
Winter 2008
Slide #49
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
20
Factors Affecting Recurrence
Affects the number of iterations and multiple generation
Redundancy factor affects digit selection
and multiple generation
Slow
Fast computation of residual
And affected by all of the above:
A major contributor to iteration cycle time
Much simplified with prescaling, at the expense of a pre-multiply
Winter 2008
Slide #50
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Initialization
x
*
€
*
*produces scaled quotient, must be scaled up (by 2 or 4)
during termination step.
Winter 2008
Slide #51
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
21
Termination
● 
Correction step (so final residual is positive)
● 
Correction for scaling of dividend in initialization by
shifting quotient in termination step
● 
For most floating-point implementations it is required to
detect the zero-remainder condition.
wn = 0
⇒ bits of qn after digit n are 0
€
Winter 2008
Slide #52
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Division Implementation
Can be pipelined
Totally
Sequential
Part Sequential /
Part Combinational
Totally
Combinational
1 copy
N steps
k copies
N/k steps
N copies
1 step
Winter 2008
Slide #60
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
22
Comparison of Division and Multiplication
● 
● 
Same alternatives exist for division as multiplication
▪  Totally sequential
One Recursion Step
▪  Sequential/Combinational
▪  Totally combinational / pipelined
Combinational Multiplication
▪  Because of associativity of addition,
additions can be performed with
•  Linear array
•  Tree
● 
Combinational Division
▪  Because of dependences between
residual and quotient digit selection,
additions must be done with
•  Linear array only
Winter 2008
Slide #61
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Schemes to be Compared
And here are the results!
Winter 2008
Slide #62
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
23
Radix-2 with Residual in Carry-Save Form
qk +1 = qk + Dk +1 2−( k +1)
w k +1 = 2w k − dDk +1
€
Winter 2008
Slide #63
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix-2 with Residual in Carry-Save Form
011.1
011.0
010.1
010.0
001.1
001.0
000.1
000.0
Winter 2008
Slide #64
1
1
1
1
1
1
1
1
111.1
111.0
110.1
110.0
101.1
101.0
100.1
100.0
0
-1
-1
-1
-1
-1
-1
-1
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
24
Implementation of Radix-2 Scheme
Winter 2008
Slide #66
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix-4 with Residual in Carry-Save Form
qk +1 = qk + Dk +1 4 −( k +1)
w k +1 = 4w k − dDk +1
€
(See next slide)
Winter 2008
Slide #67
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
25
Radix-4 Digit Selection
qk+1
2
1
0
16 yˆ
-1
-2
Selection Constant Table
Winter 2008
Slide #68
€
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix-4 Example
qk+1
2
1
0
-1
-2
Winter 2008
Slide #69
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
26
Implementation of Radix-4 Scheme
Winter 2008
Slide #70
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
Radix-512 Division
with Prescaling
Winter 2008
Slide #74
ECEn 621 Computer Arithmetic
Dr. Doran Wilde
27