DIGITAL HARDWARE ARITHMETIC CS 2600 The Two's Complement Representation Range of numbers in two’s complement method: 2 -2 n-1 X 2 n-1 - 1 Slightly asymmetric - one more negative number -2 2 n-1 (represented by 10….0) 10 0) does not have a positive equivalent A complement operation for this number will result in an overflow indication There is a un unique qu representation p n n f for 0 0011 + 1011 1110 3 -5 2 -2 1111 1110 1 -1 0000 0001 0010 0 +2 -4 +3 +4 -5 1011 0100 +5 -6 +6 -7 0001 1 + 1111 -1 1 0000 0 0011 -3 3 1100 5 -3 2 +1 -2 1101 0101 + 1101 1 0010 1010 1001 -8 1000 0101 +7 0111 0110 0111 + 1101 1 0100 0100 7 -3 44 • Example ‐ (two’s complement) 01001 9 01001 9 11001 ‐7 1 00010 2 – Carry‐out discarded ‐ does not indicate overflow • In general, if X and Y have opposite signs ‐ no overflow opposite signs ‐ no overflow can occur regardless of whether there is a carry‐out or h th th i t not If X and Y have the same sign and result has different sign - overflow occurs Examples - (two (two’ss complement) 10111 -9 10111 -9 1 01110 14 = -18 mod 32 Carry-out and overflow 01001 00111 9 7 0 10000 -16 = 16 mod 32 No carry-out but overflow 0101 + 0011 1000 5 3 -8 No carry-out but overflow 1011 + 1100 1 0111 -5 -4 4 7 Carry-out and overflow -> 8 mod d 16 -9 9 mod d 16 C Condition for f overflow f (f (for logic implementation): ) S or C ( MSB( A) MSB( B)); Full adder si = xi yi ci ci+1 = xi . yi + ci . (xi + yi) TD = 1 (or 2); TD = 2; A Assume 3 i/ XOR gate 3-i/p t for f Si and d SOP fform for f Ci+1 Ripple-Carry Ripple Carry Adder c4 O Cn Cn 1; O xn 1 yn 1 S n 1 x n 1 y n 1S n1 TD/C = 2N; TD/S = 2N-1; TD/O = 2N+2 Subtraction Subtract operation, X-Y, is performed by adding g the complement p of Y to X In the two's complement p system y X-Y = X + (Y+1) This still requires only a single adder operation, since 1 is added through the forced carry input to the binary adder Use EX_OR to produce complement (use one I/P bit as flag): - Show how ?? - draw Ckt. May use XOR Instead of MUX si = xi yi ci ci+1 = xi . yi + ci . (xi + yi) YKN..(KN-1) X KN..(KN-1) CKN N-bit RCA S KN..(KN-1) C(K-1)N YN..(2N-1) X N..(2N-1) C2N N-bit RCA S N..(2N-1) CN Y0..(N-1) X 0..(N-1) N-bit RCA S 0..(N-1) Cascade of K N‐bit RCAs also possible – but delay is large C0 Look‐Ahead Adder Let Pi xi yi , Gi xi yi ; Si Pi Ci ci 1 Gi Pi Ci xi yi Pi + Ci Gi and Pi are termed: y Generate and Carry y Propagate p g Carry Si + Ci 1 Gi + Examples: 74283 - a 4-bit binary full adder with fast carry xi + yi hsi + si xi 1 ci yi 1 Carry Look ahead Logic x1 y1 x0 generate : g i xi yi propagate : pi xi yi carry : ci 1 g i pi ci Two Conditions: y0 c0 1) Ci+1=1 1 if gi=1 1 2) Ci+1=1 if Ci=1 and pi=1 C1 G0 P0C0 C2 G1 P1G0 P1P0C0 74283 – contd. hsi xi yi ( xi yi )( xi . yi ) pi g i ci 1 g i ( pi ) pi ci i.e. when gi=1, pi =1 pi ( g i ci ) c1 p0 ( g 0 c0 ) c2 p1 ( g1 c1 ) p1 ( g1 p0 ( g 0 c0 )) p1 ( g1 p0 )( g1 g 0 c0 ) c3 p2 ( g 2 c2 ) p2 ( g 2 p1 )( g 2 g1 p0 )( g 2 g1 g 0 p0 ) c4 p3 ( g 3 p2 )( g 3 g 2 p1 ) ( g 3 g 2 g1 p0 )( g 3 g 2 g1 g 0 c0 ) Carry-Look-Ahead - FAST Adders Gi = xi yi - generated carry ; Pi = xi + yi - ppropagated p g carry y ci+1= xi yi + ci (xi + yi) = Gi + ci Pi Substituting ci Gi 1 ci 1Pi 1; ci 1 Gi Gi 1Pi ci 1Pi 1Pi Further substitutions - All carries can be calculated in parallel, using: xn-1,xn-2,...,x0 , yn-1,yn-2,…,y0 , and forced carry c0 Method called: “Carry Look Ahead or Propagation” for Fast Adder design C2 G1 P1C1 Compare overall delay, w.r.t. t previous i circuits i it C3 G2 P2C2 G2 P2G1 P2 P1C1 C4 G3 P3G2 P3 P2G1 P3 P2 P1C1 A4 B4 + P4 G4 A3 B3 + P4 G4 A2 B2 + P4 C5 C5 P4 C4 B1 + P4 G4 C1 C1 S4 + S3 + S2 P3 C3 G4 A1 + P2 C2 P1 + S1 Example - 4-bit Adder - Draw Ckt. Ckt - How many gates ? - Delay: For Ci’s: TD = 3; Si = 3 + 2 = 5. F RCA: For RCA TD/RCA = 8; 8 SN-1 YN-1 S1 X N-1 FA CN Y1 X1 Y0 PL CN-1 GN-1 PN-1 S0 X0 C0 PL C2 C1 G1 P1 Carry lookahead logic Carry lookahead Fast Adder G0 P0 Delay of Carry-Look-Ahead Adders G - delay of a single gate At each stage g Delay of G for generating all Pi and Gi Delayy of 2G for ggeneratingg all ci ((two-level ggate implementation) Delay of 2G for generating sum digits si in parallel (twolevel gate implementation) Total delay of 5G regardless of n - number of bits in each p operand Large n (=32) - large number of gates with large fan-in Fan-in - number of gate inputs, n+1 here Span of look-ahead must be reduced at expense of speed Reducing Span n stages divided into groups - separate carry-lookcarry look ahead in each group Groups interconnected by ripple ripple-carry carry method Equal-sized groups - modularity - one circuit designed Commonly - group size 4 selected - n/4 groups 4 is factor of most word sizes Technology-dependent constraints (number of input/output pins) ICs IC adding ddi ttwo 4 digits di it sequences with ith carry-look-ahead l k h d exist i t » G needed to generate all Pi and Gi » 2G needed to propagate a carry through a group once the Pi,Gi,c0 are available » (n/4)2G needed to propagate carry through all groups » 2G needed to generate sum outputs Total - (2(n/4)+3)G = ((n/2)+3)G - a reduction of almost 75% compared to 2nG in a ripple-carry ripple carry adder YKN..(KN-1) X KN..(KN-1) CKN N-bit LAC-FA C YN..(2N-1) X N..(2N-1) C(K-1)N C2N S KN..(KN-1) KN (KN-1) N-bit LAC-FA CN S N..(2N-1) Y0..(N-1) X 0..(N-1) N-bit LAC-FA C0 S 0..(N-1) Cascade of K N‐bit LAC‐FAs also possible Comparison of the Delay of different systems Adder C4 C8 C12 C16 S15 C28 C32 S31 RCA 8 16 24 32 31 56 64 63 LAC-FA 3 5 7 9 10 15 17 18 ?? Speed-up for higher level carry bits Let: * Po P3 P2 P1P0 ; Then, C 4 G 3 P3G 2 P3 P2G1 P3 P2 P1G0 ; * Go * Go * Po C0 If: * P1 P7 P6 P5 P4 ; Then, C8 * G1 * G1 Similarly , C12 G 7 P7G 6 P7 P6G5 P7 P6 P5G4 ; * P1 C4 * G2 * G1 * P2 C8 ; * * P1 G0 C16 * * P1 P0 C0 * G3 * P3 C12 16-bit 2-level Carry-look-ahead Adder n=16 - 4 groups Outputs: Go* , G1* , G2* , G3* , P0* , P1* , P2* , P3* ; Inputs to a carry-look-ahead carry look ahead generator with outputs c4,c8,c12 * C 4 Go C8 * Po C0 ; Similarly , C12 * G2 C16 * G3 * * P2 G1 * G3 Expression/Output (4-stage block) Si * * * P2 P1 G0 * * * P3 P2 G1 * G1 * * P1 G0 * * P1 P0 C4 * P2 C8 * * * P2 P1 P0 C0 * * * * P3 P2 P1 Go Implementation Delay Total Delay 1 1 Gi ; Pi Ci = 4, 8, 12, 16 * G2 * P1 C4 * P3 C12 * * P3 G2 Gi* ; Pi* * G1 2; 1 3; C12 C15 : 5 2 7; 2 2 5 2 + 1 = 3 8 For LAC-L1, delays are: C16 -> 9, S15 -> 10 * * * * P3 P2 P1 P0 C0 S15 : 7 1 8 YKN..(KN-1) X KN..(KN-1) CKN N-bit LAC-FA YN..(2N-1) X N..(2N-1) C(K-1)N C2N S KN..(KN-1) N-bit LAC-FA CN S N..(2N-1) Y0..(N-1) X 0..(N-1) N-bit LAC-FA C0 S 0..(N-1) Cascade of K N(=16)‐bit HLG&P‐CAs also possible K = 2;; K = 4;; Bit Delay Bit Delay (K = 1) C (K 1) C16 5 (K = 2) C (K 2) C32 7 (K = 2) C28, C32 5 + 2 = 7 (K = 3) C44, C48 7 + 2 = 9 ((K = 2) C ) 28 ‐‐> C31 7 + 2 = 9 (K = 4) C60, C64 9 + 2 = 11 (K = 2) S31 9 + 1 = 10 (K = 4) C60 ‐‐> C63 11 + 2 =13 (K = 4) S63 13+ 1 = 14 Comparison of the Delay of different systems Adder C4 C8 C16 S15 C32 S31 C64 S63 RCA 8 16 32 31 64 63 128 127 LAC-FA 3 5 9 10 17 18 33 34 HLG&P‐ HLG&P CA - - 5 8 7 10 11 14 ?? C16 * G3 ** Go * G3 * P3 C12 * * P3 G2 L2 – HLG&P CA * * * P3 P2 G1 * * * * P3 P2 P1 Go * * * * P3 P2 P1 P0 C0 ** P0 C0 ; where, ** Go * G3 ** P0 * * * * P3 P2 P1 P0 * * P3 G2 * * * P3 P2 G1 * * * * P3 P2 P1 Go ; C 16 G 3* P3* C 12 G 3* P3* G 2* P3* P2* G 1* P3* P2* P1* G o* P3* P2* P1* P0* C 0 G o* * P0* * C 0 ; where , Delay for C16: 5 Delay for S63: 12 Delay for C64: 7 G o* * G 3* P3* G 2* P3* P2* G 1* P3* P2* P1* G o* ; P0* * P3* P2* P1* P0* Comparison of the Delay of different FAST ADDER systems Adder C4 C8 C16 S15 C32 S31 C64 S63 RCA 8 16 32 31 64 63 128 127 LAC-FA 3 5 9 10 17 18 33 34 HLG&P‐ CA - - 5 8 7 10 11 14 7 12 2nd LEVEL HLG&PCA 5 7 Other Advanced Adders: - Pipelined - Manchester Adder - Carry-Skip/Carry Select - Parallel Prefix - Carry-save C adders dd – Wallace W ll ttree - MULTIPLIER UNIT 1101 x1011 --------------------1101 1101 0000 1101 _____________ 11 00 00 0 11 11 11 11 1 Three types of high-speed multipliers: Sequential multiplier - generates partial products sequentially and adds each newly generated product to previously accumulated partial product Parallel multiplier - generates partial products in parallel, accumulates using a fast multi-operand adder Array multiplier - array of identical cells generating new partial products; accumulating them simultaneously PPi-1,j+1 mj qi qi Cout Cin PPi,j Typical Cell of an Array Multiplier For Sequential circuit binary Multiplier: Need - ADDER and Shift-right Registrar (SR) modules. The control circuit requires a clock; Multiplixer to decide: - Add zero (only SR) Or - Add and shift (SR). (SR) Result to be held in 2 2-K K bit SR. Unsigned Binary Multiplication Flowchart for Unsigned Binary p Multiplication Execution of Example p Booth’s Algorithm for unsigned p multiplication Two basic principles: - Strings of 0’s require no addition – only shift - String St i off 1’s 1’ may be b given i special i l trreatment: t t t 001110 (+14) --> 010000 - 000010 (16 - 2); 1’s from K-bit posn. to M-bit posn: Treat as: 2K+1 – 2M ; In the above example K = 3, M = 1; Thus h M( (Multiplicand) l i li d) X 14 = M X 24 = M X 21 ; y Obtain result by: M << 4 - M << 1 // view as C-code. Take a example: multiplier = 30; Multiplicand = 45 (30) --> > 32 – 2 => Change Multiplier to: 0011110 0 (+1) 0 0 0 0 0 0 0 0 0 0 (-1) 0 0100000 (32) 1111110 (-2) ( 2) ------------------------- - 1 0 1 1 0 1 0 +1 0 0 0 -1 0 - - - - - - - 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 - - - - - - - - - - - - - 0 0 1 0 1 0 1 0 0 0 1 1 0 1 - - 0 1 1 1 Good multiplier for coding: 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 ….. Worst case multiplier: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ……… Mult 0 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0 Coded Mult. 0 +1 -1 +1 0 -1 0 +1 0 0 -1 +1 -1 +1 0 -1 0 0 Three rules for Booth’s algm. implementation with operation on the multiplicand: - For the LSB = 1 in a string of 1’s in a multiplier: Subtract the multiplicand from the partial product; - For the first bit-0 (prior to a 1) in a string of 0’s, the multiplicand is added to the partial product. - For any pair of identical bit-pair in the mutliplier, the partial product is unchanged. 0 0 1 1 1 1 0 0 +1 0 0 0 -1 1 0 Multiplier Bits Bit i Bit (i-1) (i 1) O Opn. / Bit Pattern P tt 0 0 0xM Shift only; String of Zeros 0 1 +1 x M Add and d Shift; Shift End E d off a String of Ones 1 0 -1 1xM Subtract and Shift; Beginning 1 1 0xM Shift only; String of Ones of a String of Ones Booth’s Algorithm Example p of Booth’s Algorithm g Execution of Example p – Booth Multiplier p +13 01101 -6 6 11010 0 -1 1 +1 -1 1 0 1 1 - 0 1 1 0 1 0 -1 +1 -1 0 - - - - - - - 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 - - - - - - - - - - - - 1 1 1 1 1 0 1 1 0 0 1 0 Fast Multiplication done by: - Bit-pair recoding - Carry-save Addition of Summands - + F Fastt L Look k ahead h d Carry C (with ( ith both b th above) b ) - Pipelined and Booth Array Tree - etc. BIT-PAIR RECODING OF MULTIPLIERS Observe this: -6 11010 0 -1 +1 -1 0 Consider the pair of (+1, -1): ==> (+1, -1)*M = 2xM – M = M ==> (0, +1) *M; Thus: (+1, -1) == (0, +1); which is also independent of the bit position “i”. Booth Pair Equiv. to Recoded pair ( (+1, 0) (0 +2) (0, 2) (-1, +1) (0, -1) (0 0) (0, (0 0) (0, ( 0, 1) (0, 1) (+1, 1) -- (-1, 0) (0, -2) BIT-PAIR RECODING OF MULTIPLIERS Multiplier Bits Booth Pair Equiv. to Recoded pair ( (+1, 0) ) ( +2) (0, ) Bit i Bit (i-1) (i 1) 0 0 0xM (-1, +1) (0, -1) 0 1 +1 x M (0 0) (0, (0 0) (0, ( 0, 1) (0, 1) 1 0 -1 x M (+1, 1) -- 1 1 0xM (+1, -1) (0, +1) (-1, 0) (0, -2) -6 1 1 1 0 1 0 0 0 -1 +1 -1 0 0 0 -1 0 -2 Execution of Example – Booth’s Recoded Multiplier +13 01101 -6 11010 0 -1 +1 -1 0 0 0 -1 0 -2 +13 01101; +26 011010; -26 100110; -13 10011; 0 1 1 0 1 0 0 -1 0 -2 - - - - - - - - 0 1 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - - - - - - - - - - - - 1 1 1 1 1 0 1 1 0 0 1 0 Unsigned Binary Division 63 03 243 +13 1101 274 189 54 13 21 274 26 14 - 13 01 1 1 0 DIVISOR; 100010010 DIVIDEND 1 1 0 1 0 1 0 0 1 0 1 0 0 0 1 - 1 1 0 1 _ _ _ _ _ 1 0 0 0 0 - 1 1 0 1 _ _ _ _ _ 1 1 1 0 - 1 1 0 1 _ _ _ _ _ 1 Unsigned g Binary y Division Divisor M Subtract or Add Shift Left Dividend Q Flowchart for Unsigned Binary Division ALGO. for RESTORING DIVISION Repeat n times: Load Divisor in Reg. M; Load Dividend in Reg. Q; Set Reg. A = 0; • Shift (left) A & Q (one bit posn.) • A = A – M; • If sgn(A) == 1 ( (-ve ve A) o Set Q0 = 0; o A = A + M; Else o End Set Q0 = 1; ANS: - Quotient is in Reg. Q; - Rem. R is i in i Reg. R A. A REG. A OPN. Dividend, Dividend Q = 1000; INIT. I Divisor, Divisor M = 11. II -M= 11101 III Result, in decimal ?? Quotient = 02 ; Rem = 02. IV REG. Q 0 0 0 0 0 1 0 0 S. L. 0 0 0 0 1 0 0 0 SUB 1 1 1 1 0 Rstr 0 0 0 0 1 0 0 0 S. L. 0 0 0 1 0 0 0 0 SUB 1 1 1 1 1 Rstr 0 0 0 1 0 0 0 0 S. L. 0 0 1 0 0 0 0 0 SUB 0 0 0 0 1 Set-Q0 0 0 0 0 1 0 0 0 S. L. 0 0 0 1 0 0 0 1 SUB 1 1 1 1 1 Rstr 0 0 0 1 0 0 0 0 1 0 Final Result REM 0 0 0 1 0 0 0 1 Q 0 How to improve the Algo. ?? Repeat n times: If A is +ve, Opns : Opns. - S.L. (A); S bt tM M; - Subtract • Shift (left) A & Q (one bit posn.) • A = A – M; • If sgn(A) == 1 ( (-ve ve A) o Set Q0 = 0; o A = A + M; (2A – M) Else o End Set Q0 = 1; If A is -ve, Opns. : - A+M - S.L. (A+M); - Subtract M; (2A + 2M) )–M ( = 2A + M How to improve the Algo. ?? If A is +ve, Opns. : - S.L. (A); - Subtract M; (2A – M) If A is -ve, ve Opns. : - A+M - S.L. (A+M); - Subtract M; (2A + 2M) – M = 2A + M Step 1: Repeat n times: 1. If sgn(A) == 0 - Shift (left) A & Q (1 (1-bit bit posn.) posn ) - A = A – M; else - Shift (left) A & Q (1-bit (1 bit posn.) posn ) - A = A + M; 2 2. End If sgn(A) (A) == 0 o Set Q0 = 1; o else Q0 = 0; Step p 2: If sgn(A) g ( ) == 1 A = A + M; RESTORING DIVISION Repeat n times: i • Shift (left) A & Q (1 bit posn.) • A = A – M; • If sgn(A) == 1 (-ve A) o Set Q0 = 0; o A = A + M; NON_RESTORING DIVISION Else o End Set Q0 = 1; REG. A OPN. INIT. Dividend, Dividend Q = 1000; I Divisor, M = 11. -M= 11101 II III IV REG. Q 0 0 0 0 0 1 0 0 S L S. L. 0 0 0 0 1 0 0 0 SUB 1 1 1 1 0 Set Q0 1 1 1 1 0 0 0 0 S. L. 1 1 1 0 0 0 0 0 ADD 1 1 1 1 1 Set Q0 1 1 1 1 1 0 0 0 S L. S. 1 1 1 1 0 0 0 0 ADD 0 0 0 0 1 Set-Q Set Q0 0 0 0 0 1 0 0 0 S. L. 0 0 0 1 0 0 0 1 SUB 1 1 1 1 1 Set-Q0 1 1 1 1 1 0 0 1 0 ADD 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 Final Result REM Q 0 0 0 1 Floating Point Numbers IEEE 32 32-bit bit single precision S 0/1 E’ (excess-127 Rep.) 8 M 23 IEEE 64-bit single precision |E| = 11 |M| = 52 Density of Floating Point Numbers SEM Field Common C case - signed-magnitude i d it d f fraction ti Floating-point format - sign bit S, e bits of exponent E m bits of unsigned fraction M E, (m+e+1=n) Value of (S,E,M) : ((-1)0 =1 ; (-1)1 =-1) Maximal value - Mmax = 1-ulp ulp -Unit in the last position - weight of the leastsignificant f b bit of f the h fractional f l significand f d Usually (not always) ulp = 2-m In IEEE 32-bit single precision; E is a “signed signed exponent”, exponent , E’ E is in excess excess-127 127 Representation E ' E 127; 0 E ' 255 2 The end values are reserved for special use: 127 E 128; Range, for Exponent: p 126 127 [2 ....2 1 E ' 254; E E '127; 126 E 127; For Mantissa: ] 10 38 [2 23 ] 10 7 In IEEE 64-bit single precision; Range, for Exponent: 1022 1023 [2 ....2 E ' E 1023 ; 1 E ' 2046 1022 E 1023 ; For Mantissa: ] 10 308 [2 53 ] 10 16 Floating-Point g Formats of Three Machines Binary Normalized Mantissa (IEEE): 1.<xxxx ..23 bits….xxxx> 0 10001000 .0010110 ……. Numerical Value (unnormalized): = +0.0010110…. x 29 0 10000101 .0110 ……. Numerical Value (Normalized): = +1.0110…. x 26 1 00101000 .001010 ……. Numerical Value (Normalized): = -1.001010…. x 2-87 Special Values : E’ = 0 AND Both M = 0 ZERO VALUE; E’ = 255 AND M = 0 INFINITY; 0 and are possible E’= 0 and M <> 0 denormal numbers; E’= 255 and M <> 0 NaN; 0.M 2 0 0; 1 Exceptions : Underflow, Overflow, Divide by zero and q rounding g in order to be Inexact Result that requires represented in one of the normal formats; Invalid 0/0 and sqrt( sqrt(-1) 1) are attempted. Special values are set to the results. 126 Algms. For ADD/SUB, MULT/DIV, in normalized floating point opens. ADD/SUB MULT. • Make • Add the E’s and smaller exp = large exp., subtract 127 by shifting opn. • Mult. the • Perform ADD/SUB on Mantissas and get mantissas i (get ( result l h sign i the with sign) DIV. • Subtract the E’s and add 127 • Div. The Mantissas and get the sign Normalize the result value, if necessary, i all in ll th the th three cases above. b Floating point addition Floating point addition FP Addition & Subtraction Flowchart Circuitry for Addition/Subtraction Floating Point Multiplication Floating Point Division
© Copyright 2026 Paperzz