1 BIT-PARALLEL ARITHMETIC Notice that even though the adder is said to be bit-parallel, computation of the output bits is done sequentially. Addition and Subtraction Two binary numbers, In fact, at each time instant only one of the full-adders performs useful work. The others have already completed their computations or may not yet have received a correct carry input, hence they may perform erroneous computations. x = (x0∑x1x2… xWd–1)2 and y = (y0∑y1y2… yWd–1)2, can be added using bit-parallel operation in O(log(Wd)) steps Because of these wasted switch operations, power consumption is higher than necessary. ■ Ripple-Carry Adder/Subtractor x0 c0 FA x1 y0 c1 FA x0 xWd–1 y1 y0 yWd–1 c2 cWd–1 FA =1 cWd s1 Adder x + y Lars Wanhammar x3 y1 =1 y2 y3 =1 Add/Sub =1 FA FA FA c4 sWd–1 s0 DSP Integrated Circuits x2 x1 c0 FA s0 2 Addition time is determined by the carry path in the full-adders. s1 s2 s3 Subtractor x – y Department of Electrical Engineering Linköping University DSP Integrated Circuits [email protected] http://www.es.isy.liu.se/ Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ 3 4 ■ Carry–Look–Ahead Adder Bit-Parallel Multiplication ■ Brent–Kung’s Adder In order to multiply two two’s-complement numbers, a and x, we form the partial bit-products ai · xk, as shown below. ■ Carry–Save Adder –a0◊xW d–1 ■ Carry–Select Adder a0◊xWd–2 – a1◊xWd–2 ■ Carry–Skip Adder –a0◊x1 ■ Conditional–Sum Adder a0◊x0 y–1 –a1◊x0 y0 ◊x c–2 Wd–1 aW ◊x c–1 Wd–1 aW aWc–1◊xWd–2 –a0◊x2 a1◊x1 –a2◊x0 y1 –aWc–1◊x0 yW d+Wc–3 yW d+Wc–2 High-speed, bit-parallel multipliers can be divided in three classes ■ Add–and–shift multipliers The partial bit-products are generated sequentially and accumulate them successively as they are generated. This type are therefore the slowest multiplier, but the required chip area is low. DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ 5 6 ■ Parallel multipliers Add–And–Shift Multiplication All bit-products are generated in parallel and uses a multi-operand adder (i.e., an adder tree) for their accumulation. Consider the multiplication of two two’s-complement numbers, y = a · x. A parallel multiplier structure can be partitioned into three parts: partial product generation, carry-free addition, and carry-propagation addition. Wd – 1 Wd – 1 Ê ˆ – i Á ˜ y = a – x0 + Â xi 2 = – ax 0 + Â ax i 2 – i Á ˜ Ë ¯ i=1 i=1 These three parts can be implemented using different schemes. 0 (–a0 The carry-free addition part is often implemented by using a Wallace or redundant binary addition tree. (–a0 ■ Array multipliers An array of almost identical cells for generation of the bit-products and accumulation. (–a0 a1 (–a0 a1 a2 – (–a0 a1 a2 y–1 y0 y1 a1 a1 a2 a aWc–1)◊xWd–1 a2 x aW )◊x c–1 Wd–2 a2 LSB MUX aWc–1)◊x2 DSP Integrated Circuits Department of Electrical Engineering Linköping University Reg Add/Sub aWc–1)◊x1 aW )◊x c–1 0 yW d+Wc–3 yW MSB d+Wc–2 LSB y This type of multipliers requires the least amount of area, but it is also the slower Lars Wanhammar MSB The multiplication can be performed by generating the partial products, for example row wise. For example, the partial product generation can be implemented by AND gates or by using the Booth’s algorithm. Horners’ method DSP Integrated Circuits [email protected] http://www.es.isy.liu.se/ Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ 7 8 Modified Booth’s Algorithm By considering three bits at a time from the input operand, x, is used to recode the other input operand, y, into partial products (–2y, –y, 0, y, 2y) that are added (subtracted) to form the result. Tree–Based Multipliers Short-hand version of the bit-products x y Hence, the number of recoded partial products is reduced. Booth’s algorithm reduces the number of additions/subtractions cycles to Wd/2 = 8, which is only half that required in the ordinary shift–and–add algorithm. Notice that the reduction of the number of cycles directly corresponds to a reduction in the power consumption. Booth’s algorithm is also suitable for bit-serial or digit-serial implementation. Bit-products Product In a tree-based multiplier the bit-products are often added using fulladders. A full-adder can be considered as a counter, denoted (3, 2)-counter, that adds three inputs and forms a two bit result. The output equals the number of 1s in the inputs – counter. A half-adder is denoted (2, 2)-counter. DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ 9 In 1964 Wallace showed that a tree structure (Wallace tree) of such counters is an efficient method, O(log(Wd)), to add the bit-products. For example, Dadda has derived a scheme where all bit-products with the same weight are collected and added using a Wallace tree with a minimum number of counters and minimal critical paths. Input However, the multiplier is irregular and difficult to implement efficiently since a large wiring area is required. Inputs to the second stage x y (3, 2)-counter Result of the second stage Wallace tree multipliers should therefore only be used for large word length and where the performance is critical. Outputs (2, 2)-counter Bit-products 10 Several alternative addition schemes are possible. Another more regular structure is the Overturn-Stair Multiplier Inputs to the third stage Result of the third stage Result of the first stage In the fourth stage, a RCA or a CLA can be used to obtain the product. DSP Integrated Circuits Department of Electrical Engineering Linköping University Lars Wanhammar DSP Integrated Circuits [email protected] http://www.es.isy.liu.se/ Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/ 11 Array Multipliers 12 Notice that some slight variations in the way the partial products are formed occur in the literature. Many array multiplier schemes with varying degrees of pipelining have been proposed. All bit-products are generated in parallel and collected through an array of full-adders and a final RCA. A typical array multiplier—the Baugh–Wooley’s multiplier with a multiplication time proportional to 2Wd. x0 1 p–1 x1 x2 x3 y0 y1 y2 y3 1 x0◊y3 x1◊y3 x2◊y3 x3◊y3 x3◊y2 x0◊y2 x1◊y2 x2◊y2 x0◊y1 x1◊y1 x2◊y1 x3◊y1 x0◊y0 x1◊y0 x2◊y0 x3◊y0 p0 p1 p2 p3 FA x2◊y3 x2◊y2 FA 0 x3◊y3 x3◊y2 FA x0◊y2 x1◊y1 FA x2◊y1 FA x3◊y1 x0◊y1 p4 p5 p6 don’t care Department of Electrical Engineering Linköping University FA x1◊y0 FA x2◊y0 FA x3◊y0 x0◊y0 FA DSP Integrated Circuits 0 x1◊y3 x1◊y2 FA 0 1 Lars Wanhammar 0 x0◊y3 FA p–1 p0 FA FA p1 1 p2 [email protected] http://www.es.isy.liu.se/ p3 p4 p5 p6 DSP Integrated Circuits Lars Wanhammar Department of Electrical Engineering Linköping University [email protected] http://www.es.isy.liu.se/
© Copyright 2026 Paperzz