International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9(2016) pp 6200-6203 © Research India Publications. http://www.ripublication.com A Power Saving Multiplication Algorithm S. Subha* SITE, VIT, Vellore, Tamil Nadu, India. R. Sakthivel SENSE, VIT, Vellore, Tamil Nadu, India. Abstract Multiplication is widely used in many applications. A number of multiplication algorithms like Booth algorithm, radix-n algorithm have been proposed in literature. This paper proposes multiplication algorithm for integers in sign magnitude notation. The magnitudes are the operands for the multiplication. The method divides the multiplier and multiplicand into slices of two bits. The partial products for all slices is calculated and added appropriately to give the result.The proposed algorithm is synthesized using Xilinx tool. An improvement in area utilization for ASIC configuration, improvement in power consumption for ASIC configuration with degradation in time is observed for the simulated algorithm. Degradation in FPGA area utilization with no change in power consumption and execution time degradation is observed. The proposed model can be extended for two’s complement multiplication, floating point multiplication. Keywords: Area Improvement, Bit Slicing, Multiplication, Power Improvement, Sign Magnitude Notation Introduction Multiplication is widely used operation in computers. A number of multiplication algorithms have been proposed in literature. Some of the multiplication algorithms are repeated addition, radix-n algorithm with special case as Booth’s algorithm, array multipliers. Some algorithms in parallel processing have been proposed for multiplication. In all these algorithms, the partial products need to perform either addition or shifting or multiplication of bits. Multiplication algorithms with fast multiplication, array multipliers are discussed [2]. The authors in [5] suggest a rule to modify Booth algorithm for any radix (> 2). The modification is suggested for multipliers of any size with radix as any power of two. The author in [6] proposes method to correct the Booth algorithm to be implemented for any radix as power of two and operand length. The authors in [7] propose method to reduce power consumption by choosing operand with smaller dynamic range for Booth encoding. In the paper [3] low power two’s complement multiplication is developed by minimizing switching activities of partial products using radix-4 Booth algorithm. Encoding is used in the proposed method. The authors in [4] propose multiplication algorithm by dividing the operands into four parts. Each multiplication is computed independently and the results are added. The proposed algorithm divided each operand into two parts. Techniques for low power design is proposed in [1]. This paper proposes an algorithm involving multiplication of bits of length two resulting in maximum of four bits. The inputs are in sign magnitude form. The magnitudes of the numbers are considered for computation. The inputs multiplicand and multiplier are divided into slices of two bits each. The partial products for all combinations of input slices are calculated. Based on input size, the partial products are added to give the result. The proposed algorithm is simulated with Xilinx tool. Three configurations are used for analysis. The trad configuration is the in-built multiplication operation. The p_mult is the configuration obtained by using in-built multiplication to generate the partial products. The p_mux configuration generates the partial products by bit inspection. The following is observed. 1. A 17.5% improvement in area utilization for ASIC 90nm for proposed model with multiplication of the slices p_mult compared with existing multiplication algorithm trad and no change in proposed model using multiplexers for multiplication p_mux. An improvement of 20% in p_mult and 1.2% for p_mux is found for ASIC 45nm. The number of slices used in FPGA is increased by 20% for p_mult and 100% for p_mux. 2. A 31% improvement in power for p_mult over trad, 9% improvement of p_mux over trad for ASIC 90nm is observed. A power improvement of 31% of p_mult over trad and 10% of p_mux over trad is observed for ASIC 45nm. The power consumed in FPGA configuration is unchanged for all configurations. 3. The p_mult configuration increased the time by 47% over tradand p_mux by 60% over trad for ASIC 90nm. A timing degradation of 48% for p_mult and 64% for p_mux over trad configuration is observed for ASIC 45nm. The FPGA configuration has time degradation of 57% for both p_mult and p_mux over trad configuration. The strategies for power improvement suggested in [1] can be examined for power improvement. The rest of paper is organized as follows. Section 2 gives the mathematical background for the proposed model, section 3 the proposed algorithm, section 4 simulations, section 5 conclusion followed by references. Mathematical Background Consider the product of two bits. There are four possibilities. The product is given below in Table 1. 6200 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9(2016) pp 6200-6203 © Research India Publications. http://www.ripublication.com Table 2: Area Utilisation Table Table 1:Bit Multiplication Table Product 00 01 10 11 00 00 00 00 00 00 01 10 11 01 00 10 100 110 10 00 11 110 1001 11 Consider the product of two 2n-bit numbers M and R. Divide each operand into slices of two bits. Then the partial product can be calculated as shown in Bit multiplication Table. The partial products can be added suitably to obtain the result. Traditional Proposed_multiply Proposed_mux The number of slices in M and R are n. There are n partial products to compute. They can be obtained from the Bit Multiplication Table. The partial products are added to give the final result. On average there are (n+1)(n-1) additions to be performed. 2 ASIC TSMC 90nm ASIC TSMC 45nm 222 183 222 83 66 82 FPGA(Virtex5 XC5CLX20T-2 FF 323) Occupied Slices FPGA 5 out of 3120 4 out of 3120 10 out of 3120 Table 3: Power Report ASICTSM C 90nm ASICTSM C 45nm 11792.647 nW 8056.113 nW 10681.233 nW 4464.628 nW 3051.049 nW 4015.667 nW Proposed Algorithm Consider the multiplication of M and R. Let both be n bits wide, n being even. The algorithm for multiplication of two four bit numbers is given below. 1. Start 2. Divide M and R into slices of two bits each. As M and R are of four bits width, then M is divded into m1 and mo and R is sliced into r1 and r0. 3. Calculate the following pp0 = ro*m0 pp1 = r0*m1 pp2 = r1*m0 pp3 = r1*m1 4. Let the result be eight bits res[7:0]. Calculate the result as below from step 5 to step 9. 5. res[1:0] = pp0[1:0]; 6. temp1 = $signed(pp1) + $signed(pp2)+pp0[3:2]; 7. res[3:2] = temp1[1:0]; 8. temp2 = $signed(pp3) + temp1[4:2]; 9. res[7:4] = temp2; 10. Stop Traditional Proposed_multipl y Proposed_mux FPGA(Virtex 5 XC5CLX20T -2 FF 323) Occupied Slices 321 mW 321 mW 321 mW Table 4: Timing Analysis Table Traditional Proposed_multipl y Proposed_mux ASICTSM C 90nm ASICTSM C 45nm 704 ps 1039 ps 1186 ps 1765 ps FPGA(Virtex 5 XC5CLX20T -2 FF 323) 325.975 ns 513.395 ns 1129ps 1946ps 513.395 ns The improvement in area utilization is shown in Fig. 1 The algorithm can be extended to input of arbitrary length. It is assumed that the length is even. For multiplying two four bit numbers, there are four multiplications, three additions. The multiplication results can be computed based on the inputs and the actual multiplication steps are avoided. Only the addition steps are involved in computation. Simulations The proposed model was simulated on Xilinx ISE Tool. Code in Verilog was written to simulate the product of two four bit numbers. The code was compiled and synthesized. The area, power consumption and execution time was calculated. The multiplication algorithm supported by the software is called trad. The algorithm involving multiplication of partial products using the in-built multiplication is called p_mult. Code in Verilog was written to generate the partial products by bit inspection. This configuration is called p_mux in the discussion that follows. The results obtained are shown below. 6201 Area Utilization 250 200 a r 150 e 100 a 50 trad p_mux p_mult 0 type Figure 1: Area Utilization International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9(2016) pp 6200-6203 © Research India Publications. http://www.ripublication.com As seen from Fig. 1, 17.5% improvement in area utilization for ASIC 90nm for proposed model with multiplication of the slices p_mult compared with existing multiplication algorithm trad and no change in proposed model using multiplexers for multiplication p_mux. An improvement of 20% in p_mult and 1.2% for p_mux is found for ASIC 45nm. The number of slices used in FPGA is increased by 20% for p_mult and 100% for p_mux. The power consumption is shown in Fig. 2. As seen from Fig. 2, 31% improvement in power for p_mult over trad, 9% improvement of p_mux over trad for ASIC 90nm is observed. A power improvement of 31% of p_mult over trad and 10% of p_mux over trad is observed for ASIC 45nm. The power consumed in FPGA configuration is unchanged for all configurations. Power Consumption m i l l i W a t t s 14000 12000 10000 8000 6000 4000 2000 0 trad p_mux p_mul Figure 2: Power Consumption Comparison The timing analysis is shown in Fig. 3. As seen from Fig. 3 the p_mult configuration increased the time by 47% over tradand p_mux by 60% over trad for ASIC 90nm. A timing degradation of 48% for p_mult and 64% for p_mux over trad configuration is observed for ASIC 45nm. The FPGA configuration has time degradation of 57% for both p_mult and p_mux over trad configuration. Timing Chart 2500 2000 trad 1500 p_mux 1000 S e c o n d s Thus res = 01111000 = 1010 * 1100 #add operations = 3 = (2+1)(2-1) Conclusion type N a n o The following example shows the algorithm simulation. 1. M=1010 R=1100 2. Let m0=10 m1=10 r0=00 r1=11 3. pp0=m0r0 = 10*00 = 00 pp1 = m1r0 = 10*00 = 00 pp2 = m0r1 = 10*11 = 110 pp3 = m1r1 = 10*11 = 110 4. Then res[1:0] = 00 5. temp1 = 00 + 110 + 0 = 110 6. res[3:2] = 10 7. temp2 = 110 + 1 = 111 8. res[7:4] = 0111 A multiplication algorithm for integers is proposed in this paper. The method consists of slicing the inputs into chunks of two bits. The partial products are determined from the slices without multiplication. The partial products are added to give the result suitably. The proposed model is simulated in Xilinx Tool. The proposed method is simulated using in-built multiplication algorithm for calculating the partial products as configuration p_mult. A second configuration called p_mux is developed to generate the partial products by bit inspection. The configuration using in-built multiplication is called trad. From the simulations, the following is observed. 1. 17.5% improvement in area utilization for ASIC 90nm for proposed model with multiplication of the slices p_mult compared with existing multiplication algorithm trad and no change in proposed model using multiplexers for multiplication p_mux. An improvement of 20% in p_mult and 1.2% for p_mux is found for ASIC 45nm. The number of slices used in FPGA is increased by 20% for p_mult and 100% for p_mux. 2. 31% improvement in power for p_mult over trad, 9% improvement of p_mux over trad for ASIC 90nm is observed. A power improvement of 31% of p_mult over trad and 10% of p_mux over trad is observed for ASIC 45nm. The power consumed in FPGA configuration is unchanged for all configurations. 3. The p_mult configuration increased the time by 47% over tradand p_mux by 60% over trad for ASIC 90nm. A timing degradation of 48% for p_mult and 64% for p_mux over trad configuration is observed for ASIC 45nm. The FPGA configuration has time degradation of 57% for both p_mult and p_mux over trad configuration. p_mul 500 References 0 [1] [2] type Figure 3: Timing Comparison 6202 Gaurav Verma, Manish Kumar, Vijay Khare, Low Power Techniques for Digital System Design, Indian Journal of Science and Technology, 8(17), pp.1-6 Israel Koren, Computer Arithmetic Algorithms, Prentice Hall, NJ, 1993 International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 9(2016) pp 6200-6203 © Research India Publications. http://www.ripublication.com [3] [4] [5] [6] [7] Jongsu Park, San Kim, Yong-Surk Lee, A Low-Power Booth Multiplier Using Novel DataPartition Method, Proceedings of AP-ASIC 2004, pp. 54-57 Nan-Ying Shen, OscalT.C. Chen, Low Power Multipliers by Minimizing Switching Activities of Partial Products, Proceedings of ISCAS, 2002, pp. IV93-IV-96 Philip.E.Madrid, Brian Millar, Earl E Swartzlander Jr., Modified Booth Algorithm for High Radix Floating point Multiplication , IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 1, No. 2, June 1993, pp.164-167 RajendraKatti, A Modified Booth Algorithm for High Radix Fixed Point Multiplication, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 2, No. 4, Dec. 1994, pp, 522-524 Sandy Wang, Yi-Wen Wu, OscalT.C.Chen, Ruey-Lian Ma, Low Power Multipliers by Minimizing Inter-Data Switching Activities, Proceedings of IEEE Midwest Symposium on Circuits and Systems, 2000, pp.89-92 Vol I 6203
© Copyright 2025 Paperzz