Implementation of Multiplier Architecture Using - scopes

International conference on Signal Processing, Communication, Power and Embedded System (SCOPES)-2016
Implementation of Multiplier Architecture Using
Efficient Carry Select adders for synthesizing FIR
filters
Vaka Saranya
P. Pushpalatha
Electronics and Communication Engineering
JNTUK
Kakinada, India
[email protected]
Electronics and Communication Engineering
JNTUK
Kakinada, India
[email protected]
Abstract—This paper proposes design of an efficient
constant multiplier architecture using carry select adders.
The algorithms proposed earlier to implement this MCM
for an efficient FIR filter design can be classified in two
main groups graph based algorithms and common subexpression elimination algorithms (CSE). CSE algorithm
uses binary representation of coefficients for the
implementation of higher order FIR filter with a fewer
variety of adders than Canonic Signed Digit (CSD)-based
CSE methods. According to the VHBCSE Algorithm,
initially 2-bit binary common sub-expression elimination
algorithm has been applied vertically across adjacent
coefficients on the 2-D space of the coefficient matrix
followed by applying 4-bit and 8-bit BCSE algorithm
horizontally within each coefficient. Thus there is reduced
power consumption by minimum switching activity along
with an improvement in the area and delay. The partial
products generated by VHBCSE methodology and
controlled additions are used by any efficient carry select
adder(CSLA) to produce output efficiently instead of
earlier ripple carry adder to reduce area and delay .
expression elimination) algorithm, which introduces the
concept of eliminating the common sub-expression in binary
form for designing an efficient constant multiplier, and is thus
applicable to reconfigurable FIR filters with low complexity.
But the choice of the length of the binary common subexpressions (BCSs) makes the design inefficient by increasing
the adder step and the hardware cost. The efficiency in terms
of delay, area and power of the constant multiplier has been
increased by VHBCSE algorithm.
II. EXISTING SYSTEM
Although area, delay and power-efficient multiplier
architectures, such as Wallace and modified Booth multipliers
have been proposed, the full flexibility of a multiplier is not
necessary for the constant multiplications, since filter
coefficients are fixed and determined beforehand by the DSP
algorithms. Hence, the multiplication of filter coefficients
with the input data is generally implemented under a shift adds
architecture, where each constant multiplication is realized
using addition/subtraction and shift operations in an MCM
operation.
Keywords—CSLA;VHBCSE;CSE algorithm;FIR filter;MCM.
I. INTRODUCTION
Like FIR filters, microprocessors, digital signal processors,
multipliers are key components of many high performance
systems.A system’s performance is generally accessed by the
performance of the multiplier since it is the slowest element in
the system. FIR filter has wide application in any digital signal
processing, wireless communication, image and video
processing and biomedical signal processing. Systems like
Software Defined Radio (SDR) and multi-standard video
codec need a reconfigurable FIR filter with dynamically
programmable filter coefficients, interpolation factors and
lengths which can vary according to the specification of
various standards in a portable computing platform.
Significant application of an efficient reconfigurable FIR filter
motivates the designer to develop the chip with low cost,
power, and area along with the capability to operate at high
speed. One of those techniques is BCSE (Binary common sub-
Fig.1. Filter design using Transposed direct form
Fig.2. Transposed form MCM block.
Implementation of constant multiplications in a shift-adds
architecture enables the sharing of common partial products
among the constant multiplications that significantly reduces
the area and power dissipation of the MCM design. Filter
design using transposed direct form and transposed form of
MCM block as shown in Fig.1. and Fig.2.Hence, the MCM
problem is defined as finding the minimum number of
addition/ subtraction operations that implement the constant
multiplications. MCM problem can be categorized in two
classes as Common Sub expression Elimination (CSE)
methods and graph-based (GB) techniques.The CSE
algorithms initially define the constants under a particular
number representation namely, binary, Canonical Signed Digit
(CSD)or Minimal Signed Digit (MSD) and then, find the best
sub expression, generally the most common, among the
constant multiplication. Binary Common Sub expression
Elimination (BCSE) is an efficient method but have some
problems. FBCSE architectures consider the signed
magnitude number format for inputs as well as coefficients.
These two architectures apply the BCSE algorithm only in the
first layer. If we consider the filter coefficients which consists
of small decimal values with negative sign, then consumption
of the hardware and power increases.
shown in table I shows that the proposed CSLA has less
number of gates and hence less area and delay.
IV.ARCHITECTURE OF THE VHBCSE
ALGORITHM
BASED CONSTANT MULTIPLIER USING EFFICIENT
CSLA
The data flow diagram of the proposed vertical-horizontal
BCSE algorithm based constant multiplier (CM) design is
shown in Fig.4.
The details of the blocks in the Fig .4. is explained here
A) Sign Conversion Block: This is needed to support the
signed decimal format data representation for both input and
coefficient. There is a 1's complement circuit to generate the
inverted version of the 16-bit (excluding MSB) coefficient and
a16-bit 2:1 multiplexer which produces the multiplexed
coefficients depending on the value of the most significant bit
(MSB) of the coefficient. Therefore for negative value of the
original coefficient, the multiplexed coefficient will be in the
inverted form; otherwise it will be as it is.
III. PROPOSED METHOD
The VHBCSE algorithm based constant multiplier
using efficient carry select adders is used here.VHBCSE uses
2-bit BCSE vertically first on the adjacent coefficient,
followed by 4-bit and 8-bit horizontal BCSEs to detect and
eliminate as many BCSs as possible which are present within
each of the coefficient. Our modified algorithm can work for
signed decimal number of both the input and the coefficients
along with a reduced probability of use of the adders (A0-A7)
to sum up the partial product generator by extending the
BCSE at the lower level and ripple carry adders are replaced
using carry select adders.
Fig. 4. Data flow diagram of CM using VHBCSE algorithm
Fig.3. Proposed Carry Select Adder
The Carry Select Adder has two units: 1) sum and carry
generator unit (SCG) and 2) sum and carry selection unit. The
logic operation of the n-bit RCA is performed in four stages:
1) half-sum generation (HSG); 2) half-carry generation
(HCG); 3) full sum generation (FSG); and 4) full carry
generation (FCG).The performance comparsions for 8 bit
B) Multiplexers Unit: The multiplexer unit is used to select the
appropriate data generated from the PPG unit depending on
the coefficient's binary value .As shown in Fig.8. during layer1, eight 4:1 multiplexers are required to produce the partial
products. According to the 2-bit BCSE algorithm applied
vertically on the Multiplier Adder Tree (MAT). The widths of
these 8 multiplexers are 17, 15, 13, 11, 9, 7, 5, and 3-bit each
instead of 16-bit for all, which would reduce the hardware and
power consumption.
C) Control Logic (CL) Generator: The CL generator block
will produce 7 control signals depending on the equality check
for 7 different cases. The architecture for the control signal
generator block is show in Fig.5. The control signal for 8- bit
equality check is seen to be produced through the control
signals generated from the 4-bit equality check.xor gates are
used for comparison.
by layer-3 to finally produce the multiplication result between
the input and the coefficient.
Fig. 5. Control logic generator unit
D) Partial Product Generator (PPG): In BCSE method, shift
and add based technique has been used for generating the
partial product which will be summed up in the following
layers for producing the final result. Choice of the size of the
BCS defines the number of partial products as show in Fig.6.
Fig.6. Partial product generator
E) Controlled addition at Layer-2: The partial products (PP)
generated from eight groups of 2-bit BCSs are added up to get
the final multiplication results which have been done in three
layers. Layer-2 requires four efficient carry select adders(A1A4) operations to sum up the eight PPs. The adders (A1-A4)
are controlled depending on the control signals (C1-C6),. The
architecture of this block is shown in Fig.7.
Fig. 8. Proposed Constant multiplier architecture
V. RESULTS AND DISUSSIONS
The VHBCSE algorithm based constant multiplier
architecture, shown in Fig. 9, has been coded using Verilog
hardware description language to synthesize in the targeted
FPGA device. The designs are simulated using Xilinx ISE
8.1(Integrated Synthesis Environment) with Spartan3E family,
XC3S250E device, PQ208 package, (-4) speed, XST
(VHDL/Verilog) synthesis tool, ISim (VHDL/Verilog)
Simulator synthesis tool. The performace
comparison
between two methods in Table II
TABLE I
PERFORMANCE COMPARSIONS FOR DIFFERENT ADDERS
ADDER
Fig.7. Control Addition at layer2
F) Controlled Addition at Layer-3: The four multiplexed sums
(AS1, AS2, AS3 and AS4) generated from layer-2 are now
summed up in layer-3. In our algorithm, controlled addition is
performed, instead of direct addition of these four sums.
G) Final Addition on Layer-4: This block performs the
addition operation between the two sums (AS5-AS6) produced
Using Ripple
carry
adders(RCA)
Conventional
Carry select
adders
Proposed Carry
select adders
TOTAL GATE
COUNT
DELAY
102
16.50ns
108
14.751ns
84
13.799ns
TABLE II
PERFORMANCE COMPARSIONS BETWEEN TWO
DIFFERENT METHODS
Parameter
Using Ripple carry
adders(RCA)
Using Carry select
adders
No.of input LUTs
310
92
No.of occupied
slices
170
53
Total delay(ns)
41.926
38.397
The simulation results for the proposed structure for filter
16-bit length have been given in Fig.9. using both ripple carry
adders and efficient carry select adders for the proposed
vertical horizontal binary common sub-expression elimination
algorithm for designing a reconfigurable FIR filter.
VI.CONCLUSION
With a view to implementing an efficient fixed point
reconfigurable FIR filter, this paper presents verticalhorizontal BCSE algorithm which removes the initial common
sub-expressions (CSs) by applying 2-bit BCSE vertically.
Further elimination of the CSs has been performed through
finding the CSs present within the coefficients by applying
BCSEs of different lengths horizontally to different layers of
the shift and add based constant multiplier architecture using
efficient carry select adders. Proposed algorithm successfully
reduces area and delay.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
(a)
[7]
(b)
Fig.9. Synthesis results of multiplier using VHBCSE
algorithm (a) ripple carry adders (b) Efficient CSLA
Indranil Hatai , Indrajit Chakrabarti, Member, IEEE, and Swapna
Banerjee, Senior Member, IEEE, “An Efficient Constant Multiplier
Architecture Based on Vertical-Horizontal Binary Common Subexpression Elimination Algorithm for Reconfigurable FIR Filter
Synthesis‖, IEEE transactions on circuits and systems—i: regular papers,
vol. 62, no. 4, april 2015.
I. Hatai, I. Chakrabarti, and S. Banerjee, “An efficient VLSI architecture
of a reconfigurable pulse-shaping FIR interpolation filter for multistandard DUC,” IEEE Trans. Very Large Scale Integr. (VLSI)Syst., May
2014 [Online]
S. J. Darak, S. K. P. Gopi, V. A. Prasad, and E. Lai, “Low-complexity
reconfigurable fast filter bank for multi-standard wireless
receivers,”IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 22,
no. 5, pp.1202–1206, May 2014.
B.Ramkumar and HarishM kittur, “low power and area efficient
CSLA”IEEE trans, vol 20, pp 371-375 feb2012
R.Mahesh and A. P. Vinod, “New reconfigurable architectures for
implementing FIR filters with low complexity,” IEEE Trans.
Comput.Aided Design Integr. Circuits Syst., vol. 29, no. 2, pp. 275–288,
Feb.2010.
J. L. Nunez-Yanez, T. Spiteri, and G. Vafiadis, “Multi-standard
reconfigurable motion estimation processor for hybrid video codecs,”
IET Comput. Digit. Tech., vol. 5, no. 2, pp. 73–85, Mar. 2011.
C. Y. Yao, H. H. Chen, T. F. Lin, C. J. Chien, and C. T. Hsu,
“Anovel common subexpression elimination method for
synthesizing fixed-point FIR filters,” IEEE Trans. Circuits Syst. I,
Reg. Papers, vol. 51, no. 11, pp. 2215–2221, Nov. 2004.