This item is protected by original copyright

AN EFFICIENT MODIFIED BOOTH MULTIPLIER
ec
te
d
by
or
ig
in
al
co
p
yr
ig
ht
ARCHITECTURE
©
Th
i
si
te
m
is
pr
ot
RAZAIDI BIN HUSSIN
UNIVERSITI MALAYSIA PERLIS
2008
ht
ig
yr
al
co
p
by
RAZAIDI BIN HUSSIN
(0430110017)
©
Th
i
si
te
m
is
pr
ot
ec
te
d
by
or
ig
in
An Efficient Modified Booth Multiplier
Architecture
A thesis submitted
in fulfilment of the requirements for the degree of
Master of Science (Microelectronic Engineering)
School of Microelectronic Engineering
UNIVERSITI MALAYSIA PERLIS
2008
UNIVERSITI MALAYSIA PERLIS
DECLARATION OF THESIS
Author’s full name
:
RAZAIDI BIN HUSSIN ………………..…………………………..........................
Date of birth
:
8 FEBRUARY 1980 ……………………………..……………………………………
Title
:
AN EFFECIENT MODIFIED BOOTH MULTIPLIER ARCHITECTURE
ig
ht
…….……………………………………..........................………………………….
:
2008…………………………
al
co
p
Academic Session
yr
…………………………………………………………………….............................
or
ig
in
I hereby declare that the thesis becomes the property of Universiti Malaysia Perlis (UniMAP) and to be placed
at the library of UniMAP. This thesis is classified as :
(Contains confidential information under the Official Secret Act 1972)*
RESTICTED
(Contains restricted information as specified by the organization where
research was done)*
OPEN ACCESS
I agree that my thesis is to be made immediately available as hard
copy or on-line open access (full text)
pr
ot
ec
te
d
by
CONFIDENTIAL
Th
i
si
te
m
is
I, the author, give permission to the UniMAP to reproduce this thesis in whole or in part for the purpose of
research or academic exchange only (except during a period of _____ years, if so requested above).
©
_________________________
SIGNATURE
800208-09-5033
(NEW IC NO. / PASSPORT NO.)
Date :4/12/2008
Certified by:
_________________________________
SIGNATURE OF SUPERVISOR
Prof. Dr. Ali Yeon Md Shakaff
NAME OF SUPERVISOR
Date : 4/12/2008
NOTES : * If the thesis is CONFIDENTIAL or RESTRICTED, please attach with the letter from the organization with period and reasons for
confidentially or restriction.
AN EFFICIENT MODIFIED BOOTH MULTIPLIER ARCHITECTURE
yr
ig
ht
ABSTRACT
©
Th
i
si
te
m
is
pr
ot
ec
te
d
by
or
ig
in
al
co
p
Multiplier plays an important role in today’s compute intensive applications
such as computer graphics and digital signal processing. This thesis described the
design of an Efficient Modified Booth Multiplier Architecture. With the tradeoff between
speed and area, the design of the Modified Booth Multiplier focused on high speed with
a moderate increase in area. This was achieved by reducing the critical path delay in
the basic element of the multiplier circuit. Multiplication is performed by generating the
partial product of Modified Booth Encoding (MBE) and accumulating the entire partial
product by an adder or compressor. The research began by examining the available
encoding schemes used to generate the partial product and 4:2 compressor that are
used to accumulate the partial product. The fastest MBE and the most efficient 4:2
compressor has been used to develop the multiplier. The multiplier performance was
improved by adapting various methods such as Simplified Sign Extension (SSE) and a
proper tree topology. The SSE method eliminated some counter or adders in a partial
product row while the tree topology arrangement of the compressors and their
interconnection accumulate the partial product. A Gajski’s rule had been used to
evaluate the performance of the multiplier and the result shows that the new multiplier
has reduced delays in producing the output. The new multiplier architecture has
reduced delays to almost 2% to 7% compared to other multipliers. The high speed
multiplier was then extended to develop a Floating Point (FP) multiplier. The FP
multiplier had been successfully designed using Altera Quartus II software and
implemented on MAX EPM7182SLC84-7 device. The result showed that the FP
multiplier is 38% faster compared to conventional FP multiplier. In term of size, the FP
multiplier is 26% bigger than conventional circuit. However the increase in area of the
circuit can be tolerated since the aim was to enhance the speed of the FP Multiplier
iii
PENAMBAHBAIKAN SENIBINA PENDARAB BOOTH DIUBAHSUAI
yr
ig
ht
ABSTRAK
©
Th
i
si
te
m
is
pr
ot
ec
te
d
by
or
ig
in
al
co
p
Pendarab memainkan peranan penting dalam aplikasi intensif komputer seperti
grafik komputer dan pemprosessan isyarat digit. Tesis ini adalah mengenai rekabentuk
sebuah Pendarab Booth Diubahsuai yang diubahsuai senibinanya. Dengan faktor
penghad seperti kelajuan dan keluasan, rekabentuk Pendarab Booth Diubahsuai ini
hanya memberi penekanan kepada kelajuan yang tinggi walaupun peningkatan saiz
yang tidak begitu ketara. Pendarab Booth Diubahsuai ini dapat dicapai dengan
mengurangkan masa laluan genting dalam litar asas pendarab. Proses pendaraban dapat
dilaksanakan dengan menghasilkan Produk Separa Pengekod Booth Diubahsuai (MBE)
dan mencampurkan Produk Separa dengan litar penambah atau litar pemampat.
Penyelidikan ini telah dimulakan dengan menganalisa pengekodan yang sedia ada bagi
menghasilkan Produk Separa dan Pemampat 4:2 yang digunakan untuk mencampurkan
Produk Separa. Litar MBE yang berkelajuan paling tinggi dan Pemampat 4:2 yang
paling effisien akan digunakan bagi membina litar pendarab. Prestasi pendarab telah
diperbaiki dengan menambah beberapa kaedah seperti kaedah Simplified Sign
Extension (SSE) dan menggunakan topologi pokok yang sesuai. Kaedah SSE digunakan
untuk menghapuskan sebahagian litar penambah atau litar pemampat dalam barisan
produk separa manakala topologi pokok adalah susunan penyambungan litar pemampat
bagi mencampurkan kesemua produk separa. Peraturan Gajski telah digunakan untuk
menilai prestasi litar pendarab dan hasil keputusan menunjukkan pendarab baru
mengurangkan masa dalam menghasilkan keluaran. Pendarab baru mengurangkan masa
pendaraban diantara 2% hingga 7% berbanding pendarab yang lain. Pendarab
berkelajuan tinggi kemudiannya dilanjutkan untuk membangunkan satu pendarab titik
terapung (FP). Pendarab FP ini telah berjaya dibina dengan jayanya menggunakan
perisian Altera Quartus II dan diimplemntasikan diatas peranti MAX EPM7182SLC847. Hasil menunjukkan litar pendarab FP adalah 38% lebih laju berbanding litar pendarab
yang biasa. Saiz litar pendarab FP adalah 26% lebih besar berbanding litar biasa.
Namun, saiz litar boleh ditoleransi memandangkan matlamat adalah menambahbaik
kelajuan pendarab FP.
iv
ACKNOWLEDGMENT
The work presented in this thesis would not have been possible without the
assistance and cooperation of many people and organizations. Firstly I would like to
thank my great supervisors, Prof Ali Yeon Md. Shakaff and Madam Norina Idris for
their assistance and support during my project. I also would like to extend thanks to Mr.
ht
Rizalafande Che Ismail and Mr. Sean Lee for their time and many helpful suggestions
yr
ig
for this thesis.
al
co
p
Last but certainly not least, I want to express my heartfelt thanks to my wife
Afzan Kamarudin during the years of my research. With no words available to show
ig
in
enough gratitude I say “may Allah reward you within the best”
or
I also want to thank my parents who always provide me the freedom to do what I
by
feel is the right thing. Finally, to all named and unnamed, for their support and
ec
te
d
understanding towards the completion of this project, thank you very much. May Allah
©
Th
i
si
te
m
is
pr
ot
bless and reward all those who had helped me!
v
TABLE OF CONTENTS
ii
ABSTRACT
iii
ABSTRAK
iv
ACKNOWLEDGMENT
v
TABLE OF CONTENTS
vi
ig
ht
DECLARATION OF THESIS
yr
LIST OF TABLES
al
co
p
LIST OF FIGURES
x
xiii
in
ABBREVIATION
ix
ig
1. Background
1
1.2 Research background.............................................................................
1
1.3 Research statement................................................................................
3
ec
te
d
by
or
1.1 Introduction…………………………………………………………...
4
1.5 Research methodologies........................................................................
4
ot
1.4 Research objectives...............................................................................
6
is
pr
1.6 Thesis structure......................................................................................
m
2. Literature Review
si
te
2.1 Introduction…………………………………………………………..
©
Th
i
2.2Multiplication Algorithm.......................................................................
8
8
2.3 Modified Booth Multiplier Architecture...............................................
9
2.4 Modified Booth Encoding.....................................................................
10
2.5 Reviews on Modified Booth Multiplier.................................................
13
2.6 Adder tree circuit...................................................................................
14
2.6.1 4:2 Compressor.......................................................................
15
2.6.2 Review on compressor............................................................
15
2.7 Multiplier topology................................................................................
17
2.7.1 Regular topologies..................................................................
18
2.7.1.1 Arrays topologies.....................................................
18
2.7.1.2 Tree topologies........................................................
20
2.7.2 Irregular topologies.................................................................
20
vi
2.8 Simplified Sign Extension....................................................................
21
2.9 Floating Point Multiplier.......................................................................
21
2.10 Floating Point Representation..............................................................
22
2.11 Summary..............................................................................................
23
3. Analysis of Existing Modified Booth Encoding
25
3.2 Analysis of Modified Booth Encoding Schemes...................................
25
ig
ht
3.1 Introduction............................................................................................
yr
3.2.1 Type A....................................................................................
26
27
3.2.3 Type C....................................................................................
28
3.2.4 Type D....................................................................................
29
30
3.2.6 Type F.....................................................................................
31
3.3 Overall Analysis of the Booth Encoding Schemes................................
32
by
ig
3.2.5 Type E.....................................................................................
or
in
al
co
p
3.2.2 Type B....................................................................................
39
3.5 Analysis of 4:2 Compressor...................................................................
41
3.5.1 Conventional 4:2 Compressor................................................
41
3.5.2 Soulas 4:2 Compressor...........................................................
42
3.5.3 Hsin Lei 4:2 Compressor........................................................
43
3.6 Result of 4:2 Compressor……………………………………………..
44
©
Th
i
si
te
m
pr
ot
y2i+1 only..........................................................................................
is
ec
te
d
3.4 Problem analysis for the Booth Encoder in which the S signal uses
3.7 Summary....................................................................................
48
4. Developed a New Modified Booth Encoding and Redesign 4:2 Compressor
4.1 Introduction............................................................................................
49
4.2 Proposed New Recoding Scheme……………………………………..
49
4.3 Redesigned 4:2 Compressor………………………………………….
51
4.4 Analysis of MBE……………………………………………………...
52
5.5 Analysis of 4:2 Compressor…………………………………………..
55
5.6 Summary………………………………………………………………
57
vii
5. Design of an Efficient Modified Booth Multiplier Using the Proposed Design
of New Modified Booth Encoding and Redesigned 4:2 Compressor
58
5.2 Multiplier Architecture……………………………………………......
59
5.3 Simplified Sign Extension Method (SSE)…………………………….
60
5.4 Compressor Row……………………………………………………...
62
5.5 Binary tree topology………………………………………………….
62
5.6 Simulation and Analysis of Results…………………………………...
64
ig
ht
5.1 Introduction……………………………………………………………
yr
5.6.1 Simulation Results…………………………………………..
64
69
5.6.3 Implementation on FPGA Board………………..………….
71
5.6.4 Analysis delay for multiplication…..……………………….
72
in
al
co
p
5.6.2 Synthesis Results……………………………………………
74
5.7 Summary………………………………………………………………
76
by
or
ig
5.6.5 Hardware costs analysis…………………………………….
ec
te
d
6. Implementation of a Floating Point Multiplier using Efficient Modified Booth
77
6.2 Floating Point Hardware………………………………………………
78
6.2.1 Multiplier Module…………………………………………...
78
6.2.2 Adder Module………………………………………………
79
6.2.3 Normalized Module…………………………………………
79
6.2.4 Sign Module…………………………………………………
80
6.3 Wallace tree topology…………………………………………………
80
6.4 Simulation Result……………………………………………………...
85
6.5 Result Analysis………………………………………………………..
87
6.6 Summary………………………………………………………………
88
©
Th
i
si
te
m
pr
ot
6.1 Introduction……………………………………………………………
is
Multiplier
7. Discussion, Conclusion and Future Work
7.1 Discussion……………………………………………………………..
89
7.1 Conclusion …………………………………………………………...
93
7.2 Future Work…………………………………………………………...
94
References………………………………………………………………..………
viii
95
LIST OF TABLES
11
Table 2.2: Array topologies for the summation of the partial products………...
19
Table 2.3: Description of tree topologies……………………………………….
20
Table 2.4: Description of irregular tree topologies……………………………..
21
Table 2.5: Real Number Notation………………………………………………
23
ht
Table 2.1: Partial Products associated with multiplier bits grouping…………...
ig
Table 3.1: Types of Booth Encoding in generate the Partial Product…………..
33
yr
Table 3.2: Normalized gate delays and hardware cost (Gajski 1997)…………..
26
al
co
p
Table 3.3: Shows detail of total delay for Booth Encoder and Booth Selector
38
Table 3.4: The number of transistors used to develop each scheme……………
39
Table 3.5: Analysis of 4:2 Compressor circuits………………………….…..…
46
or
ig
in
for Type A to Type F……………………………………………………
Table 3.6: Number of transistors used to develop the compressor…………..…
47
53
Table 4.2: Total transistors used in Booth Encoding………………………...…
54
Table 4.3: Analysis of 4:2 compressor circuits…………………………………
56
Table 4.4: Number of transistors used to develop the Compressor…………..…
57
pr
ot
ec
te
d
by
Table 4.1: Overall analysis of all types………………………………………...
is
Table 5.1: Description of simulation results based on Figure 5.6 in page 67….
m
Table 5.2: Description of simulation results based on Figure 5.7 in page 68 …
66
73
te
Table 5.3: Detail comparison of various multipliers……………………………
65
si
Table 5.4: Hardware costs (in number of transistors) for generating the
75
Table 6.1: Input A for multiplication process………………………………….
85
©
Th
i
multiplier circuit………………………………………………………..
Table 6.2: Input B for multiplication process……………………………....…..
86
Table 6.3: Output for multiplication process………………………………..….
86
Table 6.4: Analysis result of FP Multiplication……………………………...…
87
ix
LIST OF FIGURES
5
Figure 2.1: Modified Booth Multiplier block diagram……………………………
10
Figure 2.2: Multiplier bits grouping according to Booth recoding for 8 bit input..
11
Figure 2.3: Example for 01112 ……………………………………………………
11
Figure 2.4: Extended multiplier for 001112……………………………………….
12
ht
Figure 1.1: The development process…………………………………………......
ig
Figure 2.5: An example of a Booth recoded multiplication.………………………
12
15
Figure 2.7: High level view of the 4:2 compressor……………………………….
16
Figure 2.8: Equivalent circuit……………………………………………………..
16
Figure 2.9: Critical Paths in a row a 4:2 compressor..…………….………………
17
in
al
co
p
yr
Figure 2.6: Equivalent circuit (Flynn and Oberman 2001)………………………..
ig
Figure 2.10: Binary Floating Point Format…………………………….…………
23
or
Figure 3.1: Booth Encoder and Booth Selector for Type A (Weste and Harris
by
2005)………………………………………………………………………
27
ec
te
d
Figure 3.2: Booth Encoder and Booth Selector for Type B (Weste and Harris
2005)………………………………………………………………………
28
ot
Figure 3.3: Booth Encoder and Booth Selector for Type C (Brown and Vranesic
pr
2005)………………………………………………………………………
29
m
is
Figure 3.4: Corresponding Booth Encoder and Booth Selector. (Brown and
30
te
Vranesic 2005)…….………………………………………………………
si
Figure 3.5: The corresponding Booth Encoder and Booth Selector for Type E
Th
i
(Wen-Chang and Chein-Wei 2000)………………………………...……
31
©
Figure 3.6: The corresponding Booth Encoder and Booth Selector for Type F
(Hsin-Lei, Chang et al. 2004)……………………………………………
32
Figure 3.7: Critical path delay Booth Encoder for Type A (Weste and Harris
2005)...........................................................................................................
33
Figure 3.8: Critical path delay Booth Selector for Type A (Weste and Harris
2005)………………………………………………………………………
34
Figure 3.9: Critical path delay Booth Encoder for Type B (Weste and Harris
2005) ……………………………………………………………………...
34
Figure 3.10: Critical path delay Booth Selector for Type B (Weste and Harris
2005)...........................................................................................................
x
35
Figure 3.11: Critical path delay Booth Encoder for Type C (Brown and Vranesic
2005)………………………………………………………………………
35
Figure 3.12: Critical path delay Booth Selector for Type C (Brown and Vranesic
2005)………………………………………………………………………
36
Figure 3.13: Critical path delay Booth Encoder for Type D (Brown and Vranesic
2005)………………………………………………………………………
36
ig
ht
Figure 3.14: Critical path delay Booth Selector for Type D (Brown and Vranesic
37
yr
2005)………………………………………………………………………
al
co
p
Figure 3.15: Critical path delay for Type E (Wen-Chang and Chein-Wei
2000)….…...................................................................................................
37
in
Figure 3.16: Critical path delay for Type F (Hsin-Lei, Chang et al.
38
Figure 3.17: Multiplier that uses SSE method……………………………………
40
Figure 3.18: Conventional multiplier when Booth Encoder is in condition ‘111’.
40
Figure 3.19: Type E and F schemes when Booth Encoder is in condition ‘111’…
41
Figure 3.20: Structure of the 4:2 compressor built with Full Adders…………….
42
Figure 3.21: Structure of a Soulas 4:2 Compressor……………………………….
43
Figure 3.22: Structure of a Hsin Lei 4:2 Compressor……………………………..
44
Figure 3.23: Normalized gate delay for conventional 4:2 Compressor…………...
45
Figure 3.24: Normalized gate delay for Soulas 4:2 Compressor………………….
45
te
m
is
pr
ot
ec
te
d
by
or
ig
2004)………...……………………………………………………………
si
Figure 3.25: Normalized gate delay for Hsin Lei 4:2 Compressor……………….
Figure 3.26: Delay Analyses and Total Transistor for All Compressors…………
47
Figure 4.1: The new Booth Selector circuit………………………………………
51
Figure 4.2: Redesign of 4:2 compressor………………………………….………
52
Figure 4.3: Critical path delay for the Proposed Circuit………………………….
53
Figure 4.4: Critical path delay in generating partial product…………………...…
54
Figure 4.5: Normalized gate delay for the Redesign 4:2 Compressor…………….
55
Figure 4.6: Delay Analyses for All Compressors…………………………………
56
Figure 5.1: Block diagram of the 8 x 8 bit parallel multiplier…………………….
60
Th
i
©
46
Figure 5.2: The array of partial products for signed multiplication using
conventional technique (Kang and Gaudiot, 2004) ………………………
61
Figure 5.3: Sign extension less method (Ercegovac, 2003)……………………….
61
xi
Figure 5.4: Critical paths in a row of 4:2 Compressors…………………………...
62
Figure 5.5: The critical path for 8x8 bit multiplier…………………………
63
Figure 5.6: Simulation result based on detail in Table 5.1…………………..……
67
Figure 5.7: Simulation result based on detail in Table 5.2……………………..…
68
Figure 5.8: Report of analysis and synthesis usage summary for 8x8 bit
multiplier…………………………………………………………………..
69
Figure 5.9: Report of analysis and synthesis usage summary for16x16 bit
70
ig
ht
multiplier…………………………………………………………………..
yr
Figure 5.10: Report of analysis and synthesis usage summary for 32x32 bit
70
Figure 5.11: UP2 education development kit …………………………………….
71
Figure 5.12: Shows the multiplication result display in LCD………………….....
72
in
al
co
p
multiplier…………......................................................................................
78
Figure 6.2: The addition architecture of 25x25-bit signed multiplication………...
81
Figure 6.3: The partial product bits after being repositioned……………………..
81
Figure 6.4: Simulation result of multiplication……………………………………
86
Figure 7.1: Breakdown for critical path of 8bit, 16bit and 32 bit multiplier……...
91
Figure 7.2: Total transistors used in developing the multiplier………………..….
92
ot
ec
te
d
by
or
ig
Figure 6.1: FP Multiplier interconnection of block module………………………
pr
Figure 7.3: Critical paths for efficient FP Multiplier and conventional FP
is
Multiplier………………………………………………………………….
92
te
m
Figure 7.4: Total logic elements for efficient FP multiplier and conventional FP
©
Th
i
si
Multiplier………………………………………………………………….
xii
93
ABBREVIATION
CAD – Computer Aided Design
FPGA – Field Programmable Gate Array
FP – Floating Point
MBE – Modified Booth Encoding
MBM – Modified Booth Multiplier
ht
PP – Partial Product
al
co
p
Verilog HDL – Verilog Hardware Description Language
yr
ig
SSE – Simplified Sign Extension
©
Th
i
si
te
m
is
pr
ot
ec
te
d
by
or
ig
in
VLSI – Very Large Scale Integrated
xiii
CHAPTER 1
ig
Introduction
by
or
1.1
in
al
co
p
yr
ig
ht
Background
ec
te
d
This chapter started with explaining the background of multiplier. The history of
Booth algorithm, compressor and simplified sign extension were described. The
ot
objectives and scopes of research were also described in detail. Generally, this chapter
Research background
©
Th
i
si
1.2
te
m
is
pr
reviews the reason to improve the performance of Modified Booth Multiplier.
With the constant growth of computer applications such as computer graphic
and signal processing, fast arithmetic unit especially multipliers are becoming
increasingly important. Advanced VLSI technology has provided the designer the
freedom to integrate many complex components, which was not possible in the past.
Over the years various high speed multipliers have been proposed and realized (Booth
1951; Sorley 1961; Wallace 1964; Lim 1978; Villeger and Oklobdzija 1993; Hsin-Lei,
Chang et al. 2004). Multipliers play an important part in today’s digital signal
processing (DSP) systems. Examples of their use include implementations of digital
filters, discrete Fourier transform, correlations and many others.
1
In any multiplication algorithm, the operation is decomposed into a partial
product generation and partial product summation. Each partial product represents a
multiple of the multiplicand to be added to the final result. Previously reported
multiplication algorithms mainly focus on rapidly reducing the partial products rows
down to the final sums and carries used for the final accumulation. These techniques
mostly rely on the circuit optimization and minimization of critical paths (Kang and
Gaudiot 2004). Various algorithms for reducing the number of partial product have been
ig
ht
proposed. One of the earliest algorithms is Booth algorithm (Booth 1951). Ever since its
yr
first introduction, it has been a popular encoding technique used to reduce the number of
al
co
p
steps in the multiplication process. The Modified Booth Algorithm performs the
in
encoding in parallel and is widely implemented in fast multiplier design.
ig
In the accumulation part, all partial products must be accumulated to obtain the
or
final result. The accumulation of partial product can be accomplished by using adder or
by
compressor. The arrangement of adder or compressor to accumulate the partial product
ec
te
d
also affects the delays to produce the final result. A fast tree topology structure such as
ot
the Wallace tree can be employed for high speed accumulation.
pr
In 2003, Ercegovac (Ercegovac 2003) proposed simplified sign Extension (SSE)
is
which can speed up addition process. Ever since the SSE method got introduced, it has
te
m
always been used and developed because it renders the advantage of reducing the
Th
i
si
adder’s block. As a result, the total delay to produce the final result also reduces.
©
The combination of MBE and adder or compressor will produce a Modified
Booth Multiplier (MBM). On top of that, the implementation of proper tree topology
and SSE method will enhance the performance of the multiplier. This efficient
multiplier has also been extended to the Floating Point (FP) multiplier as a proof of the
efficiency of this multiplier.
2
1.3
Research statement
The limitation in designing a high performance IC or microprocessor was
limited by the multiplier latency. In increasing the performance of this device, many
researchers modified or improved the performance of their multiplier module. In recent
years, various type of multiplier can be implemented in devices, but the most popular
ig
ht
multiplier module was Modified Booth Multiplier. Hence, this research is focused on
al
co
p
yr
improving the MBM module to speedup the performance of multipliers.
The multiplier consists of two operations namely generating the partial product
in
and accumulating the partial product. In generating the partial product, several types of
ig
MBE have been identified and evaluated for the best performance. As a result, the MBE
by
or
which has the fastest speed up time will then be used to develop the efficient multiplier.
ec
te
d
The result of multiplication or final product can be obtained by accumulating the
partial product by adder or compressor circuit. In most situations, the compressor is
ot
used to generate the final product. This is due to the advantages of compressor when
pr
accumulating compared to the adder. Thus, in this study the compressor is used. The
si
te
multiplier.
m
is
compressor has been evaluated and optimized before being implemented in the
Th
i
Generally, in partial product’s accumulation process, counter and adder will
©
affect the multiplier performance. In contrast, there are two other criterion that
contribute to the same problem. The first criterion is the type of tree topology used
during the partial product summation. The next criterion is SSE method. Flynn (Flynn
and Oberman 2001) wrote details about the tree topology in their research. This study
has verified the tree topology as having better performance. Hence, this work will adapt
this tree topology in implementing a high speed multiplier.
The SSE Method currently will improve the total propagation delay in
generating the results by eliminating a few adder or counter modules. Indirectly, it will
reduce the size of the multiplier. However, there is a problem when implementing the
SSE method with the Booth Encoding. The issue is, when the S signal that is generated
3
by Booth Encoder depends only on yi+1. This study has investigated the issue of
designing a high speed multiplier in the context of an entire system.
1.4
Research Objectives
yr
ig
ht
The research work consists of four main objectives:
To design an efficient MBE. This MBE can produce a correct partial
al
co
p
i.
in
product when implementing with the SSE method.
To design an efficient compressor to enhance the accumulation process
iii.
To design a high performance Modified Booth Multiplier using new
by
or
ig
ii.
ec
te
d
MBE and compressor. Further, the SSE method and faster tree structure
ot
are to be implemented to further enhance the performance.
To improve the performance of Floating Point Multiplier by utilizing the
pr
iv.
Research Methodologies
©
Th
i
1.5
si
te
m
is
high performance Modified Booth Multiplier.
The availability of computer-based tools has greatly influenced the design
process in a wide variety of design environments. For this project, a set of design
methodologies are constructed which is similar to semi custom design technique to meet
the research requirement. The flowchart shown in Figure 1 describes a development
process. The most obvious requirement for this process is that the system must function
properly at each level and must meet an expected level of performance.
4
Start
Definition of
design
ht
Initial Design
Redesign
al
co
p
yr
ig
Simulation
Redesign
by
or
ig
in
Design
correct?
ot
ec
te
d
Prototype
implementation
Testing
Meets
specification?
©
Th
i
si
te
m
is
pr
Minor
errors?
Finish
Figure 1.1: The development process.
5
The process begins with the definition of design specifications. The essential
features of the system are identified and an acceptable method of evaluating the
implemented features in the final design is established. The specifications must be
followed to ensure that the developed system meets the general expectations.
From a complete set of specifications, the system is designed efficiently using
schematic entry and Verilog Hardware Description Language (Verilog HDL) codes.
ig
ht
After the general structure is established, CAD tool known as Quartus II 6.1 Web
yr
Edition is used to simulate the behavior of the system and such simulations are used to
al
co
p
determine whether the system meets the required specifications. If errors are found, then
appropriate changes are made and the verification of the new design is repeated through
ig
in
simulation.
or
When the simulation indicates that the design is correct, a complete physical
by
prototype of the product is constructed. For this project, Altera UP2 FPGA board is used
ec
te
d
as a physical prototype to implement the final design of the system. The prototype is
thoroughly tested for conformance according to the specifications. Any minor or major
ot
errors occurred in the testing mode must be fixed. In case of major errors, it is necessary
pr
to redesign the system and repeat the initial step as explained above. When the
si
te
m
is
prototype passes all the tests, then the system is deemed to be successfully designed.
©
Th
i
1.6 Thesis Structure
This thesis consists of seven (7) chapters. Each of the chapter is described as follows:
Chapter 1 describes the problem background, research statement, research scope,
research methodologies and thesis structure.
Chapter 2 focuses on outlines the background knowledge required for
understanding the previous work related to this project. This chapter also discusses the
Modified Booth algorithm, tree topologies and compressor that have been used recently
by other researchers. This chapter also describes the Floating Point architecture.
6
Chapter 3 demonstrates the analysis and simulation of existing MBE and
existing compressor. Some of the existing MBE will be analyzed. The result from the
analysis will be useful in Chapter 4 when designing a new MBE. Finally, the results will
also be compared with several existing design techniques in terms of propagation delay,
speed and hardware costs. Also in this chapter, the existing compressor will be
analyzed. The design which offers the lowest delay performance and hardware costs
ig
ht
will be determined.
yr
Chapter 4 shows how to design the new MBE and new compressor based on the
al
co
p
problem discussed in Chapter 3.
in
Chapter 5 explains the steps in building or constructing a high speed Multiplier
ig
by using MBE and compressor which are designed in Chapter 4. The idea of using
or
binary tree adder is to look for the advantages of MBE and compressor compared to
ec
te
d
by
existing designs.
Chapter 6 demonstrates how to improve the Floating Point Multiplier. Building
ot
FP Multipliers requires development of an efficient MBM and implementing it into FP
pr
Multiplier module. The efficient MBM consists of the new MBE and compressor with
is
the SSE method implemented in the accumulation section. Finally, the Wallace tree has
si
te
m
also been adopted in this efficient MBM.
Th
i
Finally yet importantly, Chapter 7 clarifies the results based on objectives. The
©
promising recommendations were given to improve this project in future. The problems
occurred during completing this project were also presented.
7
CHAPTER 2
ig
Introduction
by
or
2.1
in
al
co
p
yr
ig
ht
Literature Review
ec
te
d
This chapter presents the multiplication algorithm which focuses on parallel
multiplier namely Modified Booth Multiplier. The architecture involved in this
ot
multiplier also revealed in order to understand their characteristics. The element of
pr
multiplier namely Modified Booth Encoding, compressor, Simplified Sign Extension,
is
reduction tree organizations and Floating Point multiplier in previous study were
Th
i
si
te
m
demonstrated.
Multiplication Algorithm
©
2.2
Multiplication is a less common operation than addition, but still essential for
microprocessors, digital signal processors and graphic engines. The most basic form of
multiplication consists of forming the product of two binary numbers. This can be
accomplished through the traditional technique taught in primary school, simplified to
base 2.
M x N bit multiplication can be viewed as forming N partial product of M bits
each, and then summing the appropriately shifted partial product to produce an M + N –
8
bit result P. Binary multiplication is equivalent to a logical AND operation. Therefore,
generating partial products consists of the logical ANDing of the appropriate bits of the
multiplier and multiplicand. Each column of partial products must then be added to
produce a final result.
The number of partial product also can determine the performance of the
multiplication. This number of partial product can be reduced using Modified Booth
al
co
p
yr
circuits. Performance of this circuit will be explored and used
ig
ht
Encoding (MBE). Therefore Modified Booth Algorithm is better in reducing the
There are many factors that will contribute to the multiplier performance
in
including Modified Booth Encoding design, adder design, Simplified Sign Extension
ig
method and reduction tree organizations. This chapter explains more about MBE’s
or
architecture, Compressor design adder tree and Simplified Sign Extension. At the end of
ot
Modified Booth Multiplier Architecture
is
pr
2.3
ec
te
d
by
this chapter, the Floating Point Multiplier will be explained briefly.
te
m
A Modified Booth Multiplier consists of three distinct components. They are the
si
Booth Encoder, Booth Selector and adder tree. The combination of Booth Encoder and
Th
i
Booth Selector are called Modified Booth Encoding (MBE). The function of MBE
©
circuit is to produce the partial product. The Modified Booth Multiplier will generate
only 4 partial products if the multiplication input is 8 bit. Unlike the Shift and ADD or
basic multiplier, the total partial product produced is 8 bit if the multiplication input is 8
bit.
The accumulation of all partial products is done in the adder tree circuit. The
adder tree circuit contributes the most amount of delay in the multiplier performance.
There are a lot of methods that has been implemented in the adder tree circuit to
enhance the multiplier performance. In this research, three components were discussed
namely the adder or compressor circuit, the adder tree topology and the Simplified Sign
Extension (SSE) method. Figure 2.1 shows the high level block diagram of the
9
Modified Booth Multiplier.
Multiplier [7:0]
Multiplicand [7:0]
Y-1
M
BE0 (M, M2, S)
Multiplier {[1:0],0}
Multiplicand [7:0]
Y0MBE0M2
Y1
S
Y1
M
Booth Selector 0
Multiplier [3:1]
BE1 (M, M2, S)
M
ht
Y3
ig
S
Multiplier [5:3]
yr
Y3
BE2 (M, M2, S)
M
in
Y5
Multiplier [7:5]
BE3 (M, M2, S)
Y7
Multiplicand [7:0]
Booth Selector 3
ig
Y0MBE3M2
S
by
ec
te
d
ot
PP0 [10:0]
PP1 {[9:0],2b0}
PP2 {[9:0],4b0}
PP3 {[9:0],6b0}
BE3 (S)
BE2 (S)
BE1 (S)
BE0 (S)
or
S
Multiplicand [7:0]
Booth Selector 2
al
co
p
Y0MBE2M2
Y5
Multiplicand [7:0]
Booth Selector 1
Y0MBE1M2
is
pr
Adder tree
Figure 2.1: Modified Booth Multiplier block diagram
©
Th
i
si
te
m
Final Result [15:0]
2.4
Modified Booth Encoding
The Modified Booth algorithm was proposed by Macsorley (Macsorley 1961).
The basic idea is that, instead of shifting and adding every column of the multiplier term
and multiplying by ‘1’ or ‘0’, the multiplier bits are grouped in blocks of three.
Grouping starts from the LSB, and the first block only uses two bits of the multiplier
(since there is no previous block to overlap).
10

Download Report

This item is protected by original copyright

Paperzz.com

Your Paperzz