RNS-To-Binary Converter for a New Three-Moduli Set 2 - IEEE

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 9, SEPTEMBER 2007
775
RNS-To-Binary Converter for a New Three-Moduli
Set 2n+1 1; 2n; 2n 1
Pemmaraju V. Ananda Mohan, Fellow, IEEE
Abstract—In this brief, the design of residue number system
(RNS) to binary converters for a new powers-of-two related
+1
is considered. This
three-moduli set
moduli set uses moduli of uniform word length (n to
).
It is derived from a previously investigated four-moduli set
+1
. Three RNS-to-binary
converters are proposed for this moduli set: one using mixed
radix conversion and the other two using Chinese Remainder
Theorem. Detailed architectures of the three converters as
well as comparison with some earlier proposed converters for
three-moduli sets with uniform word length and the four-moduli
+1
set
are presented.
2
2
12 2
12 2 + 12
2
1 2 2 +1 2
1
1
+ 1 bits
1
Index Terms—Digital signal processor (DSP), residue number
system (RNS), reverse converters, three-moduli sets.
I. INTRODUCTION
2) based on MRC. The binary number corresponding to given
in the RNS
residues
can be derived using CRT as
(1)
and
for
where
and
are known as the multiplicative inverses
.
of
In each step of MRC, one mixed radix digit as well as sevare determined
eral intermediate results
and next, the MRC digits are weighted using (2) to obtain the
final decoded number. The decoded number is in the case of
MRC expressed as
T
HE advantages of using the residue number system (RNS)
over conventional binary number system are well documented [1], [2]. RNS using powers of two-related moduli sets
has attracted lot of attention due to the possibility of efficient realization of the various building blocks needed such as adders,
multipliers, binary-to-RNS converters and RNS-to-binary converters. Particularly, the moduli set M1
has been extensively investigated especially regarding the realization of RNS-to-binary converters [3]–[6], [11]. Recently, another powers of two-related three-moduli set M2, viz.,
was also proposed [7], [8], [11], [17].
In this paper, we propose a new three-moduli set M4
. This moduli set can be considered as derived from
the four-moduli set M3 due to Vinod and Premkumar [9] viz.,
by removing the modulus
.
Three converters have been proposed in this paper for this new
moduli set based on Chinese Remainder Theorem (CRT) and
mixed radix conversion (MRC). We estimate the hardware and
conversion time requirements of these three proposed designs
and compare them with the converters available in literature for
similar dynamic range RNS.
(2)
III. RNS-TO-BINARY CONVERTERS FOR THE MODULI SET
The new moduli set M4
is applicable
for odd or even since all the moduli are mutually prime. Ev, , and bits,
idently, the residues are of word lengths
respectively. The dynamic range M is
which is less than that of the four-moduli set
for same . We denote the residues corresponding
as
and the binary
to this moduli set
decoded number corresponding to this residue set as B.
The RNS-to-binary conversion for this moduli set can be carried out by using CRT or MRC.
Converter I: The well-known MRC technique suggested for
this reverse conversion is illustrated in Fig. 1. The various multiplicative inverses
,
,
in this converter, denoted as
Converter I, can be computed as follows:
(3a)
(3b)
II. BACKGROUND MATERIAL
There is a need for conversion from residue form to binary form after performing the desired signal processing using
moduli processors in a RNS-based signal processor. This can be
achieved by using one of the two approaches: 1) based on CRT;
Manuscript received January 19, 2007; revised March 27, 2007. This paper
was recommended by Associate Editor S. Tsukiyama.
The author is with R&D, Electronics Corporation of India Limited, Bangalore
560 052, India (e-mail: [email protected]).
Digital Object Identifier 10.1109/TCSII.2007.900844
(3c)
It can be verified easily that (3a)–(3c) are correct. The implementation of the MRC algorithm of Fig. 1 using the various
multiplicative inverses given in (3a)–(3c) follows the architecture given in Fig. 2. The various modulo subtractors can make
. The subtraction
use of the well-known property of
can be realized by one’s-complementing
1549-7747/$25.00 © 2007 IEEE
776
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 9, SEPTEMBER 2007
,
, and are the mixed radix digits. Note, however,
where
that since the least significant bits of are given by , we
(the
MSBs of )
need to compute only
(5)
f 01 2
Fig. 1. MRC for the three-moduli set 2
;
;
2
From (4) and (5), we have
0 1g.
(6)
Denoting
-bit,
and
Fig. 2. Architecture of the new converter I.
and adding to using a modulo
adder in the block
is thus the already availMODSUB1. The mixed radix digit
since
. The subtraction
able
involves two numbers of different
word length of
bits and of bits. By appending
a 1-bit most significant bit (MSB) of zero, can be considered
-bit word. Thus, one’s complement of this word can
as a
using a
-bit modulo adder in the block
be added to
MODSUB2.
can be obtained by circularly left shifting
Next,
already obtained by 1 bit. The computation
can be carried out as explained
of
before in the case of
since
is
-bit wide and
is -bit wide using the block MODSUB3.
with
Next, the multiplication of
to obtain
[see (3c)] is carried out by first left
circular shifting
by 1 bit and one’s
complementing the bits in the result.
The last stage in the converter shall compute using (2)
(4)
and
of word lengths
-bit
respectively,
as
,
, the three operands to be added to
can be seen as shown in the equation at the bottom of
obtain
the page, together with a least significant bit (LSB) of “1.” Note
that the primes indicate the inverted bits. These three words
-bit words since the first
can be simplified as two
-bit
and second words together have zeroes in all the
-bit
positions. These two words can be added using a
CPA (CPA1 in Fig. 2). Since bits in one operand being added
are “one,” full adders can be replaced by pairs of exclusive
NOR (EXNOR) and OR gates.
The modulo adders can be realized using one’s-complement
adders (CPA with end-around carry) or by using special designs
due to Efstathiou et al. [10]. In this brief, we consider the use of
one’s-complement adders.
The hardware requirements for this converter are thus
for MODSUB1,
each for MODSUB2 and
MODSUB3 and
for the
CPA1. The total hardware requirement and conversion time are
presented in Table I.
Converter II: We next consider the use of CRT for deriving
the converter II. Note that the various multiplicative inverses
needed in the computation of (1) are as follows:
Substituting these in (1), the desired binary word
tained as
can be ob-
(7)
MOHAN: RNS-TO-BINARY CONVERTER FOR A NEW THREE-MODULI SET
777
TABLE I
COMPARISON OF VARIOUS CONVERTERS FOR THREE-MODULI SETS
Subtracting
have
from both sides of (7) and dividing by
, we
(8)
where
circularly rotating
can be obtained by
by two bits to the left (to compute
) and taking one’s complement (to
take into account the negative sign). We can compute (8) by
adding the various terms using a CSA chain followed by a
adder. We denote ,
and
modulo
as
,
, and
. We manipulate (8) by rewriting
the negative terms in terms of one’s complements of similar
word length and adding a correction term
(9)
The first seven terms can be rewritten so as to reduce the number
of words to be added to four as shown in (10), at the bottom of
the page.The last term in (10) can be seen to be
. In this term, the sign bit has not been included. Note that
is negative taking into account an additional MSB of “1.”
These terms can be rewritten as shown in (11), at the bottom of
MSBs in the word
shown in italics in
the page. The
whose MSBs
(10) have been combined with the fourth word
has been absorbed in the LSB
are zero. Further, the LSB 1 of
which was zero. Note that the sum of these five
position of
and is
words in (11) can at most be
. The five words in (11) can be
less than
added using a carry save adder (CSA) comprising of three stages
CSA1, CSA2, CSA3 followed by a modulo
adder as shown in Fig. 3(a). This modulo adder can be of parallel
type. First, using CPA2, the CARRY and SUM output vectors of
CSA3 can be added.
and
can be carried out by two
The subtraction of
three-input adders by adding the CARRY and SUM vectors of the
and
and selecting
CSA3 with two’s-complement of
the correct result using a 3:1 Multiplexer based on the sign bits
full-adders are
of the result of the adders. Note that
full-adders
needed for each of CSA4 and CSA5 and
are needed for the each of CPA3 and CPA4. The CSA computing
SUM and CARRY vectors (CSA1, CSA2, CSA3) and the carry
(10)
(11)
778
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 9, SEPTEMBER 2007
Fig. 3. Implementations of converters II and III based on CRT.
propagate adder CPA2 need
and
. The
total hardware requirement and conversion time are presented in
Table I.
Converter III: Note that the hardware requirement of the converter II can be reduced slightly by using ROM as will be shown
next. It may be noted that the sum of the five words in (11)
,
,
and four MSBs
excluding the two MSBs in case of
(shown in bold face) is less than
. Denoting
in case of
,
,
highlighted in bold
the three 2-bit MSB words of
as , , and , by having a ROM containing ten locations
-bit wide (since
),
each
the term needs to be added to , where
(12)
-bit ROM
The value of can be read from a
using a four bit address since
and added to the
five words comprising in the CSA as shown in Fig. 3(b) and
is needed. Note that in (12), the
then only one subtraction of
and
term 15 is added to take into account the four MSBs of
the 16 accounts for the negative value of
considering the
sign bit also. The ROM size can be reduced by noting that for
,
,
,
and
,
and all the other values can be
for
obtained by adding multiples of
to or by appending
two bits either 00, 01, 10 or 11 as MSBs. Thus, combinational
bits
logic together with three ROM locations each of
can achieve the conversion.
thus obtained can be added
The five words in (11) and
using a CSA comprising of four stages CSA6, CSA7, CSA8, and
adder as shown in Fig. 3(b).
CSA9 followed by a modulo
This modulo adder can be of parallel type. CPA5 adds the CARRY
and SUM output vectors of CSA9. The subtraction of
can
in a
be carried out by adding the two’s-complement of
three-input adder with the CARRY and SUM output vectors of the
CSA9 and selecting the correct result using a 2:1 Multiplexer
based on the sign bit of the result of the modulo adder. Note that
Full adders each are needed for CSA10 and CPA6. The
CSA computing SUM and CARRY vectors (CSA6, CSA7, CSA8,
and
. The reand CSA9) needs
sulting hardware requirement and conversion time are presented
in Table I.
Design Example: Consider using the moduli set {127, 63,
and the residue set {73, 22, 27}. The decoded
64} i.e.,
value shall be 3775.
binary number shall be 241627. The
First
can be obtained by left circular
by 2 bits to get
and
shift of
. Thus,
taking it’s one’s-complement to get
,
,
,
,
in (11) can
the five words to be added
be written as in (13). The multiplexer selects the output of the
to yield
subtractor subtracting 8001 (i.e.,
. In converter III, the terms to the right of the
(13)
,
and
dotted line need to be added to . Since
,
. Thus, , the last word to be added to the five
operands to the right of the dotted line is obtained from the ROM
. The various words are as follows:
as
(14)
One subtraction of 8001 from this value gives the desired
.
MOHAN: RNS-TO-BINARY CONVERTER FOR A NEW THREE-MODULI SET
779
TABLE II
COMPARISON OF BINARY-TO-RNS CONVERTERS AND MULTIPLIERS FOR TWO MODULI
IV. EVALUATION OF PROPOSED CONVERTERS
The three new converters proposed in Section III for the
moduli set M4 are compared next regarding the hardware
and conversion time with converters available for the two
three-moduli sets M1, M2 and four-moduli set M3. In the case
of the moduli set M1, the converters in [3] and [4] use -bit
CPAs with EAC and in [5] use -bit CPAs. The converters CI,
CII, and CIII in [6] used -bit or -bit CLAs. The converters
for the moduli set M2 [7], [8], [17] used modulo adders due to
Efstathiou et al. [10] and CPAs. For the sake of fair comparison,
we consider the use of one’s-complement adders for all adders
in these as well as proposed designs. The
modulo
resulting hardware and delay requirements are summarized
in Table I for the various designs. Evidently, the proposed
converter I needs much less hardware than the converters [5],
[8], [17], CII [6]. Further, it has similar hardware requirements
as the converters [3], [4], CI and CIII [6]. It, however, needs
more conversion time than the converters [3]–[6], [8], [17].
All the proposed converters are superior to the four-moduli
converters [9], [12], [13] in hardware as well as conversion
time requirements. The new converters II and III have less conversion time than the converters [3], [4], CI [6], [8], [17] and
proposed converter I, however, needing much larger hardware.
It can be seen that the proposed converters are in some cases
superior to other converters either in hardware requirement or
conversion time.
However, the proposed moduli set as well as the moduli set
whereas the moduli set M1
M2 use residues of the form
. The modulo multiplication
has one modulus of the type
[frequently needed in the finite-impulse response (FIR) filter implementations] as well as binary-to-RNS conversion, need less
hardware and less computation time for the moduli of the type
than for the moduli of the type
as shown in
Table II. In the evaluation of multipliers presented in Table II, it
is assumed that CPAs are used in the modulo adders and that no
Booth recoding is employed.
V. CONCLUSION
In this brief, we have considered the residue to binary conversion for a new three-moduli set
derived from Vinod four-moduli set by deleting one modulus
. The Converter I is derived using the well-known
MRC technique. The converters II and III are derived using the
well-known CRT. The converter III uses ROM. All the three
proposed converters have been evaluated regarding hardware
and conversion time requirements and compared with other
recently described converters for three-moduli sets as well as
the four-moduli set of Vinod et al. all using uniform moduli
length, to bring out the tradeoffs in hardware and conversion
times.
REFERENCES
[1] N. S. Szabo and R. I. Tanaka, Residue Arithmetic and Its Applications
to Computer Technology. New York: Mc-Graw Hill, 1967.
[2] M. A. Soderstrand, W. K. Jenkins, G. A. Jullien, and F. J. Taylor,
Residue Number System Arithmetic: Modern Applications in Digital
Signal Processing. New York: IEEE Press, 1986.
[3] S. J. Piestrak, “A high-speed realization of residue to binary system
converter,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.,
vol. 42, no. 10, pp. 661–663, Oct. 1995.
[4] A. Dhurkadas, “Comments on ‘A high-speed realization of a residue
to binary number system converter’,” IEEE Trans. Circuits Syst. II,
Analog Digit. Signal Process., vol. 45, no. 3, pp. 446–447, Mar. 1998.
[5] M. Bharadwaj, A. B. Premkumar, and T. Srikanthan, “Breaking the
2n-bit carry propagation barrier in residue to binary conversion for
1; 2 ; 2 + 1 -moduli set,” IEEE Trans. Circuits Syst. I,
the 2
Fundam. Theory Appl., vol. 45, no. 9, pp. 998–1002, Sep. 1998.
[6] Y. Wang, X. Song, M. Aboulhamid, and H. Shen, “Adder-based residue
to binary number converters for 2
1; 2 ; 2 + 1 ,” IEEE Trans.
Signal Processing, vol. 50, no. 7, pp. 1772–1779, Jul. 2002.
[7] A. A. Hiasat and H. S. Abdel-Aty-Zohdy, “Residue to binary arithmetic
converter for the moduli set 2 ; 2
1; 2
1 ,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 45, no. 2, pp. 204–209,
Feb. 1998.
[8] W. Wang, M. N. S. Swamy, M. O. Ahmad, and Y. Wang, “A high-speed
residue-to-binary converter for three-moduli 2 ; 2
1; 2
1
RNS and a scheme for its VLSI implementation,” IEEE Trans. Circuits
Syst. II, Analog Digit. Signal Process., vol. 47, no. 12, pp. 1576–1581,
Dec. 2000.
[9] A. P. Vinod and A. B. Premkumar, “A residue to binary converter for
the 4-moduli superset 2
1; 2 ; 2 + 1 ; 2
1 ,” J. Circuits
Syst. Comput., vol. 10, pp. 85–99, 2000.
[10] C. Efstathiou, D. Nikolos, and J. Kalamatinos, “Area-time efficient
modulo 2
1 adder design,” IEEE Trans. Circuits Syst. II, Analog
Digit. Signal Process., vol. 41, pp. 463–466, 1994.
[11] W. Wang, M. N. S. Swamy, M. O. Ahmad, and Y. Wang, “A study of
the residue-to-binary converters for the three-moduli sets,” IEEE Trans.
Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 2, p. , Feb. 2003.
[12] B. Cao, T. Srikanthan, and C. H. Chang, “Efficient reverse converters
for the four-moduli sets 2
1; 2 ; 2 + 1 ; 2
1 and 2
1; 2 ; 2 + 1 ; 2
1 ,” Proc. IEE Comput. Digit. Tech., vol. 152,
no. 5, pp. 687–696, Sep. 2005.
[13] B. Cao, C. H. Chang, and T. Srikanthan, “New efficient residue to binary converters for 4-moduli set 2
1; 2 ; 2 + 1 ; 2
1 ,” in
Proc. ISCAS, 2003, vol. 4, pp. 536–539.
[14] G. Bi and E. V. Jones, “Fast conversion between binary and residue
numbers,” Electron. Lett., vol. 24, pp. 1195–1197, Sep. 1988.
[15] Z. Wang, G. A. Jullien, and W. C. Miller, “An algorithm for multiplication modulo (2
1),” in Proc. 39th Midwest Symp. Circuits Syst.,
1997, pp. 1301–13044.
[16] R. Zimmermann, “Efficient VLSI implementation of modulo (2
1)
addition and multiplication,” in Proc. IEEE Symp. Comput. Arithm.,
Apr. 1999, pp. 158–167.
[17] W. Wang, M. N. S. Swamy, M. O. Ahmad, and Y. Wang, “A note on
‘A high-speed residue to binary converter for three-moduli 2 ; 2
1 RNS and a scheme for its VLSI implementation’,” IEEE
1; 2
Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 49, no. 3, p.
230, Mar. 2002.
f 0
g
f 0
f
0
g
0g
f
0
f 0
0 g
f 0
0 g
0 g
0 g
0
f 0
f 0
0 g
0
6
0 g
f
0