Fractional and Integer Arithmetic using the DSP56000

APR3/D
Rev. 1
Fractional and
Integer Arithmetic
using the
DSP56000 Family of
General-Purpose
Digital Signal
Processors
M o t o r o l a ’ s
H i g h - P e r f o r m a n c e
D S P
T e c h n o l o g y
Table
of Contents
SECTION 1
Introduction
1-1
SECTION 2
2.1 Twos-Complement Fraction
2.1.1 Twos-Complement Integer
2.2 Double-Precision Numbers
2.3 Real Numbers
2.4 Mixed Numbers
2.5 Data Shifting
2.5.1 1-Bit Shifts/Rotates
2.5.2 Multi-Bit Shifts/Rotates
2.5.3 Fast Multi-Bit Shifts
2.5.4 No-Overhead, Dynamic Scaling
(1-Bit Shifts)
2-1
2-4
2-6
2-6
2-10
2-14
2-14
2-15
2-16
3.1 Mixed Numbers
3.1.1 Addition
3.1.2 Subtraction
3.2 Real Numbers
3.2.1 Addition
3.2.2 Subtraction
3-1
3-1
3-3
3-4
3-4
3-6
Data
Representations
SECTION 3
Mixed- and
Real-Number
Addition and
Subtraction
MOTOROLA
2-20
iii
Table
of Contents
SECTION 4
Signed
Multiplication
SECTION 5
Signed Division
iv
4.1 Multiplication of a Signed Fraction with a
Signed Fraction
4.2 Multiplication of a Signed Integer with a
Signed Integer
4.3 Multiplication of a Signed Integer with a
Signed Fraction
4.4 Double-Precision Multiplication
4.5 Double-Precision Multiplication
of Fractions
4.6 Double-Precision Multiplication of
Integers
4.7 Multiplication of a Real Number with a
Real Number
4.8 Multiplication of a Mixed Number with
a Mixed Number
5.1 Division of a Signed Fraction by a
Signed Fraction
5.2 Division of a Signed Integer with a
Signed Integer
5.3 Double-Precision Division
5.4 Real-Number Division
5.5 Mixed-Number Division
5.6 Divide Routines with N ≤ 24 Bits
5.6.1 Positive Operands with
Remainder Where N Is Variable
5.6.2 Positive Operands without
Remainder Where N Is Fixed
4-3
4-4
4-5
4-7
4-7
4-10
4-12
4-16
5-3
5-8
5-10
5-14
5-16
5-18
5-18
5-18
MOTOROLA
Table
of Contents
5.6.3 Signed Operands with Remainder
Where N Is Variable
5-20
5.6.4 Signed Operands without
Remainder Where N Is Fixed
5-21
SECTION 6
REFERENCES
MOTOROLA
Conclusion
6-1
References-1
v
Illustrations
Figure 2-1
Twos-Complement Fraction
2-2
Figure 2-2
DSP56000 Operands
2-3
Figure 2-3
Twos-Complement Integers
2-5
Figure 2-4
Real-Number Format
2-7
Figure 2-5
CONVR Macro Definition
2-8
Figure 2-6
CONVRG Macro Definition
2-8
Figure 2-7
Mixed Number Format
2-11
Figure 2-8
CONVMN Macro
2-11
Figure 2-9
CONVMNG Macro Definition
2-13
Figure 2-10
Multi-Bit Shifts Using REPeat, DO
2-15
Figure 2-11
Multi-Bit Shift Macros
2-16
Figure 2-12
Constant Generation and Multi-Bit Shifts
2-19
Figure 4-1
Signed-Integer Multiplication
4-1
Figure 4-2
Signed-Fraction Multiplication
4-2
Figure 4-3
CONVSISF Routine
4-6
Figure 4-4
Double-Precision Multiplication
4-7
Figure 4-5
MULT48FG Flowchart
4-9
Figure 4-6
MULT48FG Routine
4-10
Figure 4-7
MULT48IG Routine
4-13
MOTOROLA
vii
Illustrations
Figure 4-8
REALMULT Routine
4-14
Figure 4-9
REALMULT Data ALU Programmer’s Model
4-14
Multiplication of Two Mixed Numbers
4-17
Figure 5-1
SIG24DIV Routine
5-4
Figure 5-2
INTDIV Routine
5-10
Figure 5-3
Double-Precision Divide Flowchart
5-11
Figure 5-4
DIV48 Routine
5-13
Figure 5-5
Positive Divide: 48-Bit Operand and Remainder
5-19
Figure 5-6
Positive Divide: N-Bit Quotient without Remainder
5-19
Figure 5-7
Signed Divide: 48-Bit Operand and Remainder
5-20
Figure 5-8
Signed Divide: N-bit Quotient with Remainder
5-21
Figure 4-10
viii
MOTOROLA
APR3 LOT Page ix Wednesday, December 13, 1995 2:44 PM
List of Tables
Table 2-1
Fast 4-Bit Right Shift
2-17
Table 2-2
Fast 4-Bit Left Shift
2-17
Table 3-1
Positive Mixed Numbers with Sum Less than 128 3-2
Table 3-2
Positive Mixed Numbers with Sum Greater
than 128
3-3
Table 3-3
Mixed-Number Subtraction
3-3
Table 3-4
Mixed-Number Subraction with Negative Result
3-4
Table 3-5
Real-Number Addition
3-5
Table 3-6
Real-Number Addition Using the Exension Bit
3-6
Table 3-7
Real-Number Subtraction with a Positive Result
3-7
Table 3-8
Real-Number Subtraction with a Negative Result 3-7
Table 4-1
Signed-Fraction Multiplication
4-3
Table 4-2
Signed-Integer Multiplication
4-4
Table 4-3
Signed-Integer and Signed-Fraction Multiplication 4-5
Table 4-4
Double-Precision Fractional Multiplication
4-8
Table 4-5
Double-Precision Integer Multiplication
4-11
Table 4-6
Real-Number Multiplication
4-12
Table 4-7
Multiplication of Two Negative Real Numbers
4-15
Table 4-8
Multiplication of Two Mixed Numbers
4-17
MOTOROLA
ix
APR3 LOT Page x Wednesday, December 13, 1995 2:44 PM
List of Tables
Table 5-1
Signed-Fraction Division
5-6
Table 5-2
Contents of Accumulator After Signed-Fraction
Division Iterations
5-7
Table 5-3
Signed-Integer Division
5-8
Table 5-4
Contents of Accumulator After Each
Signed-Integer Division Iteration
5-9
Table 5-5
Double-Precision Division
5-12
Table 5-6
Result of Real-Number Division Using
DIV24 Routine
5-15
Result of Real-Number Division Using
DIV48 Routine
5-15
Result of Mixed-Number Division Using
SIG24DIV Routine
5-16
Contents of Accumulator After Each
Mixed-Number Division Iteration
5-17
Table 5-7
Table 5-8
Table 5-9
x
MOTOROLA
SECTION 1
Introduction
“Data
representations
are not as
difficult as they
may first
appear. The
basic difference
between data
representations
is a shifting or
scaling
operation.”
The
DSP56000 Family of general-purpose digital
signal processors (DSPs) is distinctive in that the onchip multiplier directly supports fractional data formats
and indirectly supports integer data formats. This application note discusses using the DSP56000/
DSP56001 processors to perform arithmetic operations on data represented as integers, fractions, and
combinations thereof, namely mixed numbers, real
numbers, or floating-point numbers. A fractional data
representation was chosen for the DSP56000/
DSP56001 for the following reasons:
• The most significant product (MSP) of a
multiplication has the same format as the input and
can be immediately used as an input to the multiplier
without a shifting operation.
• The least significant product (LSP) of a multiplication
can be rounded into the MSP naturally: i.e., without
having to handle a changing exponent.
• All floating-point formats use fractional mantissas
• Coefficients in digital filters are output as fractions
by high-level filter-design software packages.
MOTOROLA
1-1
From a hardware point of view, this decision had its
primary impact on the design of the multiplier (see
SECTION 4 Signed Multiplication). Since the format of the resultant operands from addition or
subtraction operations is unchanged from the format of the input operands, the choice of integer or
fractional formats does not impact the design of the
arithmetic logic unit (ALU).
Many data representations will be defined, including twos-complement fractional and integer
numbers, real and mixed numbers, and double-precision (48-bit) numbers (see SECTION 2 Data
Representations). Signed Multiplication and
Signed Division (SECTIONs 4 and 5, respectively) discuss multiplication and division using these
data representations.
Data representations are not as difficult as they
may first appear. The basic difference between
data representations is a shifting or scaling operation. Performing shifting operations with the
DSP56000 Family of processors is discussed in
SECTION 2.5 Data Shifting. Mixed-and Realnumber Addition And Subtraction (Section 3)
discusses operations involving more than simple
ADD and SUB instructions. Division-yielding quotients and remainders that are not word (24-bit)
multiples are described in SECTION 5.6 Divide
Routines With N ≤ 24 Bits.
■
1-2
MOTOROLA
SECTION 2
Data Representations
“The DSP56000
Family
processors
provide four
distinct ways to
perform data
shifts . . .”
Different data representations are introduced and
discussed in the following paragraphs. The following
sections also present routines, where required, that
convert numbers from one data representation to
another.
2.1 Twos-Complement
Fraction
A fraction, F, is any number whose magnitude satisfies the inequality:
0.0 ≤ mag(F) < 1.0
Examples of fractions are 0.25 and -0.87. The twoscomplement fractional data representation is shown
in Figure 2-1. The binary word is interpreted as having
a binary point after the most significant bit (MSB). The
range of numbers that can be represented using N-bit
twos-complement fractional data is:
-1.0 ≤ F ≤ 1-2-(N-1)
MOTOROLA
2-1
s.f f f f f f f f f f f f f f f f f f f f f f f f f
N bits
where:
s = the sign bit
f = a fractional bit
. = the implied binary point
N = the number bits
Figure 2-1
Twos-Complement Fraction
This is represented in a register by a sign bit followed by the
fractional number which is less than one.
DSP56000/DSP56001 processors use a fractional
data representation for all arithmetic operations.
Figure 2-2 shows the bit weighting and register
names for words, long words, and accumulator operands in the DSP56000/DSP56001 processors.
For words and long words, the most negative number that can be represented is -1.0 whose internal
representation is $800000 and $800000000000,
respectively. The “$” sign denotes a hexadecimal
value. The most positive word is $7FFFFF or 1-2-23
= 0.9999998, and the most positive long word is
$7FFFFFFFFFFF or 1-2-47 =0.999999999999993.
These limits apply to data stored in memory and to
data stored in the data ALU input pipeline registers. The accumulators, A and B, have 8-bit extension registers, A2 and B2, respectively. This
extension allows word growth so that the most posi-
2-2
MOTOROLA
tive and negative numbers that can be represented
in the accumulators are +255.999999999999993
and -250.0, respectively.
Word Operand (24 bits)
Weighting = -20
.
2-23
Registers: X1 X0 Y1 Y0 A1 A0 B1 B0
Long Word Operand (48 bits)
Weighting =
-20
2-24
2-47
.
Registers: X1:X0=X
Y1:Y0=Y
A1:A0=A1
B1:B0=B10
Accumulator (56 bits)
Weighting = -28
20
2-24
2-47
.
Sign
Extension
Registers:
Figure 2-2
Operand
A = A2:A1:A0 or
Zero
B = B2:B1:B0
DSP56000 Operands
These are fractional number representations that can be one word
long, two words long or two words with eight bits of sign extension
and overflow extension.
An immediate fractional number can be stored in a
general-purpose register (for example, X0) by simply using the MOVE immediate instruction. For
example, execution of:
MOVE #.5,X0
with result in $400000 (0.5) being stored in X0.
MOTOROLA
2-3
2.1.1 Twos-Complement Integer
An integer, I, is a number that contains no decimal
point and its magnitude satisfies the inequality:
1< mag(I)
Examples of integer numbers are 1, 256, and -67.
The twos complement integer data representation
is shown in Figure 2-3. The binary word is interpreted as having a binary point after the least significant
bit (LSB) — that is, the data is right-hand justified.
The range of number that can be represented using
N-bit twos-complement integers is:
-2N-1≤1≤2N-1-1
The MSBs are sign-extension bits. Caution must be
exercised when moving integer data on the
DSP56000/DP56001 because the DSP56000/
DSP56001 will naturally tend to left-hand justify
data (i.e., assuming the data is a fraction).
siiiiiiiiiiiiiiiiiiiiiiiii.
N bits
where:
Figure 2-3
2-4
s = the sign bit
i = an integer bit or sign extension bit
. = the implied binary point
N = the number bits
Twos-Complement Integers
These can be represented in DSP56000 registers but caution must
be exercised when moving data.
MOTOROLA
The DSP56000 cross assembler provides forcing
functions that can be used to facilitate handling integer data, which is best described by way of
example. To store the integer value 56 ($38) in X1,
the programmer would be tempted to use:
MOVE #56,X1
The value stored in X1 would be $380000
(3670016), which is clearly incorrect. The error occurred because the assembler will always pick the
shortest form of instruction encoding. Data less
than eight bits can be encoded in the MOVE instruction without the use of an extension word. The
DSP56000/DSP56001 interprets this data as fractional and therefore stores it left-hand justified as
demonstrated in this example. If the immediate long
force operator, >, is used, an extension word will be
used, and the data will be right-hand justified as the
following example shows:
MOVE # > 56,X1
The content of X1 will be $000038. For integers of magnitude greater than 128, the short addressing is not
applicable since the number will occupy more than
eight bits. The value will therefore be treated as a 24-bit
number by the assembler and encoded into the LSBs
of the extension word. As an example, execution of:
MOVE #1234,X1
will result in X1=$0004D2 (1234).
MOTOROLA
2-5
2.2 Double-Precision Numbers
A double-precision number is a 48-bit twos-complement number, fraction or integer, that is stored as a
long-word operand. The range of a double-precision twos-complement fractional number is:
-1 ≤ double-precision fraction ≤ 1-2-47
The range for a double-precision integer is:
-140737488355328 ≤ double-precision integer ≤140737488355327
2.3 Real Numbers
A real number, R, consists of an integer part and a
fractional part. The decimal point separates the two
parts. Only the integer part has a sign bit, and it may
assume the value zero. The real-number representation discussed in this document consists of a 24bit integer portion and a 24-bit fractional portion
(see Figure 2-4).
47
24
siiiiiiiiiiiiiiiiiiiiiii
where:
Figure 2-4
2-6
s
i
f
.
23
.
0
sffffffffffffffffffffffff
= the sign bit
= an integer bit or sign extension bit
= fractional bit
= binary point
Real-Number Format
This is a concatenation of the twos-complement integer
representation and twos-complement fraction representation.
MOTOROLA
For long-word operands, like X (X1:X0), the integer
part of the real number will occupy the upper 24 bits
of X, X1, and the fractional part with occupy the lower
24 bits, X0. The binary point is assumed to occupy
an imaginary place between bit 23 and bit 24; whereas, the sign bit occupies the leftmost bit of the integer
portion. The range of a real number, R, is:
-8388608.0 ≤ R < 8388607.9999999
Examples of real numbers are:
56.789, 0.345, and -789.123.
The convert to real macro, CONVR, presented in
Figure 2-5, performs the conversion of a real, positive decimal number, xr, to the real-number format
for storage. This macro uses the convert to integer
function, CVI, built into the assembler to convert
real numbers to integers by simply truncating the
fractional part of the number. When the fractional
portion is moved into A0, it will be signed. A left shift
is subsequently performed to eliminate this sign bit.
;CONVR.ASM
;This macro converts a real positive decimal number,
;xr (0.0 < xr < 8388607.9999999), to the real number format.
;The signed integer part is stored in the upper part of the
;A accumulator (A1), and the unsigned fractional part is stored in A0.
CONVR macro xr
;macro definition
clr a
;clear the accumulator
move #(xr-@cvi(xr)),a0
;store fractional part in A0
asl a
;eliminate sign bit in fract. part
move #@cvi(xr),a1
;store the integer part in A1
endm
;end macro definition
Figure 2-5
MOTOROLA
CONVR Macro Definition
This converts a real, positive decimal number to the real number
format.
2-7
;CONVRG.ASM
;This macro converts a real decimal number,
;xr (8388607.9999999 > xr >-8388608.0), to the real number format.
;The signed integer part is stored in the upper part
;of the A accumulator (A1), and the unsigned fractional part is stored in A0.
CONVRG macro xr
;macro definition
clr a
;clear accumulator a
move #(xr-@cvi(xr)),a0
;move the fraction into A0
asl a
#>@cvi (xr),x1
;shift the fraction’s sign bit into A1
move a1,x0
;move the int. into X1, move A1 to X0
move x1,a1
;move the int. to A1
sub x0,a
;convert the integer to one’s complement
endm
endm
Figure 2-6
CONVRG Macro Definition
This converts a real decimal number to the real-number format.
Converting a negative real number to the real-number format involves two steps. First, the absolute
value of the number has to be stored into an accumulator in the real-number format; then the stored
value is negated.
The convert real general macro, CONVRG, depicted in Figure 2-6, handles both positive and negative
operands. The ASL instruction will eliminate the
sign bit in the fractional part.
If the number is positive, the number in register A has
been correctly converted. If xr is negative, the sign bit
of the fractional part will propagate from A0 to A1 due
to the ASL instruction. This sign bit is used to subtract out the one that was added to the integer
portion when it was converted to a twos-complement
number. The single case in which one should not
be subtracted from the integer is when the fraction
2-8
MOTOROLA
portion is zero. In this case, since CVI always returns a zero as positive, zero is subtracted from the
integer, and the result is correct.
A good example to demonstrate the CONVRG
macro is the case where xr=-1.5. After the SUB
X0,A
instruction
has
been
executed,
A=$00:FFFFFE:800000 (-1.5 in the real-number
format). The extension register contains all zeros
even though the real number is negative. That is, bit
47 is the true sign bit for the 48-bit real number. It is
not immediately apparent that $00:FFFFFE:800000
represents -1.5. It becomes apparent after the absolute value of A1:A0 is taken, which is
accomplished by moving the long word, A1:A0, into
accumulator B (so that bit 47 is properly sign extended), and then taking the absolute value of B.
This yields B=$00:000001:800000 (+1.5 in the realnumber format).
In summary, real numbers must be treated as 48-bit
entities — for example, do not take the absolute value of the integer portion only.
2.4 Mixed Numbers
A mixed number (MN) is a special case of a real
number in that the number occupies 24 bits instead
of 48 bits and satisfies the inequality:
-128.0 ≤ MN < 128.0
Examples of mixed numbers are -67.875 and
89.567. As seen in the mixed-number format given
MOTOROLA
2-9
in Figure 2-7, the 24-bit word is divided into two
parts. The first part (which is an 8-bit 1’s complement number) contains the integer portion of the
mixed number including the sign bit related to the
whole 24-bit number. The second part (which is a
16-bit 2’s complement number with the sign bit
shifted into the first part) contains the unsigned fractional portion of the number.
siiiiiii.ffffffffffffffff
24 bits
where:
Figure 2-7
s
i
.
f
= the sign bit
= an integer bit or sign extension bit
= the binary point
= a fractional bit
Mixed Number Format
This is a variation of the real-number format that allows a real
number to reside in a single register.
The virtue of the MN format is that MN-formatted
data can be treated as 24-bit signed data by the machine — that is, the integer and fractional portions
do not have to be treated separately. Two macros
have been prepared to show how real numbers
having a magnitude less than 128 can be stored in
24 bits and set up in the MN format.
2-10
MOTOROLA
;CONVMN.ASM
;This macro converts
;The mixed number is
CONVMN
macro
move
move
mpyr
add
endm
Figure 2-8
a positive decimal mixed number to the MN format.
stored in A1. 0.0 ≤ xmn < 128.0
xmn
;macro definition
#(xmn-@cvi (xmn)),x0
;fractional part to X0
#$010000,y1
;shift constant in y1
y1,x0,a #@cvi (xmn),x1
;shift X0;integer part in X1
x1,a
;concatenate int. and fract.
;end macro definition
CONVMN Macro
This converts a positive decimal mixed number to the MN format.
The first macro, convert to MN, CONVMN, converts
any positive mixed number to the MN format (see
Figure 2-8). The macro does not check the sign or
the magnitude of the input; it is the user's responsibility to make the necessary checks. The fractional
part of the number is shifted to the right by seven
bits, using a shift constant, $010000, and the MPYR
instruction discussed in SECTION 2.5 Data Shifting; therefore, the first eight bits of the 24-bit word
are zero (see Figure 2-8). This instruction ensures
that the fraction is stored in the upper part of the accumulator (A1 in this case) and that it is 16 bits
(rounded). In parallel with the MPYR instruction, the
integer is moved to X1 by using the move immediate short instruction that places the 8-bit signed
integer in the upper eight bits of the 24-bit register,
X1. The ADD X1,A instruction concatenates the integer and fractional parts to form the mixed number
and sign extends the mixed number so that the
MOTOROLA
2-11
number can be immediately used in the data ALU.
A second macro, convert to MN general, CONVMNG, handles signed real numbers having a
magnitude less than 128 (see Figure 2-9). When a
negative number is detected, it is made positive,
then transformed into the MN format, and finally
negated again.
;CONVMNG.ASM
;This macro will convert a decimal number xmn,
;where -128.0 < xmn < 128.0, into the mixed number, MN, format.
;
CONVMNG macro
xmn
;macro definition
move
#(xmn - @cvi(xmn)), x0 ;obtain signed fract. part
clr b
#$080000,y1
;clear b, shift constant in Y1
mpyr
y1,x0,a#@cvi(xmn),x1
;shift fract.;integer in X1
eor
x1,b
;set sign
jeq
_endf
;finished if integer = 0
jpl
_endit
;jump if positive
neg b
#$008000,y1
;int. positive;left shift const.
neg a
b1,x1
;fract. positive; integer in X1
mpy
y1, x1,b
;shift integer and store in B
move
b0,x1
;obtain the shifted integer
add
x1,a
;add the int. to the fract.
neg
a
;negate mixed number entity
jmp
_endf
;jump to the end
_enditadd
b,a
;concatenate positive int.and fr.
_endf
;finished
endm
;end macro definition
Figure 2-9
CONVMNG Macro Definition
This converts a decimal number to the mixed-number format.
Figure 2-9 shows how the detection of the sign of
the number is done using the EOR instruction. The
EOR is performed with accumulator B, which is already zero. The integer will reside in the most
significant byte of B1, and the N bit in the condition
code register (CCR) will be set if the integer was
2-12
MOTOROLA
negative. The negation is now performed by using
the NEG instruction. When the negative B1 is negated to turn its contents into a positive number, the
integer occupies the lowest eight bits of B1. To
move the 8-bit number to the upper eight bits of a
24-bit register, a left shift by 16 bits must be performed by utilizing the ideas presented in the
following section on data shifting. A 16-bit left shift
by the use of a shift constant will force the 8-bit
number to reside in the lower 24 bits, B0, of the destination accumulator B. To concatenate the 8-bit
signed integer with the 16-bit fraction, the number is
moved to X1 and then added to accumulator A containing the unsigned fraction.
2.5 Data Shifting
Data shifting is used in converting one data representation into another data representation. The
DSP56000 Family processors provide four distinct
ways to perform data shifts:
1. 1-bit shifts/rotates
2. multi-bit shifts/rotates
3. fast multi-bit shifts
4. dynamic scaling.
These approaches to shifting are described in the
following sub-sections.
MOTOROLA
2-13
2.5.1 1-Bit Shifts/Rotates
For 1-bit shifts/rotates of either 56-bit accumulator A
(A=A2:A1:A0=8:24:24 bits) or 56-bit accumulator B
(B=B2:B1:B0=8:24:24 bits), use the arithmetic shift
right (ASR) and arithmetic shift left (ASL) instructions. If 1-bit shifts on only the most significant 24-bit
word of accumulator A, A1, or accumulator B, B1 are
required, use the rotate right (ROR), rotate left
(ROL), logical shift right (LSR), or logical shift left
(LSL) instructions.
2.5.2 Multi-Bit Shifts/Rotates
The most straightforward approach for multi-bit
shifts/rotates of the accumulator is to use ASR,
ASL, LSR, LSL, ROR, or ROL instructions with the
repeat instruction, REP, or the hardware DO loop
instruction, with the loop consisting of a single instruction as the examples in Figure 2-10 show. The
repeat instruction is not interruptible; whereas, the
DO instruction is interruptible.
REP #n
ASL A
or
DO #n, END1
ASR A
END1
(where n is the number of
positions to be shifted/rotated)
Figure 2-10 Multi-Bit Shifts Using REPeat, DO
2-14
MOTOROLA
The DSP56000 macro cross assembler supports
macros with the MACRO DEFINITION and MACLIB
directives (see Reference 1). Two accumulator shift
macros, one for left shifts, SHLAC, and the other for
right shifts, SHRAC, can be defined as shown in
Figure 2-11.
;Macros for performing multi-bit shifts right and left.
;For the two given macros
;
Let
acc = accumulator A or B
;
n
= the number of bits to be shifted
;
SHRAC
macro acc,n
;macro definition for shifting the
rep
#n
;accumulator right n bits.
asr
acc
;shift right
endm
;end macro definition
;
SHLAC
macro acc,n
;macro definition for shifting the
rep
#n
;accumulator left n bits
asl
acc
;shift left
endm
;end macro definition
Figure 2-11 Multi-Bit Shift Macros use the ASR or ASL instruction shifts one
bit per instruction cycle.
2.5.3 Fast Multi-Bit Shifts
The fastest way to do multi-bit shifting is to multiply
the operand by a shift constant. In the case of a
right shift, the constant KR is a fraction given by
KR=2-n. The example in Table 2-1 shows how to
shift the content of X0 right by four bits. The shifted
result resides in the upper part of accumulator A,
A1. The code executed to implement the 4-bit right
shift shown in Table 2-1 is:
MOVE #KR,X1
MOTOROLA
MPY X0,X1,A
2-15
Table 2-1 Fast 4-Bit Right Shift
Register
Hexadecimal
Value
Comments
X0
060000
Value To Be Shifted
X1
080000
Shift Constant, KR
A1
006000
Shifted Result
Similarly, the example in Table 2-2 shows how to
shift the content of X0 left by four bits. In the case of
a left shift, the constant KL is an integer given by
KL=2n-1. KL is 2n-1, not 2n, because the DSP56000
multiplier is fractional, thereby automatically implementing a 1-bit left shift. The result for left shifts
resides in A0, the least significant word of A. The
code executed to implement the 4-bit left shift is:
MOVE #>KL,X1
MPY
X0,X1,A
Table 2-2 Fast 4-Bit Left Shift
2-16
Register
Hexadecimal
Value
Comments
X0
060000
Value To Be Shifted
X1
000008
Shift Constant, KL
A0
600000
Shifted Result
MOTOROLA
Generating the constants for the shifts is made
easy by using the POW and CVI functions built into
the DPS56000 macro cross assembler. The raise to
the power function, POW, returns a real number for
any base raised to a real number. For example,
K=@pow(2,-4) returns 0.0625, and
K=@pow(2,+4) returns 16.0
However, because DPS56000 is a fractional machine, the assembler will limit real numbers unless
precautions are taken. In the previous example, the
object code for 16.0 will be limited to $7FFFFF or
+0.999998 decimal by the assembler, which is incorrect. To obtain the integer form of the real
number, the assembler provides a convert to integer function, CVI. The CVI function converts real
numbers to integers by truncating the fractional portions. For example,
@CVI (@pow(2,+4))
returns 16 (not 16.0), which will be assembled as
$000010 in object code.
The previous instruction sequences can be put in
two distinct macros for programming ease. The two
macros, MSHR for multi-bit shifts right and MSHL
for multi-bit shifts left, are listed in Figure 2-12.
MOTOROLA
2-17
;Macro definitions for generating right and left shift constants,
;KR and KL, and performing the right and left shifts.
;
;
Let
s = the source register
;
m = the multiplier register
;
n = the number of bits to be shifted
;
acc= the destination accumulator
;
;
where
s,m can be one of X0,X1,Y0,Y1
;
and
acc can be A or B
;
;
MSHR
macro
s,m,n,acc
;four input variables.
move
#@pow(2,-n),m
;load the multiplier register
mpy
s,m,acc
;shift right n bits
endm
;end macro definition
;
MSHL
macro
s,m,n,acc
;four input variables
move
#>@cvi(@pow(2,n-1)),m ;load the multiplier register
mpy
s,m,acc
;shift left n bits
endm
;end macro definition
Figure 2-12 Constant Generation and Multi-Bit Shifts
This uses the ultiply instruction to perfor multi-bit sht in one
instruction cycle.
The immediate long-data move (note the greater
than sign, >, in the move instruction) must be used
in the MSHL example to prevent the data from being treated as a fraction and shifted accordingly. In
the following example, for:
MOVE #2,X1
X1 will be $020000 because the immediate short data
(i.e., data which can be represented as eight bits) is
treated as an 8-bit fraction occupying the two most significant bytes of the destination register; whereas, for:
MOVE #>2,X1
2-18
MOTOROLA
X1 will be $000002 because the data is treated as
an integer occupying the least significant bytes of
the destination register with sign extension.
The immediate long-data move is a two-word instruction that executes in two instruction cycles;
whereas, the immediate short-data move is a oneword instruction that executes in one instruction cycle. The immediate short-data move can be used
for multi-bit right shifts of less than or equal to eight
bits. For right shifts of more than eight bits, the immediate long-data move must be used with the
appropriate 24-bit fraction, 2**(-n), utilizing the
POW directive.
2.5.4 No-Overhead, Dynamic Scaling
(1-Bit Shifts)
For no-overhead 1-bit shifts of either accumulator A
or B, the scaling mode is used. In this mode the shift
occurs automatically when transferring data from
either of the 56-bit data ALU accumulators, A or B
(not A2, A1, A0, A10 or B2, B1, B0, B10), to the XD
or YD buses. This shift function is activated by appropriately setting the scaling-mode bits in the
status register. This mode is primarily intended for
adding a scaling operation to existing code without
modifying the code (simply setting the scalingmode bits). For more details on the scaling mode,
consult the DSP56000 Digital Signal Processor
User's Manual (see Reference 2) and the DATA
ALU subsection of ADI1290 (see Reference 3). ■
MOTOROLA
2-19
SECTION 3
Mixed- and RealNumber Addition and
Subtraction
“The magnitude
of the negative
real number can
be easily found
by executing the
ABS A
instruction.”
The arithmetic operations of addition and subtraction
performed on mixed and real numbers are discussed
in the following paragraphs.
3.1 Mixed Numbers
Mixed numbers can be represented in a 24-bit word
using the MN format discussed in SECTION 3.2 Real
Numbers. To better understand addition and subtraction of mixed numbers, consider the examples in the
following paragraphs.
3.1.1 Addition
Two examples
considered.
of
mixed-number
addition
are
Example One—The simplest case is the addition of
two positive numbers as shown in Table 3-1. The instruction executed is:
ADD X1,A
MOTOROLA
3-1
Table 3-1 Positive Mixed Numbers with Sum
Less than 128
Register
X1
Hexadecimal Value
43C000
Mixed-Number
Value
67.75
A (before)
00 : 178000 : 000000
23.50
A (after)
00 : 5B4000 : 000000
91.25
Example Two—In this example, the result of the addition will be greater than 128, which is the limit for
24-bit MN-formatted mixed numbers. However, the
status register will signify the use of the extension
part of the accumulator; thus, the exact representation of the mixed number having a magnitude greater
than 128 can be contained in the accumulator. It cannot be stored as a 24-bit word, however, since it
requires more than 24 bits to represent it.
Consider the example shown in Table 3-2. The value of the status register, SR, is $0320, signifying the
E bit, bit 5, has been set. The hexadecimal value of
B1 represents the decimal number 131.0 because
the status register indicates the extension bits of accumulator B are in use; thus, bit 47 of B is not a sign
bit but part of the mixed number.
3-2
MOTOROLA
Table 3-2 Positive Mixed Numbers with Sum
Greater than 128
Hexadecimal Value
Mixed-Number
Value
464000
70.25
B (before)
00 : 3CC000 : 00
60.75
B (after)
00 : 830000 : 00
131.00
Register
Y1
3.1.2 Subtraction
Mixed-number subtraction is as straightforward as
the mixed-number addition.
Example Three— Consider the case shown in Table 3-3. The instruction executed is:
SUB X0,B
The status register value remains the same.
Table 3-3 Mixed-Number Subtraction
Register
X0
Hexadecimal Value
178000
Mixed-Number
Value
23.50
B (before)
00 : 43C000 : 000000
67.75
B (after)
00 : 2C4000 : 000000
44.25
MOTOROLA
3-3
Example Four— Consider the case depicted in Table 3-4 where the result is negative. The N bit, bit 3,
and the borrow (carry) bit, bit 0, in the status register
are set, which indicates that the result is negative
and that a borrow has occurred. The magnitude of
the negative mixed number can be easily found by
executing the ABS B instruction.
Table 3-4 Mixed-Number Subtraction with
Negative Result
Register
X1
Hexadecimal Value
464000
Mixed-Number
Value
70.25
B (before)
00 : 3CC000 : 000000
60.75
B (after)
FF : F68000 : 000000
-9.50
3.2 Real Numbers
Consider real numbers having the format discussed
in SECTION 2.3 Real Numbers in which the signed
integer occupies the most significant 24-bit word,
and the unsigned fraction occupies the least significant 24-bit word of a 48-bit-long word.
3.2.1 Addition
The numbers to be added should be moved into any
of the acceptable source registers (X, Y, A, B) and
destination registers (A, B) for the 48-bit addition
(see Reference 2). If the sum of the fractional parts
3-4
MOTOROLA
is greater than unity, a carry is propagated into the
integer part. If, after adding the two real numbers,
the integer result cannot be represented in 24 bits,
then the extension part of the accumulator will be
used. Bit 5 in the status register will indicate whether the extension bits are in use.
Example One—Consider the case shown in Table
3-5. The instruction executed is:
ADD X,A
Although the real-number source may have been
saved in A10, bit 47 and A2 must represent proper
sign extension if the C bit in the status register is to
be set correctly, which is necessary when doing
multiple-precision arithmetic. The fraction parts in
X0 and A0 are unsigned.
Table 3-5 Real-Number Addition
Register
X
Hexadecimal Value
Real-Number
Value
000237 : C00000
567.750
A (before)
00 : 0003DB : A00000
987.625
A (after)
00 : 000613 : 600000
1555.375
Example Two— Consider the case shown in Table
3-6. If the first bit of the result is interpreted as a sign
bit, the decimal value of A1 is not 8389160 but is 8388056. The reason that the hexadecimal value in
A represents the correct result (i.e., +8389160) is
MOTOROLA
3-5
because the extension bit, bit 5 in the status register,
is set. This fact indicates that the extension bits of
accumulator A are in use; therefore, the sign bit is
not the left-most bit (bit 47) of A1 but is the left-most
bit (bit 55) of the extension register, A2.
Table 3-6 Real -Number Addition Using
the Extension Bit
Register
X
Hexadecimal Value
000237 : C00000
Real-Number
Value
567.750
A (before)
00 : 7FFFF0 : A00000
8388592.625
A (after)
00 : 800228 : 600000
8389160.375
3.2.2 Subtraction
The subtraction of real numbers is similar to the addition of real numbers.
Example Three— The case shown in Table 3-7 generates a positive result. The instruction executed is:
SUB X,A
3-6
MOTOROLA
Table 3-7 Real-Number Subtraction with a Positive Result
Register
X
Hexadecimal Value
Real-Number
Value
000138 : C00000
312.750
A (before)
00 : 00037A : 400000
890.250
A (after)
00 : 000241 : 800000
577.500
Example Four— The case depicted in Table 3-8
generates a negative result. The N bit, bit 3, and the
C bit, bit 0, in the status register are set, indicating
the result is negative and a borrow has occurred.
The magnitude of the negative real number can be
easily found by executing the ABS A instruction.
Table 3-8 Real-Number Subtraction with a
Negative Result
Register
X
Hexadecimal Value
Real-Number
Value
00037A : 400000
890.250
A (before)
00 : 000138 : C00000
312.750
A (after)
FF : FFFDBE : 800000
-577.500
■
MOTOROLA
3-7
APR3Section4 Page 1 Wednesday, December 13, 1995 2:52 PM
SECTION 4
Signed Multiplication
“Integer or
fractional
multiplication
can be
accomplished
on any signed
hardware
multiplier by
appropriate
shifting.”
Consider
the multiplication of two signed-integer
numbers (see Figure 4-1). The product of the signed
multiplier is 2N-1 bits long. To keep the product
properly normalized and to further process the
product, it is advantageous to format the product as
multiple operand words. Therefore, there is an extra
bit because two sign bits exist before multiplication
and only one exists after the multiplication. Integer
multipliers use the extra bit as a sign-extension bit.
Multiplication of signed fractions is shown in Figure 4-2.
As is the case for signed-integer multiplication, the
result of the multiplication is a 2N-1 bit word, including
the sign bit. In this case, the extra bit is appended to the
LSP as a zero in the LSB position. This bit is called the
zero-fill bit.
N Bits
N Bits
.
S
S
.
Signed Multiplier
S
S
2N-1 Bit Product
Sign Extension
2N Bits
Figure 4-1
MOTOROLA
Signed-Integer Multiplication generates a duplicated sign bit
4-1
APR3Section4 Page 2 Wednesday, December 13, 1995 2:52 PM
N Bits
N Bits
S.
S.
Signed Multiplier
S . Most Significant Product
Least Significant Product
0
2N-1 Bit Product
Zero Fill
2N Bits
Figure 4-2
Signed-Fraction Multiplication generates a single sign bit and
one bit of zero filling
In summary, signed-integer and signed-fractional
multipliers differ only in the way in which they treat
the extra bit. In the integer case, the bit is used for
sign extension; whereas, in the fractional case, it is
used as zero fill. Integer or fractional multiplication
can be accomplished on any signed hardware
multiplier by appropriate shifting. The following
paragraphs discuss performing fractional, integer,
mixed-number, and real-number multiplications
using the DSP56000 Family of processor.
4-2
MOTOROLA
APR3Section4 Page 3 Wednesday, December 13, 1995 2:52 PM
4.1 Multiplication of a
Signed Fraction with a
Signed Fraction
Let the values of the 24-bit general-purpose
registers X1 and X0 be as shown in Table 4-1. After
executing MPY X0,X1,A on the DSP56000/
DSP56001, the content of accumulator A is as
shown in Table 4-1. The last bit of the accumulator
is zero, and the first bit carries the sign of the
product. When accumulator A is rounded to 24 bits
using the instructions RND or MPYR X0,X1,A, the
value in A is $00:009D99:000000 (see Table 4-1).
The lower 24 bits, A0, are zeros, and the eight signextension bits, A2, of the 56-bit accumulator are
zeros, indicating a positive number.
Table 4-1 Signed-Fraction Multiplication
Register
Hexadecimal Value
Real-Number Value
X0
0647D9
+0.049067616462708
X1
0C8BD3
+0.098017096519470
A
00 : 009D98 : B815B6
+0.004809465298806
A (RND)
00 : 009D99 : 000000
+0.004809498786926
MOTOROLA
4-3
APR3Section4 Page 4 Wednesday, December 13, 1995 2:52 PM
4.2 Multiplication of a
Signed Integer with a
Signed Integer
Consider the case represented in Table 4-2 in
which two signed integers in X0 and X1 are
multiplied, and the result is stored in accumulator A.
It can be seen from Table 4-2 that if the contents of
X0, X1, and A are interpreted as fractions, the result
is correct. However, if the contents of X0, X1, and A
are interpreted as integers, then a shift is required
immediately after the multiplication to obtain the
correct results. This shift moves the LSB out of the
accumulator and adds a sign-extension bit in the
MSB position. Therefore, the instruction sequence
to perform integer multiplication on DSP56000/
DSP56001 processors is a multiplication followed
by a right shift, namely,
MPY X0,X1,A
ASR A
Table 4-2 Signed-Integer Multiplication
Register
Hexadecimal Value
X0
000002
2
2.3841858E-07
X1
000138
312
3.7193298E-05
1248
8.8675733E-11
A
4-4
00 : 000000 : 0004E0
Integer Value
Fractional Value
MOTOROLA
APR3Section4 Page 5 Wednesday, December 13, 1995 2:52 PM
Table 4-3 Signed-Integer and Signed-Fraction Multiplication
Register
Hexadecimal Value
X0
400000
X1
00007F
A10
00003F : 800000
A1
00003F
A0
800000
Integer Value
Fractional Value
0.5000000
127
63
0.5 (unsigned)
4.3 Multiplication of a
Signed Integer with a
Signed Fraction
Multiplication of an integer with a fractional number
is a unique case since the result will be a real
number — i.e., it will consist of an integer and a
fractional part. When the contents of X0 and X1 are
as shown in Table 4-3, execution of the instruction
MPY X0,X1,A will result in A1=$00003F and
A0=$800000.
The integer part will be stored in the upper 24 bits,
A1, of the 48-bit result, and the fractional part will
reside in the lower 24 bits, A0, of the result. A0 is
being interpreted as an unsigned fraction. When
performing multiple-precision arithmetic on real
numbers, it is necessary to convert real numbers
into a signed-integer operand and a signed-fraction
MOTOROLA
4-5
APR3Section4 Page 6 Wednesday, December 13, 1995 2:52 PM
operating (see SECTION 4.4 Double-Precision
Multiplication). To format A0 as a twoscomplement positive fraction, two shift operations
must be performed, LSLA followed by ASRA. The
execution of LSLA shifts the MSP, A1, one bit left
and inserts a zero in the LSB position of A1. The
execution of ASRA shifts the full 56-bit accumulator
A one bit right, thereby restoring A1 and forming a
positive twos-complement fraction in A0. If the
product of the multiplication is negative, then
introducing the sign bit in the fractional part involves
three steps. First, the absolute value of the number
must be obtained. Second, the shift LSLA followed
by ASRA should be performed to generate a signed
twos-complement fraction. Finally, the negative
values of both parts, integer and fractional, must be
obtained separately. The convert to signed integer
and signed fraction routine, CONVSISF, given in
Figure 4-3, implements these three steps.
;CONVSISF.ASM
;This routine will convert a negative 56-bit number in the real number format
;(with a signed integer in A2:A1 and an unsigned fraction in A0)
;to a signed integer in A1 and a signed fraction in B1
;
abs a
;obtain the absolute value of the result
lsl a
;shift left to introduce sign bit
asr a
;introduce positive sign in fractional part
move a0,b
;move positive fraction to B1
neg b #0,a0 ;negate fraction, clear A0
neg a
;negate integer
Figure 4-3
4-6
CONVSISF Routine converts a negative 56-bit number to a
signed integer and signed fraction
MOTOROLA
APR3Section4 Page 7 Wednesday, December 13, 1995 2:52 PM
4.4 Double-Precision
Multiplication
In double-precision multiplication, two 48-bit
numbers are multiplied together to generate a 96bit signed product. The concept of double-precision
multiplication is depicted in Figure 4-4. When two
48-bit numbers P and Q (where P1 and P0 are the
most significant and least significant 24-bit words,
respectively, of P and, similarly, Q1 and Q0 for Q)
are multiplied, four single-precision products,
P0Q0, P1Q0, P0Q1, and P1Q1, are generated.
These products must be added with the proper
weighting to yield the correct result, R3:R2:R1:R0.
P1 : P0 x
Q1 : Q0
Q0P0
P1Q0
Q1P0
P1Q1
+
R3:R2:R1:R0
Figure 4-4
Double-Precision Multiplication can be performed by a set of
partial multiplications and additions
4.5 Double-Precision
Multiplication of Fractions
The flowchart for the 48-bit general fraction
multiplication routine, MULT48FG, is given in
Figure 4-5. To compensate for the fact that signed
multiplications are performed, a trick is used. The trick
is to force bits 23 of P0 (P0(#23)) and Q0 (Q0(#23)) to
MOTOROLA
4-7
APR3Section4 Page 8 Wednesday, December 13, 1995 2:52 PM
zero before performing the Q0P0, P1Q0, and Q1P0
multiplications and then to adjust the result if P0(#23)
and/or Q0(#23) were set. The least significant word of
the adjusted intermediate product, LS(IP), is
concatenated with the least significant 48 bits of the
result, R1:R0. IP is then shifted right by 24 bits to
weight the MSB of IP correctly before performing the
P1Q1 multiplication and accumulation. When A2 is
moved in A1, A1 is sign extended. After the P1Q1
multiplication, A contains the sign-extended result,
R3:R2. This routine executes in 27 cycles if both
P0(#23) and Q0(#23) are set and in 26 cycles if both
are zero. Listed in Figure 4-6, MULT48FG performs
double-precision signed multiplication of fractions.
Consider the multiplication of two 48-bit fractions
stored in X and Y as shown in Table 4-4. The result is
stored in accumulators A and B. The upper 48 bits of
the 96-bit result are stored in A10, and the lower 48
bits are stored in B10 (see Table 4-4). A2 contains the
sign extension of A10. The fractional result in decimal
form is obtained after concatenating the two results,
A0:B10, as indicated.
Table 4-4 Double-Precision Fractional Multiplication
Register
Hexadecimal Value
Fractional Value
X
345678 : FFFFFF
0.408888936042779
Y
006789 : 7FFFFF
0.003159701824181
A10
002A55 : CE41FA
B10
9683FB : 000002
A10:B10
4-8
0.001291967117102
MOTOROLA
APR3Section4 Page 9 Wednesday, December 13, 1995 2:52 PM
Start
Start
P1
P1
Q1
Q1
X1
X1 P0’
P0’
Y1
Y1 Q0’
Q0’
==
==
P0(#23)=0
P0(#23)=0
Q0(#23)=0
Q0(#23)=0
P0’Q0’
P0’Q0’
MS(P0’Q0’)
MS(P0’Q0’)
BB
AA
P1Q0’+MS(P0’Q0’)
P1Q0’+MS(P0’Q0’)
P0’Q1’
P0’Q1’++AA
X0
X0
Y0
Y0
AA
AA
AA++Q1:Q0’
Q1:Q0’ififP0(#23)
P0(#23)==11
AA++P1:P0’
P1:P0’ififQ0(#23)
Q0(#23)==11
AA++$000000:800000
AA==IP
$000000:800000
IP
ififboth
P0(#23)
and
Q0(#23)
both P0(#23) and Q0(#23)==11
LS
LS(IP)
(IP)
B1
B1 ; ; B10=B1:B0=R1:R0
B10=B1:B0=R1:R0
Shift
ShiftIP
IPright
right24
24bits
bits
P1Q1
AA
P1Q1++shifted
shiftedIP
IP
AA==sign
extended
R3:R2
sign extended R3:R2
Figure 4-5
MOTOROLA
MULT48FG Flowchart produces a double precision multiply of
two fractions on the DSP56000
4-9
APR3Section4 Page 10 Wednesday, December 13, 1995 2:52 PM
;MULT48FG.ASM
;This routine will execute the multiplication of two 48-bit FRACTIONAL numbers
;that are already stored in memory as follows.
;
x:$Paddr P1
y:$Paddr P0
;
x:$Qaddr Q1
y:$Qaddr Q0
;The initial 48-bit numbers are:
;
P = P1:P0
(24:24 bits)
;
Q = Q1:Q0
(24:24 bits)
;P0 with bit #23 forced to zero is P0’
;Q0 with bit #23 forced to zero is Q0’
;The result, R, is a 96 bit number that is stored in the two
;accumulators A and B as follows:
;
R = R3:R2:R1:R0
;
= A10:B10 (48:48 bits)
;
= A:B10 (sign extended)
move
#paddr,r4
;initialize pointer for P
move
#qaddr,r5
;initialize pointer for Q
move
#$7fffff,x1
;load x1 with mask value
move
y:(r4),a
;load A with P0
and
x1,a
y:(r5),b
;create P0’;Q0 into B
and
x1,b
a1,x0
;create Q0’;P0’ into x0
clr
a
b1,y0
;clear A, Q0’ into y0
mpy
x0,y0,b
x:(r4),x1
;mpy P0’ with Q0’, P1 into x1
move
b1,a0
;most significant word MS (P0’Q0’)to a0
mac
x1,y0,a
x:(r5),y1
;P1 * Q0’ + a into a, Q1 into Y1
mac
x0,y1,a
;P0’* Q1 + a into a
jset
#23,y:(r4),one
;P0(#23)= 1?
jset
#23,y:(r5),two
;Q0(#23)= 1?
jmp
thr
;both P0(#23) and Q0(#23) = 0
one add
y,a
#$000800,y0
;adjust for P0(#23) = 1; load y0
jclr
#23,y:(r5),thr
;both P0(#23) and Q0(#23) = 1?
mac
y0,y0,a
;generate cross term ($400000) and adj.
two add
x,a
;adjust for Q0(#23) = 1 product
thr move
a0,b1
;concatenate R1:R0 in B10
move
a1,x0
;shift accumulator A 24 bits right
move
a2,a
;and sign extend
move
x0,a0
;interm. product (IP) weighted properly
mac
x1,y1,a
;R3:R2 sign extended in A
end
;end of routine
Figure 4-6
MULT48FG Routine implements a double precision multiply of
two fractions on the DSP56000
4.6 Double-Precision
Multiplication of Integers
Double-precision integer multiplication is the same
as double-precision fractional multiplication except
that an ASR instruction needs to be introduced in
4-10
MOTOROLA
APR3Section4 Page 11 Wednesday, December 13, 1995 2:52 PM
the routine. The ASR eliminates the zero-fill bit and
adds a sign-extension bit, thereby converting the
fractional multiplier into an integer multiplier as
discussed in SECTION 4 Signed Multiplication.
The shift right is done in two stages since the result
is 96 bits. The lower 48 bits are shifted first, which
results in a zero in bit 47. The upper 48 bits are
subsequently shifted right with bit 0 going to the carry
bit. If the carry is set, a one is loaded into bit 47 of the
lower 48 bits of the result. The double-precision
multiplication is performed by the 48-bit general
integer multiplication routine, MULT48IG, listed in
Figure 4-7. An example is given in Table 4-5. The
result of the multiplication is stored in the two
accumulators. The 96-bit result can be obtained by
concatenating A10 with B10 (see Table 4-5).
Table 4-5 Double-Precision Integer Multiplication
Register
X
Hexadecimal Value
000006 : 123456
Integer Value
101856342
NOTE: The A10:B10 concatenated result is 4.52943438572962E + 19.
MOTOROLA
4-11
APR3Section4 Page 12 Wednesday, December 13, 1995 2:52 PM
4.7 Multiplication of a Real
Number with a Real
Number
When two real numbers are multiplied together, four
24-bit multiplications must be performed: one integer
with an integer, one fraction with a fraction, and two
fraction with an integer. Both the integer and the
fractional parts must be in the signed twoscomplement format. The result will be 96 bits long; the
most significant 48 bits will be the integer part, and the
least significant 48 bits will be the fractional portion.
To perform a real-number multiplication using the real
multiply routine, REALMULT, the multiplicand, P, is
stored in register X and the multiplier, Q, is stored in
register Y (see Figure 4-8). The signed-integer portion
of the real-number result, Ri, is stored in A10, and the
unsigned fractional part, Rf, is stored in B10. The data
ALU programmer's model for REALMULT is shown in
Figure 4-9 An example is given in Table 4-6.
Table 4-6 Real-Number Multiplication
Register
Hexadecimal Value
X1
00007B
X0
600000
Y1
FFFFB1
Y0
B00000
Integer Value
123
-0.75
-79
-0.625
A10
FF : FFFFFF : FFD982
B10
68 : 000000 : 000000
NOTE: The A10:B10 concatenated result
4-12
Fractional Value
is -9853.59375.
MOTOROLA
APR3Section4 Page 13 Wednesday, December 13, 1995 2:52 PM
;MULT48IG.ASM
;This routine will execute the multiplication of two 48-bit INTEGER numbers
;that are already stored in memory as follows.
;
x:$Paddr P1
y:$Paddr P0
;
x:$Qaddr Q1
y:$Qaddr Q0
;The initial 48-bit numbers are:
;
P = P1:P0
(24:24 bits)
;
Q = Q1:Q0
(24:24 bits)
;P0 with bit #23 forced to zero is P0’
;Q0 with bit #23 forced to zero is Q0’
;The result, R, is a 96 bit number that is stored in the two
;accumulators A and B as follows:
;
R = R3:R2:R1:R0
;
= A10:B10 (48:48 bits)
;
= A:B10 (sign extended)
move
#paddr,r4
;initialize pointer for P
move
#qaddr,r5
;initialize pointer for Q
move
#$7fffff,x1
;load x1 with mask value
move
y:(r4),a
;load A with P0
and
x1,a
y:(r5),b
;create P0’;Q0 into B
and
x1,b
a1,x0
;create Q0’; P0’ into x0
clr
a
b1,y0
;clear A, Q0’ into y0
mpy
x0,y0,b
x:(r4),x1
;mpy P0’ with Q0’, P1 into x1
move
b1,a0
;most significant word MS (P0’Q0’)to a0
mac
x1,y0,a
x:(r5),y1
;P1 * Q0’ + a into a, Q1 into Y1
mac
x0,y1,a
;P0’* Q1 + a into a
jset
#23,y:(r4),one
;P0(#23)= 1?
jset
#23,y:(r5),two
;Q0(#23)= 1?
jmp
thr
;both P0(#23) and Q0(#23) = 0
one add
y,a
#$000800,y0
;adjust for P0(#23) = 1; load y0
jclr
#23,y:(r5),thr
;both P0(#23) and Q0(#23) = 1?
mac
y0,y0,a
;generate cross term ($400000) and adj.
two add
x,a
;adjust for Q0(#23) = 1 product
thr move
a0,b1
;concatenate R1:R0 in B10
move
#0,b2
;clear the extension register B2
asr
b
;start adjusting the product to integer
move
a1,x0
;shift accumulator A 24 bits right
move
a2,a
;and sign extend
move
x0,a0
;interm. product (IP) weighted properly
mac
x1,y1,a
;R3:R2 sign extended in A
asr
a
;finish adjusting the product to integ.
jcc
end
;finished if A(#0) is 0
move
#$800000,x0
;if A(#0) is 1
or
x0,b
;set B10 (#47)
end
;end of routine
Figure 4-7
MOTOROLA
MULT48IG Routine which multiplies two integer numbers on
the DSP56000
4-13
APR3Section4 Page 14 Wednesday, December 13, 1995 2:52 PM
;REALMULT.ASM
;
;This routine multiplies two signed real numbers P and Q. It assumes the signed
;integer part of P(Pi), is in X1 and the signed fractional part (Pf), is in X0,
;the signed integer part of Q(Qi), is in Y1 and the signed fractional part of Q(Qf)
;is in Y0. The signed integer part of the result, Ri, is stored in A10
;and the unsigned fractional part, Rf, is stored in B10.
;
;
mpy x0,y0,b
;Pf * Qf
asl b
;remove the sign bit from the product
move b2,a
;shift PfQf product 24 bits right
move b1,a0
;and preload accumulator A
mac x1,y0,a
;mult. Pi with Qf and accumulate in A
mac x0,y1,a
;mult. Pf with Qi and accumulate in A
move a0,b1
;concatenate MS (RF) with LS (RF)
move a1,a0
;adjust weighting by shifting interm.
move a2,a1
;product right 24 bits in prep. for
mac x1,y1,a
;final product Pi * Qi + A into A
asr a
;eliminate the zero fill bit
Figure 4-8
REALMULT Routine which multiplies two signed real numbers on
the DSP56000
Signed P Integer = Pi
Signed P Fraction = Pf
X1
X0
Signed Q Integer = Qi
Signed Q Fraction = Qf
Y1
Y0
Unsigned Fractional Part of R, Rf
A10
Signed Integer Part of R, Ri
B10
Figure 4-9
4-14
REALMULT Data ALU Programmer’s Model
Accumulators A10 and B10 hold a single 96 bit mixed-number
result.
MOTOROLA
APR3Section4 Page 15 Wednesday, December 13, 1995 2:52 PM
This routine is similar to MULT48FG or MULT48IG
in that the interim products must be properly
weighted to yield the correct result. Unlike the
MULT48FG and MULT48IG routines, however,
there are no adjustment terms to consider because
the fractions, Pf and Qf, are assumed to be signed.
The CONVSISF macro in Figure 4-3 will perform
the conversion of a real number into a signed
integer and signed fraction.
The 96-bit result from the REALMULT routine
should be treated as an entity. If the positive value
of a negative result is required, then the absolute
value of the whole 96 bits should be obtained before
the integer part and fractional part can be
separated.
Table 4-7 shows a second example using two
negative numbers that produce a positive result.
Table 4-7 Multiplication of Two Negative Real Numbers
Register
Hexadecimal Value
X1
FFFFBF
X0
933334
Y1
FFFFE9
Y0
ECCCCD
A10
000000 : 0005F4
B10
6D7064 : 75C290
MOTOROLA
Integer Value
Fractional Value
-65
-0.85
-23
-0.15
4-15
APR3Section4 Page 16 Wednesday, December 13, 1995 2:52 PM
4.8 Multiplication of a Mixed
Number with a Mixed
Number
Assume that the mixed numbers are stored in the
MN format. Multiplying two mixed numbers is
simply a multiplication using the MPY instruction or
the MAC instruction for a multiply and accumulate.
The multiplication will be a 48-bit result, which will
be in the format shown in Figure 4-10 only after a
one-bit right shift to compensate for the zero-fill bit
introduced by the fractional multiplication. After this
shift, the most significant 16 bits will be the signed
integer part, and the least significant 32 bits will be
the unsigned fractional part.
Consider the example given in Table 4-8. If the
result is desired in the real-number format, where
the integer part is separated from the unsigned
fractional part, then an additional right shift by eight
bits must be performed on the product. This shift
can be performed using the REP instruction or the
appropriate shift multiplier as shown in SECTION
2.5 Data Shifting. By performing the shift, the
integer part of the product is stored in the most
significant word of the accumulator, and the
unsigned fractional part is stored in the least
significant word of the accumulator.
If the magnitude of the result is less than 128, it can
be stored back in a 24-bit register in the MN format,
which is performed by a left shift by eight bits on the
result in accumulator A (already shifted one bit to
the right, implying a net shift of seven bits left). The
result will reside in A1 as shown in Table 4-8.
4-16
MOTOROLA
APR3Section4 Page 17 Wednesday, December 13, 1995 2:52 PM
Table 4-8 Multiplication of Two Mixed Numbers
Register
Mixed-Number
Value
Hexadecimal Value
X1
068000
6.5
X0
044000
4.25
A
00 : 003740
000000
A101
1BA000 : 000000
27.625
A102
0001BA : 000000
27.625
:
1. In the MN format after the left shift by seven bits net.
2. In the real-number format after the right shift by nine bits net.
siiiiiii.ffffffffffffffff
siiiiiii.ffffffffffffffff
Two 24-bit Mixed Numbers
47
32
Signed Integer Part
31
.
0
Unsigned Fractional Part
Figure 4-10 Multiplication of Two Mixed Numbers
■
MOTOROLA
4-17
APR3Section5 Page 1 Wednesday, December 13, 1995 2:54 PM
SECTION 5
Signed Division
“Division is
inherently
iterative and
data dependent.
. . . Division is
not a
deterministic
process, but
rather a trialand-error
process.”
Even though division is the inverse mathematical
process of multiplication, it differs from multiplication
in many aspects. Division is a shift and subtract divisor operation in contrast to multiplication, which is a
shift and add multiplicand operation. In division, the
results of one subtraction determine the next operation in the sequence; thus, division is inherently
iterative and data dependent. The answer consists of
a quotient and a remainder, both of which can have
variable word lengths. In multiplication, the number of
bits in the product is known a priori, which means that
division is not a deterministic process, but rather a trial-and-error process. This fact makes implementing
divide routines a challenge. There are, however, additional data- and hardware-related factors that must
be considered, such as:
• Input Data
Signed Unsigned
Integer
Fractional Normalized
Fractional Unnormalized
• Output Requirements
Quotient
Remainder
Quotient with Remainder
Magnitude Only
Signed
Number of Bits of Accuracy
MOTOROLA
5-1
APR3Section5 Page 2 Wednesday, December 13, 1995 2:54 PM
• Machine Architecture
Fractional
Integer
Register Structure
The instruction set of the DSP56000/DSP56001
processors includes a divide iteration instruction,
DIV. Execution of a DIV generates one quotient bit
using a nonrestoring algorithm on signed fractional
operands. The original dividend must occupy the loworder 48 bits of the destination accumulator and must
be a positive number. Also, the divisor must be larger
than the dividend so that a fractional quotient is generated. After the first DIV execution, the destination
accumulator holds both the partial remainder and the
formed quotient.
The partial remainder, which occupies the high-order portion of the destination operand, is a signed
fraction. The partial remainder is not a true remainder and must be corrected before it may be used
because of the nonrestoring nature of the division
algorithm. Therefore, once the divide is complete, it
is necessary to reverse the last DIV operation to restore the remainder, if the true remainder is desired.
The formed quotient, which occupies the low-order
portion of the destination accumulator, is a signed
fraction. One bit of the formed quotient is shifted
into the LSB of the destination accumulator for each
DIV execution. Thus, portions of the destination accumulator allocated to the remainder and to formed
quotients depend on the number of DIV executions.
5-2
MOTOROLA
APR3Section5 Page 3 Wednesday, December 13, 1995 2:54 PM
In summary, for the division to produce the correct
results on the DSP56000/DSP56001, two conditions must be satisfied:
1. the dividend must be positive and sign extended
2. the magnitude of the divisor must be greater
than the magnitude of the dividend so that a
fractional quotient is generated except for
integer division.
5.1 Division of a Signed
Fraction by a Signed
Fraction
The signed 24-bit divide routine for a four-quadrant
divide (i.e., a signed divisor and a signed dividend)
that generates a 24-bit signed quotient and a 48-bit
signed remainder, SIG24DIV, is given in Figure 5-1.
The dividend is assumed to be in X0, but could have
been in X1, Y1, or Y0.
The first three instructions save the appropriate
sign bits and ensure that the dividend is positive.
The first instruction copies A1 to B1 so that the sign
bit of the dividend, bit 47 of A, is saved in B1 prior
to taking the absolute value of the dividend. The exclusive OR in the second instruction will result in the
N bit in the status register being set if the signs of
the divisor and the dividend are different. Since the
DIV instruction does not affect the N bit, the N bit
represents the sign of the final quotient.
MOTOROLA
5-3
APR3Section5 Page 4 Wednesday, December 13, 1995 2:54 PM
;SIG24DIV.ASM
;This is a routine for a 4 quadrant divide (i.e., a signed divisor and a signed
;dividend) which generates a 24-bit signed quotient and a 48-bit signed
;remainder. The quotient is stored in the lower 24 bits of accumulator A,A0,
;and the remainder in the upper 24 bits, A1. The true (restored) remainder is
;stored in B1. The original dividend must occupy the low order 48 bits of the
;destination accumulator, A, and must be a POSITIVE number. The divisor (x0)
;must be larger than the dividend so that a fractional quotient is generated.
;The quotient will be in x1 and the remainder will be in B1.
;
abs a
a,b
;make dividend positive, copy A1 to B1
eor x0,b
b,x:$0
;save rem. sign in x:$0, quo, sign in N
and #$fe,ccr
;clear carry bit C (quotient sign bit)
rep #$18
;form a 24-bit quotient
div x0,a
;form quotient in A0, remainder in A1
tfr a,b
;save remainder and quotient in B1,B0
jpl savequo
;go to SAVEQUO if quotient is positive
neg b
;complement quotient if N bit is set
savequo tfr x0,b
b0,x1
;save quo. in X1, get signed divisor
abs b
;get absolute value of signed divisor
add a,b
;restore remainder in B1
jclr #23,x:$0,done
;go to DONE if remainder is positive
move #$0,b0
;prevent unwanted carry
neg b
;complement remainder
done
;end of routine
Figure 5-1
SIG24DIV Routine
This routine is a four quadrant divide that produces a 24-bit signed
quotient and a 48-bit signed remainder.
The sign of the remainder is the sign of the dividend; therefore, bit 23 of B1 contains the sign of the
remainder. B1 is stored in data memory as well as
in the second instruction so that the sign bit can be
tested using the bit manipulation instructions later in
the routine. The third instruction clears the carry bit,
C, in the condition code register. This fact ensures
the quotient will be positive because the C bit is always the next quotient bit and because the C bit is
shifted into the accumulator at the beginning of the
execution of the DIV instruction.
5-4
MOTOROLA
APR3Section5 Page 5 Wednesday, December 13, 1995 2:54 PM
Execution of the next two instructions, REP and
DIV, generates the 24-bit quotient and 48-bit remainder (the first 24 bits of which will be zero). The
transfer instruction, TFR A,B, copies the quotient
A0 into B0 (also A1 into B1 and A2 into B2) so that
the sign of the quotient (i.e., bit 23 of B0) can be corrected. If the N bit in the status register was set, B
is complemented. The only purpose is to complement B0 at this point; bit 23 of B0 is zero prior to the
negation. Therefore, B0 is a valid, signed quotient
that is saved in X1. The divisor in X0 is copied into
B1 so that its absolute value can be generated and
used to restore the remainder. If the remainder
needs negating, B0 must be cleared first to prevent
an unwanted carry from propagating into B1, the
true remainder.
In Table 5-2 the contents of the A accumulator are
shown after each iteration of the DIV instruction for
the case shown in Table 5-1; the repeat instruction
is not interruptible. SECTION 5.6 Divide Routines
With N ≤ 24 Bits contains four divide routines that
generate quotients having less than 24 bits of precision. In Table 5-1 the true remainder in B1 is shifted
right by 24 bits because the quotient is 24 bits, and
the remainder is always smaller than the quotient.
The 24 bits are implied and are not in a register.
If the dividend is greater than the divisor, the dividend must be scaled down to be smaller than the
divisor. The quotient must then be scaled up before being output, or it must be output directly and
MOTOROLA
5-5
APR3Section5 Page 6 Wednesday, December 13, 1995 2:54 PM
interpreted correctly. This interpretation involves assuming the binary point has moved to compensate
for the original downscaling — that is, the quotient
will now be a real number. If it can be guaranteed that
the divisor and dividend are normalized, then faster
quadratic convergence and reciprocal methods can
be used to calculate the quotient.
Table 5-1 Signed-Fraction Division
Register
X0 (Divisor)
A (Dividend)
5-6
Hexadecimal Value
600000
00 : 300000 : 000000
A0 (Quotient)
400000
A1 (Remainder)
A00000
$000000:B1
(True Remainder)
000000 : 000000
Fractional Value
0.75
0.375
0.500
0.0
MOTOROLA
APR3Section5 Page 7 Wednesday, December 13, 1995 2:54 PM
Table 5-2 Contents of Accumulator After Signed-Fraction
Division Iterations
DIV Iteration
Contents of Accumulator A (in HEX)
A2
A1
A0
1
00
000000
000000
2
FF
A00000
000001
3
FF
A00000
000002
4
FF
A00000
000004
5
FF
A00000
000008
6
FF
A00000
000010
7
FF
A00000
000020
8
FF
A00000
000040
9
FF
A00000
000080
10
FF
A00000
000100
11
FF
A00000
000200
12
FF
A00000
000400
13
FF
A00000
000800
14
FF
A00000
001000
15
FF
A00000
002000
16
FF
A00000
004000
17
FF
A00000
008000
18
FF
A00000
010000
19
FF
A00000
020000
20
FF
A00000
040000
21
FF
A00000
080000
22
FF
A00000
100000
23
FF
A00000
200000
24
FF
A00000
400000
MOTOROLA
5-7
APR3Section5 Page 8 Wednesday, December 13, 1995 2:54 PM
5.2 Division of a Signed
Integer with a Signed
Integer
Integer division can be treated in the same manner
as fractional division. That is, if the dividend is positive and smaller in magnitude than the divisor,
executing the SIG24DIV routine will generate the
correct results. However, since the remainder is not
used in integer division (the remainder is truncated),
SIG24DIV can be shortened for use with integer division. Consider the example using SIG24DIV shown
in Table 5-3. The contents of the A accumulator after
each DIV iteration are shown in Table 5-4.
The quotient is stored in the lower 24 bits, A0, of accumulator A. The value $1BD178 (0.21732998
decimal) is the quotient. A1 will contain the lower 24
bits of the 48-bit true remainder after the addition of
the absolute value of the divisor. In this example,
B1=$0018E0 after the remainder has been restored.
Therefore, the true remainder is $000000:0018E0 or
0.000000000045247 decimal.
Table 5-3 Signed-Integer Division
Register
X0 (Divisor)
A0 (Dividend)
5-8
Hexadecimal Value
600162E
00 : 0004D2 : 000000
Fractional Value
5678
1234
MOTOROLA
APR3Section5 Page 9 Wednesday, December 13, 1995 2:54 PM
MOTOROLA
5-9
APR3Section5 Page 10 Wednesday, December 13, 1995 2:54 PM
Table 5-4 Contents of Accumulator After Each Signed-Integer
Division Iteration
DIV Iteration
5-10
Contents of Accumulator A (in HEX)
A2
A1
A0
1
FF
FFF376
000000
2
FF
FFFD1A
000000
3
00
001062
000000
4
00
000A96
000001
5
FF
FFFEFE
000003
6
00
00142A
000006
7
00
001226
00000D
8
00
000E1E
00001B
9
00
00060E
000037
10
FF
FFF5EE
00006F
11
00
00020A
0000DE
12
FF
FFEDE6
0001DB
13
FF
FFF1FA
00037A
14
FF
FFFA22
0006F4
15
00
000A72
000DE8
16
FF
FFFEB6
001BD1
17
00
00139A
0037A2
18
00
001106
006F45
19
00
000BDE
00DE8B
20
00
00018E
01BD17
21
FF
FFECE
037A2F
22
FF
FFF00A
06F45E
23
FF
FFF642
0DE8BC
24
FF
0002B2
1BD178
MOTOROLA
APR3Section5 Page 11 Wednesday, December 13, 1995 2:54 PM
Shown in Figure 5-2, the INTDIV macro performs a
signed-integer divide without the extra instructions
that SIG24DIV uses to generate the remainder.
This routine reduces the number of operative instructions by about one-half.
;Signed integer divide macro.
;
Registers used: a,b,x0
;
Input: macro pass parameters “dividend, divisor”
;
Output: Quotient --> a0
INTDIV
done
macro dividend, divisor
move #>dividend, a
move a2,a1
move #>dividend,a0
asl a
#>divisor,x0
abs a
a,b
and #$fe, ccr
rep #$18
div x0,a
eor x0,b
jpl done
neg a
nop
endm
Figure 5-2
;sign extend a2
;and A1
;move the dividend into A
;prepare for divide, and
;move divisor into x0 (24 bit)
;make dividend positive, save in B
;clear the carry flag
;form a 24-bit quotient
;for quotient in a0, remainder a1
;save quotient sign in N
;go to done if quotient is positive
;complement quotient if N bit is set
;finished, the quotient is in a0
INTDIV Routine
This routine is a four quadrant divide that produces a 24-bit signed
quotient with no remainder.
5.3 Double-Precision Division
Division of a 48-bit number by another 48-bit number is not possible using the REP,DIV instruction
sequence because the divisor is restricted to be a
24-bit fraction. Therefore, to perform double-precision division producing a 48-bit quotient and 96-bit
remainder, the DIV48 routine is introduced. This
MOTOROLA
5-11
APR3Section5 Page 12 Wednesday, December 13, 1995 2:54 PM
routine implements the nonrestoring divide algorithm that the DIV instruction implements in
hardware. The flowchart for this algorithm is shown
in Figure 5-3.
5-12
MOTOROLA
APR3Section5 Page 13 Wednesday, December 13, 1995 2:54 PM
Start
Clear C bit (C=0);
Set count j =1
Load dividend (D) and divisor (S) in 48-bit registers
Clear 48-bit quotient register Q
Determine relative sign of D and S
1-bit shift left of D and Q
Value of
EOR
operation
?
Different
D+S
Same
D-S
D
D
Invert C bit
Add the C bit in least significant bit of Q
j=j+1
No
j > 48
?
Yes
Make Q positive by 1-bit shift to the right
D + |S|
D = True Remainder
Correct signs of quotient and remainder
Figure 5-3
Double-Precision Divide Flowchart
This chart describes the division of two signed 48-bit numbers
that is performed by the DIV48 routine.
The DIV48 dividend is stored in B10; the divisor is
stored in X; and the quotient is developed in A10. After
iterating the loop 48 times, B10 will contain the least sig-
MOTOROLA
5-13
APR3Section5 Page 14 Wednesday, December 13, 1995 2:54 PM
nificant 48 bits of the 96-bit remainder. The sign of the
remainder must be made the same as that of the dividend. The quotient will be negative if the signs of the
dividend and the divisor are different. The borrow bit
generated as the result of a subtraction (i.e., N1-N2) is
the complement of the carry bit generated as the result
of the addition of the negative (i.e., N1+(-N2)) of an operand. Therefore, the carry bit must be inverted if the
divisor is subtracted from the dividend. The carry is introduced into the quotient using the add long with carry
(ADC) instruction with one of the addends, Y, equal to
zero. Consider the example in Table 5-5. The result,
which is stored in accumulator A, is obtained using the
DIV48 routine shown in Figure 5-4.
Table 5-5 Double-Precision Division
Register
Decimal Hexadecimal
Value
Integer Value
Decimal Fractional
Value
X (Divisor)
000078 : 123450
+2014458960
0.000014313591805
B10 (Dividend)
00000F : 02468A
+251807370
0.000001789198976
A10 (Quotient)
100000 : 000000
B10 (Remainder
before Correction)
B10 (True Remainder
after Correction)
5-14
0.125
FFFF87 : EDCBB0
000000 : 000000
0
0.0
000000 : 000000
MOTOROLA
APR3Section5 Page 15 Wednesday, December 13, 1995 2:54 PM
;DIV48.ASM
;This routine performs double precision division on two 48-bit operands.
;The operands can be either fractions or integers. The dividend must be positive
;the magnitude of the divisor must be greater than the magnitude of the
;dividend. The dividend and divisor are assumed to be in memory as
;
L:$divaddr
dividend
;
L:$divaddr+1 divisor
;
L:$divaddr+2 quotient
;The dividend is loaded in the long word operand B10 and the divisor is loaded
;in the long operand word x. The 48-bit true remainder is stored in B.
;
move
#divaddr,r0
;initialize r0
move
#0,y0
;clear y0
move
r0,r1
;initialize r1
and
#$fe,ccr
;clear carry bit C
move
1:(r0)+,b
;load dividend into B
abs
b
1:(r0)+,x
;make the dividend positive, divisor in X
clr
a
#0, y1
;clear a and y1
do
#48,endloop
;execute the loop 48 times
eor
x1,b
;do operands have the same sign?
jmi
opp
;if opposite sign jump to location one
eor
x1,b
;restore the value of the operand in B
asl
a
;prepare A to receive a quotient bit
asl
b
;multiply the dividend by 2
sub
x,b
;subtract the divisor from the dividend
move
sr,x:$0
;process to invert the carry bit
bchg
#0,x:$0
;invert the carry bit
move
x:$0,sr
;restore the SR with inverted carry bit
jmp
quo
;jump to location “quo”
opp
eor
x1,b
;operands have opposite sign
asl
a
;prepare A to receive a quotient bit
asl
b
;multiply the dividend by 2
add
x,b
;add the divisor to the dividend
quo
adc
y,a
;add the carry bit to develop quotient
endloop
;end of the 48 iteration loop
asr
a
;introd. The positive sign bit in quotient
tfr
x1,a
a,1:(r0) ;divisor in A, save quotient
move
x0,a0
;lower part of divisor in A
abs
a
;get the absolute value of divisor
add
a,b
;generate lower 48 bits of true remainder
jclr
#23,x:(r1),done
;if dividend is positive, finished
neg
b
;if dividend is negative, negate remainder
done nop
;end of routine
Figure 5-4
MOTOROLA
DIV48 Routine
This routine divides a 48-bit signed number by a second 48-bit
signed number, and produces a 48-bit quotient and 48-bit remainder.
5-15
APR3Section5 Page 16 Wednesday, December 13, 1995 2:54 PM
5.4 Real-Number Division
This type division is not possible by simply using the
DIV instruction repeatedly because a real number is
a 48-bit number, and the DIV operation only operates
on 24-bit operands. A real-number division cannot be
broken down into four divisions and subtractions like
the real-number multiplication because division is
nondeterministic. One way to divide two real numbers
is to scale the numbers to form integers, fractions, or
mixed numbers by multiplying them by the appropriate constant. Consider the following example:
123.750
--------------------- = 0.14769506191257
837.875
To perform the previous division using the
DSP56000/DSP56001, multiply both numbers by
1000 to change them to integers. The new values
with the results after executing the SIG24DIV routine are shown in Table 5-6.
The second and most accurate method of realnumber division is by using the DIV48 routine
shown in Figure 5-4. The real number is kept in the
real-number format and stored according to the requirements of the routine. The values used are
shown in Table 5-7. Although the result is more accurate than the result obtained by the first method,
it is slower than the first method.
5-16
MOTOROLA
APR3Section5 Page 17 Wednesday, December 13, 1995 2:54 PM
Table 5-6 Result of Real-Number Division Using DIV24 Routine
Register
X1 (Divisor)
Hexadecimal Value
0CC8F3
Integer Value
Fractional Value
837875
Table 5-7 Result of Real-Number Division Using DIV48 Routine
Register
Hexadecimal Value
Real
Number
X (Divisor)
000345 : E00000
+837.875
B10 (Dividend)
00007B : C00000
+123.750
A10 (Quotient)
12E7AB : FA58FC
B10 (Remainder)
000129 : 200000
True Remainder
000000 : 000000
00046F : 000000
MOTOROLA
Fractional Value
0.147695061912572
4.8069E-19
5-17
APR3Section5 Page 18 Wednesday, December 13, 1995 2:54 PM
5.5 Mixed-Number Division
The mixed number is stored in the appropriate register or accumulator in the MN format. Since the
number is represented in 24 bits, the SIG24DIV routine can be used. The mixed-number division must
satisfy the same two conditions as the fractional division. Since the magnitude of the divisor must be
greater than the magnitude of the dividend, the quotient will be a fraction represented in the signedfraction format.
Consider the example shown in Table 5-8. After executing the SIG24DIV routine, the quotient will be
in A0, and the lower part of the true remainder will
be in B1. The results obtained after the execution of
each DIV instruction are given in Table 5-9.
Table 5-8 Result of Mixed-Number Division Using SIG24DIV Routine
Register
X0 (Divisor)
A (Dividend)
5-18
Hexadecimal Value
3FC000
00 : 188000 : 000000
A0 (Quotient)
313131
A1 (Remainder)
D8C000
$000000:B1
(True Remainder)
000000 : 188000
Decimal Value
63.75
24.50
00.3843137
00.000000011408702
MOTOROLA
APR3Section5 Page 19 Wednesday, December 13, 1995 2:54 PM
MOTOROLA
5-19
APR3Section5 Page 20 Wednesday, December 13, 1995 2:54 PM
Table 5-9 Contents of Accumulator After Each Mixed-Number
Division Iteration
Contents of Accumulator A (in HEX)
DIV Iteration
5-20
A2
A1
A0
1
FF
F14000
000000
2
00
224000
000000
3
00
04C000
000001
4
FF
C9C000
000003
5
FF
D34000
000006
6
FF
E64000
00000C
7
00
0C4000
000018
8
FF
D8C000
000031
9
FF
F14000
000062
10
00
224000
0000C4
11
00
04C000
000189
12
FF
C9C000
000313
13
FF
D34000
000626
14
FF
E64000
000C4C
15
00
0C4000
001898
16
FF
D8C000
003131
17
FF
F14000
006262
18
00
224000
00C4C4
19
00
04C000
018989
20
FF
C9C000
031313
21
FF
D34000
062626
22
FF
E64000
0C4C4C
23
00
0C4000
189898
24
FF
D8C000
313131
MOTOROLA
APR3Section5 Page 21 Wednesday, December 13, 1995 2:54 PM
5.6 Divide Routines with
N ≤ 24 Bits
Four distinct routines for the division of fractional
numbers where an N-bit (N<24) quotient is required
are given in the following paragraphs.
5.6.1 Positive Operands with Remainder
Where N Is Variable
For positive fractional operands, the code in Figure
5-5 may be used to perform a divide operation,
which generates an N-bit quotient and a 48-bit remainder having 48 N bits of precision for N<24.
In this routine, the quotient is built up by rotating the
C bit into B. The correct C bit is generated by executing the DIV instruction. The remainder is built up
in A. The REP Y1 and ASL B instruction sequence
sets the signed-fraction format for the N-bit quotient
into a signed fraction. Similarly, the REP Y0 and
ASR B instruction sequence formats the 48-N-bit
true remainder into a signed fraction.
5.6.2 Positive Operands without
Remainder Where N Is Fixed
For positive fractional operands, the code in Figure
5-6 (or similar code) may be used to perform a divide operation yielding only an N-bit quotient without
a remainder for N<24.
MOTOROLA
5-21
APR3Section5 Page 22 Wednesday, December 13, 1995 2:54 PM
The quotient bits must be extracted out of the accumulator that contains both remainder and quotient
bits after the execution of the DIV instructions. These
bits must then be formatted as a positive fraction.
;This routine assumes that the 48-bit positive fractional dividend
;is stored in the A accumulator, the 24-bit positive fractional divisor is
;stored in the X0 register, the value N is stored in the Y0 register and the
;value 24-N is stored in the Y1 register. This routine stores the N-bit positive
;fractional quotient in the X1 register and the 48-bit positive fractional
;remainder with 48-N bits of precision in the B accumulator. Note that in this
;routine the value of N and 24-N may be changed at run time without reassembling
;since they are stored in registers.
;
clr b
;initialize B1 for quotient
and #$fe, ccr
;clear carry, C, (quotient sign bit)
do y0,loop1
;compute N-bit quotient (Y0=N)
rol b
;build up N-bit quotient in B1
div x0,a
;build up 48-N bit remainder in A
loop1
rep y1
;repeat 24-N times (Y1=24-N)
asl b
;format quotient as positive fraction
tfr x0,b
b1,x1
;save N-bit quotient, move divisor
add a,b
;recover 48-N bit remainder in B
rep y0
;repeat N times (Y0=N)
asr b
;format remainder as positive fraction
Figure 5-5
Positive Divide: 48-Bit Operand and Remainder
;This routine assumes that the 48-bit positive fractional dividend is stored
;in the A accumulator and that the 24-bit positive fractional divisor is stored
;in the X0 register. This routine stores the N-bit positive fractional quotient
;in the A accumulator, A1. The value of N is not stored in a register and is
;specified at the time of compilation using the CVI and POW functions built into
;the DSP56000 Cross Assembler. The quotient is stored in A1 and the remainder
;is destroyed.
;
and #$fe,ccr
;clear carry, C, (quotient sign bit)
rep #n
;form an N-bit quotient
div x0,a
;perform divide iteration N times
move a0,1
;move quotient to A1, destroy remainder
move #>(@cvi(@pow(2,n))-1,x1 ;store N-bit quotient bit mask in X1
and x1,a
;extract N-bit quotient in A1
rep #(24-n)
;repeat (24-N) times
lsl a
;format quotient as positive fraction
Figure 5-6
5-22
Positive Divide: N-Bit Quotient without Remainder
MOTOROLA
APR3Section5 Page 23 Wednesday, December 13, 1995 2:54 PM
5.6.3 Signed Operands with Remainder
Where N Is Variable
For signed fractional operands, the code shown in
Figure 5-7 may be used to perform a divide operation
yielding an N-bit quotient and a 48-bit remainder having 48 N bits of precision for N<24. Bits 0 and 1 in
location X:$0 are used to save the quotient and remainder sign flags, respectively.
;This routine assumes that the sign extended 48-bit fractional dividend
;is stored in the A accumulator, the 24-bit signed divisor is stored in the
;X0 register, the value N is stored in the Y0 register and that the value
;24-N is stored in the Y1 register. This routine stores the N-bit signed
;fractional quotient in the X1 register and the 48-bit positive fractional
;remainder with 48-N bits of precision in the B accumulator. In this routine the
;values of N and 24-N may be changed at run time.
;
bclr #0,x:$0
;clear quotient sign flag (bit 0,x:$0)
bclr #1,x:$0
;clear remainder sign flag (bit 1,x:$0)
tst a
;determine the sign of the dividend
jpl signquo
;go to SIGNQUO if the dividend is positive
bset #1,x:$0
;set remainder sign flag if negative
signquo abs a
a,b
;make dividend positive, copy a1 to b1
eor x0,b
;get sign of quotient (N bit)
jpl start
;go to START if the sign is positive
bset #o,x:$0
;set quotient sign flag if negative
start
clr b
;initialize B1 for quotient
and #$fe,ccr
;clear carry, C, (quotient sign bit=0)
do y0, loop1
;compute N-bit quotient (Y0=N)
rol b
;build up N-bit quotient in B1
div x0,a
;build up 48-N bit remainder in A
loop1
rep y1
;repeat 24-N times (Y1=24-N)
asl b
;format quotient as positive fraction
jclr #0,x:$0,savequo
;go to SAVEQUO if the quot. is positive
neg b
;complement quot. if sign flag is set
savequo tfr x0,b b1,x1
;save N-bit quotient, divisor into B
abs b
;get the absolute value of divisor
add a,b
;recover 48-N bit remainder in B
rep y0
;repeat N times (Y0=N)
asr b
;format remainder as signed fraction
jclr #1,x:$0,done
;go to DONE if the remainder is positive
neg b
;complement remainder if negative
done
Figure 5-7
5-23
Signed Divide: 48-Bit Operand and Remainder
MOTOROLA
APR3Section5 Page 24 Wednesday, December 13, 1995 2:54 PM
The first function of this routine is to set up these
flags. The quotient is built up by rotating the C bit
into B. The correct C bit is generated by executing the
DIV instruction. The remainder is built up in A. The
REP Y1 and ASL B instruction sequence formats the
N-bit quotient as a signed fraction. Similarly, the
REP Y0 and ASR B instruction sequence formats
the 48 N-bit true remainder into a signed fraction.
5.6.4 Signed Operands without
Remainder Where N Is Fixed
For signed fractional operands, the code given in
Figure 5-8 may be used to perform a divide operation, yielding only an N-bit quotient without a
remainder for N<24.
;This routine assumes that the sign extended 48-bit fractional dividend
;is stored in the A accumulator and that the 24-bit signed fractional divisor
;is stored in the X0 register. This routine stores the N-bit signed fractional
;quotient in the A accumulator. The value of N is not stored in a register
;and is specified at the time of compilation using the CVI and POW functions
;built into the DSP56000 Cross Assembler. The quotient is stored in A1
;and the remainder has been destroyed.
;
abs a
a,b
;make dividend positive, copy signed div. to B1
eor x0,b
;get sign of quotient and save in N bit
and #$fe,ccr
;clear carry (quotient sign bit = 0)
rep #n
;form an N-bit quotient by executing
div x0,a
;the divide iteration N times
jpl mask
;go to MASK if quotient is positive
neg a
;negate quotient
mask move a0,a1
;destroy remainder, move quotient in A1
move #>(@cvi(@pow(2,n))-1,x1;N-bit quot. mask in X1
and x1,a
;recover signed N-bit quotient in A1
rep #24-n
;repeat 24-N times
lsl a
;format quotient as signed fraction
Figure 5-8
5-24
Signed Divide: N-bit Quotient with Remainder
MOTOROLA
APR3Section5 Page 25 Wednesday, December 13, 1995 2:54 PM
The NEG A instruction is guaranteed to negate the
quotient because the first bit of the quotient has
been set to zero, making it a positive fraction. Quotient bits must be extracted out of the accumulator,
which contains both remainder and quotient bits after the execution of the DIV instructions. The
quotient bits must then be formatted as a positive
fraction.
■
5-25
MOTOROLA
SECTION 6
Conclusion
“. . . unless an
application
specifically
demands using
less than 24
bits of
precision, the
use of an
unsigned fullprecision
division routine
will probably
result in the
fastest division
execution time.”
MOTOROLA
F
or the case in which only positive fractional operands are used to compute both a full-precision (i.e.,
24-bit) unsigned quotient and its 48-bit remainder,
both the quotient and the remainder are correctly
aligned with their respective register boundaries after
the 24 divide iterations. Neither the quotient nor the remainder must be shifted to produce the correct result
(see SECTION 5.1 Division of a Signed Fraction by
a Signed Fraction). In general, unless an application
specifically requires the number of bits of the precision
(N) in the quotient to be variable, using a fixed value
N, declared at the time of assembling, results in a significantly faster division execution time. Similarly,
unless an application specifically requires that the remainder has to be computed, using a routine that
computes only the quotient results in a significantly
faster division execution time. Finally, unless an application specifically demands using less than 24 bits of
precision, the use of an unsigned full-precision division routine will probably result in the fastest division
execution time.
■
6-1
REFERENCES
1. DSP56000 Macro Assembler Reference Manual
(available from DSP, Oak Hill, Texas, (512) 4402030), Motorola Inc., 1986.
2. DSP56000 Digital Signal Processor User's
Manual (DSP56000UM/AD), Motorola Inc., 1986.
3. DSP56001 56-Bit General-Purpose Digital Signal
Processor (ADI1290), Motorola Inc., 1988.
■
MOTOROLA
Reference-1
APR3 Title Page Page i Wednesday, December 13, 1995 2:40 PM
© Motorola Inc. 1993
Motorola reserves the right to make changes without further notice to any products herein. Motorola makes no warranty, representation or guarantee regarding the suitability of
its products for any particular purpose, nor does Motorola assume any liability arising
out of the application or use of any product or circuit, and specifically disclaims any and
all liability, including without limitation consequential or incidental damages. “Typical” parameters can and do vary in different applications. All operating parameters, including
“Typicals” must be validated for each customer application by customer’s technical experts. Motorola does not convey any license under its patent rights nor the rights of others. Motorola products are not designed, intended, or authorized for use as components
in systems intended for surgical implant into the body, or other applications intended to
support or sustain life, or for any other application in which the failure of the Motorola
product could create a situation where personal injury or death may occur. Should Buyer
purchase or use Motorola products for any such unintended or unauthorized application,
Buyer shall indemnify and hold Motorola and its officers, employees, subsidiaries, affiliates, and distributors harmless against all claims, costs, damages, and expenses, and
reasonable attorney fees arising out of, directly or indirectly, any claim of personal injury
or death associated with such unintended or unauthorized use, even if such claim alleges that Motorola was negligent regarding the design or manufacture of the part. Motorola and B are registered trademarks of Motorola, Inc. Motorola, Inc. is an Equal
Opportunity/Affirmative Action Employer.