direct computation - The University of Texas at Austin

Computation of the
Multidimensional Discrete Fourier Transform (DFT)
Lecture by Prof. Brian L. Evans (UT Austin)
Scribe: Wade C. Schwartzkopf (UT Austin)
 DIRECT COMPUTATION OF TWO-DIMENSIONAL DFTS

If the twiddle factors (WN terms) are pre-computed and stored in a
table, then we require

N12 N22 complex multiplies and adds

1012 complex multiplies and adds (CMADs) if N1=N2=1024:
@1 s/CMAD 11.5 days (low-end programmable processor)
@1 ns/CMAD  16.7 minutes (high-end dedicated processor)
Since bus speeds are significantly slower than processor speeds, we can
only process the data in real time if we can feed data to the processor
quickly enough. Note that a 200 MHz Texas Instruments TMS320C62x is
a fixed-point programmable DSP processor that can compute one CMAD
in 10 ns. A CMAD consists of four real multiplications plus four real
additions, and the 200 MHz C62x can perform 400 million real
multiplications per second. For more information about the C62x, see
http://www.ece.utexas.edu/~bevans/courses/ee345s/lectures/01_Architecture
 ROW-COLUMN DECOMPOSITION

We can regroup the computations
X (k1 , k 2 ) 
G (n1 , k 2 ) 
X (k1 , k 2 ) 

N1 1 N 2 1
   x(n , n ) W
1

n1  0 n2  0
N 2 1
2
n2 k 2
N2
 n1k1
 WN1

0  k1  N1  1
0  k2  N 2  1
2
n2 k 2
N2
0  n1  N1  1
0  k2  N 2  1
 G(n ,k
)WNn11k1
0  k1  N1  1
0  k2  N 2  1
 xn ,n  W
N1 1
n2  0
n1  0
1
1
2

Interpretation


To compute G(n, k2), compute a 1-D DFT of the nth column of x. Do
this for all values of n.
Computation can be done using N1 N2-point DFTs +N2 N1-point DFT’s

With direct computation of the 1-D DFTs, the requires
N1 N 22  N 2 N12  N1 N 2 N1  N 2  CMADS

2109 CMADs if N1 = N2 = 1024 for direct computation
@1 s/CMAD 33.3 minutes
@1 ns/CMAD  2 seconds

If N1 = 2u; N2 = 2v and a 2-D Fast Fourier Transform (FFT) is used
N1 N 2
N N
N N
log 2 N 2  1 2 log 2 N1  1 2 log 2 N 2 N1
2
2
2

2x107 CMADs if N1 = N2 = 1024
@1 s/CMAD 10 seconds
@1 ns/CMAD  0.01 seconds
This execution time is now fast enough to perform a frame-based DFT for
a video sequence at 30 frames/s.

Note: Must be able to access data rowwise and columnwise. This
needs memory!
 MULTIDIMENSIONAL DFT’S

Things become more complicated as the dimensionality goes up.
X (k1, k2 ,..., kM )  
n1
 x(n , n ,..., n
1
n2
2
M
)WNnMM k M ...WNn11k1
nM
1. Compute a 1-D DFT with regard to n1 for each value of n2, n3, …, nM.
2. Compute a 1-D DFT with regard to n2 for each value of k1, n3, …, nM.
…
M. Compute a 1-D DFT with regard to nM for each value of k1, k2, …
k(M-1)

Let NP = N1N2…NM, then


 M
 N
# CMADS     N j   i log 2 N i
i 1  j 1
 2
j

i




M  M
 1
1
    N j   log 2 N i  N P  log 2 N P
2
i 1  j 1
 2



M
Np
 DEALING WITH STORAGE

If the machine on which an FFT is to be computed has limited
memory, then it may be difficult to access the data both rowwise and
columnwise.

This is particularly true for 3-D and higher.

In such cases the data is typically stored rowwise in secondary
storage and matrix transposition is performed.
1. Take row DFTs
2. Transpose
3. Take row DFTs
4. Transpose
etc.
 EKLUNDH’S TRANSPOSITION ALGORITHM

If N1 = N2 = 2u, then transposition of a 2-D sequence (note that the
origin is in the lower left-hand corner) can be performed recursively
in u stages using only 2 rows of memory. The algorithm requires u
passes through the matrix.
X 2
X 

X 0
 X 1t
Xt 
 t
 X 0
X3


X1 
t
X3 

t
X 2 
 MORE GENERAL DFT ALGORITHMS



The rectangular DFT relates a finite extent rectangular array to
rectangular samples of its Fourier transform.

Is there a DFT for hexagonal arrays?

Is there a DFT connecting a rectangular array to hexagonal samples of
its Fourier transform?
Definitions

~
x (n) : periodic sequence with periodicity matrix N.

IN: set of |det(N)| samples that makes up exactly one period of ~x (n) .
Hypothesis

Let’s hypothesize that ~x (n) can be expressed as a finite linear
combination of harmonically related complex sinusoids.
~
x (n) 
 a(k )e
jk T R T n
(*)
kJ N
where k is an integer vector and JN is a set of multidimensional indices.

If this representation is valid, it must be consistent with the periodicity
of ~
x.
~
x (n)  ~
x (n  Nr) 
 a(k )e
jk T R T n  Nr 
 a(k )e
jk T R T Nr
kJ N

e jk
T
RT n
(**)
kJ N

Equating the right sides of (*) and (**) and making use of the
orthogonality property of the complex exponentials, for any harmonic
that is not zero, we must have
e jk
T
R T Nr
1
R T N  2I
R T  2N 1

~
If we define X k   det N a(k ) , then
1
det N
~
x (n) 


kJ N
N
contains det N T  det N members
~
X (k ) 
 ~x (n)e
 jk T ( 2N 1 ) n
nI N
~
If we replace the periodic arrays ~
x and X with finite support arrays ~
x
~
and X with support on IN and JN, respectively, we obtain the
generalized DFT.
x(n) 
X (k ) 
1
det N
 X (k )e
kJ N
 x(n)e



 j k T 2N 1 n
 SAMPLING THE FOURIER TRANSFORM
Define
X (ω)   x(n) e jω
T
n
n

Then
X (k )  X (ω) ω( 2N T )k
 , nI
N
j k T 2N 1 n
nI N


j k T 2N 1 n
The complex exponentials in this sum are periodic in n with periodicity
matrix N and in k with periodicity matrix Nt.
J

~
 X (k )e
, k  JN

Recall

Note: If V is given, we have limited control over the locations of the
Ω  V T ω
ω  VT Ω
Ωk  V T (2N T )k  2 (VN ) T k
frequency samples because of the constraint that N must be an
integer matrix.
 QUINCUNX DFT
(Figure 3.10 in notes)



T

V 2
T

2
T 
2 
T
 
2
N
N
 0
0

N 
If we use
Then
2 
1   2

 2
1
0 k  

T

1
N
NT   k1   2 1 1   k1 
T
k  
  NT





1
1
2 k 2 
2
2  k 2  NT 1  1 k 2 




  0

T 
N
NT 
T
 NT
This matrix [1 1; 1 –1] is called a quincunx sampling matrix (a nonrectangular sampling matrix of determinant +2 or –2). This is
quincunx sampling.
 EVALUATING GENERAL DFT’S

Fact: If N is diagonal, the general DFT can be evaluated using a rowcolumn algorithm
X (k1 , k2 )   x(n1 , n2 )e
n1
n2
  x(n1 , n2 )e
n1
 2

0   n1 
N
1

 
 j  k1 k 2 
 0 2   n 
2

N 2   
n2
j
2n1 k1
N1
j
e
2n 2 k 2
N2

Smith Form Decomposition:

Any non-singular integer matrix can be expressed as
N=UDV
where
U, V
: unimodular integer matrices (determinant of +1 or –1)
D
: diagonal integer matrix
Substituting a Smith form of a periodicity matrix N,
X (k ) 

 jk T 2N 1 n
 x(n)e
 jk T 2V 1D 1U 1 n
 x(n)e
 j 2 V T k
nI N


nI N


 x(n)e



T

D 1 U 1n

nI N
We would like to express the DFT of a signal with non-rectangular
periodicity in terms of a separable DFT. We would like the term in the
 j 2 kˆ T D1 nˆ
exponential to be of the form
We work towards this goal by making the following change of variables:

Algorithm
Let kˆ  V T k nˆ  U 1n
k  VT kˆ n  Unˆ
X (V T kˆ ) 
 x(U nˆ ) e
 j 2  kˆ T D 1 nˆ
Unˆ I N
1. Compute x(nˆ ) by permuting x(n) according to nˆ  U 1 n . The
permuted values should fall (or be made to fall using circular shifting)
into the fundamental parallelopiped formed by D, i.e. a rectangular
volume of samples.
2. Calculate a separable DFT according to entries on the diagonals of D
3. Permute DFT X (kˆ ) using k  VT kˆ .