Part II

An Introduction to the Mathematics of Digital Signal Processing: Part II: Sampling, Transforms,
and Digital Filtering
Author(s): F. R. Moore
Source: Computer Music Journal, Vol. 2, No. 2 (Sep., 1978), pp. 38-60
Published by: MIT Press
Stable URL: http://www.jstor.org/stable/3680221
Accessed: 03-02-2016 12:02 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].
MIT Press is collaborating with JSTOR to digitize, preserve and extend access to Computer Music Journal.
http://www.jstor.org
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
An
Introduction to
of
Digital
Signal
the
Mathematics
Processing
PartII:Sampling,Transforms,and Digital Filtering
F. R. Moore
Bell Laboratories
MurrayHill, New Jersey 07974
? 1978 by F. R. Moore
INTRODUCTION
In Part I of this tutorial (ComputerMusic Journal,
Vol. 2, No. 1), we discussedsome of the basic mathematical
ideas relevantto the processingof digitalsignals.Now we turn
to the applicationof these and other concepts, operatingon
the assumption that the reader understandseverything in
PartI thoroughly(althoughsome of this second part can be
understood without following the mathematicalarguments).
Again,it will be impossibleto give a detailed account of all
the techniques of digital signal processing,because there is
simply too much to cover in an article.Thus, the ideas chosen
for inclusionhere are only the most fundamental,which is to
say (hopefully) the most important.Armed with the knowledge presentedhere, the readershould be able to understand
much of the literaturein the field, even though we will
continue to omit calculus from our mathematicalconcerns.
As in many subjects,notations often are used only as a kind of
shorthandfor concepts which can be adequately explained
without resortingto "higher"mathematics.As we saw in
Part I, however,the better our mathematicalfacility, the easier
it is to solve certainproblemswhich are otherwisedifficult, or
at the very least, tedious. Thus we will continue to use
extensively the most powerful mathematicsat our disposal,
that of complex exponentials,in our treatmentof sampling,
transforms,and an introductionto the concepts of digital
filtering.
Before we delve into these concepts, however, some
words are in orderabout the generalnatureof the subjectwe
are studying.Digital signalprocessing,computerprogramming,
and acoustics all relate to computermusic in a similarway:
Unlike typical subfields such as harmony or counterpoint,
Page 38
which exist as subdivisionsof a globalrealmof study, digital
signalprocessing,computerprogrammingand acousticsare all
complete fields in themselves,each one repletewith its own
motivations,jargon,and subfields.Perhapsthe most fascinating aspect of computermusic is its attempt to synthesizefrom
such a vast arrayof knowledgethe keys to a rich and expressive sonic art. Since classicaltimes, when mathematicsand
physics were considered to be subfields of music (recall
Pythagoras'investigationsof the pitch of vibratingstringsof
different lengths) music and science have travelledincreasingly
divergentpaths in pursuitof ever-elusivetruths and beauty,
which indicates,at the very least, that one greatdifficulty for
computer musicianswill be to bridge the terminologygaps
among severalfields at once. This means that we must be
patient and willing to let each field describeitself in its own
terms first before we can progressto an understandingof how
to rephrasethese statementsin musicalterms. Thereforewe
cannot initially ask questionssuch as "How can we make an
oboe sound like a clarinet?"directly of digitalsignalprocessing, since the informationis not couched in these terms for
historicalreasons. We can, however,keep such questionsin
mind as we study, on the assumptionthat an accurateunderstandingof the terminologyof the field will allow such questions to be rephrasedinto tractableproblems. Often, the
answerwill be that the question asks for unknown or illunderstood techniquesto be applied, but just as often the
searchwill lead us to other revelations: answersto questions
which arebeggingto be asked! It will often be a useful exercise, therefore,to try to imaginewhat could happento some
sound if a particularprocess were applied to it. Certainly
much is alreadyknown about the musical effects of digital
signalprocessing,but at this point much more is to be learned.
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
Also, we must keep in mind that digitalsignalprocessing
is not programming,not acoustics, and hence even a perfect
understandingof it would not necessarilyshow us how to
create satisfying music with a computer. It is, however, a
powerfulway to think about the manipulationand control of
sounds, and as such it will most likely representan important
prerequisiteto our understandingof how to createmusic in
new and expressiveways.
Wheel
Spoke
of Spoke
S-Angle
SAMPLINGAND QUANTIZATION
Whatis a sampledsignal?Whenwe watch a movie, we
are looking at a streamof separate,discretephotographsflowing by at a rate of 24 frames(photographs)per second, but we
areseeing somethingquite different. The apparentcontinuous
motion on the screenis reallythe result of samplingthe position of the variouspeople and objects on the screen at a
sufficient rate to ensure that no important detail of the
motion is lost. Clearly,if the motion were sampled more
slowly, say, at 5 times per seond, the motion would appear
jerky and discontinuous,as it does undernow-familiarstrobe
lightingat discotheques.We can imaginean experimentthat
must have been executed severaltimes in the history of the
motion pictureindustry:We startout filmingvariousmoving
scenes at a slow, "flickery" rate, and then gradually(or in
steps) increasethe framerate until the motion appearssmooth
and continuous. Of course,we will eventuallyrun into practical difficulties, such as the sensitivityof the film to light, since
an increasingframerate implies a decreasingexposure time for
each frame. But hopefully, before such limits are reached,a
smooth renderingof motion will be achieved,and indeed, the
movie industry has settled on 24 frames per second as a
standardwhichworkswell enough. But does it work perfectly?
The answer,as usual,is: of course not, as anyone who has ever
watched a wagon wheel in a westernmovie can verify. As the
wagon startsout from a standingposition, the wheels appear
to turn slowly forward;then as the speed increases,the wheels
first appearto go faster,then begin to slow down and go backwards,then to stop, then to go forwardsagain,etc. Of course
they neverappearto stop completely when the wagon is
movingsince their speed blurstheir picture,but neitheris their
motion renderedaccuratelyby the film process.If we were to
increasethe framerate of filming, what would be the effect on
the image of the turningwheels? Clearly,the wagon could go
faster before its wheels started to appearslowing down or
going backwards,but the point is that for any filming speed
(samplingrate), there is some upperlimit on the rapiditywith
which motion can take place and still be renderedaccurately
on the screen. (Obviously, since most motions are slow
enough, the movie industrydeems it unnecessaryto increase
the frame rate for the sake of renderingchase scenes more
believable.)
In order to understandan important aspect of the
samplingprocess,let us imaginethat we are filming a documentary on the motion of a one-spoke wagon wheel (see
Figure 1). Let us arbitrarilysay that when the spoke points to
the rightthat it is in position zero, and that any other position
is defined as the angle, measured counterclockwise, from
position zero. Hence at an angle of nr/2radiansthe spoke
points straightup, at a radiansit points to the left, etc. If the
wheel completes exactly one full counterclockwiserevolution
per second we can describe its rotational velocity by saying
Directionof Motion
Figure1. A turning,one-spoke wagonwheel.
that it is turningat a rate of +27rradiansper second (clockwise
motion would indicate a negative velocity); two counterclockwise revolutionsper second would correspondto 47r
radiansper second, and in general,F revolutionsper second
will correspondto 2nrFradiansper second. The motion of the
wheel is clearly periodic with a period of T = 1/F seconds.
F is the frequency (repetition rate) with which the wheel
turns, so it can be measuredin cycles (or repetitions)per
second (Hertz, abbreviatedHz.), and the quantity 27rFis called
the radianfrequency at which the wheel rotates (measuredin
radiansper second).
Let us begin filmingthe rotatingwheel at the standard
rate of 24 frames per second. Assuming that the wheel
smoothly turns from the startingposition zero at a rate of
F = 1 Hz., successiveframes of the movie will be taken at
0 = 0, 0 = (1/24) * 27rF= rr/12,0 = rn/6,etc. Since the wheel
turns only 1/24th of a revolutionat each frame,we can expect
our documentaryto representthe facts as they are, a good
quality for any documentary.(Our camerais of courseideal,
so it neverblursthe picture of the wheel no matterhow fast
the wheel moves. At F = 6 Hz. we would get a seriesof frames
such as those shown in Figure2; the wheel turns 6/24 = 1/4 of
a turn at each frame.
At F = 12 Hz., the wheel turns halfway aroundat each
frame, and the filmed imageof it appearsto oscillate back and
forth with maximumrapidity(underthese conditions). Why
maximum? Whatlargermotion could the spoke make on
successiveframes? If we set F even higher,say, to 18, then the
wheel makes 3/4 of a complete revolutionat each frame,but
this is also indistinguishablefrom the wheel turningbackwards
at a rate of -6 Hz., equivalentto runningthe film in Fig. 2 in
reverse. To make this clearer,considerthe case of F = 23 Hz.
At each frame,the wheel turns23/24 of a revolution,almost all
the way fromwhereit starts. To verifythat this will appearas a
slow backwardsmotion (-1 Hertz), one only has to go to a
western movie and watch carefully, taking into account the
greaternumberof spokes on most wagon wheels. Finally, at
F = 24 Hz., the wheel does not appearto move at all! Thus if
we start at F = 0 Hz. and graduallyincreasethe speed to
F = 24 Hz., we see the wheel move slowly forward(counterclockwise,iLe.,with smallpositive frequency),go fasterto a
maximumat F = 12 Hz., then slow down while turningin the
opposite direction(clockwise motion, negativefrequency)and
stop completely at F = 24 Hz. If we furtherincreaseF from 24
to 48 Hz., we see againthe same sequenceof events. This is
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 39
0000000
0000000000000000000
t =0
0=0
t-
t = 1/24
0 =sr/2
t = 2/24
0 =n
t = 3/24
0 = 31r/2
t = 4/24
0 = 2r = 0
Figure2. Movieof a one-spoke wagonwheel turning(apparently)at f= 6 Hz., shot at 24 framesper second.
becauseall frequenciesoutsidethe range0 to 12 Hz. are
indistinguishable,
exceptforbeingpositiveor negative,from
0 and 12 Hz. accordingto the
between
frequencies
relationship
Fa
+)R
=F-
F
<
2
<
(k+2(1)R
2
2
where
Fa
F
R
k
is the "apparent"
frequencyin Hz.,
is the actualfrequencyin Hz.,
is the samplingratein Hz.(samples/second),
and
is anyodd integerwhichsatisfiesthe inequality.
Thusif the wagon-wheelturnsat 28 Hz.(=F) andwe
filmit at 24 framespersecond(=R),we choosek to be an odd
integerwhichsatisfies
)F <
2
k_(<
12k < 28 < 12(k+2)
22)R
Theonly oddintegerwhichsatisfiesthisinequalityis k = + 1,
since
_
1 .24 <-<
28
1
2 8
-
2
<-,<
--
3 24
2
Thereforethe apparentfrequencyis
F=28
=28 -2.242
= +4 Hz.
Thepointis thatwhileF maybe anyfrequencywhatsoever,
is restrictedto a definiterangeof frequencieswhich
Fa
dependson the samplingrate.Wecansee thatFaandF arethe
sameonlyif (k + 1)R/2 is equalto zero,whichis trueonlyif
k =-l1. Then
Fa = F-O
R'
R
2 <F-<+
R
2
or simply,Fa = F only if IFI < R/2 (IF I is the "absolute
value"or "magnitude"
of F withoutregardto its plusor
minussign).If IFI> R/2, thenFa#=F, andwe saythatFais
analiasofF. Thisphenomenon
of aliasingor foldoveris found
in all sampledsystems,whetherthey arefilmedwagonwheels
or digitizedwaveforms.Equation(1) is a statementof the
samplingtheorem,whichstatesthat any simpleharmonic
Page 40
variation(i.e., a sinusoidalvariationof a one-dimensional
quantity,a circularmotionin a two-dimensional
quantity,
etc.) whichoccursat a rateof F Hz.mustbe sampledat least
2F timespersecondin orderto avoidaliasing.
Thereadermayhavealreadynoticedthatif the sampling
rateis exactlytwicethe frequencybeingsampled(R = 2F),
thenEquation(1) is ambiguous,
sincetherearetwo different
valuesof k whichwill satisfythe inequality.Wewill come
backto thisfinepointlateron.
In orderto processsoundswitha computer,we representtheirwaveforms
as sequencesof discrete,finite-precision
numbers.Theseare the samplesof the instantaneous
amplitude of thesewaveformstakenat brief,regularintervalsin
time. Anymusicalwaveform
canbemodelledasa sumof sinusoidalvibrations,
eachwitha particular
(thoughpossiblytimevarying)amplitude,frequency,andphase.Thus,in orderto
representa continuous(analog)waveformaccuratelywith
discrete(digital)samples,we mustensurethat the sampling
frequencyis at leasttwo timesgreaterthanthatof the highest
frequencycomponentof the originalwaveform.Thesampling
theorem(Equation(1)) thenassuresus of the accuracyof our
renditionof the waveformif it is bandlimited
to the frequency
regionbelowone-halfthe samplingrate.
Thesamplingprocessis achievedby usingananalog-tovaluein
digitalconverter
(ADC)whichgeneratesa numerical
form(typicallya binarynumberof 12 to
computer-readable
16 binarydigits,or bits). The converter's
numericaloutput
is proportional
to theelectricallevel(eithervoltageor current)
at its input,whichis sampledat a raterangingfroma few
Hertz(forsignalssuchas seismicwaves)to 50 kiloHertz(for
high-qualityaudiosignals).Theanalogwaveformis typically
passedthrougha low-passfilterto attenuateanycomponents
at frequenciesgreaterthanhalf the samplingrate (see
Figure3), sincethesearegenerallyimpossibleto removefrom
the digitalsignaldue to the aliasingeffect describedabove.
Whetherthe distortiondue to aliasingproducesnoticeable
effectsin musicalsoundsdependson the relativestrengthsof
the aliasedcomponents,butseverealiasingis generallymuch
morenoticeableandirritating
in soundsthanit is in movies
of wagonwheels.
Theanalog-to-digital
converterproducesa B-bit binary
valueto representthe instantaneous
amplitudeof the analog
signalat eachsample.SinceB binarydigitsmayrepresentat
most2Bdifferentvalues,thismeansthatthe ADCmustchoose
the closestB-bit valueavailable
for eachsample.Thus,if the
bandlimited
varies
analogsignal
between,say, +10 and -10
volts, and B = 10 bits, the entire20 volt (peak - to - peak) range
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
SIGNAL
SOURCE
e.g., sound
+
air pressure
V
variation
-
time -4
TR ANSDUCER
e.g., microphone
electrical analog
to pressure
variations
LOW- PASSI
FILTER
">
time -,
removes
frequency
components
>R/2 Hz.
band-limited
analog waveform
ANALOG - TO -
samples at R Hz.
DIGITAL
CONVERTER
and quantizes to
B bits
+
0
.
time
+
stores
complete-P
E
representation as
sequence of
binary numbers
COMPUTER
MEMORY
discrete representation of
band-limited
analog waveform (digital
signal)
C
time -+
Figure3. Steps by which a continuous signalis convertedinto
a digitalsignalfor subsequentcomputerprocessing.
may be representedto an accuracyof 20/21o .02 volts at
each sample.In other words, the true voltageamplitudediffers from its binaryrepresentationby at most ?10 millivolts,
for an accuracyof about ?.05%. Such inaccuraciesare often
significant,since we can view them as equivalentto a small,
constantamount of randomnoise being added to an otherwise
perfectly representedsignal. This quantizationnoise, as it is
called, is the digitalequivalentto tape or amplifierhiss, and it
is usuallycharacterizedby a signal-to-quantizationnoise ratio
(SQNR), expressedin dB (decibels):
signalamplitude
(2)
loglo noise amplitude
Thus, if a maximumamplitudeof 10 volts correspondsto the
maximumbinaryvalue for a 10-bit ADC, the noise amplitude
will be 2-10 as greatas that of the strongestsignal,yielding an
SQNRof
SQNR in dB = 20
20 log1o 10
21go10.-2-i?-
The readermay wish to verify that underthese conditionsand
assumptions,the SQNR of a B-bit ADC is approximately
6B dB. However,two caveatsmust be kept in mind: First,we
are assumingthat the quantizationerrormay be treatedas a
randomnoise independentof the signal, which is certainly
questionable,especially at low samplingrates or for small
numbersof bits. Second, if the analogsignalamplitudeis not
maximal,it must be rememberedthat the noise level remains
the same, renderingthe quantizingnoise more audible and
bothersome for very soft sounds than for loud ones. ADC's
are availablewith 8 to 16 bits of resolution, and while the
issue has not quite been resolved,it seems that 12-bit ADC's
give minimallyacceptablesound quality, and that improvement beyond 16 bits is probablyunnecessary,since at that
accuracythe noise levels of transducersand amplifiersbecome
predominant.Specialbit-coding techniquesmay eventually
reducethe amountof data in a digitalsignal,but most computer music programsso far have not dealt with this possibility.
Producingsoundswith a computeris just the reverseof
the process diagrammedin Figure 3: a digital-to-analog
converter(DAC) is used to convertbinarynumbersto voltage
levels, a low-pass filter "smooths" the waveformby passing
only those frequenciesless than half the samplingrate, and the
resultinganalogsignalis then amplifiedand transducedby a
loudspeakeror earphones.
The numericalversion of the signal may be stored in
computermemoryand processedin a varietyof ways. Two of
the most importantof these processesare transformingthe
digitalsignalin orderto analyze its frequencyspectrum,and
filtering,which altersits frequencyspectrum.The information
gainedby analyzingthe digitalsignalmay be used to understand how such signals might be synthesized-a common
objective in computer music-and filteringis a majortechnique for controlling the quality, or timbre, of sounds
produced.
DIGITALSIGNALS
A digitizedsignalis not representedas a function of a
continuoustime variable(t), but ratheras a function of discrete valuesof time (n). In other words,we can think of a
digitalsignalas a sequence of numbers,each representingthe
instantaneousvalueof a (presumably)continuoustime function. Furthermore,we will always assumethat the samples
are uniformly spaced in time. Two basic notations for such
discrete-valued functions are commonly used in the digital
signalprocessingliterature,either
x(n),
n < N2,nEl
N
or
x(n T), N
< n < Nz , nEI
Both of these notationshave the samemeaningexcept that in
the second one the samplingperiod, T, is shown explicitly.
Since it is easy enoughto rememberthat the relationbetween
successiveinteger valuesof n and time dependson the samplingrate, we will use the first notation here. For example,a
one-second sine wave at a frequencyof 100 Hz. is represented
as
20 logo 210 -20 loglo 1000 = 60 dB
I
!
f(t) = sin(wt),
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
0
< t < 1
Page 41
Re[eijon]
where Co= 2rrF= 21rX 100. If we samplethis waveformat
500 Hz., the discrete form of this equation will be written
x (n) = sin(cn) ,
1
0 < n < 499
againwith o = 2rrF,but n and t arenot equal to each other.
Properlyspeaking,the quantity n T, where T = 1/500 second,
is equal to discretevaluesof t for integervaluesof n. Note also
that we will generallynumberN samplesfrom 0 to N - 1.
Two importantspecial functionswhich we will need in
our discussionare the impulse, or unit sample, function and
the complex exponential function. The digitalimpulsefunction is defined to be equal to one only if its argumentis zero,
and it has a zero value otherwise,Le.
I
n
(3)
it
If we want the impulseto occur on some sampleno :- O0,
follows from the definition that the following equation is
true:
(4)
n = 4.no
example
0Figure
for
4shows
aspecific
Figure4 shows a specific example for no = 4.
-2
1
-1
0
1
2
3
4
5
6
7
8
u(n-4) 1
-1
0
1
2
4
3
5
6
7
8
9
n-
Figure4. The unit sample (digitalimpulse)function u(n) and
the delayed unit samplefunction u(n -no) for the case no = 4.
The complex exponential function cannot be graphed
quite as easily as the unit samplefunction becauseit has values
consistingof both real and imaginaryparts:
eiwn = cos (cn) + j sin (con)
wherei2 = - 1.
Page 42
(
.
(,
Figure 5 shows two graphsof this function, one of its
real part, and the other of its imaginarypart. From looking at
Figure5, we cannot tell the frequencyof the sinusoidalwaveforms depicted. But we can see that there appearto be 8
samplesin each period of the sinusoidalwaves, and deduce
that the frequency of these waveformsmust therefore be
one -eighth the samplingrate. Waveformsobtainedby sampling sounds do not have real and imaginaryparts,of course;
we say that such waveformsare pure real, or equivalently,
that they are complex with a zero imaginarypart.
SPECTRA
9
n--
-2
1
Figure5. The complex exponentialfunction ejwn, shown as
graphsof its real and imaginaryparts,co = 2r/N, N= 8.
n =no
u(n - no) =
u(n)
Im[eljn]
0
u (n) =
1
T-1
(5)
As almost everyoneknows, two differentmusicalinstruments playing the same pitch at the same loudness for the
same durationfrom the same directionstill sound different
due to what is called their tone color, or timbre.Unfortunately, this subtractivedefinition of timbre only says what it is not:
timbreis that aspect of a sound which is not its pitch (if it
has one), loudness, duration,or directionality.Whatis left is
just the microstructureof the sound, and in orderto examine
it, we need a way to literallydissect sounds,i.e., to analyze
them into their constituent parts. Obviously, a complete
descriptionof all of the constituent partsof a particularsound
will include informationabout its pitch, loudness,and so on,
and in a broadsense we might even include these qualitiesin
our definition of timbre. Except on a few electronic instruments such as Theramins,or electronicorgans,the tone color
does not remainthe samewhen differentnotes are played, due
to varyingstringor tube lengths, lip tension, and so on. Since
all of these variationsin tone quality are quite relevantin
accounting for the characteristicsounds of musicalinstruments, we can see that analysisof the microstructureof a
sound is likely to yield informationonly about that particular
sound. Even two successivenotes played on the same instrument by the same performerin the same mannerare likely to
have strikinglydifferentmicrostructures.It is the study of this
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
tonal microstructure,and its relationshipto what we hear,that
is one of the deepest problemsin computermusic research,for
it is here that the complexitiesof the physics of the instrument, of room acoustics,and of the psychology of the listener,
enter in.
Our model for describingthis microstructureis called
the spectrum of a sound, by analogy to the spectrumof a
beam of light that may be obtained by passingthe light
througha prism.The prismhas the propertythat light made
up of different frequencies,or colors, is refractedby varying
amounts,the index of refractiondependingon the component, or primary,color in question. By observingthe intensities
of the light at these differentfrequencies,we are able to determine the make-up of the originallight beam. If the beam is
"purewhite" light, we obtain a "full spectrum,"proverbially
the colors of the rainbow.
The prismfor soundsis Fourieranalysis.By applyingthe
Fouriertransformto the waveformof a sound, we can mathematicallydeterminejust which amountsof which frequencies
are responsiblefor that particularwaveshape,and we can use
our analysisas a guide in synthesizingthat sound. If the sound
consists of all audiblefrequenciesin roughlythe same amounts,
we call the result "white sound," by analogy to white
light. Unfortunately,since a rainbowis considerablymore
appealingthan the steady, steam-like hiss of its audible
counterpart,we usuallyreferto this sound as white noise. If,
however,some frequenciesare considerablymore predominant
than others, the sound becomes "colored,"and if the relationships amongthese predominantcomponentsbecome roughly
harmonic(i.e., the frequenciesare integermultiplesof a single
frequency, called the fundamental frequency) the tone will
acquirea more definite pitch. When the waveformconsists
entirely of harmonically related frequencies, it will be
periodic,with a period equal to the reciprocalof the fundamental frequency(which need not be present).
The measurementof sound spectrais complicatedby the
fact that the spectraof almost all sounds changeboth rapidly
and drasticallyas time goes by. This situationis worsenedby
the fact that the accuracywith which we can measurea spectrum inherently decreasesas we attempt to measureit over
smallerand smallerintervalsof time. The spectrumof any
instant duringthe temporalevolution of a waveformdoes not
even exist; for example, we could scarcelytell anythingat all
about the frequencycomponentsof a digitalsignalby examining a single sample!We can measurewhat happensto the spectrum only on the averageover a short intervalof a soundperhapsa millisecondor so. The longerthe interval,the more
accurateour measurementof the averagespectral content
duringthat interval,but the less we know of the variations
that occurredduringthat interval.Thus the problemof spectral measurementcan be seen to be one of findingthe best
compromisebetween these opposing goals. Just how much
accuracyis needed is still an open question in the realmof
musical psychoacoustics:in some cases our ears seem to be
much more tolerant of approximationsthan in others. The
historicalmodel ofspectra as measuredby Hermnnann
Helmholtz
(see References)is clearlyinadequatefor believableresynthesis
(Helmholtzwas able to determinethe averagevalueof spectral
componentsover the durationof entire notes played on individualinstruments).A more recent model characterizesa note
by attack steady-state, and decay segments.Such a model is
certainlyan improvement;but it has limitationswhen applied
to the problem of producing "connected notes" (Lie.,to
problemsof musicalphrasing).Besides,the "steady-state"of
any real tone isn't really"steady"at all.
This discussionis not intendedto imply that the situation is hopeless, but only that it is subtle and complex, and
that it is as importantto appreciatethe limitations of the
spectralmeasurementtechniques presentedhere as it is to
realizetheir power. Therecan be little doubt that these techniques, and their relativesand extensions, will be the ones
which will eventuallyyield the basis for a richly expressive
computermusic.
THE DISCRETEFOURIERTRANSFORM
The two most commonly used transformsin digital
signalprocessingare the discreteFouriertransform(DFT) and
the so-called z-transform.The DFT is used to calculatethe
spectrumof a waveformin terms of a set of harmonically
relatedsinusoids,each with a particularamplitudeand phase.
It is usuallyimplementedby meansof a particularlyefficient
algorithmknown as the FFT (for fast Fourier transform),the
discoveryof which has made spectralcomputationa much
more practicalrealitythan it would be otherwise.Since the
DFT is less restrictive(albeit less efficient), and since the FFT
is well documentedfor those with a basic understandingof the
DFT, we will consideronly the DFT here. The z- transform,
unlike the FFT, is not somethingthat is typically calculated
with a computer, but is rathera mathematicaltool used
primarilyin the theory of digitalfilters. It is in a sense more
generalthan the DFT, since it includesthe DFT as a special
case, and it is of considerableinterestin the generaltheory of
digitalsignalprocessing.
The fundamentaloperationof the DFT is to decompose
an arbitrarywaveforminto its spectrum.The spectrumof a
waveformis a descriptionof that waveformin terms of a
number of "basic buildingblocks" for waveforms,which in
the case of the DFT are sinusoidswith harmonicallyrelated
frequencies.By analogy,if we factor an integerinto its prime
factors,we have in a sense "decomposed"the originalnumber
into basic numerical "building blocks;" for example,
340 = 1 X 2 X 2 X 5 X 17. The buildingblocks themselves
(the prime numbers)are just those numberswhich cannot
be furtherdecomposed: they can be expressedonly as one
times themselves,hence they arethe componentsof which
other numbersare formed,and not vice versa.Finally, factoring any integerinto primesyields an answerthat is unique:
there is no other set of primenumberswhich, when multiplied
together,will yield 340 except the set stated above.Perhaps
we picked the number340 in the first place, not on the
basis of its primefactors,but becauseit was the sum of the
agesof everyonein a small room: 340 = 10 + 28 + 32 + 40 +
50 + 50 + 60 + 70. This is clearlyanotherway to decompose
340, but it is not unique,since an infinite numberof
sequencessum to 340.
Similarly,the basicbuildingblock used by the DFT for
waveformsis the sinusoid.The DFT worksby treatingN
samplesof a waveformas if it were one N-sample period of
an infinitely long waveformcomposedof a sum of sinusoids
which are all harmonicsof a fundamentalfrequency correspondingto the N-sample period. And, like the primefactors
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 43
discussed above, this set of harmonicallyrelated sinusoids,
each with a particularamplitudeand phase, is unique: no
other set of sinusoidscould be summedtogether to obtain the
originalwaveform.Of course there may be other, possibly
non-unique ways of decomposinga waveform,just as 340
could be non-uniquely decomposedinto non-unique sums of
ages ratherthan a unique product of primes.In fact, other
unique decompositionsfor waveformsexist besidesthe sum of
sinusoidsyielded by the DFT, which we won't considerhere.
We should also discussthe concept of energyat a particularfrequency. A waveformmay, in signalprocessingparlance,
have energyat, say, 100 Hz., which meansthat at least one
sinusoidalcomponent with a frequencyof 100 Hz. and a nonzero amplitudeis presentin the vibrationpattern.Energyin
this case designates"that which exists" at 100 Hz. which does
not exist at, say, 110 Hz. The DFT functions by measuringthe
amplitudesof sinusoidalcomponentsat particularfrequencies
in a waveform,and since energy can be shown to be proportional to the squareof amplitude,we can see that this process
measuresthe energy at such frequencies.We could imagine
accomplishingthis processin a laboratorywith a set of electrical audio filters,each of which will pass energyonly at one
frequencyand block energyat all others. This bank of filters
could be used to detect energiesat a set of frequenciesfor an
arbitraryinput signal.The DFT accomplishesthis mathematically in the following way.
Suppose x (n) is a sequence of numbersrepresenting
N samplesof one periodof a waveformwith a period of N
samples.For example, let x (n) = A sin (con), with co = 2nr/N
and 0 n ?<N- 1. For N = 8 we would have the sequence
x(n) =
0, -
A
A
.707A, A
,
A
0,
A
x (n)sin(con)= 0
n=O
+
4A =N
+ A + A A+ + A + A + A
0
N-1l A
Z A sin(cn) sin(2con)= L
n=0
n=O0
AN-1
=
---
cos(-on) -
[cos(-con) - cos(3on)]
AN-l
n=0
cos(3cn) = 0
n=0
We have used a trigonometricidentity to show that this
sequenceis composedof the sum of two cosine waves,one at
frequency -co and the other at frequency 3o. Since the
cosine wave has the same symmetryabove and below the horizontal axis as the sine, both of these componentssum to zero
as well, indicatingno energyat frequency2co. As long as we
are summingup the valuesof a sinusoidalwaveformover any
integralnumbersof periods, we get zero. The sinusoidswhich
have an integralnumberof periodsin a durationof N samples
arejust those correspondingto the harmonicsof the frequency
with a period of N samples.
So far we haven'tconsideredthe phase of the sinusoidat
frequency w. We recall from Part I of this tutorial that a
sinusoidwith arbitraryphaseand amplitudecan be represented as
A sin (cn + ) =a cos (cn) + b sin (cn)
(7)
where
A
4
a
b
is the amplitude,
is the phaseangle,
is equal to A sin j, and
is equal to A cos 4.
Both the amplitudeand phaseof a sinusoidalcomponent
at frequencyo can then be determinedby usingour multiplyand-sum procedure,first with a cos (cn) multiplierto calculate the "a" coefficient, then with sin (con) to calculate the
"b" coefficient. The amplitudeand phase of the component
are then givenby the relations
A =
and
(8)
=tan-I__b
For example,let the sequencex(n) be defined as follows
A
The resultis A/2, one-half the amplitudeof the sinusoidat
frequencyco, scaledby N, the numberof samplesunderconsideration. We could not simply sum together the numbersin
the x(n) sequenceto obtain the same result, since summing
over any intergralnumberof periodsof a sinusoidyields a zero
result. This is due to the symmetryof the sine and cosine functions above and below the horizontalaxis. However,by multiplyingx (n) by sin (con),we form the sequencex (n) sin (con)
= A sin2 (con), and all valuesof sin2 arenon-negative.
Thus we have "extractedthe amplitude"of the sinusoid
at frequencyco in x(n) by purelymathematicalmeans. What
would happenif we were to "extractthe amplitude"of the
Page 44
N-l
-,
-A,
Wecan measurethe energyat frequency o by extractingthe
amplitude, A, of the sinusoid at this frequency. This is
accomplishedin this case by formingthe productof x (n) with
sin (cwn),and addingup the numbersin the resultingsequence,
since
N- IA
component of x(n) at frequency 2c? Withx(n) defined as
before, we expect that no such componentwill be detected,
i.e., that its amplitudewill be zero. In orderto verify that this
is so, we form the product sequencex(n) sin (2cn) and add
up the resultingnumbers:
a2+b2
x(n) = A sin (on + q1) + B sin(2on +
q2)
= al cos (on) + bl sin (con) + a2 cos (2con) + bz sin (2con)
where
al = A sin 1, bl = A cos 'P, a2 = B sin
52
and
b2 = B cos 2
?
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
We "extract" al via our multiply-and-sum procedure,
using cos (con) as a multiplier:
N-1
E x(n)cosc(wn)
n=0
N-1
=
Here the 1/N factor is includedto compensatefor the fact that
the latter two sums are scaledby N, as derivedearlier.
Notice that if x (n) = A cos (con), and we "extractthe
amplitude"ofx (n) at frequencyc with our multiply-andsum procedure,we find that the answeris A/2:
i
1NE 1 A cos(con) cos (cn)
[a cos2 (cn) + b1 sin(cn) cos (cn)
n=0
+ a2 cos (2con) cos (cn) + b2 sin (2cn) cos (con)]
n=O
n=0
N-1
a n=0
=
+? cos(2con)] =N
[1,
2
But notice also that if we "extractthe amplitude"of x (n) at
frequency-co the answeris the same:
Similarly,if we were to use sin (con) as a multiplierwe
could extract b,; cos (2cn) as a multiplierwould extracta2,
and so on. Given both a,, and bi, we can then apply
Equations(8) to obtain A and 0b1, if desired.This is the principle of the DFT: the multiply-and-sumprocedureis appliedto
determinethe amplitudesand phasesof each of the harmonics
of the waveform.
How many harmonicsmight be present?Accordingto
the samplingtheorem, we need at least 2 samplesin each
periodin orderto avoid aliasing;so if N = 8, and the sampling
rate R is = 8000 Hz., the only possible frequenciesof which
our periodic function x (n) could be composed are the harmonics of 8000/ 8 = 1000 Hz., which "fit", i.e., which have
frequenciesless than or equal to one half the samplingrate.
However, the samplingtheorem is perfectly admissiveof
negative frequencies,so the complete list of integralmultiples of 1000 which have magnitudes< 4000 Hz. is:
-
4000 Hz. (harmonic"-4")
3000
2000
1000
0
(harmonic"0")
+ 1000
(harmonic+ 1, or the fundamental
frequency)
+ 2000
+ 3000
+ 4000
x (n) is modelled as being composed only of sinusoidsat these
frequencies,ie.,
1
1 INn=O
A N-1
A cos(wn) cos(-cn)
?
cos [won-(-con)]
+cos [wn+(-con)]j
n=0
=A
2
Before we proceed,let us considerjust what is meant by
a "negativefrequency."Whenwe consideredwagon wheels,
we measuredanglesin a counterclockwisedirectionfrom the
righthorizontalaxis as positive angles,and clockwiseas negative angles.Clearlythe angle+ 2700 describesthe samespoke
position as -900. Similarly,the radianfrequencyof rotation
was positive for counterclockwisemotion and clockwise
motion was describedas a negativeradianfrequency.Does it
matterwhetherwe use a positive or negativedescriptionof a
frequency?Not too surprisingly,the answeris yes and no.
Certain mathematical functions have the property
known as evenness,which meansthat they areleft-right symmetricalaroundzero:
f(x) = f(-x)
= f(x) "even"
(10)
(the symbol =- means "impliesthat"). Other functions have
the propertyof oddness, which meansthat they areleft-right
antisymmetricalaroundzero:
+N/2
x (n)
j
=
k = -N/2
where
ak cos(kcn)+ bk sin (kcon)
co= 2ir/N,
N- I
ak =
x (n) cos(kon) , and
-f(x)
(9)
b
II=0
f(x)
x
(n)sin(kon)
j f(x) "odd."
(11)
Some functionsare even, some are odd, many areneither,and
only one is both (it is left as an exercisefor the readerto discover the only function that is both even and odd). Any function may be thought of, however,as composedof the sum of
an even partand an odd part,either (or both) of which may be
zero. In other words,any arbitraryfunctionfmay be "broken
apart."Thus,
N-l
= 1
= f(-x)
= fe(X)
+
fo(X)
(12)
wherefe (-x) = fe (x) andfo(-x) = -fo(x).
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 45
Here'sthe proof that this is so:
f(x)
= fe(X) + fo(X) = f(-X)
= fe(-X)
+ fo(-X)
But, by the definitions of fe andfo,
f(-x) =fe(x)- fo(x)
Thereforewe can solve for eitherfe or fo by adding(or subtracting)f(x) andf(-x):
f(x) +f(-x) = [fe(x)+fo(x) + [ (x)- fo(x)]
fe(x) = 1/2
[f(x) +f(- x)]
Similarly,
fo(x)= h [f(x) - f(-x)]
Clearly,
fe (-x) = ? [f(-x)
+ f(x)]
= fe(x)
and,
=1/2
fo(-&x)
[f(- X)-f(x)] =-fo().
be mentioned before we proceed to define the DFT. A
component at exactly one-half the samplingrate, i.e., with
only 2 samplesper period, can only be purely even, since the
samplesoccur at angles0 and rr,and both sin (0) and sin (7r)
are equal to 0. Thus the "bk"coefficients of Equation(9) will
alwaysbe zero wheneverk = ?N/2. Similarly,at zero frequency (also called "D.C."for direct currentby some engineersand
others who talk to these engineers),the "b," coefficient is
alwayszero, againsince sin (0) = 0.
We are now ready to define the DFT of a sequence
x (n). As mentioned before, we model x (n) as one N-sample
period of a periodicwaveform.The DFT will then yield the
unique spectrum of x (n) in terms of the amplitudesand
phasesof sinusoidalcomponents,each with periodsharmonically relatedto N, the numberof samplesin the transformed
signal.Whilethe FFT algorithmgenerallyrequiresthat N be
a power of 2, the DFT (which yields exactly the same result,
albeit with less computationalefficiency) places no restriction
on N except, of course, that it be greaterthan 2 samples.
Following the usual practice in the literature,we will
define the DFT in terms of the complex exponential which
allows us to representboth sine and cosine functions at once.
A typical definition for the DFT is then
DFT [x (n)] = X(k)
Finally,
fe(x)+fo(X)=' [f(x)+f(-x)] +V [f(x)- f(-x)] =f(x).
N-I
N-
0
n=0 x(n)e-iwnk
Wehave provedthat f(x) can be decomposedinto a sum of
even and odd partswithout sayinganythingelse at all about
f(x), so it is true for all functions!
Getting back to the question of negativefrequencies,it
is clearthat if a function is purely even, such as the cosine
function, the sign of the frequencydoesn't matter at all since
cos(con) = cos (-con)
(13)
But for a purely odd function, such as sine, it representsa
negation of amplitude,or equivalently,a 1800phase shift:
sin (-con) = -sin (won)= sin (con+n)
(14)
BecauseA cos (cwn)is an even function, it is generally
meaninglessto distinguishbetween positive and negativefrequency cosine waveforms.But the spectrumof a cosine wave
may be consideredto contain both positive and negativefrequency components,both of which have a positive amplitude
equal to A/2. For a sine waveformA sin (con), we obtain also
positive and negativefrequencycomponents,but of opposite
amplitude due to the oddness of the sine function: If the
positive frequencycomponenthas a positive amplitude,then
the correspondingnegativefrequencycomponentwill have a
negativeamplitude,and vice versa.The complete DFT yields
the amplitudesand phasesof both the positive and negative
harmonics.The amplitude is split in half at corresponding
positive and negativefrequencies,with the signsof the amplitude of the odd parts being opposite: this explains the fact
that only one-half of the amplitude is measuredif we
consideronly positive frequencies.
One more aspect of cosine and sine waveformsshould
Page 46
<
k <N-
1
(15)
where co = 2rr/Nand e -jwnk= cos (cnk) - j sin (cnk).
The inverseDFT is then defined as
DFT-' [X(k)] =x(n)
IN-1
X (k)e+jwkn
.I-N
k=0
0
n
N-l
(16)
Thus if X(k) is the DFT of x(n), then x(n) is the inverseDFT
of X(k). The mathematicaloddnessof the imaginarypartof
the complex exponentialnecessitatesthe use of the minussign
in the exponent of the DFT, while the inverseDFT has a positive exponent. Also, since the multiply-and-sumprocedure
producesvalues which are scaled by a factor of N, the 1/N
factor appearsin the inverseDFT in orderto make the statement DFT-' [DFT [x(n)] ] = x (n) exactly true. The valuesof
X (k) (the spectrum)are complex, with the real partscorrespondingto the "a" (cosine, even part) coefficients and the
imaginarypartscorrespondingto the "b" (sine, odd part)
coefficients of the spectralcomponents.If we denote X(k) as
ak+jbk, then
(17)
IX(k)1=/a+b•
is called the magnitude,modulus, or amplitudeof X(k), which
is the same as the amplitudeof the correspondingspectral
component (except for being scaledby N).
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
bk
arg[X(k)] = pha [X(k)] = 4 X(k) =
so
(18)
4
ak
tan-is called the argument,phase, or angle of X(k), which is equal
to the phase angleof the correspondingspectralcomponent.
The next question is: How do the valuesof the index k
correspondto frequency?In orderto understandthis correspondence,it is instructiveto take the DFT of a specific
sequence of numbersand see exactly what we get.
Let us define x (n) as 8 samplesof a periodicwaveform
with a period of 8 samples, sampled at R = 8000 Hz.
x (n) must be composed of harmonicsof 8000/8 = 1000 Hz.,
m=0
Fm
0
1
0
Om
Am
1000
1/10
1
0
0
2
2000
1/2
7r/3
3
3000
1/3
4
4000
1/4
7r/4
r/5
Am cos (mcon + Cm)
= Am [cos
(mcn) cos Om-
bm
am
.100
.433
.236
.236
.202
.147
with am and bm defined as in Table 1, we can see that the form
givenin Table 3 yields the same numbersfor x(n) as Table 2.
Figure6 is a graphof x(n). Only those valuesfrom n = 0
to n = 7 are considered(one period);these are shown as solid
lines on the graph. x(n) is presumablyan 8-sample sequence
extracted from a longersequencewith a period of 8 samples
(other values of this longer sequence are shown on dotted
lines). A glanceat Figure6 confirmsthat the spectralstructure
of x(n) is not very apparentfrom observationof its waveform,
Table 1. Coefficientsrepresentingone period of a sampled
m=O
waveform, x(n)=
Amcos(mon +').
4
sin(mcn) sin Om
= Am [amcos(mcon) - bmsin(mcn)]
0
0
1.000
.250
Am cos (mn + Om)
with o = 2-n/8= ir/4.
Values for both the amplitudesand the phaseanglesfor
each of the five componentsare givenin Table 1. Table2
shows actualnumericalvaluesfor the five componentsof x (n).
The sum of these five components,Le., the numericalvalueof
the samplesof x (n) itself are shown at the end of Table2. The
so-called "analytic(cosine and sine) form" of the components
of x(n) is shown in Table 3. By applyingthe trigonometric
identity
(Correspon(phasein
ding frequency
in Hz.) (amplitude) radians)(Amcos m) (Amsin m)
m
>
x(n)=
4
x(n) = > Am cos(m(on+Om)
m=0
0On<,7
nr
27r
4
CO8
Components:
Am cos (mcon+ Om)
m
1
2
3
4
5
6
7
.100
.100
.100
.100
.100
=
.100 cos (0)
-.707
0
.707
=
cos (-
-.433
-.250
.433
-.236
0
=
-.202
=
0
.100
.100
.100
1
1.000
.707
0
2
.250
-.433
3
.236
-.333
.236
0
4
.202
-.202
.202
-.202
= 1.788
-.161
.288
-.376
-.250
-.707
.433
-1.000
.250
-.236
.202
.333
-.202
.202
-.909
-.184
.500 cos
+ 0
(-23j+
.333 cos (4
+
.250cos rn
+I--
-4
4
-.684
m=0
1.038
=
x(n)
Table 2. NumericalValuesof the Componentsof x(n) as describedin Table 1, and their sum, x(n) itself.
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 47
yet applyingthe DFT to x(n) will indeed tell us exactly the
components of which x(n) is composed.
In orderto calculatethe DFT it is useful to make a table
of values for e-Jwnksuch as the one shown in Table4. We start
with k = 0:
7
X(O)=
x(n)e-jo
X(1) = ~. x(n)e-iwn
n=O
n=O
7
7
7
=
The real part of the answer(.8) is supposedto be N times one
of the "a" coefficients in Table 1, and indeed, it is N- ao. The
imaginarypart of the answer(0) correspondsto bo. So k = 0
apparentlyrefersto the zero frequency,D.C., Oth harmonic
component of x(n). For k = 1:
7
n=0
x(n) cos(0)-
n=O
x(n) cos(2nn/N) -
/ n=0 x(n) sin(27rn/N)
[(1.788)(1) + (-.161) (.707) +... etc.]
7
=
7
=
i] n=0 x(n) sin(O)
n=0
x(n) = 1.788 + (-.161) +.288 + (-.376)
+j [(1.788)(0) +(-.161)(-.707)+...
etc.]
= 4+j0
+ (-.684) + (-.909) + (-.184) + 1.038
which is just N/2 times (al + jbI), the coefficients for the 1st
harmonic, or fundamental frequency. Table 5 shows a
.8
4
x(n) =
m=0 [am
cos
sin (mon)]
(mon)-bm
2vr ir
4
8
7
0n
Components:
am cos (mwn)
m
n
0
1
2
3
4
5
6
7
0
.100
.100
.100
.100
.100
.100
.100
.100
=
1
1.000
.707
0
-.707
0
.707
=
2
.250
0
-.250
0
3
.236
-.167
4
.202
-.202
- 1.000
-.707
.250
0
-.236
0
.167
.202
-.202
.167
.202
-.202
.202
cos
=
n
.250 cos(n-)
-.167
=
.236
-.202
=
.202 cos (nir)
0
-.250
0
.100 cos (0)
cos-
bmsin (mon)
m
0
1
2
3
4
5
6
7
0
0
0
0
0
0
0
0
0
=
0 sin (0)
1
0
0
0
0
0
0
0
0
=
0sin
2
0
.43
3
0
.167
4
0
0
0
-.236
0
-.433
.43
0
0
-.167
.236
0
0
0
.167
0
0
-)
-.433
=
.43
-.167
=
.236 sin
0
=
.147 sin (ni)
sin
(_r)
Table 3. AlternativeForm of the Descriptionof the Componentsof x(n) as Describedin Table 1.
Page 48
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
2
x(n)
9
9
S91
0
0I
0
0O
Figure6. A graphof x (n) as
describedin Table 1. x (n) is
periodicwith a period of N = 8
samples. Only the period from
Q
oo
n = 0 to n = 7 is considered
-1
(solid lines), but presumablythe
function repeatsitself before
and after this period (dotted
lines).
f
-2
n=O
e-jwnk = cos (wnk) -j sin (wnk)
component values for N = 8, C = 2nr/N
cos (2rnk/N)
k n
0
1
0
1.000
1.000
1
1.000
.707
2
1.000
3
2
1.000
3
1.000
0
-.707
0
-1.000
0
1.000
-.707
0
4
1.000
-1.000
5
1.000
-.707
0
6
1.000
0
-1.000
0
7
1.000
.707
0
-.707
1.000
.707
-1.000
.707
4
1.000
-1.000
5
1.000
1.000
-1.000
1.000
0
0
-1.000
0
0
-.707
.707
-1.000
.707
1.000
.707
-1.000
0
-.707
0
-1.000
0
-.707
0
.707
1.000
-1.000
1.000
7
-.707
1.000
-1.000
6
-j sin (27rnk/N)
(all values below are multipliedby j)
0
11
2
3
44
5
6
7
0
0
0
0
0
0
0
0
0
1
0
-.707
-1.000
-.707
0
.707
2
0
-1.000
0
3
0
-.707
4
0
0
5
0
.707
-1.000
6
0
1.000
0
7
0
.707
1.000
0
1.000
1.000
0
-1.000
-.707
0
.707
0
0
0
.707
0
-.707
-1.000
.707
0
0
1.000
-.707
1.000
.707
0
1.000
-1.000
.707
0
1.000
0
- 1.000
0
-.707
-1.000
-.707
Table4. Valuesof e-iwnk for c = 2nl/N, N = 8
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 49
complete table of X(k) for all valuesof k from 0 to 7. Several
remarksare in orderabout X(k). Wesee that k correspondsto
the harmonicnumberfor k = 0, 1, 2, and 3. But the D.C. and
half samplingrate componentsare scaledby N, while the rest
are scaled by N/2. For k = 5, 6, and 7, we see that the
coefficients are the same as k = 3, 2, and 1 respectively,except
that the sign of the imaginarypart is reversed.Since the
imaginarypart correspondsto the sine function component,
and since sine is an odd function, it is clearthat these are the
spectralvaluesof the negativefrequencies. k = 5 corresponds
k
X(k)
Corresponding
SpectralCoefficients
0
.8 +jO
N(ao +jbo)
1
4+j0
2
1 +j 1.732
3
.944 +.944
4
1.616 +j0
5
.944-j. 944
6
1 -j 1.732
7
4 -j20
Corresponding
Frequency(Hz.)
?0
N
7 (a +1jb1)
N
2 (a2 +b2)
+ 2000
2 (a3+jb3)
+ 3000
N (a4+jO)
?14000
N (a3- jb)
-3000
2 (a2 jb2)
N
bN(a
- 2000
+ 1000
out again. This procedurewill be left to the readeras a
valuableexercise. Let us proceed by examiningsome of the
propertiesof the DFT and derivesome importanttransforms
that will be useful later.
First of all it is importantto note that we have defined
the DFT in such a way that only the principalvaluesof X(k)
are calculated.A more generaldefinition is
N-1
X(k) = E x(n) e-jwnk
n=0
t
Amplitude
1000
N/2
I
O<k<N-1
n=0
+Fo +R
2
Frequency-*
(Hz.)
to the -3rd harmonic, k = 6 corresponds to the -2nd har-
monic and k = 7 to the -1st harmonic. Whatabout k = 4?
As mentioned above, the samplingprocesscannot representan
amplitude for a sine component at half the samplingrate,
so the imaginarypartis 0, even though we used a non-zero
value for b4 (b4, in fact, has been ignoredby the sampling
process). Also, the scale factor is N for k = 4 instead of
N/2. This is becauseboth the D.C. and one-half sampling
rate components perforce have zero imaginaryparts, and
hence it is impossibleto split them into distinguishablepositive and negativefrequenciesas can be lone with the other
components.This says that, when samplingat 8000 Hz., we
cannot distinguishcomponentsat + 0 Hz. from - 0 Hz., nor
can we distinguish components at +4000 Hz. from
componentsat - 4000 Hz. This accounts for the ambiguityin
Equation(1) when F = ?R/2. It also accounts for the different
scale factors for X(0) andX(N/2).
In orderto completely check our transformdefinition
we should apply the inverseDFT to X(k) and see if x(n) pops
0
-R -Fo
2
Table 5: The discrete Fouriertransform(DFT) of x(n) as
describedin Table 1 (N = 8, R = 8000 Hz.).
Page 50
--N
IN/
x(n)e-jwnk
(19)
This definition shows explicitly that the spectrumobtained
from the DFT is a periodicfunction of frequency. In other
words, the principalvaluesof the N-sample DFT of cos (con)
arejust what is shown at the top of Figure7, but this is really
only a part of the full picture, shown at the bottom of
Figure7. This spectralperiodicityis due to the periodicityof
the complex exponential function itself, and theoretically
extends over all frequencies for all digital signals. It is
easy to see from the graphof the full periodicspectrumof
a sampled signal how no frequenciesgreaterin magnitude
thanR/2 can exist. If Fo in Figure7 were insteadthe frequency R - Fo, which is greaterthanR/2, the plots would look
exactly the same.
If we (properly) interpret the k index of X(k) =
DFT [x(n)] as negativefrequenciesfor k > N/2, and normalize amplitudesthat may be scaledby N, we can begin to construct a useful conventionfor spectralplots. Only the smooth
N-1
X(k) = DFT[x(n)] =
-oo < k <oo
Amplitude
I!
- R- Fo
-3R
2
-R+Fo
-R
-R
-Fo
0
2
-Fo
R - Fo
R
2
R
R+Fo
3R
2
Frequency-+
(Hz.)
Figure7. Spectrumof cosinefunction at frequencyIFoI<R/2.
At the top we see just the 2 principalcomponents, one at
+Fo Hz. and the other at -Fo Hz., both with an amplitudeof
N/2. At the bottom we see three periodsof the periodicspectrum of the same functionscentered around0 Hz. Normally
only the top figure is used, but the spectrumof all digitalsignals is actually periodicas shown at the bottom.
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
cosine
(real,even)
sine
(imaginary,odd)
I
unit sample (impulse)
(constant)
impulsetrain
impulsetrain
sin irx
rectangularpulse
bell-shapedpulse
sine
bell-shapedpulse
Figure8. Some transformpairs. Eachmemberof a pairtransformsinto the other (see text). The tick markson the horizontal
axes representunit valuesof time or frequency;on the verticalaxes, they representunit valuesof amplitude. After The Fourier
Transformand Its Applications by Ron Bracewell.Copyright@ 1965 by McGraw-Hill,Inc. Used with permissionof
McGraw-HillBook Company.
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 51
curve need be given (ratherthan a sequence of dots on the
heads of sticks) for functions like cosine or sine-it is understood that this curve is sampled at the samplingrate. When
convenient or appropriate,some functions such as the impulse
will still be shown with the dot-stick notation.
Transformpairsfor some common functions are shown
in Figure8. It shouldbe rememberedthat the DFT is reallya
two-way process:each memberof a transformpairtransforms
into the other member.We see, for example, that the unit
samplefunction transformsinto a constant spectrum.Orwe
can read this pairin the opposite direction:a constant-valued
sampledfunction has energy only at zero Hz. The scale markings in Figure8 correspondto unit valuesof time or frequency
on the horizontalaxes, and unit valuesof amplitudeon the
verticalaxes. Rememberthat if one of the membersof a pair
is interpretedas a time function, its amplitudeis scaled by
one, and its correspondingspectralamplitudeswill be scaled
by N.
function x(n) with u(n), the impulsefunction (third from the
top in Fig. 8), then the result is just the same as x(n). In other
words,
x(n) * u(n) = x(n)
where the asteriskdenotes the convolution operation.Note
that this is certainlydifferentfrom multiplyingx(n) by u(n),
which would resultin setting all valuesof x(n) to zero, except
for x(0), which would remainunchanged.The impulseis said
to be an identity function with respect to convolution,since
convolving any function with u(n) leaves that function
unchanged.If we scale u(n) by a constantci, the resultis
x(n) * clu(n) = clx(n)
which is just x(n) again,but scaledby the sameconstant. If,
however,we convolvex(n) with cl u(n - no), a shifted, scaled
impulse function, the resultis
x(n) * clu(n - no) = clx(n - no)
CONVOLUTION
If X, (k) = DFT [xl (n)] and X2 (k) = DFT [x2 (n)],
then an importantproperty of the DFT known as linearity
assuresus that the following is alwaystrue:
DFT[clxl(n)+c2x2(n)] = c1X1
(k)+c X2(k)
(20)
where cl and c2 are arbitraryconstants.Does the samehold
true if we multiplyxl (n) and x2 (n)? Unfortunatelynot, since
the spectrum of the product function xl (n) x2 (n) is not
X1 (k) X2 (k). Thereis a method for obtainingthe spectrumof
the product of two different functions known as convolution,
which we will treat first in a qualitativeway. If we convolve a
a shifted, scaledversionof x(n) (see Figure9).
In orderto convolvex(n) with a scaled,delayed impulse
function, we just scale and delay x(n) by the same amount.
But we can view any sampledfunction as a collection, or
sequence,of scaled, delayedimpulsefunctions! For example,
supposex(n) is defined by
,
u(n)
=
n=-l
n = +1
otherwise
S.5
Inl< 2
andy (n) is defined as
=
0
x(n)
x(n)
*
.5u(n)
=
.5x(n)
x(n)
*
u(n-z)
=
x(n-z)
otherwise
In orderto convolvex(n) with y(n) (see Figure 10), we can
think of x(n) as being composed of two impulses:one scaled
by + 1 and delayedby - 1 samples,the other scaledby +2 and
delayed by +1 samples.Thus we place a copy of y(n) at
n = - 1 (i.e., we form the functiony(n + 1)), and another
copy at n = +1, this one scaled by +2. The convolution of
x(n) andy(n) is just the sum of these two shifted, scaledversions of y (n). Wecould also have treatedy (n) as a set of five
impulseslocated at n = -2,- 1, 0, 1,2, and scaledby .5 each.
Placingscaledcopies of x(n) at each of these 5 locations and
addingup the 5 resultingfunctions (Lie.,.5 [x(n + 2) + x(n + 1)
+x(n)+x(n - 1)+x (n - 2)] ) would have yielded exactly the
same result.This indicatesthat the convolution operationis
commutative,which is to say x(n) ? y(n) =y(n) * x(n).
The mathematical definition of convolution is as
follows. Supposex(n) is a sampledfunction of durationNx
samples,ie., x(n) is non-zero only in the range0 < n < Nx- 1.
Let y (n) be a similarfunction of durationNy samples.The
convolution of x(n) withy(n) defines a new function
_D
.
Figure9. Examplesof convolution of an arbitrarysequence
x(n) with the unit samplefunction.
Page 52
1
2
0
x(n)=
y(n)
x(n)
(21)
n
z(n) = x(n) *y(n) =
>
m=0
x(m)y(n - m)
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
(22)
Volume II Number 2
together,however,we must convolvethe spectraof the two
separatecosine waveforms.The resultis shown at the bottom
of Figure 11. We treat the spectrumof frequencyF2 as if it
were two impulseswith amplitude= /2. We then make two
copies of the spectrumof the cosine waveformat frequency
F1, center a copy at each of the two locations indicatedby the
F2 impulses, and scale the F1 copies accordingto the F2
impulse strengths.We have just demonstratedyet another
trigonometricidentity,
y(n)
2-
no
0
x(n)
2
-
0
cos (A) cos (B) = ? [ cos (A - B) + cos (A +B)]
n-*
1 -y(n + 1)
2-
since the spectrumof the product of our two cosine waveswas
just two more cosine waves,one at the frequencyF1 + F2 and
the other at the frequencyF1 - F2, both amplitudesscaledby
Y1. Notice that it is quite possible to get foldoverwhen multiplying waveformstogether, since, in this case for example,
F, + F2 Hz. might well be greaterthan R/2 Hz. even though
neitherF1 nor F2 is. The aliasedcomponentswould have to
obey Equation(1), just like all the well-behavedcomponents.
THE Z-TRANSFORM
+
The z -transformis widely used in digitalsignalprocessing theory, and has a very simpledefinition. However,like the
oriental game of Go, it is much easier to learn what the
z-transformis than to masterit. Wecall x(n) a "one-sided"
sequenceif all of its valuesarezero for n < 0. It is called a
finite, one-sided sequenceif all valuesof x(n) are also zero for
-
2 y(n- 1)
0
2
0
n-+
n-
Figure 10. An example of the convolution of two functions, x (n) with y (n).
-R
-F2
2
As is evident in Figure 10, the convolved sequence is
generallyof a longer durationthan either function; the dura+ tion of z(n) in Equation(22) is in fact
Nx Ny 1 samples.
Whenwe multiply two waveformstogether,their spectra
are convolved. Similarly,if two spectra are multiplied, their
correspondingwaveformsare convolved,as we shall see later
when we discussdigital filtering.
Using convolution we can see what happensto the spectrum when we multiply two sampledwaveformstogether
(amplitudemodulation). Suppose we have two sampledcosine
waves, one at frequencyF, and the other at frequencyF2 (let
both amplitudesbe equal to one for simplicity).The sampling
rate is R Hz., and is much greaterthan either F, or F2, so
foldover is not of concern. If we were to add the waves
together, their spectrawould also simply add, and the resultis
shown at the top of Figure 11. If we multiply the waveforms
-F1 0 +F1
+F2
+R
2
-1
-R
2
-(F2+F1)
-(F2+F1)
0
F2-F1
F2+F1
+R
2
Figure11. The spectrumof the sum (top) and product
(bottom) of two cosine waveforms,one at frequencyF, and
the other at F2.
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Paqe 53
n > N, for some finite value of N. The z-transformof a onesided sequencex(n) is then defined to be
X(z) =
oo
j x(n) z-n
n =-0
(23)
and for a finite, one-sided sequenceof lengthN, it is
N- 1
X(z) =
x(n) z-n
n=0
(23)
wherez is a complex variable.
The major difference between the DFT and the
z-transformis that while in the DFT we multiply and sum the
samples of a waveform with a particularcomplex value,
namely e-W , in the z-transform,the valueof z may be set to
any complex value whatever. Clearly,if we set z to eiw, then
the z-transformis equivalentto the DFT for finite sequences.
But the z-transform can also be used to gain an additional
kind of insightinto the natureof digitalsignals.For example,
if x(n) is the unit-step function
1
n >O0
0
n<0
=
x(n)
then its one-sided z-transformis just
X(z)=
=
0 l'z-n
n=O
n1
n=O
z-n
1
- z-1
(25)
The closed form of Equation(25) is just a statementof
the result of summingan infinitely long geometricseries,such
as those discussedin Part I of this tutorial. Similarly,we see
that the z-transformof the complex exponential
eiwn = cos (on) + / sin (on)
is just
X(z)=
n=0 ejwnz-n
=
n=0
plane (i.e., graphingthe realpart along the horizontalaxis and
the imaginarypart along the verticalaxis), we find that the
graphis simply a circle of radiusone centeredat the origin,
called the unit circle. Since each point on the complex plane
representsa possiblevalue for z, we see that Iz I> 1 specifies
the set of all points on the complex plane that lie outside of
the unit circle. Similarly, Iz l = 1 specifies all points on the unit
circle, while Iz I< 1 is the set of all points inside it.
The z -transformof any function has a generalform:
(z-1 eJi)n
M
R (l-ziz-1)
X(z) = A
i= 1
(27)
- (l-piz-')
i= 1
where
A
is an arbitraryconstant,
II is the "productoperator,"analogousto the "sum operator"Z except that multiplicationratherthanaddi4
tion is specified;for example, II c = 1 2 3 - 4 = 24,
c=1
are
M
of
"zeroes"
and
X(z)
zi
Pi areN "poles"of X(z).
Both the numeratorand denominatorof Equation(27) are
polynomials in the complex variablez, and they are shown
here in factored form in orderto demonstratethat the zi and
Pi values are reallyjust the roots of the numeratorand denominator polynomials,respectively.Theseroots may be complex,
and if they are, we know that as long as the coefficients of the
numeratorand denominatorpolynomials are real (as they
must be for a realizablesystem), then the complex roots will
alwaysappearin conjugatepairs. We can then representthe
z - transformof a sequence by writing the transformin the
form givenby Equation(27) and plotting the locations of the
poles (pi) and zeros (zi) as "X's" and "O's"on the complex
plane. Such a "pole-zero" plot is just a unique way to
representthe sequencex(n) on the complex plane.
For example, Equation(25) is a z-transformwith M = 1
zero equal to 0 andN = 1 pole equal to 1, since
1
_ (1-0-z-1)
1-7z-1 (
-1)
1
(26)
S- z-1 eiw
We recallthat such geometricsumsconverge(i.e., sum to
a finite value)only for certainrestrictedvaluesof z. For example, Equation(25) convergesonly if Iz- I is less than one since
otherwisewe would be addingtogether an infinitely long sequence of numberswhich do not get smalleras we go along. In
I
orderfor z"
I to be less than one, Izl must be greaterthan
=
+ jb is complex, we can see that Iz
z
a
one. Since
1. If1=J+we make
will be greaterthan one only when
/a2+bi2>
= 1 on the complex
a graphof the equation Iz l=-a
+b2i
Page 54
Similarly,Equation(26) has one (complex) zero with a value
of zero, and one (complex) pole with a value of eiw. We can
graphthese two equationson the complex plane as shown in
Figure 12. Poles are graphedas X's, and zeros are graphedas
O's;the unit circleis shown for reference.
The poles of the bottom graphare "mirrorimages"of
each other in relationto the horizontal(real) axis. In other
words, the two poles differ only in the sign of their imaginary
part, and thus the poles representa "conjugatepair." The
exact location of the eJw pole pair dependson the value of
o, the frequencyof the sinusoid.If co = 0, the pole pairwill
be coincident, i.e., both X's are placed on top of each other at
z = 1. As the frequencyincreasesto
o increasesto +rn,
_R/2,
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
imaginary
complex(z-) plane
The significanceof the z -transformis its use in digital
filter theory. Wewill not delve very deeply into this topic
due to its vast complexity (no doubt it has imaginarypartsin
the minds of many theorists),but we can discussbasic
elements of some aspectsof digitalfilters.
DIGITAL FILTERING
real
-"
imaginary
complex (z-) plane
real-*
Figure12. Pole-zeroplotsof thez-transformof Equation
The basic purpose of a digital filter is to modify the
spectrumof digitalsignalsin a desirableway. A digitalfilter is
usuallypicturedas a "blackbox" with a signalinput and a
signaloutput. The operationof the box is describedby what
engineerscall its transferfunction, a mathematicalformulation
of its operationin the transformdomain.For examplewe may
wish to create a filter which will pass all frequency components lower in frequency than a certain value, Fc, and to
remove all others. Such a filter would be a low-pass filter
(LPF), and Fc is its cutofffrequency. Wemay desireto create
high pass filters(HPF) or band-passfilters (BPF) to removeor
attenuatecertain frequencieswhile passingothers. An all-pass
filter may be used to modify only the phase on certaincomponents. In generalwe would like to be able to specify an
arbitrarytransferfunction in terms of a frequency response
curve, to be multiplied by the frequency spectrum of any
signalwhich is treatedby it, and possibly also a phase response
curve,which would modify the phaseof an arbitrarysignal.
The transferfunction for an ideal LPF is shown in Figure 13.
From now on, we will use "normalizedfrequency"in our
plots, where o = ?ir correspondsto a frequency of ?R/2.
This allows us to speak of frequencies in a way that is
independentof any particularsamplingrate.
27. Top: M 1, N = 1. Bottom: M= 2, N = 2.
andat w = ?n, the two poleswouldbe on top of eachotherat
z = -1. Intermediate
at interfrequenciesare represented
mediatepositionsalongthe circumference
of the unitcircle.
Thez-transform,likethe DFT,alsohasaninverse,albeit
a morecomplicatedone thanthe inverseDFT.The only
methodof performing
the inversez-transform
whichwe will
considerhereis calledpartialfractionexpansionof X(z).
we changethe formof X(z)
If, by algebraicmanipulation,
fromthatof Equation(27) to
N
X(z)= 1
i=
i
(1
x(n) =
(28)
-piZ)
ci(pi)n n >0
i= 1
0
-
(28)
wherethepi arethe poles(rootsof the denominator
polynomial)andtheci areconstantsderivedin the algebraic
manipulais givenby:
tion, thenthe inversez-transform
N
f
Amplitude
(29)
n<O0
Othermethodsexist for the inverse
z-transform,but
noneof themis simplerto performthanthis
unfortunately,
one.
++C
+Wc
-O
Frequency-+
Figure13. Transferfunction for an ideal low pass filter with
(normalized)cutoff frequency c,.
The filter describedin Figure 13 operatesby multiplying
the spectrumshown by the spectrumof the input signal,thereby setting to zero any frequencycomponentsin the signalthat
are greaterthan ?+
wc
If we think of Figure 13 as the spectrumof some signal,
call it h (n), then we would be able to implement our ideal
LPF by convolvingthe input signalwith h (n), since multiplying spectracorrespondsto convolutionof waveforms,just as
multiplying waveforms corresponds to convolving their
spectra.Thus, if x(n) is the input signaland X(z) is its transform,y(n) is the output signaland Y(z) is its transform,and
F. R. Moore:An Introductionto the Mathematicsof DigitalSignalProcessing,PartII
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page55
of H(z), thenthe transferfunch(n) is the inversetransform
tion of ourfilter,
y(n) = h(n) * x(n)
by anequationof the
Anydigitalfiltermaybe described
form:
m
(30)
y(n) =
to
corresponds
Y(z)= H(z)X(z)
(31)
The functionh(n) is calledthe impulseresponse(or unit
sampleresponse)of the filter,since Y(z) = H(z) only if
X(z) = 1, whichis the transformof the digitalimpulse
function.
We can uniquelydescribeany filter by its impulse
responseh(n), whichis just the (inverse)transformof its
transferfunctionH(z). Andoncewe knowH(z), whichwe
its responseto an
canobtainfromanyfilterby transforming
impulsefunction,we knowwhatthe filterwill do to any
input signal,sinceH(z) definesthe frequencyand phase
responseof the filtercompletely.
Wecalculatethe frequencyandphaseresponseof a filter
withthe followingsteps:
1. Findthe impulseresponseh(n) of the filter.Thiscanbe
doneempirically
by excitingthe filterwithu(n) as an
inputsignal;h (n) is then"whatcomesout."
2. Calculatethe z-transformof the impulseresponse:
H(z) = Z h(n)z-n. Thissummaybe eitherfiniteor
infinite,dependingon whetherh(n) is of finiteduration.
3. Set z equalto ejw. Thismakesthe transferfunction
H(z) yieldthe spectrumof theimpulseresponse,
H(eiw).
4. Thefrequencyresponseof the filteris thendefinedto
be themagnitudeof the spectrumof its impulse
response:IH(eiw)I.
5. Thephaseresponseis similarlypha[H(eiw) ].
Manysubtleproblemsareencounteredin digitalfilter
design.For example,the idealLPFdescribedaboveis an
If we
exampleof a filterwhichis saidto be unrealizable.
of the "rectangular
examinethe transform
box"functionin
Figure8, we see thatit lookssomethinglikea cosinewavethat
dies downin amplitudeon both sidesof the origin(this
functionis calledthe "sinc"function).If the rectangular
box
werea digitalwaveform(a sortof singlehalfperiodof a
squarewave),thenthissaysthatthe spectrumof thissignal
graduallydies away as the magnitudeof the frequency
increases,both in positive and negativedirections.But if the
rectangularbox is the spectrum,as for the ideal LPF, this says
that the impulse response of the filter begins before n = 0,
the time at which the impulseoccurs. In other words, the ideal
LPF beginsto respondto its input before it has occurred.No
wonderthis filter is called unrealizable!It would need some
sort of "crystalball" with which it could look into the future,
see an impulse coming from the distant future, and begin its
responsebefore its input arrives.The moralis: if anyone offers
to sellyou anidealLPF,don'tbuyit (unlessyou checkit very
carefully)!
We will end our discussionof digital filters with the
generalformulafor all digital filters and two simple examples:
a first-orderrealizable(but hardlyideal) LPF, and a second
orderBPF.
Page56
i=0
N
bix(n - i)-
i=l
aiay(n - i)
(32)
where
x(n) is aninputsignalor sequence,
y(n) is the outputsignal,
howy(n)
bi area set of M coefficientsdescribing
dependson the currentinputsampleand the
previousM inputsamples,and
a set of N coefficientsdescribing
are
howy(n)
ai
dependson thepreviousN outputsamples.
Thebi coefficientsof Equation(32) determinethe zeros
of the transferfunction,andtheai coefficientsdeterminethe
poles.Somefiltershaveonlyzerosandno poles(i.e.,N = 0),
andhencetheiroutputdependsonly on the currentinput
sampleandthe pastM samples.Suchfiltersarecalledtransversal,orfinite impulseresponse(FIR) filters,sincetheir
responseto animpulsecannotlastanylongerthanM samples,
as definedin Equation(32). If a filtercontainspoles,it is
calleda recursive,orinfiniteimpulseresponse(IIR)filter.
Manyfilterscontainbothpolesandzeroesin theirtransfer function,the simplestof whichis a first-orderLPF,
definedby the differenceequation:
y(n) =x(n) + Ky(n - 1)
(33)
whereK is an arbitrary
constantless thanone. In termsof
Equation(32), ourfirst-orderfilterhasbo= 1 andal = K. We
can find its transferfunctionby transforming
the impulse
response,h(n); the frequencyresponsewillthenbe IH(eiw)I
andthe phaseresponsewillbe givenby pha[H(eiw)]. If we
startwiththe initialconditionthaty(- 1) = 0, thenwe canlist
the valuesof h(n) as follows:
x(n)= u(n) , y(-1) = 0
y(n)= u(n)+ Ky(n - 1)
n
0
x(n) = u(n)
1
1
2
3
0
0
0
y(n)
1 +Ky(- 1) =
1 = Ko
K=K'
0 +Ky(0) =
0+Ky(1)=
K'K=K2
0+Ky(2)= K.K2=K3
etc.
Since h(n) is infinitely long, our filter is of type IIR
with impulseresponse
h(n)=
SKn
n>
0
nn<0
ComputerMusicJournal,Box,E, MenloPark,CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
0
Volume II Number2
Theone-sidedz-transformof h(n) is then
H(z)= Z
h(n)z-n=
oo
n=0
n=0
K(e-iw ew)
)
=tan-
S[2 -K(eiw + e-ij)]
oo(Kz-')n
-
n=O
Settingz equalto eiw nowyieldsthe spectrumforourfilter
Againwe makeuseof Euler'srelationin two forms:
eiw + e-iw = 2 cos c and eiw - e-Jiw= 2 jsin w. Also, note
that the inversetangentis an odd function, i.e., tan' (- 0)
=-tan-1 (0).
1
- 1KeW
H(eJw)
- e w)
Stan
Wenow haveonly to find the magnitudeandphaseof the
functionH(eiw) in orderto makethe frequencyandphase
explicit.A usefulpropertyof complexnumbersis
responses
thatthe productof a complexnumberwithits conjugateis
of the complexnumber;i e., if
equalto the squaredmagnitude
K(e-'w
[ j(2-K2 coso)
zz * = (a +jb) (a -jb) = aZ+ b2 = zl2
= -tan'
(wherez*= the conjugateof z = a - b) It is alsoeasy to
verifythe followingrelations:
=
j(z + z*)
Thus,in orderto findthe frequencyresponsewe calculatethe
of H(ei"):
magnitude
[H(eiW)H(e-iW)]%
1~
I
1 -Ke-iw
1 -Ke* i
-
+e-w) +K2
1 K(eJw
-
Noting from Euler'srelationthat eiw + e-iJ = 2 cos w, we
obtain
1
[
--
2K
+K
cos
3
o
The phaseresponseis equal to
I
L
-
Ke-J"+
(all
(*
sin
w
+ z*)= a
1 Im z = the imaginarypart of z =
(z z*) = b
z - z*
- tan' Im
Inzz
tan'
IH(ejw)I =
)
I -K cos w
K
-n
= phaH(eiw)
1
Re z = the real partofz =(z
Re z
K2sin
-tatann-
z = a +b, then
ll(eiw)l=
L(1-Keiw + 1-Ke-iw)
Knzn
oo
00
phaz
- Keiw-(1-Ke-iw)
F
=tan-
K
I
-
I - A'e. w
Wecannowmakegraphsof IH(eiw)IandphaH(eiw)
from--7r?<w < +rnto get a completepictureof the transfer
functionas a frequencyresponseanda phaseresponse.Since
the frequencyresponseof mostfiltersvariesoversucha great
numerical
vertical
range,we typicallyplotit on a logarithmic
scale,suchasis donein Figure14.Ifwe plot20 logloIH(eiw)l,
we canevenreadtheverticalscaledirectlyin dB.A valueof
0 dB corresponds
to IH(eiw)l = 1, +6 dB corresponds
to
to IH(eiw)l= .5, andso on.
IH(eiw) = 2, -6 dBcorresponds
The frequencyresponsecurvesin Figure14 indicate
clearlythat our first-ordersystemis a low passfilter,but
hardlyanidealone.
A simplesecond-order
filtermighthavethe equation
y(n) = x(n) + K, y(n - 1) + K2y(n - 2)
(34)
to derivethe
whereK, andK2 areconstants.Inattempting
impulseresponseof thisfilter,h(n), we findthatthemathematicaltools which we have developedso far are unfortunately not adequateto the task. In orderto "solve"Equation(34)
for h(n) we would need to develop the theory of difference
equations, which are the digital correspondentsof differential equations in the analog world. If we aregiven h(n),
however,we could then carryout the necessarycalculationsto
obtain the frequency and phase response. The impulse
responsecorrespondingto Equation(34) will be given here
without derivation,since it is possible to make use of this
filter without being able to solve difference equations. With
the followinggift from differenceequation theory, then, we
can proceed.
The impulseresponseof Equation(34) is either the sum
of two exponentiallydecayingfunctions, or a sinusoidwhose
amplitudedecays exponentially,dependingon the relationship
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 57
1
20og/o
1 - 2K cosco + K2
30-
20
..K =.95
K=.5
10--
K sin w
-tan-1
1-K cos C,
SK= .95
.K=.5
-
I\
-
~I+=r
-n
Figure14. Frequencyand phaseresponseof a first-order
system (see text) for two valuesof K (.95 and .5).
Page 58
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2
between the K1 and K2 coefficients. If K1 > impulseresponseis of the form
K2
h(n) = a0 (pl)n + a2(P2)n
/4, the
(35a)
(sum of exponentials).If K1 < - K22/4, then
h(n)= al rn sin (bn+ O)
(35b)
where
r =
-K2
b = cos-1
K1
0 = b,and
I
sin b
The frequencyand phase responseof this filter can be
calculatedfrom the z-transformof Equation(35b), evaluated
at z = eJw:
H(ejw)
1
= 1 - 2r (cos b) e-jw + r2 e-2(w
(36)
likely that computermusic will continue to be fruitfullypracticed by smallgroupsof cooperatingindividualsratherthan by
single personsworkingalone. Also, it meansthat individuals
must be willingto learnas much as possible,and to rely on
good sourcesfor that which is unknown.The trick is to be
able to discerna good source from a not-so-good source. It
would help a great deal if computermusicians,with their
variousspecialities,were able to speak a common language
wherebytheir individualspecialitiescould be described.One
such languageexists in the form of computer programming
principles.It is possible for a composer to explain the rules
of harmonyin computerprogrammingtermsto a mathematician, and for the mathematicianto explain integrationsin
computer programmingterms to a composer,if both can
program.Thus the core of computer music must be in programmingand music composition or performance.Beyond
a good workingknowledgeof these two fields, mathematics
seems to be a unifying element for the remainingareasof
acoustics, psychology and signalprocessing.An overviewof
each of these areaswould be useful for work in computer
music. Tutorialsin the fields of acoustics and psychological
acousticssimilarto this one in mathematicsand signalprocessingwould providean even more solid groundfor discussion.
Wehave coveredabout as much signalprocessingas the nonspecialistpractitionerof computermusic is likely to need, at
least in the way of basics.We arenow in a position to build
on this foundation.
The interestedreaderwill find it an exhileratingexercise to plot
curvesfor IH(ejw)I and phaH(eiw) based on Equation(36).
WHERETO GO FROMHERE
Wehave only scratchedthe surfaceof mathematicsand
digitalsignalprocessing,but we have scratchedit pretty well.
Many of the concepts we have discussedare subtle and not
easily understood,and it should come as no surpriseif the
uninitiatedreaderis a bit confused at this point. Confusionis
the naturalcompanionof learning.WhenJean BaptisteJoseph
Fourierfirst establishedthe fact that an arbitrarymathematical function could be representedas a sum of sinusoids
(in 1807), leading mathematiciansof the day refused to
believe it, so new and startlingwas the idea. No doubt the idea
confused them, too. People have been confused by this idea
eversince, of course, but we now know it to be one of the
most useful basic facts about mathematics,science, engineering, and sound, includingmusic. The truth and beauty of the
idea makes all the initial confusion worthwhile.
If the confusion persists,however, after a few careful
readingsand problemsolutions, the time has come to attack it
before it turnsinto frustration.Some recommendedreferences
for furtherstudy are given after the problems,organizedby
topic. An hour or two with a good teacher can be worth
months of self study in certaincases, but skill alwayscomes
from practice, and no numberof hours watching a teacher
solve problemswill make a student skilledat this.
Finally, a word about scope. Computermusic is a highly
interdisciplinaryfield, containing portions of computer
and engineering.Just
science, music, science, mnathematics,
which partsof all of these fields are partof computermusic
and which are not is unclearat this time, though it is clearthat
no individualis likely to have the utopian qualificationof
expertise in all of these fields at once. Consequently,it is
Some MoreProblems
1. (Sampling).Non-bandlimitedwaveformsarecharacterized
by largejumps (discontinuities)in theirwaveforms;such a
waveformis depicted in analog form in FigureP1. The
Fourierrepresentationof the sawtooth as it is shown is
00
F(t)= 22
k=l
k
(--!)k-1
sin kt
a) Derive an expression for the equivalent waveform
sampled at R samplesper second with a frequency of
F Hz.
b) SupposeR = 10 kHz. andF = 440 Hz. Whatis the lowest
frequencycomponent of this waveformwhich will be
folded (aliased)to an incorrectfrequency?Whatis the
amplitudeof this component?
c) Supposewe cannot hear componentswhich are 40 dB
weakerin amplitudethan the fundamentalof the sawtooth. Whatis the highest frequency sawtooth which
will have aliasedcomponents40 dB less strongthan the
fundamental(again,assumeR = 10000 Hz.)?
2. (Sampling). Supposewe programa computerto producea
sine wave of constantlyincreasingfrequencyaccordingto
the relation
Frequency= 1000 (time in seconds).
The samplingrate of the digital-to-analogconverteris set
to 32 kHz. Makea graphshowingthe frequencywe will
actuallyhearcoming from the DAC as a function of time
for t = 0 to t = 60 seconds.
F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Page 59
ACKNOWLEDGEMENTS
-3n /
-i
-2r
2i
I wish to thank John Snell, John Strawn,John Gordon,
and MarcLeBrunfor their many helpful suggestions;without
their dedicationsuch articlesas this, and indeed the Computer
MusicJournalitself, could not exist.
The editors of ComputerMusicJournalwould also like
to thank John Gordonfor providingdrawingsfor Fig. 3-6 and
9-10.
3ir t -
,0
REFERENCES
FigurePl: A non-bandlimitedsawtooth waveform.
Acoustics and Psychoacoustics
3. (DFT). Makea spectralplot for the DFT workedout in
Tables 1 through5. Labelthe horizontalaxis with the frequencies -4000, -3000, -2000, - 1000, 0, 1000, 2000,
3000, and 4000 Hz. and plot the magnitudeof X(k) at
each of these frequencieson a dB scale.
4. (Convolution). A squarewave is characterizedby a
spectrumcontainingonly odd-numberedmultiplesof the
fundamentalfrequency,with amplitudesequal to one-overthe -component-number:
-I sin (kwn), k odd.
x(n) =
k=l
Foldovercan be avoidedby choosinga suitably smallvalue
for M. Suppose that we producethis waveformat
F Hz. << R/2 Hz., and that we multiply it by
y(n) = (1 + cos 2on)
i.e., a cosine waveformat twice the fundamentalfrequency
of the squarewave.
a) Whatis the spectrumof the resultingwaveform?
b) Is it still a squarewave?
5. (DFT). Supposex(n) represents16 samplesof a digitized
sine waveformwith a period of 16 samples,i.e.
x(n) = sin
(2rn/iN)
0 < n< N- 1
withN= 16.
a) Whatis the DFT of x(n)?
b) Supposewe took the DFT of only the first 8 samplesof
x(n) as defined above. Is the DFT the same? Why or
why not?
6. (z-transform). Makea pole-zero plot of the z-transformof
the first-orderfilter discussedin the text. Whathappensto
the location of the pole as K goes from 0 to 1?
7. (Digital Filters). Makea plot of the frequencyresponseof
the second orderfilter (Equation(36)) for r = .95, b = rT/4.
How does it differ from the frequencyresponseof a firstorderfilter?
Page 60
Benade, ArthurH. FundamentalsofMusical Acoustics. New
York: Oxford UniversityPress, 1976.
Helmholtz,H. von. On the Sensationsof Toneas a Physiological Basisfor the Theoryof Music. Trans.by A. Ellis.
New York: Dover, 1954.
* Roederer,JuanG. Introductionto the Physicsand Psychophysics ofMusic. New York: Springer,1975.
ComputerMusicand ElectronicMusic
Appleton, Jon H., and Ronald C. Perera,ed. The Development and Practice of ElectronicMusic. Englewood
Cliffs, NJ.: Prentice-Hall,1975.
* Mathews,Max V. The Technology of ComputerMusic.
Cambridge:MITPress, 1969.
ComputerProgramming
* Knuth, Donald E. The Art of Computer
Programming.
Vol. 1: FundamentalAlgorithms. Reading, Mass:
Addison-Wesley,1969.
Digital SignalProcessing
Oppenheim,Alan V., and RonaldW. Schafer. DigitalSignal
Processing. EnglewoodCliffs, N.J.: Prentice-Hall,
1975.
* Rabiner,LawrenceR., and BernardGold. TheoryandApplication of Digital SignalProcessing. EnglewoodCliffs,
N.J.: Prentice-Hall,1975.
Robinsen, EndersA., and ManuelT. Silvia. Digital Signal
Processingand Time Series Analysis. San Francisco:
Holden-Day, 1978.
Mathematics
Bracewell,Ron. TheFourier Transformand its Applications.
New York: McGraw-Hill,1965.
Polya, George,and Gordon Latta. Complex Variables.New
York: Wiley, 1974.
Spiegel, MurrayR. MathematicalHandbookof Formulasand
Tables. Schaums'sOutline Series in Mathematics.
New York: McGraw-Hill,1965.
* Steward,Ian, and DavidTall. The Foundationsof Mathematics. New York: Oxford UniversityPress, 1977.
(*-first choice)
Computer Music Journal, Box E, Menlo Park, CA 94025
This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC
All use subject to JSTOR Terms and Conditions
Volume II Number 2