An Introduction to the Mathematics of Digital Signal Processing: Part II: Sampling, Transforms, and Digital Filtering Author(s): F. R. Moore Source: Computer Music Journal, Vol. 2, No. 2 (Sep., 1978), pp. 38-60 Published by: MIT Press Stable URL: http://www.jstor.org/stable/3680221 Accessed: 03-02-2016 12:02 UTC Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. MIT Press is collaborating with JSTOR to digitize, preserve and extend access to Computer Music Journal. http://www.jstor.org This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions An Introduction to of Digital Signal the Mathematics Processing PartII:Sampling,Transforms,and Digital Filtering F. R. Moore Bell Laboratories MurrayHill, New Jersey 07974 ? 1978 by F. R. Moore INTRODUCTION In Part I of this tutorial (ComputerMusic Journal, Vol. 2, No. 1), we discussedsome of the basic mathematical ideas relevantto the processingof digitalsignals.Now we turn to the applicationof these and other concepts, operatingon the assumption that the reader understandseverything in PartI thoroughly(althoughsome of this second part can be understood without following the mathematicalarguments). Again,it will be impossibleto give a detailed account of all the techniques of digital signal processing,because there is simply too much to cover in an article.Thus, the ideas chosen for inclusionhere are only the most fundamental,which is to say (hopefully) the most important.Armed with the knowledge presentedhere, the readershould be able to understand much of the literaturein the field, even though we will continue to omit calculus from our mathematicalconcerns. As in many subjects,notations often are used only as a kind of shorthandfor concepts which can be adequately explained without resortingto "higher"mathematics.As we saw in Part I, however,the better our mathematicalfacility, the easier it is to solve certainproblemswhich are otherwisedifficult, or at the very least, tedious. Thus we will continue to use extensively the most powerful mathematicsat our disposal, that of complex exponentials,in our treatmentof sampling, transforms,and an introductionto the concepts of digital filtering. Before we delve into these concepts, however, some words are in orderabout the generalnatureof the subjectwe are studying.Digital signalprocessing,computerprogramming, and acoustics all relate to computermusic in a similarway: Unlike typical subfields such as harmony or counterpoint, Page 38 which exist as subdivisionsof a globalrealmof study, digital signalprocessing,computerprogrammingand acousticsare all complete fields in themselves,each one repletewith its own motivations,jargon,and subfields.Perhapsthe most fascinating aspect of computermusic is its attempt to synthesizefrom such a vast arrayof knowledgethe keys to a rich and expressive sonic art. Since classicaltimes, when mathematicsand physics were considered to be subfields of music (recall Pythagoras'investigationsof the pitch of vibratingstringsof different lengths) music and science have travelledincreasingly divergentpaths in pursuitof ever-elusivetruths and beauty, which indicates,at the very least, that one greatdifficulty for computer musicianswill be to bridge the terminologygaps among severalfields at once. This means that we must be patient and willing to let each field describeitself in its own terms first before we can progressto an understandingof how to rephrasethese statementsin musicalterms. Thereforewe cannot initially ask questionssuch as "How can we make an oboe sound like a clarinet?"directly of digitalsignalprocessing, since the informationis not couched in these terms for historicalreasons. We can, however,keep such questionsin mind as we study, on the assumptionthat an accurateunderstandingof the terminologyof the field will allow such questions to be rephrasedinto tractableproblems. Often, the answerwill be that the question asks for unknown or illunderstood techniquesto be applied, but just as often the searchwill lead us to other revelations: answersto questions which arebeggingto be asked! It will often be a useful exercise, therefore,to try to imaginewhat could happento some sound if a particularprocess were applied to it. Certainly much is alreadyknown about the musical effects of digital signalprocessing,but at this point much more is to be learned. Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 Also, we must keep in mind that digitalsignalprocessing is not programming,not acoustics, and hence even a perfect understandingof it would not necessarilyshow us how to create satisfying music with a computer. It is, however, a powerfulway to think about the manipulationand control of sounds, and as such it will most likely representan important prerequisiteto our understandingof how to createmusic in new and expressiveways. Wheel Spoke of Spoke S-Angle SAMPLINGAND QUANTIZATION Whatis a sampledsignal?Whenwe watch a movie, we are looking at a streamof separate,discretephotographsflowing by at a rate of 24 frames(photographs)per second, but we areseeing somethingquite different. The apparentcontinuous motion on the screenis reallythe result of samplingthe position of the variouspeople and objects on the screen at a sufficient rate to ensure that no important detail of the motion is lost. Clearly,if the motion were sampled more slowly, say, at 5 times per seond, the motion would appear jerky and discontinuous,as it does undernow-familiarstrobe lightingat discotheques.We can imaginean experimentthat must have been executed severaltimes in the history of the motion pictureindustry:We startout filmingvariousmoving scenes at a slow, "flickery" rate, and then gradually(or in steps) increasethe framerate until the motion appearssmooth and continuous. Of course,we will eventuallyrun into practical difficulties, such as the sensitivityof the film to light, since an increasingframerate implies a decreasingexposure time for each frame. But hopefully, before such limits are reached,a smooth renderingof motion will be achieved,and indeed, the movie industry has settled on 24 frames per second as a standardwhichworkswell enough. But does it work perfectly? The answer,as usual,is: of course not, as anyone who has ever watched a wagon wheel in a westernmovie can verify. As the wagon startsout from a standingposition, the wheels appear to turn slowly forward;then as the speed increases,the wheels first appearto go faster,then begin to slow down and go backwards,then to stop, then to go forwardsagain,etc. Of course they neverappearto stop completely when the wagon is movingsince their speed blurstheir picture,but neitheris their motion renderedaccuratelyby the film process.If we were to increasethe framerate of filming, what would be the effect on the image of the turningwheels? Clearly,the wagon could go faster before its wheels started to appearslowing down or going backwards,but the point is that for any filming speed (samplingrate), there is some upperlimit on the rapiditywith which motion can take place and still be renderedaccurately on the screen. (Obviously, since most motions are slow enough, the movie industrydeems it unnecessaryto increase the frame rate for the sake of renderingchase scenes more believable.) In order to understandan important aspect of the samplingprocess,let us imaginethat we are filming a documentary on the motion of a one-spoke wagon wheel (see Figure 1). Let us arbitrarilysay that when the spoke points to the rightthat it is in position zero, and that any other position is defined as the angle, measured counterclockwise, from position zero. Hence at an angle of nr/2radiansthe spoke points straightup, at a radiansit points to the left, etc. If the wheel completes exactly one full counterclockwiserevolution per second we can describe its rotational velocity by saying Directionof Motion Figure1. A turning,one-spoke wagonwheel. that it is turningat a rate of +27rradiansper second (clockwise motion would indicate a negative velocity); two counterclockwise revolutionsper second would correspondto 47r radiansper second, and in general,F revolutionsper second will correspondto 2nrFradiansper second. The motion of the wheel is clearly periodic with a period of T = 1/F seconds. F is the frequency (repetition rate) with which the wheel turns, so it can be measuredin cycles (or repetitions)per second (Hertz, abbreviatedHz.), and the quantity 27rFis called the radianfrequency at which the wheel rotates (measuredin radiansper second). Let us begin filmingthe rotatingwheel at the standard rate of 24 frames per second. Assuming that the wheel smoothly turns from the startingposition zero at a rate of F = 1 Hz., successiveframes of the movie will be taken at 0 = 0, 0 = (1/24) * 27rF= rr/12,0 = rn/6,etc. Since the wheel turns only 1/24th of a revolutionat each frame,we can expect our documentaryto representthe facts as they are, a good quality for any documentary.(Our camerais of courseideal, so it neverblursthe picture of the wheel no matterhow fast the wheel moves. At F = 6 Hz. we would get a seriesof frames such as those shown in Figure2; the wheel turns 6/24 = 1/4 of a turn at each frame. At F = 12 Hz., the wheel turns halfway aroundat each frame, and the filmed imageof it appearsto oscillate back and forth with maximumrapidity(underthese conditions). Why maximum? Whatlargermotion could the spoke make on successiveframes? If we set F even higher,say, to 18, then the wheel makes 3/4 of a complete revolutionat each frame,but this is also indistinguishablefrom the wheel turningbackwards at a rate of -6 Hz., equivalentto runningthe film in Fig. 2 in reverse. To make this clearer,considerthe case of F = 23 Hz. At each frame,the wheel turns23/24 of a revolution,almost all the way fromwhereit starts. To verifythat this will appearas a slow backwardsmotion (-1 Hertz), one only has to go to a western movie and watch carefully, taking into account the greaternumberof spokes on most wagon wheels. Finally, at F = 24 Hz., the wheel does not appearto move at all! Thus if we start at F = 0 Hz. and graduallyincreasethe speed to F = 24 Hz., we see the wheel move slowly forward(counterclockwise,iLe.,with smallpositive frequency),go fasterto a maximumat F = 12 Hz., then slow down while turningin the opposite direction(clockwise motion, negativefrequency)and stop completely at F = 24 Hz. If we furtherincreaseF from 24 to 48 Hz., we see againthe same sequenceof events. This is F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 39 0000000 0000000000000000000 t =0 0=0 t- t = 1/24 0 =sr/2 t = 2/24 0 =n t = 3/24 0 = 31r/2 t = 4/24 0 = 2r = 0 Figure2. Movieof a one-spoke wagonwheel turning(apparently)at f= 6 Hz., shot at 24 framesper second. becauseall frequenciesoutsidethe range0 to 12 Hz. are indistinguishable, exceptforbeingpositiveor negative,from 0 and 12 Hz. accordingto the between frequencies relationship Fa +)R =F- F < 2 < (k+2(1)R 2 2 where Fa F R k is the "apparent" frequencyin Hz., is the actualfrequencyin Hz., is the samplingratein Hz.(samples/second), and is anyodd integerwhichsatisfiesthe inequality. Thusif the wagon-wheelturnsat 28 Hz.(=F) andwe filmit at 24 framespersecond(=R),we choosek to be an odd integerwhichsatisfies )F < 2 k_(< 12k < 28 < 12(k+2) 22)R Theonly oddintegerwhichsatisfiesthisinequalityis k = + 1, since _ 1 .24 <-< 28 1 2 8 - 2 <-,< -- 3 24 2 Thereforethe apparentfrequencyis F=28 =28 -2.242 = +4 Hz. Thepointis thatwhileF maybe anyfrequencywhatsoever, is restrictedto a definiterangeof frequencieswhich Fa dependson the samplingrate.Wecansee thatFaandF arethe sameonlyif (k + 1)R/2 is equalto zero,whichis trueonlyif k =-l1. Then Fa = F-O R' R 2 <F-<+ R 2 or simply,Fa = F only if IFI < R/2 (IF I is the "absolute value"or "magnitude" of F withoutregardto its plusor minussign).If IFI> R/2, thenFa#=F, andwe saythatFais analiasofF. Thisphenomenon of aliasingor foldoveris found in all sampledsystems,whetherthey arefilmedwagonwheels or digitizedwaveforms.Equation(1) is a statementof the samplingtheorem,whichstatesthat any simpleharmonic Page 40 variation(i.e., a sinusoidalvariationof a one-dimensional quantity,a circularmotionin a two-dimensional quantity, etc.) whichoccursat a rateof F Hz.mustbe sampledat least 2F timespersecondin orderto avoidaliasing. Thereadermayhavealreadynoticedthatif the sampling rateis exactlytwicethe frequencybeingsampled(R = 2F), thenEquation(1) is ambiguous, sincetherearetwo different valuesof k whichwill satisfythe inequality.Wewill come backto thisfinepointlateron. In orderto processsoundswitha computer,we representtheirwaveforms as sequencesof discrete,finite-precision numbers.Theseare the samplesof the instantaneous amplitude of thesewaveformstakenat brief,regularintervalsin time. Anymusicalwaveform canbemodelledasa sumof sinusoidalvibrations, eachwitha particular (thoughpossiblytimevarying)amplitude,frequency,andphase.Thus,in orderto representa continuous(analog)waveformaccuratelywith discrete(digital)samples,we mustensurethat the sampling frequencyis at leasttwo timesgreaterthanthatof the highest frequencycomponentof the originalwaveform.Thesampling theorem(Equation(1)) thenassuresus of the accuracyof our renditionof the waveformif it is bandlimited to the frequency regionbelowone-halfthe samplingrate. Thesamplingprocessis achievedby usingananalog-tovaluein digitalconverter (ADC)whichgeneratesa numerical form(typicallya binarynumberof 12 to computer-readable 16 binarydigits,or bits). The converter's numericaloutput is proportional to theelectricallevel(eithervoltageor current) at its input,whichis sampledat a raterangingfroma few Hertz(forsignalssuchas seismicwaves)to 50 kiloHertz(for high-qualityaudiosignals).Theanalogwaveformis typically passedthrougha low-passfilterto attenuateanycomponents at frequenciesgreaterthanhalf the samplingrate (see Figure3), sincethesearegenerallyimpossibleto removefrom the digitalsignaldue to the aliasingeffect describedabove. Whetherthe distortiondue to aliasingproducesnoticeable effectsin musicalsoundsdependson the relativestrengthsof the aliasedcomponents,butseverealiasingis generallymuch morenoticeableandirritating in soundsthanit is in movies of wagonwheels. Theanalog-to-digital converterproducesa B-bit binary valueto representthe instantaneous amplitudeof the analog signalat eachsample.SinceB binarydigitsmayrepresentat most2Bdifferentvalues,thismeansthatthe ADCmustchoose the closestB-bit valueavailable for eachsample.Thus,if the bandlimited varies analogsignal between,say, +10 and -10 volts, and B = 10 bits, the entire20 volt (peak - to - peak) range Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 SIGNAL SOURCE e.g., sound + air pressure V variation - time -4 TR ANSDUCER e.g., microphone electrical analog to pressure variations LOW- PASSI FILTER "> time -, removes frequency components >R/2 Hz. band-limited analog waveform ANALOG - TO - samples at R Hz. DIGITAL CONVERTER and quantizes to B bits + 0 . time + stores complete-P E representation as sequence of binary numbers COMPUTER MEMORY discrete representation of band-limited analog waveform (digital signal) C time -+ Figure3. Steps by which a continuous signalis convertedinto a digitalsignalfor subsequentcomputerprocessing. may be representedto an accuracyof 20/21o .02 volts at each sample.In other words, the true voltageamplitudediffers from its binaryrepresentationby at most ?10 millivolts, for an accuracyof about ?.05%. Such inaccuraciesare often significant,since we can view them as equivalentto a small, constantamount of randomnoise being added to an otherwise perfectly representedsignal. This quantizationnoise, as it is called, is the digitalequivalentto tape or amplifierhiss, and it is usuallycharacterizedby a signal-to-quantizationnoise ratio (SQNR), expressedin dB (decibels): signalamplitude (2) loglo noise amplitude Thus, if a maximumamplitudeof 10 volts correspondsto the maximumbinaryvalue for a 10-bit ADC, the noise amplitude will be 2-10 as greatas that of the strongestsignal,yielding an SQNRof SQNR in dB = 20 20 log1o 10 21go10.-2-i?- The readermay wish to verify that underthese conditionsand assumptions,the SQNR of a B-bit ADC is approximately 6B dB. However,two caveatsmust be kept in mind: First,we are assumingthat the quantizationerrormay be treatedas a randomnoise independentof the signal, which is certainly questionable,especially at low samplingrates or for small numbersof bits. Second, if the analogsignalamplitudeis not maximal,it must be rememberedthat the noise level remains the same, renderingthe quantizingnoise more audible and bothersome for very soft sounds than for loud ones. ADC's are availablewith 8 to 16 bits of resolution, and while the issue has not quite been resolved,it seems that 12-bit ADC's give minimallyacceptablesound quality, and that improvement beyond 16 bits is probablyunnecessary,since at that accuracythe noise levels of transducersand amplifiersbecome predominant.Specialbit-coding techniquesmay eventually reducethe amountof data in a digitalsignal,but most computer music programsso far have not dealt with this possibility. Producingsoundswith a computeris just the reverseof the process diagrammedin Figure 3: a digital-to-analog converter(DAC) is used to convertbinarynumbersto voltage levels, a low-pass filter "smooths" the waveformby passing only those frequenciesless than half the samplingrate, and the resultinganalogsignalis then amplifiedand transducedby a loudspeakeror earphones. The numericalversion of the signal may be stored in computermemoryand processedin a varietyof ways. Two of the most importantof these processesare transformingthe digitalsignalin orderto analyze its frequencyspectrum,and filtering,which altersits frequencyspectrum.The information gainedby analyzingthe digitalsignalmay be used to understand how such signals might be synthesized-a common objective in computer music-and filteringis a majortechnique for controlling the quality, or timbre, of sounds produced. DIGITALSIGNALS A digitizedsignalis not representedas a function of a continuoustime variable(t), but ratheras a function of discrete valuesof time (n). In other words,we can think of a digitalsignalas a sequence of numbers,each representingthe instantaneousvalueof a (presumably)continuoustime function. Furthermore,we will always assumethat the samples are uniformly spaced in time. Two basic notations for such discrete-valued functions are commonly used in the digital signalprocessingliterature,either x(n), n < N2,nEl N or x(n T), N < n < Nz , nEI Both of these notationshave the samemeaningexcept that in the second one the samplingperiod, T, is shown explicitly. Since it is easy enoughto rememberthat the relationbetween successiveinteger valuesof n and time dependson the samplingrate, we will use the first notation here. For example,a one-second sine wave at a frequencyof 100 Hz. is represented as 20 logo 210 -20 loglo 1000 = 60 dB I ! f(t) = sin(wt), F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions 0 < t < 1 Page 41 Re[eijon] where Co= 2rrF= 21rX 100. If we samplethis waveformat 500 Hz., the discrete form of this equation will be written x (n) = sin(cn) , 1 0 < n < 499 againwith o = 2rrF,but n and t arenot equal to each other. Properlyspeaking,the quantity n T, where T = 1/500 second, is equal to discretevaluesof t for integervaluesof n. Note also that we will generallynumberN samplesfrom 0 to N - 1. Two importantspecial functionswhich we will need in our discussionare the impulse, or unit sample, function and the complex exponential function. The digitalimpulsefunction is defined to be equal to one only if its argumentis zero, and it has a zero value otherwise,Le. I n (3) it If we want the impulseto occur on some sampleno :- O0, follows from the definition that the following equation is true: (4) n = 4.no example 0Figure for 4shows aspecific Figure4 shows a specific example for no = 4. -2 1 -1 0 1 2 3 4 5 6 7 8 u(n-4) 1 -1 0 1 2 4 3 5 6 7 8 9 n- Figure4. The unit sample (digitalimpulse)function u(n) and the delayed unit samplefunction u(n -no) for the case no = 4. The complex exponential function cannot be graphed quite as easily as the unit samplefunction becauseit has values consistingof both real and imaginaryparts: eiwn = cos (cn) + j sin (con) wherei2 = - 1. Page 42 ( . (, Figure 5 shows two graphsof this function, one of its real part, and the other of its imaginarypart. From looking at Figure5, we cannot tell the frequencyof the sinusoidalwaveforms depicted. But we can see that there appearto be 8 samplesin each period of the sinusoidalwaves, and deduce that the frequency of these waveformsmust therefore be one -eighth the samplingrate. Waveformsobtainedby sampling sounds do not have real and imaginaryparts,of course; we say that such waveformsare pure real, or equivalently, that they are complex with a zero imaginarypart. SPECTRA 9 n-- -2 1 Figure5. The complex exponentialfunction ejwn, shown as graphsof its real and imaginaryparts,co = 2r/N, N= 8. n =no u(n - no) = u(n) Im[eljn] 0 u (n) = 1 T-1 (5) As almost everyoneknows, two differentmusicalinstruments playing the same pitch at the same loudness for the same durationfrom the same directionstill sound different due to what is called their tone color, or timbre.Unfortunately, this subtractivedefinition of timbre only says what it is not: timbreis that aspect of a sound which is not its pitch (if it has one), loudness, duration,or directionality.Whatis left is just the microstructureof the sound, and in orderto examine it, we need a way to literallydissect sounds,i.e., to analyze them into their constituent parts. Obviously, a complete descriptionof all of the constituent partsof a particularsound will include informationabout its pitch, loudness,and so on, and in a broadsense we might even include these qualitiesin our definition of timbre. Except on a few electronic instruments such as Theramins,or electronicorgans,the tone color does not remainthe samewhen differentnotes are played, due to varyingstringor tube lengths, lip tension, and so on. Since all of these variationsin tone quality are quite relevantin accounting for the characteristicsounds of musicalinstruments, we can see that analysisof the microstructureof a sound is likely to yield informationonly about that particular sound. Even two successivenotes played on the same instrument by the same performerin the same mannerare likely to have strikinglydifferentmicrostructures.It is the study of this Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 tonal microstructure,and its relationshipto what we hear,that is one of the deepest problemsin computermusic research,for it is here that the complexitiesof the physics of the instrument, of room acoustics,and of the psychology of the listener, enter in. Our model for describingthis microstructureis called the spectrum of a sound, by analogy to the spectrumof a beam of light that may be obtained by passingthe light througha prism.The prismhas the propertythat light made up of different frequencies,or colors, is refractedby varying amounts,the index of refractiondependingon the component, or primary,color in question. By observingthe intensities of the light at these differentfrequencies,we are able to determine the make-up of the originallight beam. If the beam is "purewhite" light, we obtain a "full spectrum,"proverbially the colors of the rainbow. The prismfor soundsis Fourieranalysis.By applyingthe Fouriertransformto the waveformof a sound, we can mathematicallydeterminejust which amountsof which frequencies are responsiblefor that particularwaveshape,and we can use our analysisas a guide in synthesizingthat sound. If the sound consists of all audiblefrequenciesin roughlythe same amounts, we call the result "white sound," by analogy to white light. Unfortunately,since a rainbowis considerablymore appealingthan the steady, steam-like hiss of its audible counterpart,we usuallyreferto this sound as white noise. If, however,some frequenciesare considerablymore predominant than others, the sound becomes "colored,"and if the relationships amongthese predominantcomponentsbecome roughly harmonic(i.e., the frequenciesare integermultiplesof a single frequency, called the fundamental frequency) the tone will acquirea more definite pitch. When the waveformconsists entirely of harmonically related frequencies, it will be periodic,with a period equal to the reciprocalof the fundamental frequency(which need not be present). The measurementof sound spectrais complicatedby the fact that the spectraof almost all sounds changeboth rapidly and drasticallyas time goes by. This situationis worsenedby the fact that the accuracywith which we can measurea spectrum inherently decreasesas we attempt to measureit over smallerand smallerintervalsof time. The spectrumof any instant duringthe temporalevolution of a waveformdoes not even exist; for example, we could scarcelytell anythingat all about the frequencycomponentsof a digitalsignalby examining a single sample!We can measurewhat happensto the spectrum only on the averageover a short intervalof a soundperhapsa millisecondor so. The longerthe interval,the more accurateour measurementof the averagespectral content duringthat interval,but the less we know of the variations that occurredduringthat interval.Thus the problemof spectral measurementcan be seen to be one of findingthe best compromisebetween these opposing goals. Just how much accuracyis needed is still an open question in the realmof musical psychoacoustics:in some cases our ears seem to be much more tolerant of approximationsthan in others. The historicalmodel ofspectra as measuredby Hermnnann Helmholtz (see References)is clearlyinadequatefor believableresynthesis (Helmholtzwas able to determinethe averagevalueof spectral componentsover the durationof entire notes played on individualinstruments).A more recent model characterizesa note by attack steady-state, and decay segments.Such a model is certainlyan improvement;but it has limitationswhen applied to the problem of producing "connected notes" (Lie.,to problemsof musicalphrasing).Besides,the "steady-state"of any real tone isn't really"steady"at all. This discussionis not intendedto imply that the situation is hopeless, but only that it is subtle and complex, and that it is as importantto appreciatethe limitations of the spectralmeasurementtechniques presentedhere as it is to realizetheir power. Therecan be little doubt that these techniques, and their relativesand extensions, will be the ones which will eventuallyyield the basis for a richly expressive computermusic. THE DISCRETEFOURIERTRANSFORM The two most commonly used transformsin digital signalprocessingare the discreteFouriertransform(DFT) and the so-called z-transform.The DFT is used to calculatethe spectrumof a waveformin terms of a set of harmonically relatedsinusoids,each with a particularamplitudeand phase. It is usuallyimplementedby meansof a particularlyefficient algorithmknown as the FFT (for fast Fourier transform),the discoveryof which has made spectralcomputationa much more practicalrealitythan it would be otherwise.Since the DFT is less restrictive(albeit less efficient), and since the FFT is well documentedfor those with a basic understandingof the DFT, we will consideronly the DFT here. The z- transform, unlike the FFT, is not somethingthat is typically calculated with a computer, but is rathera mathematicaltool used primarilyin the theory of digitalfilters. It is in a sense more generalthan the DFT, since it includesthe DFT as a special case, and it is of considerableinterestin the generaltheory of digitalsignalprocessing. The fundamentaloperationof the DFT is to decompose an arbitrarywaveforminto its spectrum.The spectrumof a waveformis a descriptionof that waveformin terms of a number of "basic buildingblocks" for waveforms,which in the case of the DFT are sinusoidswith harmonicallyrelated frequencies.By analogy,if we factor an integerinto its prime factors,we have in a sense "decomposed"the originalnumber into basic numerical "building blocks;" for example, 340 = 1 X 2 X 2 X 5 X 17. The buildingblocks themselves (the prime numbers)are just those numberswhich cannot be furtherdecomposed: they can be expressedonly as one times themselves,hence they arethe componentsof which other numbersare formed,and not vice versa.Finally, factoring any integerinto primesyields an answerthat is unique: there is no other set of primenumberswhich, when multiplied together,will yield 340 except the set stated above.Perhaps we picked the number340 in the first place, not on the basis of its primefactors,but becauseit was the sum of the agesof everyonein a small room: 340 = 10 + 28 + 32 + 40 + 50 + 50 + 60 + 70. This is clearlyanotherway to decompose 340, but it is not unique,since an infinite numberof sequencessum to 340. Similarly,the basicbuildingblock used by the DFT for waveformsis the sinusoid.The DFT worksby treatingN samplesof a waveformas if it were one N-sample period of an infinitely long waveformcomposedof a sum of sinusoids which are all harmonicsof a fundamentalfrequency correspondingto the N-sample period. And, like the primefactors F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 43 discussed above, this set of harmonicallyrelated sinusoids, each with a particularamplitudeand phase, is unique: no other set of sinusoidscould be summedtogether to obtain the originalwaveform.Of course there may be other, possibly non-unique ways of decomposinga waveform,just as 340 could be non-uniquely decomposedinto non-unique sums of ages ratherthan a unique product of primes.In fact, other unique decompositionsfor waveformsexist besidesthe sum of sinusoidsyielded by the DFT, which we won't considerhere. We should also discussthe concept of energyat a particularfrequency. A waveformmay, in signalprocessingparlance, have energyat, say, 100 Hz., which meansthat at least one sinusoidalcomponent with a frequencyof 100 Hz. and a nonzero amplitudeis presentin the vibrationpattern.Energyin this case designates"that which exists" at 100 Hz. which does not exist at, say, 110 Hz. The DFT functions by measuringthe amplitudesof sinusoidalcomponentsat particularfrequencies in a waveform,and since energy can be shown to be proportional to the squareof amplitude,we can see that this process measuresthe energy at such frequencies.We could imagine accomplishingthis processin a laboratorywith a set of electrical audio filters,each of which will pass energyonly at one frequencyand block energyat all others. This bank of filters could be used to detect energiesat a set of frequenciesfor an arbitraryinput signal.The DFT accomplishesthis mathematically in the following way. Suppose x (n) is a sequence of numbersrepresenting N samplesof one periodof a waveformwith a period of N samples.For example, let x (n) = A sin (con), with co = 2nr/N and 0 n ?<N- 1. For N = 8 we would have the sequence x(n) = 0, - A A .707A, A , A 0, A x (n)sin(con)= 0 n=O + 4A =N + A + A A+ + A + A + A 0 N-1l A Z A sin(cn) sin(2con)= L n=0 n=O0 AN-1 = --- cos(-on) - [cos(-con) - cos(3on)] AN-l n=0 cos(3cn) = 0 n=0 We have used a trigonometricidentity to show that this sequenceis composedof the sum of two cosine waves,one at frequency -co and the other at frequency 3o. Since the cosine wave has the same symmetryabove and below the horizontal axis as the sine, both of these componentssum to zero as well, indicatingno energyat frequency2co. As long as we are summingup the valuesof a sinusoidalwaveformover any integralnumbersof periods, we get zero. The sinusoidswhich have an integralnumberof periodsin a durationof N samples arejust those correspondingto the harmonicsof the frequency with a period of N samples. So far we haven'tconsideredthe phase of the sinusoidat frequency w. We recall from Part I of this tutorial that a sinusoidwith arbitraryphaseand amplitudecan be represented as A sin (cn + ) =a cos (cn) + b sin (cn) (7) where A 4 a b is the amplitude, is the phaseangle, is equal to A sin j, and is equal to A cos 4. Both the amplitudeand phaseof a sinusoidalcomponent at frequencyo can then be determinedby usingour multiplyand-sum procedure,first with a cos (cn) multiplierto calculate the "a" coefficient, then with sin (con) to calculate the "b" coefficient. The amplitudeand phase of the component are then givenby the relations A = and (8) =tan-I__b For example,let the sequencex(n) be defined as follows A The resultis A/2, one-half the amplitudeof the sinusoidat frequencyco, scaledby N, the numberof samplesunderconsideration. We could not simply sum together the numbersin the x(n) sequenceto obtain the same result, since summing over any intergralnumberof periodsof a sinusoidyields a zero result. This is due to the symmetryof the sine and cosine functions above and below the horizontalaxis. However,by multiplyingx (n) by sin (con),we form the sequencex (n) sin (con) = A sin2 (con), and all valuesof sin2 arenon-negative. Thus we have "extractedthe amplitude"of the sinusoid at frequencyco in x(n) by purelymathematicalmeans. What would happenif we were to "extractthe amplitude"of the Page 44 N-l -, -A, Wecan measurethe energyat frequency o by extractingthe amplitude, A, of the sinusoid at this frequency. This is accomplishedin this case by formingthe productof x (n) with sin (cwn),and addingup the numbersin the resultingsequence, since N- IA component of x(n) at frequency 2c? Withx(n) defined as before, we expect that no such componentwill be detected, i.e., that its amplitudewill be zero. In orderto verify that this is so, we form the product sequencex(n) sin (2cn) and add up the resultingnumbers: a2+b2 x(n) = A sin (on + q1) + B sin(2on + q2) = al cos (on) + bl sin (con) + a2 cos (2con) + bz sin (2con) where al = A sin 1, bl = A cos 'P, a2 = B sin 52 and b2 = B cos 2 ? Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 We "extract" al via our multiply-and-sum procedure, using cos (con) as a multiplier: N-1 E x(n)cosc(wn) n=0 N-1 = Here the 1/N factor is includedto compensatefor the fact that the latter two sums are scaledby N, as derivedearlier. Notice that if x (n) = A cos (con), and we "extractthe amplitude"ofx (n) at frequencyc with our multiply-andsum procedure,we find that the answeris A/2: i 1NE 1 A cos(con) cos (cn) [a cos2 (cn) + b1 sin(cn) cos (cn) n=0 + a2 cos (2con) cos (cn) + b2 sin (2cn) cos (con)] n=O n=0 N-1 a n=0 = +? cos(2con)] =N [1, 2 But notice also that if we "extractthe amplitude"of x (n) at frequency-co the answeris the same: Similarly,if we were to use sin (con) as a multiplierwe could extract b,; cos (2cn) as a multiplierwould extracta2, and so on. Given both a,, and bi, we can then apply Equations(8) to obtain A and 0b1, if desired.This is the principle of the DFT: the multiply-and-sumprocedureis appliedto determinethe amplitudesand phasesof each of the harmonics of the waveform. How many harmonicsmight be present?Accordingto the samplingtheorem, we need at least 2 samplesin each periodin orderto avoid aliasing;so if N = 8, and the sampling rate R is = 8000 Hz., the only possible frequenciesof which our periodic function x (n) could be composed are the harmonics of 8000/ 8 = 1000 Hz., which "fit", i.e., which have frequenciesless than or equal to one half the samplingrate. However, the samplingtheorem is perfectly admissiveof negative frequencies,so the complete list of integralmultiples of 1000 which have magnitudes< 4000 Hz. is: - 4000 Hz. (harmonic"-4") 3000 2000 1000 0 (harmonic"0") + 1000 (harmonic+ 1, or the fundamental frequency) + 2000 + 3000 + 4000 x (n) is modelled as being composed only of sinusoidsat these frequencies,ie., 1 1 INn=O A N-1 A cos(wn) cos(-cn) ? cos [won-(-con)] +cos [wn+(-con)]j n=0 =A 2 Before we proceed,let us considerjust what is meant by a "negativefrequency."Whenwe consideredwagon wheels, we measuredanglesin a counterclockwisedirectionfrom the righthorizontalaxis as positive angles,and clockwiseas negative angles.Clearlythe angle+ 2700 describesthe samespoke position as -900. Similarly,the radianfrequencyof rotation was positive for counterclockwisemotion and clockwise motion was describedas a negativeradianfrequency.Does it matterwhetherwe use a positive or negativedescriptionof a frequency?Not too surprisingly,the answeris yes and no. Certain mathematical functions have the property known as evenness,which meansthat they areleft-right symmetricalaroundzero: f(x) = f(-x) = f(x) "even" (10) (the symbol =- means "impliesthat"). Other functions have the propertyof oddness, which meansthat they areleft-right antisymmetricalaroundzero: +N/2 x (n) j = k = -N/2 where ak cos(kcn)+ bk sin (kcon) co= 2ir/N, N- I ak = x (n) cos(kon) , and -f(x) (9) b II=0 f(x) x (n)sin(kon) j f(x) "odd." (11) Some functionsare even, some are odd, many areneither,and only one is both (it is left as an exercisefor the readerto discover the only function that is both even and odd). Any function may be thought of, however,as composedof the sum of an even partand an odd part,either (or both) of which may be zero. In other words,any arbitraryfunctionfmay be "broken apart."Thus, N-l = 1 = f(-x) = fe(X) + fo(X) (12) wherefe (-x) = fe (x) andfo(-x) = -fo(x). F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 45 Here'sthe proof that this is so: f(x) = fe(X) + fo(X) = f(-X) = fe(-X) + fo(-X) But, by the definitions of fe andfo, f(-x) =fe(x)- fo(x) Thereforewe can solve for eitherfe or fo by adding(or subtracting)f(x) andf(-x): f(x) +f(-x) = [fe(x)+fo(x) + [ (x)- fo(x)] fe(x) = 1/2 [f(x) +f(- x)] Similarly, fo(x)= h [f(x) - f(-x)] Clearly, fe (-x) = ? [f(-x) + f(x)] = fe(x) and, =1/2 fo(-&x) [f(- X)-f(x)] =-fo(). be mentioned before we proceed to define the DFT. A component at exactly one-half the samplingrate, i.e., with only 2 samplesper period, can only be purely even, since the samplesoccur at angles0 and rr,and both sin (0) and sin (7r) are equal to 0. Thus the "bk"coefficients of Equation(9) will alwaysbe zero wheneverk = ?N/2. Similarly,at zero frequency (also called "D.C."for direct currentby some engineersand others who talk to these engineers),the "b," coefficient is alwayszero, againsince sin (0) = 0. We are now ready to define the DFT of a sequence x (n). As mentioned before, we model x (n) as one N-sample period of a periodicwaveform.The DFT will then yield the unique spectrum of x (n) in terms of the amplitudesand phasesof sinusoidalcomponents,each with periodsharmonically relatedto N, the numberof samplesin the transformed signal.Whilethe FFT algorithmgenerallyrequiresthat N be a power of 2, the DFT (which yields exactly the same result, albeit with less computationalefficiency) places no restriction on N except, of course, that it be greaterthan 2 samples. Following the usual practice in the literature,we will define the DFT in terms of the complex exponential which allows us to representboth sine and cosine functions at once. A typical definition for the DFT is then DFT [x (n)] = X(k) Finally, fe(x)+fo(X)=' [f(x)+f(-x)] +V [f(x)- f(-x)] =f(x). N-I N- 0 n=0 x(n)e-iwnk Wehave provedthat f(x) can be decomposedinto a sum of even and odd partswithout sayinganythingelse at all about f(x), so it is true for all functions! Getting back to the question of negativefrequencies,it is clearthat if a function is purely even, such as the cosine function, the sign of the frequencydoesn't matter at all since cos(con) = cos (-con) (13) But for a purely odd function, such as sine, it representsa negation of amplitude,or equivalently,a 1800phase shift: sin (-con) = -sin (won)= sin (con+n) (14) BecauseA cos (cwn)is an even function, it is generally meaninglessto distinguishbetween positive and negativefrequency cosine waveforms.But the spectrumof a cosine wave may be consideredto contain both positive and negativefrequency components,both of which have a positive amplitude equal to A/2. For a sine waveformA sin (con), we obtain also positive and negativefrequencycomponents,but of opposite amplitude due to the oddness of the sine function: If the positive frequencycomponenthas a positive amplitude,then the correspondingnegativefrequencycomponentwill have a negativeamplitude,and vice versa.The complete DFT yields the amplitudesand phasesof both the positive and negative harmonics.The amplitude is split in half at corresponding positive and negativefrequencies,with the signsof the amplitude of the odd parts being opposite: this explains the fact that only one-half of the amplitude is measuredif we consideronly positive frequencies. One more aspect of cosine and sine waveformsshould Page 46 < k <N- 1 (15) where co = 2rr/Nand e -jwnk= cos (cnk) - j sin (cnk). The inverseDFT is then defined as DFT-' [X(k)] =x(n) IN-1 X (k)e+jwkn .I-N k=0 0 n N-l (16) Thus if X(k) is the DFT of x(n), then x(n) is the inverseDFT of X(k). The mathematicaloddnessof the imaginarypartof the complex exponentialnecessitatesthe use of the minussign in the exponent of the DFT, while the inverseDFT has a positive exponent. Also, since the multiply-and-sumprocedure producesvalues which are scaled by a factor of N, the 1/N factor appearsin the inverseDFT in orderto make the statement DFT-' [DFT [x(n)] ] = x (n) exactly true. The valuesof X (k) (the spectrum)are complex, with the real partscorrespondingto the "a" (cosine, even part) coefficients and the imaginarypartscorrespondingto the "b" (sine, odd part) coefficients of the spectralcomponents.If we denote X(k) as ak+jbk, then (17) IX(k)1=/a+b• is called the magnitude,modulus, or amplitudeof X(k), which is the same as the amplitudeof the correspondingspectral component (except for being scaledby N). Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 bk arg[X(k)] = pha [X(k)] = 4 X(k) = so (18) 4 ak tan-is called the argument,phase, or angle of X(k), which is equal to the phase angleof the correspondingspectralcomponent. The next question is: How do the valuesof the index k correspondto frequency?In orderto understandthis correspondence,it is instructiveto take the DFT of a specific sequence of numbersand see exactly what we get. Let us define x (n) as 8 samplesof a periodicwaveform with a period of 8 samples, sampled at R = 8000 Hz. x (n) must be composed of harmonicsof 8000/8 = 1000 Hz., m=0 Fm 0 1 0 Om Am 1000 1/10 1 0 0 2 2000 1/2 7r/3 3 3000 1/3 4 4000 1/4 7r/4 r/5 Am cos (mcon + Cm) = Am [cos (mcn) cos Om- bm am .100 .433 .236 .236 .202 .147 with am and bm defined as in Table 1, we can see that the form givenin Table 3 yields the same numbersfor x(n) as Table 2. Figure6 is a graphof x(n). Only those valuesfrom n = 0 to n = 7 are considered(one period);these are shown as solid lines on the graph. x(n) is presumablyan 8-sample sequence extracted from a longersequencewith a period of 8 samples (other values of this longer sequence are shown on dotted lines). A glanceat Figure6 confirmsthat the spectralstructure of x(n) is not very apparentfrom observationof its waveform, Table 1. Coefficientsrepresentingone period of a sampled m=O waveform, x(n)= Amcos(mon +'). 4 sin(mcn) sin Om = Am [amcos(mcon) - bmsin(mcn)] 0 0 1.000 .250 Am cos (mn + Om) with o = 2-n/8= ir/4. Values for both the amplitudesand the phaseanglesfor each of the five componentsare givenin Table 1. Table2 shows actualnumericalvaluesfor the five componentsof x (n). The sum of these five components,Le., the numericalvalueof the samplesof x (n) itself are shown at the end of Table2. The so-called "analytic(cosine and sine) form" of the components of x(n) is shown in Table 3. By applyingthe trigonometric identity (Correspon(phasein ding frequency in Hz.) (amplitude) radians)(Amcos m) (Amsin m) m > x(n)= 4 x(n) = > Am cos(m(on+Om) m=0 0On<,7 nr 27r 4 CO8 Components: Am cos (mcon+ Om) m 1 2 3 4 5 6 7 .100 .100 .100 .100 .100 = .100 cos (0) -.707 0 .707 = cos (- -.433 -.250 .433 -.236 0 = -.202 = 0 .100 .100 .100 1 1.000 .707 0 2 .250 -.433 3 .236 -.333 .236 0 4 .202 -.202 .202 -.202 = 1.788 -.161 .288 -.376 -.250 -.707 .433 -1.000 .250 -.236 .202 .333 -.202 .202 -.909 -.184 .500 cos + 0 (-23j+ .333 cos (4 + .250cos rn +I-- -4 4 -.684 m=0 1.038 = x(n) Table 2. NumericalValuesof the Componentsof x(n) as describedin Table 1, and their sum, x(n) itself. F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 47 yet applyingthe DFT to x(n) will indeed tell us exactly the components of which x(n) is composed. In orderto calculatethe DFT it is useful to make a table of values for e-Jwnksuch as the one shown in Table4. We start with k = 0: 7 X(O)= x(n)e-jo X(1) = ~. x(n)e-iwn n=O n=O 7 7 7 = The real part of the answer(.8) is supposedto be N times one of the "a" coefficients in Table 1, and indeed, it is N- ao. The imaginarypart of the answer(0) correspondsto bo. So k = 0 apparentlyrefersto the zero frequency,D.C., Oth harmonic component of x(n). For k = 1: 7 n=0 x(n) cos(0)- n=O x(n) cos(2nn/N) - / n=0 x(n) sin(27rn/N) [(1.788)(1) + (-.161) (.707) +... etc.] 7 = 7 = i] n=0 x(n) sin(O) n=0 x(n) = 1.788 + (-.161) +.288 + (-.376) +j [(1.788)(0) +(-.161)(-.707)+... etc.] = 4+j0 + (-.684) + (-.909) + (-.184) + 1.038 which is just N/2 times (al + jbI), the coefficients for the 1st harmonic, or fundamental frequency. Table 5 shows a .8 4 x(n) = m=0 [am cos sin (mon)] (mon)-bm 2vr ir 4 8 7 0n Components: am cos (mwn) m n 0 1 2 3 4 5 6 7 0 .100 .100 .100 .100 .100 .100 .100 .100 = 1 1.000 .707 0 -.707 0 .707 = 2 .250 0 -.250 0 3 .236 -.167 4 .202 -.202 - 1.000 -.707 .250 0 -.236 0 .167 .202 -.202 .167 .202 -.202 .202 cos = n .250 cos(n-) -.167 = .236 -.202 = .202 cos (nir) 0 -.250 0 .100 cos (0) cos- bmsin (mon) m 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 0 = 0 sin (0) 1 0 0 0 0 0 0 0 0 = 0sin 2 0 .43 3 0 .167 4 0 0 0 -.236 0 -.433 .43 0 0 -.167 .236 0 0 0 .167 0 0 -) -.433 = .43 -.167 = .236 sin 0 = .147 sin (ni) sin (_r) Table 3. AlternativeForm of the Descriptionof the Componentsof x(n) as Describedin Table 1. Page 48 Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 2 x(n) 9 9 S91 0 0I 0 0O Figure6. A graphof x (n) as describedin Table 1. x (n) is periodicwith a period of N = 8 samples. Only the period from Q oo n = 0 to n = 7 is considered -1 (solid lines), but presumablythe function repeatsitself before and after this period (dotted lines). f -2 n=O e-jwnk = cos (wnk) -j sin (wnk) component values for N = 8, C = 2nr/N cos (2rnk/N) k n 0 1 0 1.000 1.000 1 1.000 .707 2 1.000 3 2 1.000 3 1.000 0 -.707 0 -1.000 0 1.000 -.707 0 4 1.000 -1.000 5 1.000 -.707 0 6 1.000 0 -1.000 0 7 1.000 .707 0 -.707 1.000 .707 -1.000 .707 4 1.000 -1.000 5 1.000 1.000 -1.000 1.000 0 0 -1.000 0 0 -.707 .707 -1.000 .707 1.000 .707 -1.000 0 -.707 0 -1.000 0 -.707 0 .707 1.000 -1.000 1.000 7 -.707 1.000 -1.000 6 -j sin (27rnk/N) (all values below are multipliedby j) 0 11 2 3 44 5 6 7 0 0 0 0 0 0 0 0 0 1 0 -.707 -1.000 -.707 0 .707 2 0 -1.000 0 3 0 -.707 4 0 0 5 0 .707 -1.000 6 0 1.000 0 7 0 .707 1.000 0 1.000 1.000 0 -1.000 -.707 0 .707 0 0 0 .707 0 -.707 -1.000 .707 0 0 1.000 -.707 1.000 .707 0 1.000 -1.000 .707 0 1.000 0 - 1.000 0 -.707 -1.000 -.707 Table4. Valuesof e-iwnk for c = 2nl/N, N = 8 F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 49 complete table of X(k) for all valuesof k from 0 to 7. Several remarksare in orderabout X(k). Wesee that k correspondsto the harmonicnumberfor k = 0, 1, 2, and 3. But the D.C. and half samplingrate componentsare scaledby N, while the rest are scaled by N/2. For k = 5, 6, and 7, we see that the coefficients are the same as k = 3, 2, and 1 respectively,except that the sign of the imaginarypart is reversed.Since the imaginarypart correspondsto the sine function component, and since sine is an odd function, it is clearthat these are the spectralvaluesof the negativefrequencies. k = 5 corresponds k X(k) Corresponding SpectralCoefficients 0 .8 +jO N(ao +jbo) 1 4+j0 2 1 +j 1.732 3 .944 +.944 4 1.616 +j0 5 .944-j. 944 6 1 -j 1.732 7 4 -j20 Corresponding Frequency(Hz.) ?0 N 7 (a +1jb1) N 2 (a2 +b2) + 2000 2 (a3+jb3) + 3000 N (a4+jO) ?14000 N (a3- jb) -3000 2 (a2 jb2) N bN(a - 2000 + 1000 out again. This procedurewill be left to the readeras a valuableexercise. Let us proceed by examiningsome of the propertiesof the DFT and derivesome importanttransforms that will be useful later. First of all it is importantto note that we have defined the DFT in such a way that only the principalvaluesof X(k) are calculated.A more generaldefinition is N-1 X(k) = E x(n) e-jwnk n=0 t Amplitude 1000 N/2 I O<k<N-1 n=0 +Fo +R 2 Frequency-* (Hz.) to the -3rd harmonic, k = 6 corresponds to the -2nd har- monic and k = 7 to the -1st harmonic. Whatabout k = 4? As mentioned above, the samplingprocesscannot representan amplitude for a sine component at half the samplingrate, so the imaginarypartis 0, even though we used a non-zero value for b4 (b4, in fact, has been ignoredby the sampling process). Also, the scale factor is N for k = 4 instead of N/2. This is becauseboth the D.C. and one-half sampling rate components perforce have zero imaginaryparts, and hence it is impossibleto split them into distinguishablepositive and negativefrequenciesas can be lone with the other components.This says that, when samplingat 8000 Hz., we cannot distinguishcomponentsat + 0 Hz. from - 0 Hz., nor can we distinguish components at +4000 Hz. from componentsat - 4000 Hz. This accounts for the ambiguityin Equation(1) when F = ?R/2. It also accounts for the different scale factors for X(0) andX(N/2). In orderto completely check our transformdefinition we should apply the inverseDFT to X(k) and see if x(n) pops 0 -R -Fo 2 Table 5: The discrete Fouriertransform(DFT) of x(n) as describedin Table 1 (N = 8, R = 8000 Hz.). Page 50 --N IN/ x(n)e-jwnk (19) This definition shows explicitly that the spectrumobtained from the DFT is a periodicfunction of frequency. In other words, the principalvaluesof the N-sample DFT of cos (con) arejust what is shown at the top of Figure7, but this is really only a part of the full picture, shown at the bottom of Figure7. This spectralperiodicityis due to the periodicityof the complex exponential function itself, and theoretically extends over all frequencies for all digital signals. It is easy to see from the graphof the full periodicspectrumof a sampled signal how no frequenciesgreaterin magnitude thanR/2 can exist. If Fo in Figure7 were insteadthe frequency R - Fo, which is greaterthanR/2, the plots would look exactly the same. If we (properly) interpret the k index of X(k) = DFT [x(n)] as negativefrequenciesfor k > N/2, and normalize amplitudesthat may be scaledby N, we can begin to construct a useful conventionfor spectralplots. Only the smooth N-1 X(k) = DFT[x(n)] = -oo < k <oo Amplitude I! - R- Fo -3R 2 -R+Fo -R -R -Fo 0 2 -Fo R - Fo R 2 R R+Fo 3R 2 Frequency-+ (Hz.) Figure7. Spectrumof cosinefunction at frequencyIFoI<R/2. At the top we see just the 2 principalcomponents, one at +Fo Hz. and the other at -Fo Hz., both with an amplitudeof N/2. At the bottom we see three periodsof the periodicspectrum of the same functionscentered around0 Hz. Normally only the top figure is used, but the spectrumof all digitalsignals is actually periodicas shown at the bottom. Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 cosine (real,even) sine (imaginary,odd) I unit sample (impulse) (constant) impulsetrain impulsetrain sin irx rectangularpulse bell-shapedpulse sine bell-shapedpulse Figure8. Some transformpairs. Eachmemberof a pairtransformsinto the other (see text). The tick markson the horizontal axes representunit valuesof time or frequency;on the verticalaxes, they representunit valuesof amplitude. After The Fourier Transformand Its Applications by Ron Bracewell.Copyright@ 1965 by McGraw-Hill,Inc. Used with permissionof McGraw-HillBook Company. F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 51 curve need be given (ratherthan a sequence of dots on the heads of sticks) for functions like cosine or sine-it is understood that this curve is sampled at the samplingrate. When convenient or appropriate,some functions such as the impulse will still be shown with the dot-stick notation. Transformpairsfor some common functions are shown in Figure8. It shouldbe rememberedthat the DFT is reallya two-way process:each memberof a transformpairtransforms into the other member.We see, for example, that the unit samplefunction transformsinto a constant spectrum.Orwe can read this pairin the opposite direction:a constant-valued sampledfunction has energy only at zero Hz. The scale markings in Figure8 correspondto unit valuesof time or frequency on the horizontalaxes, and unit valuesof amplitudeon the verticalaxes. Rememberthat if one of the membersof a pair is interpretedas a time function, its amplitudeis scaled by one, and its correspondingspectralamplitudeswill be scaled by N. function x(n) with u(n), the impulsefunction (third from the top in Fig. 8), then the result is just the same as x(n). In other words, x(n) * u(n) = x(n) where the asteriskdenotes the convolution operation.Note that this is certainlydifferentfrom multiplyingx(n) by u(n), which would resultin setting all valuesof x(n) to zero, except for x(0), which would remainunchanged.The impulseis said to be an identity function with respect to convolution,since convolving any function with u(n) leaves that function unchanged.If we scale u(n) by a constantci, the resultis x(n) * clu(n) = clx(n) which is just x(n) again,but scaledby the sameconstant. If, however,we convolvex(n) with cl u(n - no), a shifted, scaled impulse function, the resultis x(n) * clu(n - no) = clx(n - no) CONVOLUTION If X, (k) = DFT [xl (n)] and X2 (k) = DFT [x2 (n)], then an importantproperty of the DFT known as linearity assuresus that the following is alwaystrue: DFT[clxl(n)+c2x2(n)] = c1X1 (k)+c X2(k) (20) where cl and c2 are arbitraryconstants.Does the samehold true if we multiplyxl (n) and x2 (n)? Unfortunatelynot, since the spectrum of the product function xl (n) x2 (n) is not X1 (k) X2 (k). Thereis a method for obtainingthe spectrumof the product of two different functions known as convolution, which we will treat first in a qualitativeway. If we convolve a a shifted, scaledversionof x(n) (see Figure9). In orderto convolvex(n) with a scaled,delayed impulse function, we just scale and delay x(n) by the same amount. But we can view any sampledfunction as a collection, or sequence,of scaled, delayedimpulsefunctions! For example, supposex(n) is defined by , u(n) = n=-l n = +1 otherwise S.5 Inl< 2 andy (n) is defined as = 0 x(n) x(n) * .5u(n) = .5x(n) x(n) * u(n-z) = x(n-z) otherwise In orderto convolvex(n) with y(n) (see Figure 10), we can think of x(n) as being composed of two impulses:one scaled by + 1 and delayedby - 1 samples,the other scaledby +2 and delayed by +1 samples.Thus we place a copy of y(n) at n = - 1 (i.e., we form the functiony(n + 1)), and another copy at n = +1, this one scaled by +2. The convolution of x(n) andy(n) is just the sum of these two shifted, scaledversions of y (n). Wecould also have treatedy (n) as a set of five impulseslocated at n = -2,- 1, 0, 1,2, and scaledby .5 each. Placingscaledcopies of x(n) at each of these 5 locations and addingup the 5 resultingfunctions (Lie.,.5 [x(n + 2) + x(n + 1) +x(n)+x(n - 1)+x (n - 2)] ) would have yielded exactly the same result.This indicatesthat the convolution operationis commutative,which is to say x(n) ? y(n) =y(n) * x(n). The mathematical definition of convolution is as follows. Supposex(n) is a sampledfunction of durationNx samples,ie., x(n) is non-zero only in the range0 < n < Nx- 1. Let y (n) be a similarfunction of durationNy samples.The convolution of x(n) withy(n) defines a new function _D . Figure9. Examplesof convolution of an arbitrarysequence x(n) with the unit samplefunction. Page 52 1 2 0 x(n)= y(n) x(n) (21) n z(n) = x(n) *y(n) = > m=0 x(m)y(n - m) Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions (22) Volume II Number 2 together,however,we must convolvethe spectraof the two separatecosine waveforms.The resultis shown at the bottom of Figure 11. We treat the spectrumof frequencyF2 as if it were two impulseswith amplitude= /2. We then make two copies of the spectrumof the cosine waveformat frequency F1, center a copy at each of the two locations indicatedby the F2 impulses, and scale the F1 copies accordingto the F2 impulse strengths.We have just demonstratedyet another trigonometricidentity, y(n) 2- no 0 x(n) 2 - 0 cos (A) cos (B) = ? [ cos (A - B) + cos (A +B)] n-* 1 -y(n + 1) 2- since the spectrumof the product of our two cosine waveswas just two more cosine waves,one at the frequencyF1 + F2 and the other at the frequencyF1 - F2, both amplitudesscaledby Y1. Notice that it is quite possible to get foldoverwhen multiplying waveformstogether, since, in this case for example, F, + F2 Hz. might well be greaterthan R/2 Hz. even though neitherF1 nor F2 is. The aliasedcomponentswould have to obey Equation(1), just like all the well-behavedcomponents. THE Z-TRANSFORM + The z -transformis widely used in digitalsignalprocessing theory, and has a very simpledefinition. However,like the oriental game of Go, it is much easier to learn what the z-transformis than to masterit. Wecall x(n) a "one-sided" sequenceif all of its valuesarezero for n < 0. It is called a finite, one-sided sequenceif all valuesof x(n) are also zero for - 2 y(n- 1) 0 2 0 n-+ n- Figure 10. An example of the convolution of two functions, x (n) with y (n). -R -F2 2 As is evident in Figure 10, the convolved sequence is generallyof a longer durationthan either function; the dura+ tion of z(n) in Equation(22) is in fact Nx Ny 1 samples. Whenwe multiply two waveformstogether,their spectra are convolved. Similarly,if two spectra are multiplied, their correspondingwaveformsare convolved,as we shall see later when we discussdigital filtering. Using convolution we can see what happensto the spectrum when we multiply two sampledwaveformstogether (amplitudemodulation). Suppose we have two sampledcosine waves, one at frequencyF, and the other at frequencyF2 (let both amplitudesbe equal to one for simplicity).The sampling rate is R Hz., and is much greaterthan either F, or F2, so foldover is not of concern. If we were to add the waves together, their spectrawould also simply add, and the resultis shown at the top of Figure 11. If we multiply the waveforms -F1 0 +F1 +F2 +R 2 -1 -R 2 -(F2+F1) -(F2+F1) 0 F2-F1 F2+F1 +R 2 Figure11. The spectrumof the sum (top) and product (bottom) of two cosine waveforms,one at frequencyF, and the other at F2. F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Paqe 53 n > N, for some finite value of N. The z-transformof a onesided sequencex(n) is then defined to be X(z) = oo j x(n) z-n n =-0 (23) and for a finite, one-sided sequenceof lengthN, it is N- 1 X(z) = x(n) z-n n=0 (23) wherez is a complex variable. The major difference between the DFT and the z-transformis that while in the DFT we multiply and sum the samples of a waveform with a particularcomplex value, namely e-W , in the z-transform,the valueof z may be set to any complex value whatever. Clearly,if we set z to eiw, then the z-transformis equivalentto the DFT for finite sequences. But the z-transform can also be used to gain an additional kind of insightinto the natureof digitalsignals.For example, if x(n) is the unit-step function 1 n >O0 0 n<0 = x(n) then its one-sided z-transformis just X(z)= = 0 l'z-n n=O n1 n=O z-n 1 - z-1 (25) The closed form of Equation(25) is just a statementof the result of summingan infinitely long geometricseries,such as those discussedin Part I of this tutorial. Similarly,we see that the z-transformof the complex exponential eiwn = cos (on) + / sin (on) is just X(z)= n=0 ejwnz-n = n=0 plane (i.e., graphingthe realpart along the horizontalaxis and the imaginarypart along the verticalaxis), we find that the graphis simply a circle of radiusone centeredat the origin, called the unit circle. Since each point on the complex plane representsa possiblevalue for z, we see that Iz I> 1 specifies the set of all points on the complex plane that lie outside of the unit circle. Similarly, Iz l = 1 specifies all points on the unit circle, while Iz I< 1 is the set of all points inside it. The z -transformof any function has a generalform: (z-1 eJi)n M R (l-ziz-1) X(z) = A i= 1 (27) - (l-piz-') i= 1 where A is an arbitraryconstant, II is the "productoperator,"analogousto the "sum operator"Z except that multiplicationratherthanaddi4 tion is specified;for example, II c = 1 2 3 - 4 = 24, c=1 are M of "zeroes" and X(z) zi Pi areN "poles"of X(z). Both the numeratorand denominatorof Equation(27) are polynomials in the complex variablez, and they are shown here in factored form in orderto demonstratethat the zi and Pi values are reallyjust the roots of the numeratorand denominator polynomials,respectively.Theseroots may be complex, and if they are, we know that as long as the coefficients of the numeratorand denominatorpolynomials are real (as they must be for a realizablesystem), then the complex roots will alwaysappearin conjugatepairs. We can then representthe z - transformof a sequence by writing the transformin the form givenby Equation(27) and plotting the locations of the poles (pi) and zeros (zi) as "X's" and "O's"on the complex plane. Such a "pole-zero" plot is just a unique way to representthe sequencex(n) on the complex plane. For example, Equation(25) is a z-transformwith M = 1 zero equal to 0 andN = 1 pole equal to 1, since 1 _ (1-0-z-1) 1-7z-1 ( -1) 1 (26) S- z-1 eiw We recallthat such geometricsumsconverge(i.e., sum to a finite value)only for certainrestrictedvaluesof z. For example, Equation(25) convergesonly if Iz- I is less than one since otherwisewe would be addingtogether an infinitely long sequence of numberswhich do not get smalleras we go along. In I orderfor z" I to be less than one, Izl must be greaterthan = + jb is complex, we can see that Iz z a one. Since 1. If1=J+we make will be greaterthan one only when /a2+bi2> = 1 on the complex a graphof the equation Iz l=-a +b2i Page 54 Similarly,Equation(26) has one (complex) zero with a value of zero, and one (complex) pole with a value of eiw. We can graphthese two equationson the complex plane as shown in Figure 12. Poles are graphedas X's, and zeros are graphedas O's;the unit circleis shown for reference. The poles of the bottom graphare "mirrorimages"of each other in relationto the horizontal(real) axis. In other words, the two poles differ only in the sign of their imaginary part, and thus the poles representa "conjugatepair." The exact location of the eJw pole pair dependson the value of o, the frequencyof the sinusoid.If co = 0, the pole pairwill be coincident, i.e., both X's are placed on top of each other at z = 1. As the frequencyincreasesto o increasesto +rn, _R/2, Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 imaginary complex(z-) plane The significanceof the z -transformis its use in digital filter theory. Wewill not delve very deeply into this topic due to its vast complexity (no doubt it has imaginarypartsin the minds of many theorists),but we can discussbasic elements of some aspectsof digitalfilters. DIGITAL FILTERING real -" imaginary complex (z-) plane real-* Figure12. Pole-zeroplotsof thez-transformof Equation The basic purpose of a digital filter is to modify the spectrumof digitalsignalsin a desirableway. A digitalfilter is usuallypicturedas a "blackbox" with a signalinput and a signaloutput. The operationof the box is describedby what engineerscall its transferfunction, a mathematicalformulation of its operationin the transformdomain.For examplewe may wish to create a filter which will pass all frequency components lower in frequency than a certain value, Fc, and to remove all others. Such a filter would be a low-pass filter (LPF), and Fc is its cutofffrequency. Wemay desireto create high pass filters(HPF) or band-passfilters (BPF) to removeor attenuatecertain frequencieswhile passingothers. An all-pass filter may be used to modify only the phase on certaincomponents. In generalwe would like to be able to specify an arbitrarytransferfunction in terms of a frequency response curve, to be multiplied by the frequency spectrum of any signalwhich is treatedby it, and possibly also a phase response curve,which would modify the phaseof an arbitrarysignal. The transferfunction for an ideal LPF is shown in Figure 13. From now on, we will use "normalizedfrequency"in our plots, where o = ?ir correspondsto a frequency of ?R/2. This allows us to speak of frequencies in a way that is independentof any particularsamplingrate. 27. Top: M 1, N = 1. Bottom: M= 2, N = 2. andat w = ?n, the two poleswouldbe on top of eachotherat z = -1. Intermediate at interfrequenciesare represented mediatepositionsalongthe circumference of the unitcircle. Thez-transform,likethe DFT,alsohasaninverse,albeit a morecomplicatedone thanthe inverseDFT.The only methodof performing the inversez-transform whichwe will considerhereis calledpartialfractionexpansionof X(z). we changethe formof X(z) If, by algebraicmanipulation, fromthatof Equation(27) to N X(z)= 1 i= i (1 x(n) = (28) -piZ) ci(pi)n n >0 i= 1 0 - (28) wherethepi arethe poles(rootsof the denominator polynomial)andtheci areconstantsderivedin the algebraic manipulais givenby: tion, thenthe inversez-transform N f Amplitude (29) n<O0 Othermethodsexist for the inverse z-transform,but noneof themis simplerto performthanthis unfortunately, one. ++C +Wc -O Frequency-+ Figure13. Transferfunction for an ideal low pass filter with (normalized)cutoff frequency c,. The filter describedin Figure 13 operatesby multiplying the spectrumshown by the spectrumof the input signal,thereby setting to zero any frequencycomponentsin the signalthat are greaterthan ?+ wc If we think of Figure 13 as the spectrumof some signal, call it h (n), then we would be able to implement our ideal LPF by convolvingthe input signalwith h (n), since multiplying spectracorrespondsto convolutionof waveforms,just as multiplying waveforms corresponds to convolving their spectra.Thus, if x(n) is the input signaland X(z) is its transform,y(n) is the output signaland Y(z) is its transform,and F. R. Moore:An Introductionto the Mathematicsof DigitalSignalProcessing,PartII This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page55 of H(z), thenthe transferfunch(n) is the inversetransform tion of ourfilter, y(n) = h(n) * x(n) by anequationof the Anydigitalfiltermaybe described form: m (30) y(n) = to corresponds Y(z)= H(z)X(z) (31) The functionh(n) is calledthe impulseresponse(or unit sampleresponse)of the filter,since Y(z) = H(z) only if X(z) = 1, whichis the transformof the digitalimpulse function. We can uniquelydescribeany filter by its impulse responseh(n), whichis just the (inverse)transformof its transferfunctionH(z). Andoncewe knowH(z), whichwe its responseto an canobtainfromanyfilterby transforming impulsefunction,we knowwhatthe filterwill do to any input signal,sinceH(z) definesthe frequencyand phase responseof the filtercompletely. Wecalculatethe frequencyandphaseresponseof a filter withthe followingsteps: 1. Findthe impulseresponseh(n) of the filter.Thiscanbe doneempirically by excitingthe filterwithu(n) as an inputsignal;h (n) is then"whatcomesout." 2. Calculatethe z-transformof the impulseresponse: H(z) = Z h(n)z-n. Thissummaybe eitherfiniteor infinite,dependingon whetherh(n) is of finiteduration. 3. Set z equalto ejw. Thismakesthe transferfunction H(z) yieldthe spectrumof theimpulseresponse, H(eiw). 4. Thefrequencyresponseof the filteris thendefinedto be themagnitudeof the spectrumof its impulse response:IH(eiw)I. 5. Thephaseresponseis similarlypha[H(eiw) ]. Manysubtleproblemsareencounteredin digitalfilter design.For example,the idealLPFdescribedaboveis an If we exampleof a filterwhichis saidto be unrealizable. of the "rectangular examinethe transform box"functionin Figure8, we see thatit lookssomethinglikea cosinewavethat dies downin amplitudeon both sidesof the origin(this functionis calledthe "sinc"function).If the rectangular box werea digitalwaveform(a sortof singlehalfperiodof a squarewave),thenthissaysthatthe spectrumof thissignal graduallydies away as the magnitudeof the frequency increases,both in positive and negativedirections.But if the rectangularbox is the spectrum,as for the ideal LPF, this says that the impulse response of the filter begins before n = 0, the time at which the impulseoccurs. In other words, the ideal LPF beginsto respondto its input before it has occurred.No wonderthis filter is called unrealizable!It would need some sort of "crystalball" with which it could look into the future, see an impulse coming from the distant future, and begin its responsebefore its input arrives.The moralis: if anyone offers to sellyou anidealLPF,don'tbuyit (unlessyou checkit very carefully)! We will end our discussionof digital filters with the generalformulafor all digital filters and two simple examples: a first-orderrealizable(but hardlyideal) LPF, and a second orderBPF. Page56 i=0 N bix(n - i)- i=l aiay(n - i) (32) where x(n) is aninputsignalor sequence, y(n) is the outputsignal, howy(n) bi area set of M coefficientsdescribing dependson the currentinputsampleand the previousM inputsamples,and a set of N coefficientsdescribing are howy(n) ai dependson thepreviousN outputsamples. Thebi coefficientsof Equation(32) determinethe zeros of the transferfunction,andtheai coefficientsdeterminethe poles.Somefiltershaveonlyzerosandno poles(i.e.,N = 0), andhencetheiroutputdependsonly on the currentinput sampleandthe pastM samples.Suchfiltersarecalledtransversal,orfinite impulseresponse(FIR) filters,sincetheir responseto animpulsecannotlastanylongerthanM samples, as definedin Equation(32). If a filtercontainspoles,it is calleda recursive,orinfiniteimpulseresponse(IIR)filter. Manyfilterscontainbothpolesandzeroesin theirtransfer function,the simplestof whichis a first-orderLPF, definedby the differenceequation: y(n) =x(n) + Ky(n - 1) (33) whereK is an arbitrary constantless thanone. In termsof Equation(32), ourfirst-orderfilterhasbo= 1 andal = K. We can find its transferfunctionby transforming the impulse response,h(n); the frequencyresponsewillthenbe IH(eiw)I andthe phaseresponsewillbe givenby pha[H(eiw)]. If we startwiththe initialconditionthaty(- 1) = 0, thenwe canlist the valuesof h(n) as follows: x(n)= u(n) , y(-1) = 0 y(n)= u(n)+ Ky(n - 1) n 0 x(n) = u(n) 1 1 2 3 0 0 0 y(n) 1 +Ky(- 1) = 1 = Ko K=K' 0 +Ky(0) = 0+Ky(1)= K'K=K2 0+Ky(2)= K.K2=K3 etc. Since h(n) is infinitely long, our filter is of type IIR with impulseresponse h(n)= SKn n> 0 nn<0 ComputerMusicJournal,Box,E, MenloPark,CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions 0 Volume II Number2 Theone-sidedz-transformof h(n) is then H(z)= Z h(n)z-n= oo n=0 n=0 K(e-iw ew) ) =tan- S[2 -K(eiw + e-ij)] oo(Kz-')n - n=O Settingz equalto eiw nowyieldsthe spectrumforourfilter Againwe makeuseof Euler'srelationin two forms: eiw + e-iw = 2 cos c and eiw - e-Jiw= 2 jsin w. Also, note that the inversetangentis an odd function, i.e., tan' (- 0) =-tan-1 (0). 1 - 1KeW H(eJw) - e w) Stan Wenow haveonly to find the magnitudeandphaseof the functionH(eiw) in orderto makethe frequencyandphase explicit.A usefulpropertyof complexnumbersis responses thatthe productof a complexnumberwithits conjugateis of the complexnumber;i e., if equalto the squaredmagnitude K(e-'w [ j(2-K2 coso) zz * = (a +jb) (a -jb) = aZ+ b2 = zl2 = -tan' (wherez*= the conjugateof z = a - b) It is alsoeasy to verifythe followingrelations: = j(z + z*) Thus,in orderto findthe frequencyresponsewe calculatethe of H(ei"): magnitude [H(eiW)H(e-iW)]% 1~ I 1 -Ke-iw 1 -Ke* i - +e-w) +K2 1 K(eJw - Noting from Euler'srelationthat eiw + e-iJ = 2 cos w, we obtain 1 [ -- 2K +K cos 3 o The phaseresponseis equal to I L - Ke-J"+ (all (* sin w + z*)= a 1 Im z = the imaginarypart of z = (z z*) = b z - z* - tan' Im Inzz tan' IH(ejw)I = ) I -K cos w K -n = phaH(eiw) 1 Re z = the real partofz =(z Re z K2sin -tatann- z = a +b, then ll(eiw)l= L(1-Keiw + 1-Ke-iw) Knzn oo 00 phaz - Keiw-(1-Ke-iw) F =tan- K I - I - A'e. w Wecannowmakegraphsof IH(eiw)IandphaH(eiw) from--7r?<w < +rnto get a completepictureof the transfer functionas a frequencyresponseanda phaseresponse.Since the frequencyresponseof mostfiltersvariesoversucha great numerical vertical range,we typicallyplotit on a logarithmic scale,suchasis donein Figure14.Ifwe plot20 logloIH(eiw)l, we canevenreadtheverticalscaledirectlyin dB.A valueof 0 dB corresponds to IH(eiw)l = 1, +6 dB corresponds to to IH(eiw)l= .5, andso on. IH(eiw) = 2, -6 dBcorresponds The frequencyresponsecurvesin Figure14 indicate clearlythat our first-ordersystemis a low passfilter,but hardlyanidealone. A simplesecond-order filtermighthavethe equation y(n) = x(n) + K, y(n - 1) + K2y(n - 2) (34) to derivethe whereK, andK2 areconstants.Inattempting impulseresponseof thisfilter,h(n), we findthatthemathematicaltools which we have developedso far are unfortunately not adequateto the task. In orderto "solve"Equation(34) for h(n) we would need to develop the theory of difference equations, which are the digital correspondentsof differential equations in the analog world. If we aregiven h(n), however,we could then carryout the necessarycalculationsto obtain the frequency and phase response. The impulse responsecorrespondingto Equation(34) will be given here without derivation,since it is possible to make use of this filter without being able to solve difference equations. With the followinggift from differenceequation theory, then, we can proceed. The impulseresponseof Equation(34) is either the sum of two exponentiallydecayingfunctions, or a sinusoidwhose amplitudedecays exponentially,dependingon the relationship F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 57 1 20og/o 1 - 2K cosco + K2 30- 20 ..K =.95 K=.5 10-- K sin w -tan-1 1-K cos C, SK= .95 .K=.5 - I\ - ~I+=r -n Figure14. Frequencyand phaseresponseof a first-order system (see text) for two valuesof K (.95 and .5). Page 58 Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2 between the K1 and K2 coefficients. If K1 > impulseresponseis of the form K2 h(n) = a0 (pl)n + a2(P2)n /4, the (35a) (sum of exponentials).If K1 < - K22/4, then h(n)= al rn sin (bn+ O) (35b) where r = -K2 b = cos-1 K1 0 = b,and I sin b The frequencyand phase responseof this filter can be calculatedfrom the z-transformof Equation(35b), evaluated at z = eJw: H(ejw) 1 = 1 - 2r (cos b) e-jw + r2 e-2(w (36) likely that computermusic will continue to be fruitfullypracticed by smallgroupsof cooperatingindividualsratherthan by single personsworkingalone. Also, it meansthat individuals must be willingto learnas much as possible,and to rely on good sourcesfor that which is unknown.The trick is to be able to discerna good source from a not-so-good source. It would help a great deal if computermusicians,with their variousspecialities,were able to speak a common language wherebytheir individualspecialitiescould be described.One such languageexists in the form of computer programming principles.It is possible for a composer to explain the rules of harmonyin computerprogrammingtermsto a mathematician, and for the mathematicianto explain integrationsin computer programmingterms to a composer,if both can program.Thus the core of computer music must be in programmingand music composition or performance.Beyond a good workingknowledgeof these two fields, mathematics seems to be a unifying element for the remainingareasof acoustics, psychology and signalprocessing.An overviewof each of these areaswould be useful for work in computer music. Tutorialsin the fields of acoustics and psychological acousticssimilarto this one in mathematicsand signalprocessingwould providean even more solid groundfor discussion. Wehave coveredabout as much signalprocessingas the nonspecialistpractitionerof computermusic is likely to need, at least in the way of basics.We arenow in a position to build on this foundation. The interestedreaderwill find it an exhileratingexercise to plot curvesfor IH(ejw)I and phaH(eiw) based on Equation(36). WHERETO GO FROMHERE Wehave only scratchedthe surfaceof mathematicsand digitalsignalprocessing,but we have scratchedit pretty well. Many of the concepts we have discussedare subtle and not easily understood,and it should come as no surpriseif the uninitiatedreaderis a bit confused at this point. Confusionis the naturalcompanionof learning.WhenJean BaptisteJoseph Fourierfirst establishedthe fact that an arbitrarymathematical function could be representedas a sum of sinusoids (in 1807), leading mathematiciansof the day refused to believe it, so new and startlingwas the idea. No doubt the idea confused them, too. People have been confused by this idea eversince, of course, but we now know it to be one of the most useful basic facts about mathematics,science, engineering, and sound, includingmusic. The truth and beauty of the idea makes all the initial confusion worthwhile. If the confusion persists,however, after a few careful readingsand problemsolutions, the time has come to attack it before it turnsinto frustration.Some recommendedreferences for furtherstudy are given after the problems,organizedby topic. An hour or two with a good teacher can be worth months of self study in certaincases, but skill alwayscomes from practice, and no numberof hours watching a teacher solve problemswill make a student skilledat this. Finally, a word about scope. Computermusic is a highly interdisciplinaryfield, containing portions of computer and engineering.Just science, music, science, mnathematics, which partsof all of these fields are partof computermusic and which are not is unclearat this time, though it is clearthat no individualis likely to have the utopian qualificationof expertise in all of these fields at once. Consequently,it is Some MoreProblems 1. (Sampling).Non-bandlimitedwaveformsarecharacterized by largejumps (discontinuities)in theirwaveforms;such a waveformis depicted in analog form in FigureP1. The Fourierrepresentationof the sawtooth as it is shown is 00 F(t)= 22 k=l k (--!)k-1 sin kt a) Derive an expression for the equivalent waveform sampled at R samplesper second with a frequency of F Hz. b) SupposeR = 10 kHz. andF = 440 Hz. Whatis the lowest frequencycomponent of this waveformwhich will be folded (aliased)to an incorrectfrequency?Whatis the amplitudeof this component? c) Supposewe cannot hear componentswhich are 40 dB weakerin amplitudethan the fundamentalof the sawtooth. Whatis the highest frequency sawtooth which will have aliasedcomponents40 dB less strongthan the fundamental(again,assumeR = 10000 Hz.)? 2. (Sampling). Supposewe programa computerto producea sine wave of constantlyincreasingfrequencyaccordingto the relation Frequency= 1000 (time in seconds). The samplingrate of the digital-to-analogconverteris set to 32 kHz. Makea graphshowingthe frequencywe will actuallyhearcoming from the DAC as a function of time for t = 0 to t = 60 seconds. F. R. Moore: An Introduction to the Mathematics of Digital Signal Processing, Part II This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Page 59 ACKNOWLEDGEMENTS -3n / -i -2r 2i I wish to thank John Snell, John Strawn,John Gordon, and MarcLeBrunfor their many helpful suggestions;without their dedicationsuch articlesas this, and indeed the Computer MusicJournalitself, could not exist. The editors of ComputerMusicJournalwould also like to thank John Gordonfor providingdrawingsfor Fig. 3-6 and 9-10. 3ir t - ,0 REFERENCES FigurePl: A non-bandlimitedsawtooth waveform. Acoustics and Psychoacoustics 3. (DFT). Makea spectralplot for the DFT workedout in Tables 1 through5. Labelthe horizontalaxis with the frequencies -4000, -3000, -2000, - 1000, 0, 1000, 2000, 3000, and 4000 Hz. and plot the magnitudeof X(k) at each of these frequencieson a dB scale. 4. (Convolution). A squarewave is characterizedby a spectrumcontainingonly odd-numberedmultiplesof the fundamentalfrequency,with amplitudesequal to one-overthe -component-number: -I sin (kwn), k odd. x(n) = k=l Foldovercan be avoidedby choosinga suitably smallvalue for M. Suppose that we producethis waveformat F Hz. << R/2 Hz., and that we multiply it by y(n) = (1 + cos 2on) i.e., a cosine waveformat twice the fundamentalfrequency of the squarewave. a) Whatis the spectrumof the resultingwaveform? b) Is it still a squarewave? 5. (DFT). Supposex(n) represents16 samplesof a digitized sine waveformwith a period of 16 samples,i.e. x(n) = sin (2rn/iN) 0 < n< N- 1 withN= 16. a) Whatis the DFT of x(n)? b) Supposewe took the DFT of only the first 8 samplesof x(n) as defined above. Is the DFT the same? Why or why not? 6. (z-transform). Makea pole-zero plot of the z-transformof the first-orderfilter discussedin the text. Whathappensto the location of the pole as K goes from 0 to 1? 7. (Digital Filters). Makea plot of the frequencyresponseof the second orderfilter (Equation(36)) for r = .95, b = rT/4. How does it differ from the frequencyresponseof a firstorderfilter? Page 60 Benade, ArthurH. FundamentalsofMusical Acoustics. New York: Oxford UniversityPress, 1976. Helmholtz,H. von. On the Sensationsof Toneas a Physiological Basisfor the Theoryof Music. Trans.by A. Ellis. New York: Dover, 1954. * Roederer,JuanG. Introductionto the Physicsand Psychophysics ofMusic. New York: Springer,1975. ComputerMusicand ElectronicMusic Appleton, Jon H., and Ronald C. Perera,ed. The Development and Practice of ElectronicMusic. Englewood Cliffs, NJ.: Prentice-Hall,1975. * Mathews,Max V. The Technology of ComputerMusic. Cambridge:MITPress, 1969. ComputerProgramming * Knuth, Donald E. The Art of Computer Programming. Vol. 1: FundamentalAlgorithms. Reading, Mass: Addison-Wesley,1969. Digital SignalProcessing Oppenheim,Alan V., and RonaldW. Schafer. DigitalSignal Processing. EnglewoodCliffs, N.J.: Prentice-Hall, 1975. * Rabiner,LawrenceR., and BernardGold. TheoryandApplication of Digital SignalProcessing. EnglewoodCliffs, N.J.: Prentice-Hall,1975. Robinsen, EndersA., and ManuelT. Silvia. Digital Signal Processingand Time Series Analysis. San Francisco: Holden-Day, 1978. Mathematics Bracewell,Ron. TheFourier Transformand its Applications. New York: McGraw-Hill,1965. Polya, George,and Gordon Latta. Complex Variables.New York: Wiley, 1974. Spiegel, MurrayR. MathematicalHandbookof Formulasand Tables. Schaums'sOutline Series in Mathematics. New York: McGraw-Hill,1965. * Steward,Ian, and DavidTall. The Foundationsof Mathematics. New York: Oxford UniversityPress, 1977. (*-first choice) Computer Music Journal, Box E, Menlo Park, CA 94025 This content downloaded from 165.123.228.54 on Wed, 03 Feb 2016 12:02:37 UTC All use subject to JSTOR Terms and Conditions Volume II Number 2
© Copyright 2026 Paperzz