Integer-Based Wavetable Synthesis for Low-Computational Embedded Systems Ben Wright and Somsak Sukittanon University of Tennessee at Martin Department of Engineering Martin, TN USA [email protected], [email protected] Abstract— The evolution of digital music synthesis spans from tried-and-true frequency modulation to string modeling with neural networks. This project begins from a wavetable basis. The audio waveform was modeled as a superposition of formant sinusoids at various frequencies relative to the fundamental frequency. Piano sounds were reverse-engineered to derive a basis for the final formant structure, and used to create a wavetable. The quality of reproduction hangs in a trade-off between rich notes and sampling frequency. For lowcomputational systems, this calls for an approach that avoids burdensome floating-point calculations. To speed up calculations while preserving resolution, floating-point math was avoided entirely--all numbers involved are integral. The method was built into the Laser Piano. Along its 12-foot length are 24 “keys,” each consisting of a laser aligned with a photoresistive sensor connected to 8-bit MCU. The synthesis is processed in the sensor array by four microcontrollers running a strictly synchronous wavetable synthesis algorithm. In the laser array is integrated a microcontroller that can toggle each laser, allowing the piano to play itself or limit playable keys. I. INTRODUCTION A. Background Review Throughout the years, music synthesis has branched forth from many different techniques. FM synthesis is classically favored for its simplicity but is limited in signal realism. The Karplus-Strong algorithm is favored for its rich harmonic content and authenticity particularly to the transient responses of plucked and struck strings. In [1], Karplus-Strong polynomials are used to define zeros to selectively cancel harmonics from a traditional Karplus-Strong transfer function. The time-domain convolution associated with this method, however, requires more RAM than may be practical for a low-computational system. In [2], the authors proposed an impressive system to synthesize music by modeling the vibration of a string based on Scattering Recurrent Networks with very accurate results. To implement this system, however, would still be far beyond the capacity of a typical small embedded system. Techniques like these have been combined in [3] by a control structure that dynamically selects and combines synthesis techniques to benefit from the advantages of each. This control structure could prove invaluable to a system synthesizing a wide range of timbres and pitches but would be unnecessary for a system with sufficiently limited scope. For embedded systems, wavetable synthesis offers an attractive combination of realism and speed. In this paper, a lightweight implementation of wavetable synthesis is discussed. Section B will discuss the mathematics explaining the components of the wavetables. Part II will examine the music theory used to derive them and the algorithm coded to wrap it all up. Part III will cover the applied results. B. Mathematics and Music Theory Many periodic signals can be represented by a superposition of sinusoids, more conventionally called Fourier series. Eigenfunction expansions, a superset of these, can represent solutions to partial differential equations (PDE) where Fourier series cannot. In these expansions, the frequency of each wave (here called a formant) in the superposition is a function of the properties of the string. This is said merely to emphasize that the response of a plucked string may need a model more complete than a Fourier series to accurately capture its timbre as shown in equation (1). ∂2 u ∂x 2 = ∂2 u ∂t 2 u(x,0) = f (x) ∂u (x,0) = g(x) ∂t (1) The function represents the displacement of a vibrating string as a function of position along the string, where , and time . and represent the initial displacement and speed, respectively. d’Alembert’s solution [4] to this PDE is given in the form u(x,t) = Φ(x − ct) + Θ(x + ct) , which describes the wavespeed (2) of travelling waves. and describe travelling waves that move in different directions at speed . A solution of this PDE is given by (3) Initial conditions and are used, by an appeal to orthogonality, to derive a pair of coefficients and for each eigenfunction. There exists one eigenfunction for each eigenvalue . These eigenvalues need not be associated nor even countable. Each eigenfunction is called a formant. These formants are not as easily determined for a generalized eigenfunction expansion as they might be for a Fourier series. These formants are assumed to relate to the fundamental frequency of each note in identical fashion. A vibrating string, for example, should produce the same timbre (i.e. quality of sound as described by its formant structure) even as it is tuned higher or lower. It was assumed that the relations of formants to the fundamental root would fit into the model of a diatonic scale as described by Western music theory. This limited the formants to a set of the most significant few that provide a skeleton into which the formants can fit. If not perfectly accurate, this assumption proved a fair approximation, and simplified the enforcement of periodicity. Adding formants at eigenvalue frequencies extends the fundamental period of the superposition. The wavetable is not easily truncated because it is important that no discontinuities exist in it. However, the wavetable must be small enough to fit in the limited RAM capacity of a small MCU. Once the pattern of formant relationships and gains was recognized, the waveform was reconstructed in MATLAB (as shown by the code in Fig. 4a, and plotted with the ADSR envelope in Fig. 1) using a superposition of waves with a similar mapping of frequencies and amplitudes. Multiple MATLAB files were written to attempt a simulation most like on-chip synthesis as possible. Every note was sampled from an array of 256 8-bit numbers and amplitude-modulated according to the ADSR (Attack, Decay, Sustain, Release) envelope as illustrated in Fig. 2. II. ALGORITHM DESIGN A. Waveform Construction How to produce a desired timbre could be a subject of considerably deep inquiry. Applying an idealized partial differential equation to such a pursuit would be difficult enough, but easier still than so modeling a string with considerations for its manifold non-uniformities. The limitations of this implementation make work this precise needless. For this design, spectral analyses of professionallyrecorded piano notes were studied as a first step toward reverse-engineering piano sound. To lessen RAM consumption on-chip, a wavetable only large enough to capture the note’s fundamental frequency was used. To maintain periodicity, formants were chosen that satisfied periodicity within this window, most notably the note’s perfect fifth and compound major third. This way there are no discontinuities in the final wave. Fig. 2. A set of notes plotted in MATLAB to illustrate the effect of the amplitude modulation on the repetitive waveshape. Each enlarged section shows the signal within 50-ms-wide cuts. The amplitude modulation proved to be more important to the final sound than was initially expected. The use of an ADSR envelope to modulate the amplitude of the waveform provided the striking attack characteristic of a piano note. Intuitively speaking, this envelope could be just as important to other instruments, particularly to drums and woodwinds. B. Firmware Implementation A program was written in C, using Codevision C compiler [5], to synthesize a set of simultaneous piano notes on an embedded system. A timer interrupt function was used to synchronize the wavetable sampling. The code controlling amplitude modulation was written inside an interruptible loop. The output, a superposition of all notes, was output via a byte register into a DAC, from which an analog signal was filtered and sent straight to speakers. The flowchart is shown in Fig. 3b. Fig. 1. A MATLAB plot of the waveform and ADSR envelopes. The waveform shapes the timbre of the notes and the ADSR further controls the amplitude of each note to realistically depict its attack and decay. A timer built in to the microcontroller increments a byte register every 32 clock cycles. Each time this register overflows, an interrupt subroutine is called and the counter is reinitialized to tune the frequency of such calls. Every timer interrupt, an 8-bit number for each note is sampled from an index in the wavetable, then amplitude-modulated according to its position in the ADSR table. The sum of these numbers is normalized and output to the DAC. (a) (b) Fig. 3. (a) A system diagram of the piano hardware, (b) Flowchart describing the piano synthesis algorithm. Two counters increment each interrupt to synchronize ADSR timings for both damped and undamped notes. These indices are incremented every so often; those indices controlling position in the wavetable are incremented according to the frequency of each note. The indices for higher notes, then, step through the wavetable faster than for lower notes. The indices controlling a note’s position in the ADSR table is incremented slowly enough to happen within the interruptible loop. Critical to this program’s success was the tuning of the notes. The amount by which to increment each index for the lower frequency notes can get muddled by 8-bit precision as these increments become small for lower notes and a higher sampling rate. Floating-point math, on the other hand, proved to process very slowly. Instead, a 16-bit unsigned integer was used to represent each index, but scaled up by 256. This yielded tuning more than precise enough for this application while demanding much less CPU time than floating-point math. The indices for the ADSR table, in contrast, were fairly simple to process. These indices, also 8-bit, were incremented by one every so often according to the length of the note, rather than by a certain amount every timer interrupt. The increment timings were scaled down from the interrupt frequency by incrementing a counter in the interrupt function that the interruptible amplitude-modulating code would check. Also within this loop, checks are made concerning user input and which notes should start and stop. To achieve the best performance is to strive for the best sampling frequency while leaving enough CPU time to handle amplitude modulation and sampling user inputs. Traversing the ADSR table is not as important to the user’s ear as synchronously traversing the wavetable. Output from the wavetable must be synchronous. This is why the amplitude modulation is interruptible but wavetable sampling is not. In addition, every interrupt must consume the exact same amount of CPU time to remain synchronous and consistent. For this reason, the use of control statements was avoided within the timer interrupt code. The use of Boolean numbers in formulae allowed the complete avoidance of if conditions. In this way, a logical 0 or 1 can be used as a coefficient just like a numeric 0 or 1. For example, a number to be incremented if a condition is true may always be incremented without an if condition, even when that means the number is incremented by zero. III. RESULTS The final product was a laser piano synthesizer. Along its 12-foot length, an array of lasers (each 5 mW 650 nm) shoots along the floor into an array of photoresistive sensors, one for each of its 24 fully independent notes, as Fig. 3a shows. Full 24-note polyphony from C4 to B5 is processed by four microprocessors (AVR ATMega644), each governing its own range of 6 notes. Overclocked with 27 MHz piezoelectric crystals, each chip does its job consistently at a sampling rate just above 10 KHz. The synthesis algorithm uses two separate ADSR timings. A note is stepped through the ADSR more slowly as long as the user input corresponding to that note stayed present. This way a laser blockage that remains causes a note to sustain longer like holding a key down on a real piano. Each sampling period, the amplitude of the superposition of synthesized notes is output to an 8-bit DAC (DAC-08CN). In the laser array is integrated a microcontroller (ATMega32) that can toggle each laser. Programmed with the Harry Potter theme, the Imperial March, and Für Elise, the piano can effectively play itself. This microcontroller is also programmed with a set of scales to make playing the piano easier. Because this microcontroller could not source enough current to all 24 lasers, the lasers are sourced by Darlington pairs the MCU controls. The sensors were constructed using photoresistors aligned with plastic tubes and wired in series with a static resistor. Each photoresistor has high resistance when darkened and lesser resistance when light is shone upon it. The sensor circuit uses voltage division to sense the difference between a laser shining and a blockage. The series resistor was selected to maximize the difference between the maximum and minimum voltages across the photoresistor. It can be shown the best series resistance is given by the geometric mean of the highest and lowest resistances across the photoresistor. The low and high voltages were reliably distinguishable as TTL logic so the sensors could be wired directly to tri-state pins on the chip. The DAC selected outputs a current signal, so all the DAC output signals are added simply by wiring them to a common node. This current signal from the DACs was filtered by a parallel RC filter with cutoff frequency 1500 Hz meant to smooth the discontinuities in the DAC output signal. Finally, a series capacitor was added to normalize the voltage output to the speaker and eliminate popping. IV. CONCLUSION Even quite complicated timbres can be reproduced with wavetable synthesis. On systems with more computational power, more complicated string modeling techniques can better be appreciated and produce higher quality sound. There is much left to do modeling responses of musical instruments for the purpose of synthesizing them. There may be formants at frequencies lower than the fundamental frequency of the note or outside diatonic scales. When (a) formants are not as limited, it can quickly become important to sample from a larger waveform envelope to ensure periodicity of the output signal. It is important to implement these with an algorithm that does not fail to keep wavetable output synchronous while keeping ADSR timings steady. Using integer calculations wherever possible may sacrifice some precision, but for considerable savings in CPU time. Fig. 4b shows the envelopes initialized as 8-bit integral types (unsigned char in C). These methods proved effective in an implementation on an 8-bit embedded platform. Similar techniques may be effective on larger systems. Another step to make such an algorithm more effective could be to add transient effects, e.g. for synthesizing the music of an acoustic guitar, the “scratch” when a string is plucked. At some extent, however, it may become more efficient to pursue a different algorithm. ACKNOWLEDGEMENT The authors would like to thank Robert Reeves for his discussion and help in this work. REFERENCES [1] I. A. Cummings, R. Venugopal, J. Ahmed, and D. S. Bernstein, “Generalizations of the Karplus-Strong Transfer Function for Digital Music Sound Synthesis,” in IEEE Proceedings of the American Control Conferences, 1999, pp. 2210-2214. [2] A. W. Y. Su and L. San-Fu, “Synthesis of Plucked-String Tones by Physical Modeling With Recurrent Neural Networks,” in IEEE Workshop on Multimedia Signal Processing, 1997, pp. 71-76. [3] S. D. Trautmann and N. M. Cheung. “Wavetable Synthesis for Multimedia and Beyond,” in IEEE Workshop on Multimedia Signal Processing, 1997, pp. 89-94. [4] D. L. Powers, “The Wave Equation,” in Boundary Value Problems and Partial Differential Equations, 6th ed., Academic Press, 2009, pp. 229-231 [5] http://hpinfotech.ro/html/cvavr.htm (b) Fig. 4. (a) Code snippet depicting waveform construction in MATLAB, (b) envelope initialization in C generated by MATLAB for the synthesizer firmware.
© Copyright 2024 Paperzz