Acoustic phonetics review Ch. 7 Waves, spectra, and resonance 1 Sine waves & complex waves: When we speak, we push a stream of air out which has a constantly varying pressure caused by many individual actions in the vocal tract. These variations are propagated in a compression wave via air molecules to the listener’s ears. a. Simple sine (sinusoidal) waves. (Ex: Tuning fork) i) The greater the amplitude (difference between the greatest pressure and neutral), the greater the intensity (decibels), and the louder the sound is generally perceived to be. ii) The higher the frequency (cycles per second, Hz), the higher pitched the sound is perceived to be. (1) Average pitch range produced in speech: 120Hz (men) to 265Hz (children). (2) Average pitch range able to be perceived: 20 – 20,000Hz b. Complex waves. i) Most sounds are not simple sine waves but complex repetitive waves ii) Fourier analysis can tell us the series of sine waves which together compose the complex wave. 2 The individual components of a complex sound wave are known as harmonics. a. The first (lowest-frequency) harmonic is known as the fundamental harmonic. i) Its frequency is the fundamental frequency, or F0. ii) The frequency of every subsequent harmonic is a multiple of the F0. iii) The frequency of the entire complex wave will be the same as that of F0. 3 Sound spectra display frequency and amplitude of a wave’s harmonics. a. A spectrogram is a visual representation of a sound spectrum, showing three parameters: time (x-axis), frequency (y-axis), and intensity (darkness). i) Visible intensity will generally correspond to the sonority hierarchy, with vowels and glides the darkest and stops the lightest. ii) When the harmonics rise and fall, this means the pitch is rising and falling. 4 Resonance is the natural tendency of a body to vibrate at a certain frequency. a. Glottal wave: Airflow from the lungs is broken up in a series of pressure variations due to the opening and closing of the vocal folds (vibrations), forming a wave. b. The vocal tract acts as a resonance chamber for the glottal wave, modifying the glottal spectrum and its formants. (Below: Glottal wave spectra before and after modification.) c. Rogers: “Remember: o The frequency of the fundamental harmonic determines which pitch we hear; o The frequencies of the other harmonics are whole multiples of the fundamental; o The frequencies of the formants determine vowel quality.” Noise: some sounds do not have regular wave forms, but instead exhibit irregular energy 5 Ch. 8 Acoustics of English 1 Spectrograms a. Narrow-band vs. wide-band spectrograms: Narrow-band are the ones where you can best see harmonics and pitch; wide-band are the ones where you can best see formants and time. 2 Formants are clusters of harmonics with relatively high intensities, particularly helpful in distinguishing vowels (and other sounds); this is why in phonetics we usually use wideband spectrograms. 3 Principles for reading spectrograms Vowels o o o o o Vowel quality The lower the F1, the higher the vowel. [a] has the highest F1. The lower the F2, the further back the vowel. The lower the F3, the further back the vowel (although this is less significant). The closer F1 and F2 are to each other, the lower the vowel (and possibly further back). Rounded vowels will have all formants slightly lower than unrounded counterparts (but we probably will not need to deal with this). o o o o Other things Glides will look just like their vowel counterparts. Diphthongs will show a smooth transition from one vowel to the next. Vowels will be longest when at the end of a word, shorter before voiced consonants, shortest before voiceless consonants. Stressed syllables (in English) will show a stronger intensity. Consonants Manner of articulation o Oral stops will show as gap, though there may be a “voicing bar” (darkish strip at the bottom of the gap). o Aspirated stops will show a bit of noise before the next vowel. o Fricatives will show as noise. o [s z] will show the highest-frequency and strongest frication, while labial fricatives will be faint. o [h] will show very, very faint noise, but mostly bleed into the following vowel’s formants. F2 may also be slightly raised, and F3 slightly lowered, towards the [h] (much like a velar pinch). o Nasals will have a weak formant-like pattern. o Rhotics will show a lowered F3 at the transition and even throughout much of the next vowel. o Laterals also show weak formants. Clear l will have three distinct formants, fairly evenly spread apart. Dark l will have F1 and F2 smushed together near the bottom, with a high F3. Both have F3 at around 2500Hz. o o o o Place of articulation F2 at the transition (between vowel and consonant) is the most helpful in decoding the place of articulation. For labials, F2 will point down towards the consonant. For velars, F2 will point up, while F3 may point down a bit. For coronals, F2 should remain approximately level. Vocabulary from Johnson reading on acoustic theory o Quantal region: a “region of stability” in the vocal tract, i.e. voiceless, voicing, and glottal stop; so named because there is a small change can create a fundamentally different quality, as when the moment vocal folds begin to vibrate we “leap” from voiceless to voicing. Within the voicing region of stability, for example, a speaker can choose any one of several possible glottal widths and still produce and be perceived as producing voicing. o Nonlinear: a relationship between variables that is not directly proportional (cannot be plotted in a line), such as exponential, logarithmic, and the quantal regions described by Stevens. o Resonant frequency: the frequency at which a system oscillates at maximum amplitude. o Standing wave: a wave pattern in which the ends of a system oscillate between “peaks” of compression and “valleys” of rarefaction, while midpoint maintains a pressure of zero. o Node: the midpoint of a standing wave as described above. o Antinode: the endpoints of a standing at which pressure reaches both positive and negative maxima as described above. o Simple harmonic system: a system that has only one resonant frequency, e.g., a pendulum. o Polarity change: a change in sign (+ to – or vice versa); in a tube with one end open and the other end closed, occurs when air stream reaches the open end. o Spectral envelope: the overall shape of a power spectrum o Predictor coefficients: coefficients (a1... an) predicted by the LPC algorithm that define a filter which, when certain assumptions are met, approximates the characteristics of the vocal tract filtering function.
© Copyright 2026 Paperzz