Acoustic phonetics review Ch. 7 Waves, spectra, and resonance 1

Acoustic phonetics review
Ch. 7 Waves, spectra, and resonance
1
Sine waves & complex waves: When we speak, we push a stream of air out which has
a constantly varying pressure caused by many individual actions in the vocal tract. These
variations are propagated in a compression wave via air molecules to the listener’s ears.
a. Simple sine (sinusoidal) waves. (Ex: Tuning fork)
i) The greater the amplitude (difference between the greatest pressure and neutral),
the greater the intensity (decibels), and the louder the sound is generally perceived
to be.
ii) The higher the frequency (cycles per second, Hz), the higher pitched the sound is
perceived to be.
(1) Average pitch range produced in speech: 120Hz (men) to 265Hz (children).
(2) Average pitch range able to be perceived: 20 – 20,000Hz
b. Complex waves.
i) Most sounds are not simple sine waves but complex repetitive waves
ii) Fourier analysis can tell us the series of sine waves which together compose the
complex wave.
2
The individual components of a complex sound wave are known as harmonics.
a. The first (lowest-frequency) harmonic is known as the fundamental harmonic.
i) Its frequency is the fundamental frequency, or F0.
ii) The frequency of every subsequent harmonic is a multiple of the F0.
iii) The frequency of the entire complex wave will be the same as that of F0.
3
Sound spectra display frequency and amplitude of a wave’s harmonics.
a. A spectrogram is a visual representation of a sound spectrum, showing three
parameters: time (x-axis), frequency (y-axis), and intensity (darkness).
i) Visible intensity will generally correspond to the sonority hierarchy, with vowels and
glides the darkest and stops the lightest.
ii) When the harmonics rise and fall, this means the pitch is rising and falling.
4
Resonance is the natural tendency of a body to vibrate at a certain frequency.
a. Glottal wave: Airflow from the lungs is broken up in a series of pressure variations due
to the opening and closing of the vocal folds (vibrations), forming a wave.
b. The vocal tract acts as a resonance chamber for the glottal wave, modifying the glottal
spectrum and its formants. (Below: Glottal wave spectra before and after modification.)
c. Rogers: “Remember:
o The frequency of the fundamental harmonic determines which pitch we hear;
o The frequencies of the other harmonics are whole multiples of the fundamental;
o The frequencies of the formants determine vowel quality.”
Noise: some sounds do not have regular wave forms, but instead exhibit irregular energy
5
Ch. 8 Acoustics of English
1
Spectrograms
a. Narrow-band vs. wide-band spectrograms: Narrow-band are the ones where you can
best see harmonics and pitch; wide-band are the ones where you can best see formants
and time.
2
Formants are clusters of harmonics with relatively high intensities, particularly helpful
in distinguishing vowels (and other sounds); this is why in phonetics we usually use wideband spectrograms.
3
Principles for reading spectrograms
Vowels
o
o
o
o
o
Vowel quality
The lower the F1, the higher the vowel. [a] has the highest F1.
The lower the F2, the further back the vowel.
The lower the F3, the further back the vowel (although this is less significant).
The closer F1 and F2 are to each other, the lower the vowel (and possibly further back).
Rounded vowels will have all formants slightly lower than unrounded counterparts (but
we probably will not need to deal with this).
o
o
o
o
Other things
Glides will look just like their vowel counterparts.
Diphthongs will show a smooth transition from one vowel to the next.
Vowels will be longest when at the end of a word, shorter before voiced consonants,
shortest before voiceless consonants.
Stressed syllables (in English) will show a stronger intensity.
Consonants
Manner of articulation
o Oral stops will show as gap, though there may be a “voicing bar” (darkish strip at the
bottom of the gap).
o Aspirated stops will show a bit of noise before the next vowel.
o Fricatives will show as noise.
o [s z] will show the highest-frequency and strongest frication, while labial fricatives
will be faint.
o [h] will show very, very faint noise, but mostly bleed into the following vowel’s
formants. F2 may also be slightly raised, and F3 slightly lowered, towards the [h]
(much like a velar pinch).
o Nasals will have a weak formant-like pattern.
o Rhotics will show a lowered F3 at the transition and even throughout much of the next
vowel.
o Laterals also show weak formants. Clear l will have three distinct formants, fairly evenly
spread apart. Dark l will have F1 and F2 smushed together near the bottom, with a high
F3. Both have F3 at around 2500Hz.
o
o
o
o
Place of articulation
F2 at the transition (between vowel and consonant) is the most helpful in decoding the
place of articulation.
For labials, F2 will point down towards the consonant.
For velars, F2 will point up, while F3 may point down a bit.
For coronals, F2 should remain approximately level.
Vocabulary from Johnson reading on acoustic theory
o Quantal region: a “region of stability” in the vocal tract, i.e. voiceless, voicing, and
glottal stop; so named because there is a small change can create a fundamentally
different quality, as when the moment vocal folds begin to vibrate we “leap” from
voiceless to voicing. Within the voicing region of stability, for example, a speaker can
choose any one of several possible glottal widths and still produce and be perceived as
producing voicing.
o Nonlinear: a relationship between variables that is not directly proportional (cannot be
plotted in a line), such as exponential, logarithmic, and the quantal regions described by
Stevens.
o Resonant frequency: the frequency at which a system oscillates at maximum
amplitude.
o Standing wave: a wave pattern in which the ends of a system oscillate between “peaks”
of compression and “valleys” of rarefaction, while midpoint maintains a pressure of zero.
o Node: the midpoint of a standing wave as described above.
o Antinode: the endpoints of a standing at which pressure reaches both positive and
negative maxima as described above.
o Simple harmonic system: a system that has only one resonant frequency, e.g., a
pendulum.
o Polarity change: a change in sign (+ to – or vice versa); in a tube with one end open
and the other end closed, occurs when air stream reaches the open end.
o Spectral envelope: the overall shape of a power spectrum
o Predictor coefficients: coefficients (a1... an) predicted by the LPC algorithm that
define a filter which, when certain assumptions are met, approximates the characteristics
of the vocal tract filtering function.