Courtney Scholl

Courtney Scholl
Clinical Instrumentation Lab: Voice & Resonance Analysis
This assignment is due in your next class meeting. You will be using the Praat program to record and analyze
various speech/voice samples. Show all of your work – provide type written answers. I will not be here on
Thursday of this week – however, I will be here on both Tuesday and Wednesday if you have questions about this
lab.
Recording Instructions (use the microphone attached to the Audiogram preamplifier).
You will record the following samples:
1.
2.
3.
4.
5.
A normal pitch and quality sustained ‘ah” (2-3 seconds in duration).
A breathy sustained ‘ah” (2-3 sec.)
The following sentence; “The rainbow is a division of white light into many beautiful colors.”
The following words (one after the other: “bead, bad, boot, bought”)
The "Go today or Saturday"
Normal /ah/
Mean pitch: 241.82 Hz
Standard deviation: 1.511 Hz
Locally unvoiced frames: 0/313
Jitter: 0.291%
Shimmer: 0.168 dB
HNR: 24.69 dB
You should be able to define the meaning of each of the aforementioned measures (use Praat help; your
handouts; other sources). When finished with your measurements, close the Edit screen and return to the
objects list.



Mean pitch- average pitch level
Standard deviation- statistical measure of variability
Fraction of locally unvoiced frame- the fraction of pitch frames that are analyzed as unvoiced

Jitter- measure of the cycle-to-cycle perturbations in a vocal period; measure of short-term instability
Shimmer- measure of the cycle-to-cycle perturbations in amplitude; measure of short-term instability
Harmonics-to-noise ratio- method that detects cycle-to-cycle variations in frequency and amplitude,
respectively


Repeat the aforementioned steps for your breathy sample – what has happened to each of the aforementioned
measures? In the event of changes, why? What do these measures reflect about the underlying vibration?

When analyzing the data, it was apparent that abnormalities arose in relation to standard deviation, jitter,
and harmonics-to-noise ratio. The standard deviation should be less than 1% of the mean pitch and it was
2.69 Hz. Jitter was also high based on the norm of less than 0.5%. HNR was lower than the expected
norm of greater than 20 dB. All of the other measures were within normal limits. Changes have occurred
because of disturbances in the air resulting in dysphonia. Frequency is changing abnormally from cycle to
cycle, which accounts for the abnormal jitter measurement. Additionally, there is insufficient vocal fold
closure altering vibration, which results in an inconsistent mucosal wave. Aperiodicity is illustrated in the
Courtney Scholl
spectrogram as abnormal striations and random energy due to air escaping through the glottis. The
presence of noise is also apparent, which interferes with the signal. A supplementary thought to keep in
mind would be that normal females may exhibit mild breathiness based on vocal fold adduction, but it
depends on the perception of the listener.
Breathy /ah/
Mean pitch: 238.97 Hz
Standard deviation: 2.69 Hz
Locally unvoiced frames: 0/294
Jitter: 0.589%
Shimmer: 0.293dB
HNR: 18.53 dB
Repeat the aforementioned measures for your Rainbow sentence – how do your measures compare to your
normal sustained vowel? In the event of differences (I am sure there will be), why did they occur? Do you think
that measures of jitter, shimmer, etc. are valid in speech? Why or why not?

There is a noticeable difference in the mean pitch and standard deviation between the two voice samples.
A higher mean pitch is noted for the normal sustained production of /ah/ as compared to the Rainbow
sentence. Standard deviations for both significantly differ because the /ah/ sample requires it to be less
than 1% of the mean pitch, while expectations for conversational speech specify that the standard
deviation should be greater than 10% of the mean pitch, which accounts for the influence of
suprasegmentals. Acoustic measurements for a sustained vowel production also focus on one sound
producing a single frequency as opposed to conversational speech which involves various sounds and
frequencies. In addition, locally unvoiced frames are susceptible to change during conversational speech
because acoustic analysis software must account for the production of consonants. Measurements of
jitter, shimmer, etc. would not be valid to accurately analyze speech because variability during continuous
speech is expected.
Rainbow Sentence
Mean pitch: 194.58 Hz
Standard deviation: 27.85 Hz
Locally unvoiced frames: 178/1043
Jitter: 2.209%
Shimmer: 0.701 dB
HNR: 17.27 dB
For the “bead, bad, boot, bought” example – select the sample in your objects list and press Edit. You are going
to zoom into each word and vowel production and measure the following:
BeadF1- 294.7 Hz
F2- 2842 Hz
Bad-
Courtney Scholl
F1- 802 Hz
F2- 1838 Hz
BootF1- 454 Hz
F2- 1816 Hz
BoughtF1- 719.3 Hz
F2- 1197 Hz
3000
2500
2000
1500
Series1
1000
500
0
0
200
400
600
800
1000
Connect the coordinates you have drawn – what is the name we give to this particular shape?
 Vowel quadrilateral
Why are the coordinates different for each vowel production (i.e., what happened to the vocal tract for each
vowel – lips, tongue, etc.?)?




/i/ in bead: Lower resonance of the vocal tract is noted for F1 because it is a high vowel with more
pharyngeal space, which resonates at a lower frequency. A higher F2 is noted primarily because of front
tongue displacement and a more closed mouth posture.
/æ/ in bad: Higher resonance of the vocal tract is noted because of an open mouth posture and lowered
jaw during speech production. It also tends to have a lower F2 frequency.
/u/ in boot: Lower resonance of the vocal tract is noted for F1 because of a more closed mouth posture
and protrusion of the lips, which elongates the vocal tract thus decreasing resonant frequency. This
extension and enlargement of the resonating cavity accounts for a lower F2.
/a/ in bought: Higher resonance of the vocal tract is noted for F1 because of a more open mouth
posture. F2 is lower because this vowel is back and the tongue is located in the posterior oral cavity, thus
lowering the resonant frequency.
Courtney Scholl
What is a formant?
 Vocal tract resonance
"Go today or Saturday" sample
What is VOT and how does it vary for different types of stop-plosives?


VOT (voice onset time)- duration of the period of time between the release of a plosive and the beginning
of vocal fold vibration
VOT is shorter for voiced stop-plosives, while it’s longer for voiceless ones.
Measure the VOT (in ms) for the /g/ in the word "Go" and the /t/ in "today" - do your VOT measurements
follow the "rules" we discussed for voiced vs. unvoiced stops?
VOT for /g/: 0.033s or 33 ms
VOT for /t/ in today: 0.053s or 53 ms
 Yes, because the VOT for /g/ was 33 ms, which is less than the 50 ms criteria for perception
of a voiced stop. The VOT for /t/ was 53 ms, which is greater than the 50 ms criteria for perception of a
voiceless stop.
Measure the VOT for the /t/ in the word "Saturday" - compare to the /t/ in "today" - is the VOT longer, shorter,
or about the same? If there are differences (there probably will be), why is the VOT different for the initial
position /t/ vs. the/t/ in the articulatory context of "Saturday"?
VOT for /t/ in Saturday: 0.023s or 23 ms


The VOT for the /t/ in Saturday is shorter than the /t/ in today.
The /t/ in Saturday is articulated within connected speech, which is why it is produced as a
vocalic /t/. The presence of this /d/ sound assimilates to the voiced aspect of this plosive and
provides substantial evidence for why there is a decrease in VOT. The influence of adjacent
vowels may also account for this difference. The initial /t/ avoids this coarticulatory effect and is
produced with a normal build-up of pressure followed by a released puff of air, which
contributes to the distinguishable /t/ sound.