The effects of articulation on the perceived loudness of the projected

University of Iowa
Iowa Research Online
Theses and Dissertations
Spring 2013
The effects of articulation on the perceived
loudness of the projected voice
Brett Raymond Myers
University of Iowa
Copyright 2013 Brett Raymond Myers
This thesis is available at Iowa Research Online: http://ir.uiowa.edu/etd/2590
Recommended Citation
Myers, Brett Raymond. "The effects of articulation on the perceived loudness of the projected voice." MA (Master of Arts) thesis,
University of Iowa, 2013.
http://ir.uiowa.edu/etd/2590.
Follow this and additional works at: http://ir.uiowa.edu/etd
Part of the Speech Pathology and Audiology Commons
THE EFFECTS OF ARTICULATION ON THE PERCEIVED LOUDNESS
OF THE PROJECTED VOICE
by
Brett Raymond Myers
A thesis submitted in partial fulfillment
of the requirements for the Master of Arts
degree in Speech Pathology and Audiology
in the Graduate College of
The University of Iowa
May 2013
Thesis Supervisor: Associate Professor Eileen Finnegan
Graduate College
The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROVAL
MASTER’S THESIS
This is to certify that Master’s thesis of
Brett Raymond Myers
has been approved by the Examining Committee for the thesis requirement
for the Master of Arts degree in Speech Pathology and Audiology at the May
2013 graduation.
Thesis Committee:
Eileen Finnegan, Thesis Supervisor
Michael Karnell
Ann Fennell
If vowels are a river and consonants are the banks, it is necessary to reinforce the
latter lest there be floods.
S. M. Volkonski
The Expressive Word
Think what you’re dealing with. The majesty and grandeur of the English
language… its extraordinary, imaginative, and musical mixture of sounds.
G. B. Shaw
Pygmalion
ii ACKNOWLEDGEMENTS
I have been blessed to have the constant support of mentors and friends
throughout the progress of my Master’s thesis. I am highly grateful for Dr. Eileen
Finnegan for her guidance, encouragement, and dependability throughout this
research. I also want to express warm thanks to Ann Fennell and Dr. Michael
Karnell for serving on my research committee, and for being instrumental in my
education at the University of Iowa. I would like to thank Dr. Ingo Titze and Vicki
Lewis for sharing their time and knowledge to contribute to my educational
development. I must thank my friends Darcey Hull, Erica Jones, Adam Lloyd, and
Lauren Richman for always allowing me to bounce ideas off of them. I would like to
give thanks to Dr. Karla McGregor, Dr. Tim Arbisi-Kelm, and Nichole Eden for
allowing me to use their Word Learning Laboratory, and to Dr. Rick Arenas for his
impeccable assistance with computer programming. Furthermore, I would like to
acknowledge all of the faculty and staff of the Department of Communication
Sciences and Disorders at the University of Iowa for their excellence in scholarship
and community. Lastly, I would like to thank my parents, Doug and Sue Myers, for
endlessly supporting and encouraging me in my endeavors. Thanks to all for this
fulfilling and rewarding experience.
iii ABSTRACT
Actors often receive training to develop effective strategies for using the voice
on stage. Arthur Lessac developed a training approach that concentrated on three
energies: structural action, tonal action, and consonant action. Together, these
energies help to create a more resonant voice, which is characterized by a fuller
sound that carries well over noise and distance. In Lessac-Based Resonant Voice
Therapy, voice clinicians help clients achieve a resonant voice through structural
posturing and awareness of tonal changes. However, LBRVT does not include the
third component of Lessac’s approach: consonant action. This study examines the
effect that increased consonant energy has on the speaking voice—particularly
regarding loudness. Audio samples were collected from eight actor participants who
read a monologue using three distinct styles: normal articulation, poor articulation
(elicited using a bite block), and over-articulation (elicited using a Lessac-based
training intervention). Participants learned about the “consonant orchestra,”
practiced producing each sound in a consonant cluster word list, and practiced
linking the consonants in short phrases. Twenty graduate students of speechlanguage pathology listened to speech samples from the different conditions, and
made comparative judgments regarding articulation, loudness, and projection.
Group results showed that the over-articulation condition was selected as having
the greatest articulation, loudness, and projection in comparison to the other
conditions, although vocal intensity (dB SPL) was not statistically different. These
iv findings indicate that articulation treatment may be beneficial for increasing
perceived vocal loudness.
v TABLE OF CONTENTS
LIST OF TABLES .............................................................................................. vii
LIST OF FIGURES........................................................................................... viii
CHAPTER
1.
INTRODUCTION ............................................................................... 1
1.1 Vocal Loudness and Articulation ................................................. 1
1.2 Literature Review ......................................................................... 4
1.2.1 Loudness and Articulation ................................................. 4
1.2.2 Loudness and Dysarthric Speech ....................................... 7
1.2.3 Articulation for Impaired Listeners ................................. 11
1.2.4 Loudness and Actors ......................................................... 13
2.
METHODS ........................................................................................ 18
2.1 Sample Collection ....................................................................... 18
2.1.1 Participants ....................................................................... 18
2.1.2 Speech Tasks ..................................................................... 18
2.1.3 Intervention ....................................................................... 20
2.1.4 Recording Procedures ....................................................... 24
2.2 Data Analysis ............................................................................. 25
2.2.1 Acoustic Analysis .............................................................. 25
2.2.2 Perceptual Analysis .......................................................... 26
3.
RESULTS.......................................................................................... 28
3.1 Acoustic Findings ....................................................................... 28
3.2 Perceptual Findings ................................................................... 26
3.3 Acoustic vs. Perceptual Findings ............................................... 30
3.4 Inter-Rater Reliability................................................................ 32
4.
DISCUSSION ................................................................................... 42
APPENDIX A. PERFORMANCE MONOLOGUE ........................................... 47
APPENDIX B. LESSAC’S CONSONANT ORCHESTRA................................ 48
APPENDIX C. OVER-ARTICULATION PRACTICE LISTS .......................... 49
REFERENCES................................................................................................... 50
vi LIST OF TABLES
Table
1.
Overall percentage that a condition was selected as having better
articulation, loudness, or projection than its paired sample.......... 34
vii
LIST OF FIGURES
Figure
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Mean intensity (dB) levels for each speaker across articulation
conditions. ......................................................................................... 35
Maximum intensity (dB) levels for each speaker across articulation
conditions. ......................................................................................... 35
Mean sentence duration (in seconds) for each speaker across
articulation conditions. .................................................................... 36
Percentage that judges selected each condition as having better
articulation than its paired stimulus for each speaker. ................. 36
Percentage that judges selected each condition as being louder than
its paired stimulus for each speaker. .............................................. 37
Percentage that judges selected each condition as having better
projection than its paired stimulus for each speaker. .................... 37
Percentage that judges selected each condition as having better
articulation than its paired stimulus. Judges are displayed on the xaxis, and the data points represent ratings across speakers. ........ 38
Percentage that judges selected each condition as being louder than
its paired stimulus. Judges are displayed on the x-axis, and the data
points represent ratings across speakers. ....................................... 38
Percentage that judges selected each condition as having better
projection than its paired stimulus. Judges are displayed on the xaxis, and the data points represent ratings across speakers. ........ 39
Percentage that judges selected each condition as having better
articulation than its paired stimulus for each of the five presented
sentences. .......................................................................................... 39
Percentage that judges selected each condition as being louder than
its paired stimulus for each of the five presented sentences. ........ 40
Percentage that judges selected each condition as having better
projection than its paired stimulus for each of the five presented
sentences. .......................................................................................... 40
Comparison of acoustic findings with perceptual findings. Acoustic
loudness is shown as the percentage that the mean loudness (in dB)
of each condition was greater than that of its paired stimulus.
Perceived loudness is represented by the percentage that judges
selected each condition as sounding louder than its paired stimulus
........................................................................................................... 41
viii
1
CHAPTER ONE: INTRODUCTION
1.1 Vocal Loudness and Articulation
Stage performers are faced with the challenge of delivering their voices to a
large audience. The goal of projection is also applied to other purposes like a teacher
lecturing to a large class, a politician speaking to a crowd, a drill sergeant
commanding his troops, or an enthusiastic fan cheering at a football game. These
individuals aim to increase the loudness of their voices, and they all find different
ways do it. They strive to speak loudly and clearly to achieve optimal projection.
In a study of acoustic and perceptual analyses, Master et al. (2008) found
that actors were generally perceived as louder than non-actors when reading text
with a loud voice, even when the sound pressure level (SPL) was not significantly
different between groups. Therefore, the SPL alone does not explain the perceptual
differences between actor and non-actor loudness levels. Perhaps some aspect of
actor training is responsible for the greater perceived loudness of the actor’s
projected voice.
A look into the acting literature gives some insight into what projection
means to an actor. Mayer (1968) defines projection as “controlled energy which gives
impact and intelligibility to sound.” Machlin (1966) defines it as “the vigorous
throwing out of the sounds that make up the words you speak” (p. 17). Rodenburg
(2000) refers to projection as “a marriage between support and the means of
articulation” (p. 72). Berry (1973) speaks of projection in terms of “filling the
space… with sharpness of diction and the precise placing of word and tone” (p. 130).
2
Linklater (1976) encourages “freeing the natural speaking voice” (p. 49) while
freeing muscle tension of the articulators. The common thread among these
definitions is that projection requires loud voicing with clear articulation.
Arthur Lessac (1967) described voice for the actor in terms of structural
action, tonal action, and consonant action. With these, the actor must develop a
physical awareness of the vocal tract, experience the sensation of vocal vibrations,
and create a structure for the intelligibility of speech. The Lessac system
emphasizes the necessity to combine these three actions or energies in order to
produce a voice that easily projects—that is, a resonant voice. The field of speechlanguage pathology has borrowed principles from the Lessac approach to treat vocal
pathologies with a program called Lessac-Based Resonant Voice Therapy (Verdolini,
2000).
The LBRVT approach to voice therapy aims to minimize the impact stress of
the vocal folds while maximizing vocal output (Berry, Verdolini, Montequin, Hess,
Chan, & Titze, 2001; Verdolini, 2000; Verdolini-Marston, Druker, Palmer, &
Samawi, 1998; Peterson, Verdolini-Marston, Barkmeier, & Hoffman, 1994). A
laryngeal formation that is barely abducted and barely adducted is said to produce
a resonant voice (Verdolini, 2000). In resonant voice therapy, the patient works to
configure the oral cavity in such a way that allows the sound source to resonate in
the vocal tract. Lessac (1967) describes this structural action as an arrangement of
an open pharynx, open teeth, and loose lips. This method of resonating is often used
3
by actors to produce good voice onstage, and by dysphonic patients to produce
healthy voicing.
Lessac (1967) taught that a resonant voice is achieved when structural
action, tonal action, and consonant action work in tandem to form a trinity of
energies. His philosophy was that a speaker must connect all three energies with
equal weight given to each. Verdolini (2000) seems to have borrowed two of the
energies—structural and tonal—to establish LBRVT. The structural component is
incorporated in the basic training gesture, which is the mechanical formation that
fosters a resonating oral cavity. The tonal component involves easy phonation at the
larynx and a sensation of anterior oral vibrations when phonating. These two
principles together are effective at establishing healthy voicing strategies. However,
LBRVT does not routinely include the third of Lessac’s energies—consonant action.
Let us consider the role of articulation in resonant voice. By producing a
heightened level of articulation, a speaker by nature increases the acoustic energy
of his voice because he is introducing an additional resonant dimension to the
airflow. Berry (1973) describes the production of consonants in three segments:
coming together, holding, and releasing. “Coming together” is the action of two
surfaces meeting to disrupt the airflow. “Holding” is when the surfaces stay
together for an unspecified period of time and allow pressure to build and vibrations
to resonate. “Releasing” is the final step in producing a consonant, where the two
surfaces separate, which decreases oral pressure and sends the resonant sound out
of the vocal tract. This procedure can be applied to all consonants, and the speaker
controls the resonance by the degree of articulatory precision.
The current study investigated the effects of various levels of articulatory
precision on the perceived loudness of speech in actors during a staged reading. If
we consider articulation as influencing resonance—as described by Berry (1973)—
then the degree of articulation would fundamentally influence vocal loudness. The
present hypothesis is that a positive correlation exists between articulatory
precision and perceived vocal loudness; that is, as articulation increases from weak
to strong, the perceived loudness also increases from soft to loud.
1.2 Literature Review
1.2.1 Loudness and Articulation
The relationship between loudness and articulation has previously been
explored but begs further investigation. Several studies have shown that speakers
heighten the degree of articulation when asked to increase loudness (Schulman,
1989; Dromey & Ramig, 1998; Wohlert & Hammen, 2000; McLean & Tasko, 2003).
Schulman (1989) found that loud speech is characterized by greater articulatory
excursions than normal speech. He measured displacement of the lips and jaw in
four typical speakers as they read six lists of words normally and while shouting.
The lists included the same words in varied orders, and the words used the same
consonant frame with different vowel nuclei. After measuring lip and jaw
movements, Schulman found that the jaw had greater peak displacement and the
4
5
lips were more separated in loud speech than in normal speech. This may lead the
reader to deduce that a greater mouth opening is a concomitant characteristic of
loud voicing.
Dromey and Ramig (1998) provide further support for this finding. They
analyzed the opening and closing movements of the lips in ten adults with normal
voice and speech functions. Using a head-mounted cantilever system in various
loudness contexts, they found that the lips had greater displacements and velocities
in louder speech compared to softer speech. Wohlert and Hammen (2000) reported
similar results from collecting perioral surface EMG measurements from twenty
normal adult subjects. They obtained EMG signals in loud and soft oral reading
tasks, and their data revealed that loud voicing was associated with significantly
greater amplitude of upper and lower lip activity compared to soft voicing. McClean
and Tasko (2003) also compared orofacial muscle activity with vocal intensity levels.
The researchers measured lip and jaw EMG parameters of three normal adult male
subjects, while the participants were asked to repeat a given phrase at distinct
loudness levels. Results showed that the loud speech condition elicited higher EMG
levels than the soft speech condition. Overall, there seems to be much support for
increases in vocal loudness triggering increases in articulatory excursions.
Dromey and Ramig (1998) describe this phenomenon by stating, “A more
open vocal tract allows a more efficient radiation of acoustic energy, thus the larger
articulatory excursions can contribute to higher SPL directly.” After all, the aim of a
loud speaker is to increase the intensity of the acoustic signal, so precise
6
articulation may be a means to an end. McClean and Tasko (2003) suspect that
laryngeal and facial functions share underlying neural pathways, which explains
the correlation of levels. However, Schulman (1989) explained the correlation in
terms of subglottic pressure. When subglottic pressure increases during speech,
supraglottic pressure likewise increases, which causes airflow to surge through the
oral cavity. This escalated airflow may compel the articulators to create a greater
opening. According to this theory, the articulatory modifications are secondary to
vocal intensity changes.
However, Cookman and Verdolini (1999) showed that the inverse is also
possible. They found that when they controlled the amount of jaw opening, vocal
loudness similarly changed. They collected data from twelve vocally healthy adults,
who produced tokens of /Λ/ using varied amounts of jaw openings (10 mm, 25 mm,
40 mm). Jaw muscle pressure was also observed using a pressure transducer within
a bite block. The researchers used the electroglottographic closed quotient (EGG
CQ) to indirectly estimate vocal fold amplitude. They found that the largest jaw
openings and greatest biting pressures were associated with the greatest amplitude
of vocal fold vibration. Therefore, Cookman and Verdolini suggest that increased
loudness is associated with increased effort of articulatory gestures.
Given the evidence in the literature that loudness and articulation are
linked, Dromey et al. (2008) explored a related concept in the clinical setting. They
found that successful voice therapy for muscle tension dysphonia (MTD) patients
elicited improved articulation. The researchers collected samples of 111 women with
7
MTD—a voice disturbance that occurs without any structural or neurological
changes to the larynx—reading from “The Rainbow Passage” (Fairbanks, 1960) at a
comfortable loudness level before and after treatment for MTD. Treatment
consisted of circumlaryngeal massage and/or manual laryngeal re-posturing
maneuvers, which are commonly used to improve voicing in patients with MTD
without focusing on articulatory functions. Acoustic analysis of the diphthongs /aI/
and /eI/ revealed a significant increase in F2 slope from pre- to post-treatment
measures, as well as shorter sample duration and fewer pauses following MTD
treatment. These results are consistent with more precise articulatory excursions
and were not present in a control group—ruling out practice bias. This clinical
research provides further evidence that the laryngeal and articulatory subsystems
of the speech mechanism are co-dependent.
1.2.2 Loudness and Dysarthric Speech
There have been a number of studies to address the relationship between
loudness and articulation in the clinical setting—particularly in regard to patients
with dysarthric speech secondary to Parkinson’s disease (Tjaden & Wilding, 2004;
Sapir et al., 2007; Neel, 2009; Kim & Kuo, 2012). Dysarthria is a term given to a
group of neurological disorders affecting the motor speech system; the dysarthrias
impair the muscular control of speech, so speech articulation and intelligibility often
suffer. Parkinsonism is a degenerative neurological disorder that is characterized by
dysarthric speech, as well as a soft, breathy, monotonous voice. A popular therapy
program—the Lee Silverman Voice Treatment (LSVT)—stimulates PD patients to
8
increase and sustain respiratory and phonatory efforts during speech to create a
louder and clearer voice (Brin, Velickovic, Ramig, & Fox, 2004). Traditionally, LSVT
does not address speech affected by the dysarthrias.
While the focus of LSVT is on loudness, this regimen may have an
unintentional impact on articulation, as well. Sapir and colleagues (2007) found
that LSVT elicited significant improvements in the quality of vowel articulation
among dysarthric speakers with PD. Three groups were included in the study: 14
patients with PD received treatment, 15 patients with PD did not receive
treatment, and 14 aged-matched neurologically normal participants did not receive
treatment. All participants were asked to read a series of phrases before and after
the treatment group received the LSVT program. The researchers later extracted
the vowels /i/, /u/, and /a/ from the recorded speech samples. The vowels were paired
in pre- and post-treatment productions and presented to a panel of judges. The
judges were asked to rate each vowel pair using a 100-point visual analog scale
indicating which production was a better exemplar of the target vowel. The
investigators found that there were significant improvements in vowel goodness for
the LSVT treatment group, but there were no significant changes in vowel
productions for either of the control groups. These results demonstrate that a
treatment program with the single focus of increasing loudness can also have
therapeutic effects on articulation for individuals with PD.
In a similar study, Tjaden and Wilding (2004) found that loud speech was
significantly more intelligible than habitual speech in PD patients. This study
9
examined data from 12 individuals with dysarthria secondary to PD reading a
passage in habitual and loud conditions. Speakers were instructed that loud reading
should be double the loudness of habitual voicing, a comparable instruction to that
given in LSVT. Excerpts from the speech samples were played for 10 listeners, who
rated the intelligibility in terms of the ease with which speech could be understood,
paying particular attention to articulation. Listeners were asked to assign any
positive number to the first speech sample, and then assign numbers to all
subsequent samples based on their intelligibility relative to the first sample. These
listener ratings were later converted to a common scale and revealed an overall
greater intelligibility for loud than habitual speech.
The question remains if the speakers are more intelligible simply because
their loud speech is easy to hear. Neel (2009) compared loud speech with amplified
speech and found that loud speech was considerably more intelligible than mere
amplification. For this study, five participants diagnosed with PD read 20 sentences
using LSVT loud voicing techniques, and then they read the same sentences in their
habitual manner without effortful voicing. Neel amplified the habitual speech
samples to a loudness level corresponding to that of the loud speech samples. The
three conditions of habitual, loud, and amplified were presented to 11 listener
participants. Listeners rated intelligibility of the samples on a 7-point scale. The
results showed that loud speech was significantly more intelligible than habitual
and amplified speech, and in most speakers amplified speech was not significantly
more intelligible than habitual speech.
10
Kim and Kuo (2012) provide further evidence that amplification does not
make speech more intelligible. The researchers extracted 50 utterances from
archived speech samples of 9 healthy speakers and 16 speakers with dysarthria
secondary to stroke, Parkinson’s disease, and multiple system atrophy. They
manipulated the speech samples to be presented at four distinct levels. For two
conditions, they manipulated the overall loudness of the samples by setting the
presentation levels at 80.5 dB SPL for the high level and 66.0 dB SPL for the low
level. For the other two conditions, they adjusted the speech samples by adjusting
the intensity of the most intense vocalic nucleus of the sentence to fill a ± 6-dB
range. Then the other vocalic nuclei in the sentence were adjusted to be
proportional to the target adjustment. This process was applied to all sentences in
both presentation levels of the previous two conditions. The researchers presented
these manipulated samples to a total of 60 listeners with no known hearing
problems, who were assigned to listening to one of the four conditions. Listeners
were asked to rate speech intelligibility by “How easy it is to understand exactly
what the speaker said.” Ratings were scaled by using a direct magnitude estimation
technique. Based on these estimates of speech intelligibility, the researchers found
that an overall signal level increase was not associated with improved speech
intelligibility, and the equalization of vowels in an utterance resulted in a decrease
in speech intelligibility. Therefore, the reader may assume that the loud speaker
makes vocal tract adjustments that increase intelligibility, and perhaps those
adjustments are manifested in improved articulation.
11
1.2.3 Articulation for Impaired Listeners
Schum (1997) described everyday conversational speech as having rapid
articulation, eliminating or blurring sounds together, and failing to project the
voice. As a result, casual speech is prone to faulty intelligibility. Amplifying speech
may not be an adequate solution for improving intelligibility, as Erber (1993) points
out that amplified speech is distorted from the original signal. An effective
alternative is to make articulatory adjustments to increase intelligibility. When
speaking with hearing impaired listeners, it may be helpful to use careful
articulation and to emphasize key words with varied intonation (Erber, 1996;
Schum, 1997).
Clear speech is a term given to the speaking style that one uses to voluntarily
maximize the intelligibility of one’s own speech for the benefit of the listener
(Uchanski, 2005). The fundamentals of this approach focus on precise enunciation,
rather than increasing loudness (Tye-Murray, 1988). In other words, the goal is to
clarify the speech signal—not intensify it. Picheny, Durlach, and Braida (1985)
argue that clear speech should be used when talking with hearing impaired
listeners to improve the intelligibility of the speech signal. They recorded three
male speakers reading 50 nonsense sentences using conversational and clear
speech. The sentences were presented to five hearing impaired listeners, who were
tasked with reproducing the sentences either orally or in writing. All listeners were
significantly more accurate in their reproductions of clear rather than
conversational speech. Accuracy was determined by the percentage of correct words.
12
The researchers also assessed accuracy of phoneme classes and found that all
phonemes were more intelligible in clear speech.
Clear speech has been shown to improve intelligibility for normal hearing
subjects, as well. Caissie and colleagues (2005) found that clear speech intervention
yields excellent speech recognition regardless of hearing acuity. They used two
normal males as the speakers in their study, and both had spouses with a
sensorineural hearing loss. One talker received intervention on producing clear
speech, and the other talker was simply instructed to produce clear speech. The
intervention for the experimental talker was based on the Clear Speech program
from the Oticon Otiset hearing aid fitting software, which discusses rate,
articulation, pausing, and stress. Both speakers recorded sentences at
conversational speech, one week post-intervention, and one month postintervention. Sentences were presented to normal hearing subjects and subjects
with hearing loss, and the subjects were asked to repeat the sentences as they
heard them. The researchers found that a speaker’s intelligibility improves when
asked to speak clearly, but clear speech intervention elicits greater benefit to speech
intelligibility. These results were true for both normal hearing and hearing
impaired listeners. In fact, the hearing impaired listeners achieved the same speech
recognition accuracy as subjects with normal hearing when listening to the speaker
who received intervention. This demonstrates the effectiveness of clear speech as a
way to increase speech intelligibility.
13
Bradlow, Kraus, and Hayes (2003) broadened the scope of clear speech and
found that it is also effective in noisy backgrounds and with cognitively impaired
listeners. The researchers recorded two speakers reading a list of sentences in
conversational and clear speech. The sentences were presented with -8 dB signal-tonoise ratio and -4 dB signal-to-noise ratio to 36 normally developing school-aged
children and 63 children with learning disabilities. The children were asked to
repeat each sentence that they heard. The researchers found that clearly spoken
sentences had better speech perception than conversational speech for both groups
of children and in both SNR conditions. These findings suggest that clear speech
has strong benefits for speech perception despite background noise or the overall
cognitive function of the listener.
It has been shown that in situations where listeners have difficulty hearing—
due to hearing impairment, cognitive impairment, or background noise—clear
speech is an effective method for improving the speaker’s intelligibility. While
intelligible speech is by nature adequately loud, the previous findings do not
address the perceived loudness of clear speech. Additional research is needed to
determine if the crispness of precise articulation adds to the perceptual intensity of
the speech signal.
1.2.4 Loudness and Actors
Actors have specific vocal demands that make them an appropriate
population for investigating the relationship between articulation and loudness.
Master and colleagues (2008) demonstrated that actors have unique voice
14
characteristics compared to non-actors. The researchers recorded eleven actors and
ten non-actors reading the same text using three conditions of loudness: habitual,
moderate, and forte. All participants were recorded in an acoustically treated booth
and were asked to imagine being in a small, medium, and large space to elicit
distinct volume levels. Acoustic analysis revealed unique mean SPL levels for each
condition, but there was not a significant SPL difference between the actor and nonactor groups. However, when eight speech therapists listened to the speech samples,
they rated the actor group as being significantly louder and better projected than
the non-actor group. These findings suggest that the actors involved had some
quality of voicing that made them seem perceptually louder than the non-actors,
even though there was no overall difference in SPL.
Resonant voicing may be a possible explanation for the perceptual differences
in loudness. Acker (1987) compared samples of resonant phonation and constricted
phonation, and claimed that resonant voice is both perceptually and acoustically
louder. The study used one subject who was a female actor trained in the Lessac
technique. The stimuli for the subject were loud sustained productions of the vowel
/o/ in resonant and constricted modes. Findings revealed that resonant phonation
was on average 6.7 dB louder than constricted phonation. In a listening test, ten
judges were presented with pairs of stimuli, and they selected the resonant
productions to be louder than the constricted productions 80% of the time. Perhaps
the actors in the Master et al. (2008) study were well trained in resonant voicing,
which may have accounted for those actors sounding louder than non-actors.
15
A resonant voice may seem louder because it is considered to be acoustically
richer than conversational voice production. Raphael and Scherer (1987) examined
spectral differences between actors’ normal conversational voice and their
performance voice. The study included two male and two female actors who were all
previously trained in Lessac’s call technique. The actors were asked to sustain the
first part of the diphthong /ou/ in the word “hello” on a predetermined pitch in
speech mode and call mode. They were asked to use the same degree of effort in
speech as in call. Spectral analysis revealed significant differences between the two
vocal modes; the authors reported finding enhancement at the first formant and the
third formant skirt for the call mode. These spectral enhancements give the acoustic
signal a rich quality, which may make the voice sound louder.
It has also been shown that an actor’s vocal loudness is contingent upon the
performance conditions. Emerich et al. (2005) claim that actors produce louder
phonation onstage than in a studio. The researchers obtained voice range profiles
(VRPs) of eight professional actors by having them sustain the vowel /a/ at various
frequencies in their pitch range using minimum and maximum loudness levels.
Then the actors performed a scene in a studio and on a stage. The researchers
analyzed speech samples by collecting speech range profiles (SRPs), which reflect
the many tones and intensities produced in a connected speech sample. They found
that no actor performed using the full physiologic ranges seen in the VRP
collections, but all actors produced louder phonations in SRPs than in VRPs. The
findings also showed that actors were louder at some frequencies during the staged
16
performance compared to the studio performance. Emerich and colleagues
demonstrated that actors may exceed their baseline intensity—collected in VRP—
during a performance, and they specifically get louder onstage than in a studio.
Interestingly, Acker (1987) found that actors have greater articulatory
excursions associated with a resonant voice than a constricted voice. Radiographic
measures revealed that resonant phonation greatly increased oral cavity size (by 36
mm) and jaw lowering (by 14.6 mm) when compared to constricted phonation. In
this regard, resonant voicing is comparable to loud voicing in that they both elicit
enhanced articulatory excursions (Schulman, 1989; Dromey & Ramig, 1998;
Wohlert & Hammen, 2000; McLean & Tasko, 2003). Accordingly, resonant voice is
characterized by both acoustic and structural properties that make it distinct from
conversational voice.
In conclusion, it has been shown that heightened levels of articulation often
accompany loud speech. However, the aforementioned cited authors in this section
do not consider if loud speech might follow from controlled precision of articulation.
The current study attempts to address this matter.
The purpose of this study is to investigate the effect that articulation may
have on vocal loudness. It is hypothesized that speech samples with clear
articulatory precision will be perceived as sounding louder and more projected than
samples with poorly enunciated speech. If this hypothesis is supported by the
findings, then we may have further insight on why actors are louder in performance
than in sustained phonation (Emerich et al., 2005). We may also have a foundation
17
for future research relating to voice therapy; articulation may potentially become a
focal point of treatment to target vocal loudness. The current study serves as a
preliminary attempt to determine if articulation control has a positive influence on
perceived vocal loudness.
18
CHAPTER TWO: METHODS
2.1 Sample Collection
2.1.1 Participants
Eight amateur actors—5 females and 3 males, whose ages ranged from 22
years to 54 years (with a mean age of 29 years)—volunteered to participate in this
study. Participants were recruited through the primary investigator’s acting
colleagues associated with theatre companies in Iowa City, Iowa. All participants
were amateur actors with experience performing in stage productions ranging from
3 years to 36 years (with a mean experience of 13 years). Amateur actors were
defined as those who had previously performed in stage productions and who were
not acting professionally at the time of the study. The amount of previous acting
training for these actors varied from 0 years to 7 years (with a mean training of 3
years). Amateur actors were selected under the assumption that they may be more
susceptible to change given an intervention, whereas professional actors may be
more likely to have a polished performance voice that uses sufficient articulation,
loudness, and projection. All actors reported that they were non-smokers and had
no hearing problems, voice problems, or upper respiratory tract illnesses at the time
of the study. Informed consent was obtained from participants in accordance with
the Institutional Review Board of the University of Iowa.
2.1.2 Speech Tasks
Each actor performed a monologue in three different conditions: normal
performance, bite block performance, and over-articulation performance. Actors
19
participated in individual recording sessions at the Wendell Johnson Speech and
Hearing Center at the University of Iowa. After giving consent to participate, all
participants were given the same monologue from Shakespeare’s King Lear to
perform (Appendix A). They were given five minutes to independently read through
the monologue and practice performing it in a room alone. This process was similar
to a cold reading at an acting audition. When the five minutes of practice time was
complete, each actor reported feeling comfortable with performing the monologue.
Performances were held on a small stage raised off the main floor in a 166–
seat lecture hall while the primary investigator administered an audio recording.
For the first performance, actors were instructed to read the monologue as if they
were performing it for an audition. They were given no further instructions
regarding articulation, loudness, or projection. After this, each actor was given a
Dynarex 6–inch by 3/4–inch wooden tongue depressor to serve as a bite block device.
Participants were asked to read the monologue again in exactly the same way as
before (i.e., keeping the same characterization, emphasis, vocal tone and quality),
and this time hold the bite block with their teeth. The instructor demonstrated how
to speak in this manner. Once the participants performed the monologue with the
bite block, they received an intervention focused on developing an over-articulated
style of speech production. This intervention lasted about thirty minutes for each
participant. When the intervention was completed, the actors performed the
monologue for a third time with the instruction to incorporate the newly developed
20
heightened articulation style. After this final performance, the actors were given a
debriefing of the study, and their participation was concluded.
The order of the three speech tasks was consistent for all participants (i.e.,
normal, then bite block, then over-articulation) to eliminate experience bias from
the over-articulation intervention.
2.1.3 Intervention
Intervention for over-articulation was based on Lessac’s (1967) articulation
exercises and was administered for all participants by the primary investigator. The
purpose of this intervention was to elicit performance samples that may be
perceived as heightened articulatory productions. The intervention was a
programmatic routine consisting of structural stretch exercises, basic training of
consonants, practice in words, practice in phrases, and practice in the text from
King Lear.
The intervention began with a series of structural stretch exercises. These
exercises aimed to reduce any chronic underlying contraction in the torso,
shoulders, neck, and face, to increase each participant’s sensory awareness to
his/her own body, and to allow each participant to attend to his/her breathing and
speech (Verdolini, 2000). Participants began by standing in a neutral posture with
feet shoulder-width apart, knees loose, spine elongated (with natural curves),
shoulders resting back, and head sitting lightly at the top of the spine. They were
instructed to take deep, slow breaths, breathing in through the nose to fill the lungs
21
and exhaling through the mouth. On an exhalation, they slowly raised their arms
above their heads, and held this position for several deep breaths. Then they
gradually dropped their arms and let the weight of the head pull the upper body
downward, so that the hands might touch the ground. After a few more deep
breaths in this bent-over position, participants slowly elongated the spine back to
the neutral position. These stretches aimed to alleviate undesirable tension in the
back and allow participants to attend to the breath.
Next, the participants stretched the muscles affecting the thoracic cavity.
First, they placed their hands on their hips and pointed their elbows to the wall
behind them in order to expand the chest wall. Then, they reached their arms
around to the front as if hugging a large tree, which expands the back and allows
the lower part of the lungs to fill with air. Finally they reached the left arm up and
over to the right side, bending at the waist—stretching the left side of the rib cage.
This was repeated for the right side, as well.
Then, the participants stretched their necks with side stretches, head rolls,
and massage. The side stretch involved placing the left hand onto the right side of
the head and letting the weight of the hand stretch out the right side of the neck.
This was repeated for both sides. The head rolls involved dropping the chin to the
chest and rolling the head up to each side. Finally, participants were instructed to
massage the back of their necks to loosen any tight muscles there.
Further exercises were intended to relax the oral-facial muscles. Participants
practiced trilling the lips by blowing air through them and letting them loosely
22
vibrate together. They also stretched out the inside of the cheeks by pressing the
tongue along the interior walls of the oral cavity, as if trying to remove peanut
butter from the mouth. Then, they engaged all facial muscles by moving them in all
directions; they were instructed to raise every muscle in their face toward the
ceiling, to the left, to the right, and toward the floor. Finally, participants engaged
the tongue in directional movements by protruding it from the mouth and pointing
it up, down, left, and right. Each of these exercises was aimed at warming up the
muscles to prepare them for the articulation exercises to follow.
The first articulation exercise was consonant control practice of individual
speech sounds. The instructor gave participants a copy of Lessac’s diagram of the
Consonant Orchestra (Appendix B). Lessac (1967) refers to each of the consonant
sounds as a musical instrument, and the consonants come together to play a
symphony that is speech. Hampton (1997) suggests that the purpose of the
Consonant Orchestra is to “taste [a consonant’s] particular identifying vibrations, to
explore its particular range, … and to incorporate this new-found vibratory and
rhythmic awareness into spoken language.” Using the Consonant Orchestra as a
reference, participants produced sound repetitions and elongations for each
consonant sound in the English language. While working at the sound level,
participants were encouraged to attend to the oral vibrations associated with each
consonant, and they were asked to make the vibrations as strong as possible. They
also used this time to focus on the structural characteristics that produced each
consonant sound. If the consonant were a stop, they were to feel a complete stop of
23
the airflow before letting the sound escape the vocal tract. If the sound were a
fricative, they were to feel the obstruction to the airflow as the air pushed through
the articulators. This exercise aimed to heighten each participant’s sensory
awareness of consonant energy.
Next, the instructor gave the participants a list of words and phrases to recite
(Appendix C). This list was borrowed from Lessac (1967), and it consisted of 68
words and 52 phrases that had one or more consonant clusters. Participants were
asked to maintain the sensory awareness that they developed in the previous
exercise for producing the current words and phrases, and to pronounce these
stimuli clearly and precisely. The instructor modeled the first word and phrase, and
then he simultaneously read the first few words and phrases with each participant.
After a few trials, he stopped reading the stimuli and allowed each participant to
continue independently. The instructor provided occasional feedback for all
participants. This exercise applied the concept of consonant energy to a more
meaningful speech-related context.
The final exercise related to precise articulation was selective sound practice.
In this activity, participants were asked to read the King Lear monologue using
only consonants. That is, they recited the monologue while omitting all vowel
sounds, making the speech a continuous cluster of plosives, fricatives, nasals, and so
on. For example, the line “Blow, winds, and crack your cheeks” would be read as [bl
wndz nd krk jr tʃks]. The instructor modeled the first sentence, and then he
simultaneously read the first few lines with each participant. After fading away to
24
let participants continue independently, the instructor provided occasional
feedback. This selective sound practice engaged the participants in feeling each
individual consonant and giving it adequate weight and emphasis, which would
then be applied to the final performance of the monologue.
After these exercises were complete, participants were allotted five minutes
to practice the monologue with over-articulation. Once each participant felt
comfortable that he/she could perform the monologue using over-articulation, the
intervention session was complete, and the final recording was administered.
2.1.4 Recording Procedures
Recordings were collected during all performances in the same manner. The
actors were asked to stand on a mark placed in the center of the stage and deliver
their readings from this point. The room was a 166-seat auditorium with ambient
noise level below 50 dB. The only people in the room during any given recording
were the performing actor and the primary investigator. A Shure PG48 dynamic
cardioid microphone (Shure Incorporated, Niles, IL) was used to collect the voice
samples. A constant microphone-to-mouth distance of 1 m (91.44 cm) was used for
all participants. The microphone was connected to an M-Audio Fast Track MKII
USB audio interface (inMusic Brand, Cumberland, RI), which provided consistent
signal gain for all participants. The signal was processed through Pro Tools SE
(Avid Technology, Daly City, CA) recording software with a sampling rate of 48
kHz, 24-bit resolution.
25
A general purpose Quest Sound Level Meter (series 210; Quest Technologies,
Oconomowoc, WI) was used to calibrate the equipment prior to each recording
session. A white noise sound source was generated by SimplyNoise
(simplynoise.com) and played through a loudspeaker at a microphone-to-source
distance of 1 m (91.44 cm). The signal was presented at three distinct levels (60 dB,
70 dB, 80 dB as measured by the sound level meter) for 10 seconds each and
recorded with the Pro Tools SE software. The recorded measurements were used to
calibrate the recorded audio signals.
2.2 Data Analysis
2.2.1 Acoustic Analysis
After recording each actor in all three conditions, the collected voice samples
were segmented using the editing software Audacity (version 2.0.2;
audacity.sourceforge.net). Five distinct sentences were selected from the monologue
for analysis: “You cataracts and hurricanes, spout till you have drench’d our
steeples, drown’d the cocks!” “And thou, all-shaking thunder, strike flat the thick
rotundity o’ th’ world,” “Rumble thy bellyful! Spit, fire! Spout, rain!” “I never gave
you kingdom, call’d you children, you owe me no subscription,” “Here I stand your
slave, a poor, infirm, weak, and despis’d old man.” These sentences were chosen at
random and taken consistently from all recordings.
The Computerized Speech Lab software (Model 4400; Kay Elemetrics
Corporation, Lincoln Park, NJ) was used to determine the mean and maximum
26
loudness levels for each sentence per actor per condition. These data were used to
make comparisons between and across sentences, actors, and conditions.
2.2.2 Perceptual Analysis
Twenty graduate students in the Department of Communication Sciences
and Disorders at the University of Iowa volunteered to participate as judges in this
study. The judges consisted of 17 females and 3 males, whose ages ranged from 22
years to 37 years (with a mean age of 25.7 years). The number of years that the
judges spent studying communication sciences and disorders varied from 2 years to
4.5 years experience (with a mean experience of 3.2 years). The judges had no
known hearing loss at the time of the study. Informed consent was obtained from
judge participants in accordance with the Institutional Review Board of the
University of Iowa.
Judges independently listened to selected voice samples in a small laboratory
room at the Wendell Johnson Speech and Hearing Center at the University of Iowa.
The judges listened to stimuli through Sennheiser HD435 headphones that were set
at a consistent level for all samples. Stimuli were presented using E-Prime software
(version 2.0; Psychology Software Tools, Sharpsburg, PA). The computer program
presented randomized pairs of stimuli and asked judges to make a comparative
judgment between each pair of stimuli. The stimuli pairs were recordings of the
same actor reading the same sentence, but in different articulatory conditions. That
is, the only difference between each pair was that the stimuli were recorded in
different conditions. The judges were naïve to the performance conditions.
27
The stimulus presentation was divided into three sections, each consisting of
twenty randomized trials (for a total of sixty trials). The judges were first asked to
select which stimulus from each pair sounded louder than its partner. The next
section asked judges to select which stimulus had better articulation than its
partner. The final section asked judges to select which stimulus had better
projection than its partner. Projection was defined as “The extent to which a voice is
clear and carries naturally and effortlessly” (Michel & Willis, 1983). Judges were
also given the option to rate the two stimuli as the same if they believed the two
had equal levels of loudness, articulation, or projection. Judges were instructed to
respond according to their opinion. For each trial, judges listened to both stimuli
(with the option of listening multiple times), made a comparative judgment, and
moved on to the next trial. A practice trial began each session to ensure that the
judges comprehended the task. The sections were consistently presented in the
order described above. When the judges completed all three sections, their
participation in this study was concluded.
28
CHAPTER 3: RESULTS
Data were analyzed with the StatPlus software package (version 2009;
AnalystSoft Incorporated). Since the aims of the present study had not been
addressed by previous research, a type I error rate of α = 0.05 was selected for
statistical analysis. For post hoc comparisons, a Bonferroni correction of α = 0.0167
was used to reduce the possibility of obtaining type I errors.
3.1 Acoustic Findings
Computerized Speech Lab (CSL) was used to measure mean vocal intensity
during the different articulation conditions for each speaker. A one-way repeated
measures ANOVA was carried out to determine the influence of articulation
condition on mean intensity (dB) level of each stimulus sentence for each speaker.
The results showed no significant effect of condition on mean vocal intensity, F2,14 =
2.84, p = 0.09. The mean intensity of the normal condition was 75.37 dB (SD = 3.73);
the mean intensity of the bite block condition was 74.42 dB (SD = 5.12); the mean
intensity of the over-articulation condition was 76.58 dB (SD = 3.55). Figure 1
shows the mean intensity across conditions for each speaker.
A one-way repeated measures ANOVA was also used to determine the
influence of articulation condition on maximum intensity (dB) level of each stimulus
sentence for each speaker. The results showed a statistically significant effect of
condition on maximum intensity, F2,14 = 4.52, p = 0.03. Post hoc comparisons using
paired sample t-tests indicated that the over-articulation condition (M = 98.35, SD =
10.60) elicited a significantly greater maximum intensity than the bite block
29
condition (M = 95.34, SD = 13.61), t(7)= 2.79, p = 0.014. However, there was no
significant difference between maximum intensity of the normal condition (M =
96.91, SD = 12.17) and that of the over-articulation or bite block conditions.
Furthermore, it was observed that Speaker 8 was an outlier for the overarticulation condition, so the post hoc comparisons were analyzed again after
omitting the data for Speaker 8. These findings showed no significant difference
between maximum intensity of conditions. Figure 2 shows the maximum intensity
across conditions for each speaker.
Additionally, a one-way repeated measures ANOVA was conducted to
investigate the effect of articulation condition on time duration of speech samples.
The author did not initially intend to observe duration, but noticeable differences
between conditions elicited further examination. The results showed a statistically
significant effect of condition on time, F2,14 = 14.58, p < 0.001. The mean duration of
speech samples in the normal condition was 5.56 seconds (SD = 1.15); the mean
duration in the bite block condition was 5.68 seconds (SD = 0.94); the mean duration
in the over-articulation condition was 8.50 seconds (SD = 2.87). Post hoc
comparisons using paired sample t-tests indicated that the over-articulation
condition elicited a significantly greater time than both the bite block (t(7)= 3.50, p
= 0.005) and normal conditions (t(7)= 4.36, p = 0.002). However, there was no
significant difference between the duration of speech samples in the normal and
bite block conditions.
30
It was appreciated that Speaker 3 was an outlier for duration (i.e., an
atypically greater mean time for over-articulation than other speakers), so a more
conservative analysis was conducted omitting the data for Speaker 3. Even with the
outlying data removed, similar findings were confirmed. The over-articulation
condition was associated with a significantly greater mean time than both the bite
block (t(6)= 3.59, p = 0.002) and normal conditions (t(6)= 4.27, p < 0.001), but no
significant difference was observed between normal and bite block conditions.
Figure 3 shows the mean time duration of speech samples across conditions for each
speaker.
3.2 Perceptual Findings
Perceptual comparisons of articulation, loudness, and projection were
analyzed to observe the effects of articulation condition on each of the
aforementioned qualities. First, a one-way repeated measures ANOVA was
conducted to determine whether the articulation conditions produced perceptually
different articulation styles for each speaker. The results showed a statistically
significant effect of condition on the quality of articulation, F2,14 = 316.44, p < 0.001.
Of the times that a normal speech sample was presented, it was selected as having
better articulation than its paired sample with a mean of 62.82% (SD = 9.58); the
bite block condition was selected as having better articulation with a mean of 1.67%
(SD = 2.17); the over-articulation condition was selected as having better
articulation with a mean of 89.80% (SD = 5.40). Post hoc comparisons using paired
31
sample t-tests indicated that the over-articulation condition elicited significantly
better articulation than the normal condition (t(7)= 5.96, p < 0.001), and the normal
condition elicited significantly better articulation than the bite block condition (t(7)=
15.80, p < 0.001). Figure 4 shows the percent of the time that each condition was
selected as having better articulation than its paired sample for each speaker.
Next, a one-way repeated measures ANOVA was administered to observe
differences in perceived loudness of the articulation conditions for each speaker. The
results showed a statistically significant effect of condition on the perceived quality
of each speech sample, F2,14 = 32.70, p < 0.001. Of the times that a normal speech
sample was presented, it was selected as being louder than its paired sample with a
mean of 55.60% (SD = 14.39); the bite block condition was selected as being louder
with a mean of 18.29% (SD = 10.81); the over-articulation condition was selected as
being louder with a mean of 79.76% (SD = 12.20). Post hoc comparisons using
paired sample t-tests indicated that the over-articulation condition elicited
perceptually louder samples than the normal condition (t(7)= 2.92, p = 0.011), and
the normal condition elicited perceptually louder samples than the bite block
condition (t(7)= 4.50, p = 0.001). Figure 5 shows the percent of the time that each
condition was selected as sounding louder than its paired sample for each speaker.
Then, a one-way repeated measures ANOVA was used to determine whether
the articulation conditions were associated with different levels of projection for
each speaker. The results showed a statistically significant effect of condition on the
perceived level of projection, F2,14 = 159.27, p < 0.001. Of the times that a normal
32
speech sample was presented, it was selected as having better projection than its
paired sample with a mean of 59.25% (SD = 9.69); the bite block condition was
selected as having better projection with a mean of 3.80% (SD = 4.33); the overarticulation condition was selected as having better projection with a mean of
85.02% (SD = 7.70). Post hoc comparisons using paired sample t-tests indicated that
the over-articulation condition elicited significantly better projection than the
normal condition (t(7)= 4.31, p = 0.002), and the normal condition elicited
significantly better projection than the bite block condition (t(7)= 13.30, p < 0.001).
Figure 6 shows the percent of the time that each condition was selected as having
better projection than its paired sample for each speaker. Table 1 shows the overall
percentage that a condition was selected as having better articulation, loudness, or
projection than its paired sample.
Figures 7, 8, and 9 show the percent of the time that each judge selected a
condition as having better articulation, loudness, and projection (respectively) than
its paired sample.
Figures 10, 11, and 12 show the percent of the time that each condition was
selected as having better articulation, loudness, and projection (respectively) than
its paired sample for each of the five stimulus sentences.
3.3 Acoustic vs. Perceptual Findings
Acoustic and perceptual findings were analyzed together to determine if there
were significant differences between the two sources. Paired sample t-tests for each
33
articulation condition were conducted to compare the percentage that a presented
stimulus was acoustically louder than its paired sample and the percentage that it
was perceived to be louder. Results indicated no significant differences between
acoustic and perceptual findings for the normal condition (t(7)= 1.73, p = 0.06), the
bite block condition (t(7)= 1.76, p = 0.06), or the over-articulation condition (t(7)=
0.31, p = 0.38). Figure 13 shows the comparisons for each condition.
3.4 Inter-Rater Reliability
Reliability was determined by comparing ratings of trials that were
presented to multiple judges for each of the judgment tasks. The inter-rater
reliability for the judges during the articulation task, assessed with Fleiss’ kappa,
was found to be κ = 0.93. The inter-rater reliability for the judges during the
loudness task was found to be κ = 0.85. The inter-rater reliability for the judges
during the projection task was found to be κ = 0.92. According to Landis and Koch
(1977), the judges were in almost perfect agreement for each of the tasks. Intrarater reliability was not measured because repeat trials were not administered
during testing.
34
Judgment Task
Articulatory
Condition
Percent Condition Selected
Articulation
Loudness
Projection
Over-Articulation
89.80%
79.76%
85.02%
Normal
62.82%
55.60%
59.25%
Bite Block
1.67%
18.29%
3.80%
Table 1. Overall percentage that a condition was selected as having better
articulation, loudness, or projection than its paired sample.
Figure 1. Mean intensity (dB) levels for each speaker across articulation
conditions.
Figure 2. Maximum intensity (dB) levels for each speaker across articulation
conditions.
35
36
Figure 3. Mean sentence duration (in seconds) for each speaker across articulation
conditions.
Figure 4. Percentage that judges selected each condition as having better
articulation than its paired stimulus for each speaker.
Figure 5. Percentage that judges selected each condition as being louder than its
paired stimulus for each speaker.
Figure 6. Percentage that judges selected each condition as having better
projection than its paired stimulus for each speaker.
37
38
Figure 7. Percentage that judges selected each condition as having better
articulation than its paired stimulus. Judges are displayed on the x-axis, and the
data points represent ratings across speakers.
Figure 8. Percentage that judges selected each condition as being louder than its
paired stimulus. Judges are displayed on the x-axis, and the data points represent
ratings across speakers.
39
Figure 9. Percentage that judges selected each condition as having better
projection than its paired stimulus. Judges are displayed on the x-axis, and the data
points represent ratings across speakers.
Figure 10. Percentage that judges selected each condition as having better
articulation than its paired stimulus for each of the five presented sentences.
40
Figure 11. Percentage that judges selected each condition as being louder than its
paired stimulus for each of the five presented sentences.
Figure 12. Percentage that judges selected each condition as having better
projection than its paired stimulus for each of the five presented sentences.
Figure 13. Comparison of acoustic findings with perceptual findings. Acoustic
loudness is shown as the percentage that the mean loudness (in dB) of each
condition was greater than that of its paired stimulus. Perceived loudness is
represented by the percentage that judges selected each condition as sounding
louder than its paired stimulus.
41
42
CHAPTER FOUR: DISCUSSION
The present study examined whether the degree of articulation influences the
perceived loudness of the projected voice. Although a repeated measures ANOVA
did not demonstrate a significant difference in mean sound pressure levels between
articulatory conditions of actors reading the same passage, significant differences
between conditions were confirmed for perceptual ratings of articulation, loudness,
and projection. In this study, the only manipulated variable was articulatory
context (i.e., normal, bite block, over-articulation). Perceptual ratings confirmed
that these styles of articulation were indeed distinct on the articulation parameter
as judged by a series of listeners. Although loudness was not directly manipulated,
the judges rated the articulation conditions as being markedly different regarding
loudness and projection.
The current study shows a strong difference between acoustic and perceptual
findings. Even though the sound pressure levels were not distinctly different
between conditions, judges rated the conditions as having significantly different
loudness levels. These findings are consistent with the results of Master and
colleagues (2008), who found that actors and nonactors delivered speech samples of
similar SPL, but listeners rated the actors as being louder and better projected than
the nonactors. The present results suggest that the explanation for such perceptual
differences may be due to modifications in articulation. That is, assuming the actors
in the Master et al. (2008) study were well trained in articulation, then articulatory
43
differences may have been responsible for significant differences in perceived
loudness.
A secondary finding of the current study was that the over-articulation
condition elicited a prolonged duration of speech samples. Although it was not the
intended purpose of the study to investigate rate, the present findings support
previous evidence that the degree of articulation influences rate (Dromey & Ramig,
1998; Wohlert & Hammen, 2000; Tjaden & Wilding, 2004; Picheny, Durlach, &
Braida, 1985). Lessac (1967) mentions that actors may experience an extended
speech rate when first beginning his consonant action exercises, but it should be
their goal to master this style of speech at a normal rate. In the present study,
actors were asked to be aware of the concentrated energy associated with each
consonant and with the combination of consonants. Consequently, they slowed their
speech rate to accommodate for increased awareness of articulatory movements.
A number of limitations in the present study may provide useful directions
for future work in this area. First, the manner of perceptual measurement does not
provide information to infer the degree of perceptual differences. That is, if judges
used a rating scale to indicate loudness levels, we would be able to determine how
much louder one sample appears to sound compared to another. The difficulty with
using such a scale is that inter-rater—and even intra-rater—reliability might be
low due to inconsistent perceptions of the rating scale. For this preliminary study, a
more conservative system was used to collect somewhat more objective data on
perception. The present results imply that there are indeed significant differences
44
in loudness levels for each articulatory condition, but further studies are needed to
demonstrate the extent of these differences.
Another limitation was the amount of time given to the over-articulation
treatment. The treatment presented in this study was based on Lessac’s (1967)
consonant action exercises, but it was a considerably modified program. Lessac
recommends consonant training that lasts several weeks to ensure proper
exploration of speech actions. However, the current training only lasted thirty
minutes. As a result, the actor participants responded to treatment in varying ways.
Rate of speech was notably inconsistent between actors, and Lessac even says, “You
will find yourself experimenting with a new kind of rhythm, speed, and melody—an
interplay between dynamic tempos….” (1967). Intensity of consonants was
subjectively noted to vary between actors, as well. Another study may be required to
replicate treatment that is more closely defined by Lessac’s training
recommendations.
The degree of generalization of treatment effects to conversational speech is,
at this point, inconclusive. The over-articulation condition was notably stilted and
not representative of natural-sounding speech. Lessac addresses this and
encourages his students to link consonants together and strive for a more natural
rate—although it may feel seemingly rapid to the speaker (1967). However, this
natural rate was not always achieved by the speakers in this experiment due to the
restricted amount of time given to treatment. In the time allotted, the actors only
worked on applying the articulation strategies to the performance text. If the
45
participants were given more time during treatment, then they could have worked
on generalizing the strategies to conversational speech.
The performance monologue was selected because it readily lends itself to
over-articulation. In a commentary about the monologue, Silverbush and Plotkin
(2002) describe Lear’s speech as containing “the same hard consonants that are
repeated throughout the piece—Bullying Bs, Spitting Ss, Ps, and Ts, Damning Ds,
Raging Rs, and Clobbering Ks” (p. 663). This style of writing is not representative of
speech in the Standard American dialect, so naturalness and generalizability of the
strategies to casual speech is unknown.
Based on the results of this study, we can infer that some benefit may be
received if articulation treatment were applied to voice therapy. After all, the goals
of resonant voice strategies often incorporate speaking with a forward focus, with
special attention given to anterior oral vibrations—and all of this can be
accomplished with consideration of articulatory movements of the lips, teeth, and
tip of the tongue. However, additional studies are needed to determine the effect of
articulation on vocal resonance, and it is so far unknown what clinical outcomes
may result from articulatory focus in voice therapy. The current findings show that
over-articulation sounds louder and better projected than habitual speech, so
articulatory efforts may be useful in addressing therapy goals to increase vocal
loudness.
In conclusion, the current study provides evidence that the degree of
articulation has a strong positive correlation to perceived loudness of the voice.
46
When speakers used a style of over-articulation, their speech was perceived by
listeners to be louder and better projected. Future studies may focus on pathological
populations, generalizing the strategies to conversation, and determining the degree
of loudness differences between conditions. This is a preliminary study that merely
begins to advocate for including articulation therapy as a component of voice care.
APPENDIX A. PERFORMANCE MONOLOGUE
From William Shakespeare’s King Lear
Lear:
Blow, winds, and crack your cheeks! Rage! Blow!
You cataracts and hurricanes, spout
Till you have drench'd our steeples, drown'd the cocks!
You sulph'rous and thought-executing fires,
Vaunt-couriers to oak-cleaving thunderbolts,
Singe my white head! And thou, all-shaking thunder,
Strike flat the thick rotundity o' th' world,
Crack Nature's moulds, all germains spill at once,
That makes ingrateful man!
Rumble thy bellyful! Spit, fire! Spout, rain!
Nor rain, wind, thunder, fire are my daughters.
I tax not you, you elements, with unkindness.
I never gave you kingdom, call'd you children,
You owe me no subscription. Then let fall
Your horrible pleasure. Here I stand your slave,
A poor, infirm, weak, and despis'd old man.
But yet I call you servile ministers,
That will with two pernicious daughters join
Your high-engender'd battles 'gainst a head
So old and white as this! O! O! 'tis foul!
47
APPENDIX B. LESSAC’S CONSONANT ORCHESTRA
48
49
APPENDIX C. OVER-ARTICULATION PRACTICE LISTS
W ord List
firsts seconds thirds fourths fifths sixths sevenths eighths ninths tenths elevenths twelfths thirteenths fourteenths fifteenths sixteenths seventeenths eighteenths nineteenths patience patients petitions entrance entrants thieves Thebes fines finds Ben’s bends bashes batches tracks tracts acts axe asks tennis tens tends tense tents tenths whirls worlds whirly worldly wouldst couldst shouldst wouldn’t couldn’t shouldn’t didn’t hadn’t liaison serendipity synergistic extraterritorialism rather recalcitrance recapitulative recapitulance reconnaissance January February Wednesday characteristic W ord Com binations
grab it stop up bad actor get out drag along back away arrange everything catch on that’s enough leads on run off home owner give away enough of it because of it missed out on it breathe in birth of a nation massage it wash up sail away over all strong executive this is it
Linking Consonants
sob sister keep this stand back what for stack pack that’s mine leave soon has been loose talk smooth surface wisdom tooth wash clean hill country night report predict weather judge carefully room temperature hot wind those ships take time big deal canned goods watch Germany exciting game that’s bad business mysterious witch dark night ask not why 50 REFERENCES
Acker, B. F. (1987). Vocal tract adjustments for the projected voice. Journal of Voice,
1, 77-82.
Berry, C. (1973). Voice and the actor. New York: Wiley.
Berry, D. A., Verdolini, K., Montequin, D. W., Hess, M. M., Chan, R. W., & Titze, I.
R. (2001). A quantitative output-cost ration in voice production. Journal of
Speech, Language, and Hearing Research, 44, 29-37.
Bradlow, A. R., Kraus, N., & Hayes, E. (2003). Speaking clearly for children with
learning disabilities: Sentence perception in noise. Journal of Speech,
Language, and Hearing Research, 46, 80-97.
Brin, M. F., Velickovic, M., Ramig, L. O., & Fox, C. (2004).Hypokinetic laryngeal
movement disorders. In R. D. Kent (Ed.), The MIT encyclopedia of
communication disorders. Cambridge: The MIT Press.
Caissie, R., McNutt Campbell, M., Frenette, W. L., Scott, L., Howell, I., & Roy, A.
(2005). Clear speech for adults with a hearing loss: Does intervention with
communication partners make a difference? Journal of American Academy of
Audiology, 15, 157-171.
Cookman, S., & Verdolini, K. (1999). Interrelation of mandibular laryngeal
functions. Journal of Voice, 13, 11–24.
Dromey, C., & Ramig, L. (1998). Intentional changes in sound pressure level and
rate: Their impact on measures of respiration, phonation, and articulation.
Journal of Speech, Language, and Hearing Research, 41, 1003–1018.
Dromey, C., Nissen, S. L., Roy, N., & Merrill, R. M. (2008). Articulatory changes
following treatment of muscle tension dysphonia: Preliminary acoustic
evidence. Journal of Speech, Language, and Hearing Research, 51, 196-208.
Emerich, K., Titze, I. R., Svec, J. G., Popolo, P. S., & Logan, G. (2005). Vocal range
and intensity in actors: A studio versus stage comparison. Journal of Voice,
19, 78-83.
Erber, N. (1993). Communication and Adult Hearing Loss. Melbourne: Clavis
Publishing.
Erber, N. (1996). Communication Therapy for Adults with Sensory Loss. Melbourne:
Clavis Publishing.
51 Fairbanks, G. (1960). Voice and articulation drillbook (2nd ed.). New York: Harper &
Row.
Kim, Y., & Kuo, C. (2012). Effect of level of presentation to listeners on scaled
speech intelligibility of speakers with dysarthria. Folia Phoniatrica et
Logopaedica, 64, 26-33.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for
categorical data. Biometrics, 33, 159–174.
Lessac, A. (1967). The use and training of the human voice. New York: Drama Book.
Linklater, K. (1970). Freeing the natural voice. New York: Drama Book.
Machlin, E. (1966). Speech for the stage. New York: Theatre Arts Books.
Master, S., De Biase, N., Chiari, B. M., & Laukkanen, A. M. (2008). Acoustic and
perceptual analyses of Brazilian male actors’ and nonactors’ voices: Longterm actor’s spectrum and the “actor’s formant.” Journal of Voice, 22, 146154.
Mayer, L. (1968). Fundamentals of voice and diction. Dubuque, IA: William C.
Brown.
McClean, M. D., & Tasko, S. M. (2002). Association of orofacial with laryngeal and
respiratory motor output during speech. Experimental Brain Research, 146,
481–489.
Michel, J. F., & R. A. Willis. (1983). An acoustical study of voice projection. In:
V. L. Lawrence (Ed.). Transcripts of the twelfth symposium: Care of the
professional voice (pp. 52-56). New York: The Voice Foundation.
Neel, A. T. (2009). Effects of loud and amplified speech on sentence and word
intelligibility in Parkinson disease. Journal of Speech, Language, and
Hearing Research, 52, 1021-1033.
Park, S. A. (1997). Voice as a source of creativity for acting training, rehearsal, and
performance. In M. Hampton & B. Acker (Eds.), The vocal vision: Views on
voice by 24 leading teachers, coaches & directors (pp. 107-119). New York:
Applause.
52
Peterson, K. L., Verdolini-Marston, K., Barkmeier, J. M., & Hoffman, H. T. (1994).
Comparison of aerodynamic and electroglottographic parameters in
evaluating clinically relevant voicing patterns. Annals of Otology, Rhinology,
& Otolaryngology, 103, 335-346.
Picheny, M. A., Durlach, N. I., & Braida, L. D. (1985). Speaking clearly for the hard
of hearing I: Intelligibility differences between clear and conversational
speech. Journal of Speech and Hearing Research, 28, 96-103.
Raphael, B., & Scherer, R. (1987). Voice modifications of stage actors: Acoustic
analyses. Journal of Voice, 1, 83-87.
Rodenburg, P. (2000). The actor speaks: Voice and the performer. England: Palgrave
Macmillan.
Sapir, S., Spielman, J. L., Ramig, L. O., Story, B. H., & Fox, C. (2007). Effects of
intenstive voice treatment (the Lee Silverman Voice Treatment [LSVT]) on
vowel articulation in dysarthric individuals with idiopathic Parkinson
disease: Acoustic and perceptual findings. Journal of Speech, Language, and
Hearing Research, 50, 899-912.
Schulman, R. (1989). Articulatory dynamics of loud and normal speech. The Journal
of the Acoustical Society of America, 85, 295–312.
Schum, D. J. (1996). Intelligibility of clear and conversational speech of young and
elderly talkers. Journal of the American Academy of Audiology, 7, 212–218.
Schum, D. J. (1997). Beyond hearing aids: Clear speech training as an intervention
strategy. Hearing Journal, 50, 36–39.
Silverbush, R., & Plotkin, S. (2002). Speak the speech! Shakespeare’s monologues
illuminated. New York: Faber and Faber.
Sundberg, J. (1987). The science of the singing voice. DeKalb, IL: Northern Illinois
University Press.
Titze, I. R. (1984). Principles of voice production. New York: Prentice Hall.
Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipulations in dysarthria:
Acoustic and perceptual findings. Journal of Speech, Language, and Hearing
Research, 47, 766-783.
Tye-Murray, N. (1998). Foundations of Aural Rehabilitation. San Diego: Singular
Publishing.
53 Uchanski, R. M. (2005). Clear speech. In D. B. Pisoni & R. E. Remez (Eds.), The
handbook of speech perception (pp. 207-235). Malden: Blackwell Publishing.
Verdolini, K. (2000). Case study: Resonant voice therapy. In J. Stemple (Ed.), Voice
therapy: Clinical studies (2nd ed., pp. 46-62). San Diego: Singular Publishing.
Verdolini-Marston, K., Drucker, D.G., Palmer, P.M., & Samawi, H. (1998).
Laryngeal adduction in resonant voice. Journal of Voice, 12, 315-327.
Wohlert, A. B., & Hammen, V. L. (2000). Lip muscle activity related to speech rate
and loudness. Journal of Speech, Language, and Hearing Research, 43, 12291239.