Perceived vocal fatigue and effort in relation to laryngeal functional

The Laryngoscope
C 2013 The American Laryngological,
V
Rhinological and Otological Society, Inc.
Perceived Vocal Fatigue and Effort in Relation to Laryngeal
Functional Measures in Paresis Patients
Sheila V. Stager, PhD, CCC-SLP; Steven A. Bielamowicz, MD
Objectives/Hypothesis: To determine if differences in objective measures of laryngeal function can meaningfully explain
different levels of self-perceptions of effort or fatigue in patients with vocal fold paresis.
Study Design: A retrospective chart review of 72 patients with vocal fold paresis diagnosed using laryngeal electromyography, who had either been observed (n 5 21), treated only by injection (n 5 24), or treated only by surgery (n 5 27).
Methods: Before and after treatment/observation, patients’ subjective ratings of severity of vocal effort and fatigue
were assessed using the Glottal Function Index. Laryngeal function was assessed using maximum phonation time and translaryngeal flow.
Results: None of the variables demonstrated a significant linear change across time. Post hoc Tukey analyses following
analysis of variance (ANOVA) found significant differences in flow among three groups, those rating symptoms of effort as no
problem, moderate problem, or severe problem. Post hoc Tukey analyses following ANOVA found significant differences in the
amount that flow changed among three groups, those demonstrating no difference, minor differences, or major differences in
ratings of effort before and after treatment.
Conclusions: Changes in reported symptom severity of effort were related to changes in translaryngeal midvowel flow
that were not explained by passage of time.
Key Words: Vocal effort, laryngeal electromyography, translaryngeal flow.
Level of Evidence: 4.
Laryngoscope, 124:1631–1637, 2014
INTRODUCTION
A significant number of studies have examined
whether measures obtained from one modality of evaluating voice correlate with measures obtained from other
modalities. The rationale is twofold: to help reduce the
number of outcome measures that need to be obtained
and to validate that two measures are assessing similar
aspects of the disorder. The results have been mixed.
For example, acoustic measures such as jitter and
shimmer have not been shown to be highly correlated
with perceptual measures such as breathiness, roughness, or hoarseness from sustained vowel productions.1
The overall scores from quality-of-life rating scales (i.e.,
Voice Handicap Index [VHI]) have shown poor agreement with acoustic measures.2–4 However, good correlations have been found between normalized measures of
glottal gap size from endoscopic examinations and aerodynamic measures of maximum phonation time (MPT)
From the Voice Treatment Center, Medical Faculty Associates,
Department of Surgery, Division of Otolaryngology, The George
Washington University Medical Center, Washington, DC, U.S.A.
Editor’s Note: This Manuscript was accepted for publication
October 29, 2013.
Presented at the 33rd Annual Symposium, The Voice Foundation,
Philadelphia, Pennsylvania, U.S.A., June 2–6, 2004.
The authors have no funding, financial relationships, or conflicts
of interest to disclose.
Send correspondence to Sheila V. Stager, PhD, Voice Treatment
Center, Medical Faculty Associates, 2021 K Street NW, Suite 206,
Washington, DC 20006. E-mail: [email protected]
DOI: 10.1002/lary.24493
Laryngoscope 124: July 2014
and mean flow rate.5 Moderate correlations have been
reported between subscales of the VHI and MPT.6 In
doing this type of research, it is important to select variables that are both representative of the disorder and
can be validly assessed using some numeric rubric.
Self-reported reduction in symptom severity is one
of the most salient outcome measures. Patients with
vocal fold paresis report increased effort in speaking as
well as vocal fatigue, so documenting reduced severity
for these two symptoms would be important treatment
outcome measures. The severities of vocal effort and
fatigue are rated separately in the Glottal Function
Index (GFI).7 This index was developed and validated by
comparing total scores with the endoscopic finding of
incomplete glottal closure—the higher the total GFI
score, the greater the glottal insufficiency. It assesses
the severity of symptoms on a six-point Likert scale,
where 0 represents no problem with the symptom, and 5
represents a severe problem. These different ratings
allow for categorizing symptom severity, categories that
can then be used to compare means of absolute values of
objective measures. Differences in ratings of symptom
severity across time in individuals allow another type of
categorization that can then be used to compare changes
in means of objective measures across the same time
frame. The goal would be to give a better idea about
functional factors that may be important to subjective
percepts of symptom severity.
Vocal fatigue could result from increased vocal
effort, which is why studies of vocal fatigue often
assessed degree of vocal effort. For example, changes in
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
1631
vocal effort have generally been assessed in studies of
healthy voices designed to fatigue the voice using paradigms such as prolonged phonation tasks at high intensities and/or high pitch.8–12 Three other studies have
used self-rating visual analog scale scales to assess the
degree of vocal tiredness, a synonym for the term
fatigue,13–15 rather than a Likert scale of symptom
severity. Given this potential interaction, it would be
important to examine the correspondence between
numerical ratings of effort and fatigue from the GFI. As
well, including before and after treatment data would
provide a more equal representation across all six rating
categories (the six points on the Likert scale), as the ratings of severe symptom severity would likely be found
during the pretreatment examination, and the ratings of
none or mild symptom severity would likely be found
during the post-treatment evaluation.
If functional measures are shown to significantly
differ depending on ratings of vocal fold effort and/or
fatigue, then our understanding of the physiology behind
those functional measures may in turn help us understand what might be involved in the patient’s percept of
symptom severity. Thus, the selection of the functional
measures is important and should be ones affected by
nerve damage to the system. Two measures of laryngeal
function typically reported in treatment outcome studies
are transglottal flow during voicing and MPT. Both of
these measures are affected by the size of the glottal
gap. Flow has also been reported to be significantly
related to reduced volitional activity as evidenced by a
reduced interference pattern seen in thyroarytenoid and
cricothyroid muscle laryngeal electromyography (EMG)
during high-pitched voicing, as well as the ratings of
number of normal motor units seen in these muscles
during volitional activity such as voicing recorded from
laryngeal EMG.16 In these studies, our methodology has
been to use single rater scales of activation.16
Patients in this study were compared across time to
get a broader range of ratings. However, with paresis
patients, there is always the possibility that the process of
spontaneous recovery may be occurring across time. To
understand how the passage of time may affect the changes
in ratings and measures, we compared patients who chose
not to be treated across similar time increments.
Once it was determined that rating changes were
not explained solely by passage of time, a two-pronged
approach was used to assess relationships between selfratings and functional measures. First, we considered if
a relationship was present between the reported degree
of severity of the symptom and the degree of abnormality of the functional measure. Second, if a relationship
exists between a functional measure and a self-reported
measure, then the same degree of change should be in
evidence for both types of measures following treatment
or with spontaneous recovery.
Thus, the purpose of this study was to determine if
differences in objective measures of laryngeal function
could meaningfully explain different levels of selfperceptions of effort or fatigue in patients with vocal fold
paresis. We also assessed whether time following onset
affected symptom severity ratings.
Laryngoscope 124: July 2014
1632
MATERIALS AND METHODS
Subjects
This was a retrospective study, approved by the institutional review board of The George Washington University Medical Center. The charts of 72 patients (37 male), within the age
range of 21 to 89 years (mean, 58 years) were reviewed. All subjects were diagnosed with vocal fold paresis/paralysis based on
signs of reinnervation, denervation, and/or “poor tone” (defined
as reduced number of motor units and reduced volitional activity as evidenced by a reduced interference pattern16) in the
EMG signal from any thyroarytenoid and/or cricothyroid muscle. Possible etiologies were idiopathic, iatrogenic, viral, or intubation injuries. Of the 72 patients, 49 patients (26 male), ages
ranging from 21 to 87 years, had received a single type of treatment and had returned for at least one follow-up clinical visit
after treatment. If data were collected from more than one
follow-up visit, then the data from the visit most separated in
time from the initial evaluation were used. The time between
onset of symptoms and initial evaluation ranged from 0.5 to 12
months (mean, 4.2 months; standard deviation [SD], 3 months).
The time between onset of symptoms and final evaluation
ranged from 1.5 to 39 months (mean, 11.3 months; SD, 6.9
months).
The other 23 patients (11 male), with an age range of
36 to 89 years, received no treatment. Each subject had
returned for at least one follow-up clinical visit after a
period of time for observation. The time between onset of
symptoms and initial evaluation ranged from 0.5 to 7
months (mean, 3 months). The time between onset of symptoms and final observation ranged from 3.25 to 29 months
(mean, 10 months).
Assessment
Subjective rating scales. Patients were asked to complete the GFI.7 This index asked the patient to rate the severity
of their voice symptoms during the past month using a Likert
scale from 0 (no problem) to 5 (severe problem). Subjects were
asked to rate “speaking takes effort,” “throat discomfort or pain
after using your voice,” “vocal fatigue (voice weakens as you
talk),” and “voice cracks or sounds different.”
For statistical analysis of the relationships between symptom rating and measures of laryngeal function, instead of dividing subjects into six groups, one group per number on the
rating scale (0, 1, 2, 3, 4, 5), subjects were divided into three
groups: no problem (rating of 0 or 1), mild problem (rating of 2
or 3), and a severe problem (rating of 4 or 5). Because most of
the ratings from the initial visit were 4 and 5, and most of the
ratings from the final visit were 0 and 1, the initial and final
visits were combined for this analysis so that a relatively equal
representation of each rating was used. For statistical analysis
of the relationships between change in symptom ratings before
and after treatment and changes in functional measures before
and after treatment, instead of dividing subjects into six groups,
one group per difference in the rating (0, 1, 2, 3, 4, 5), subjects
were divided into three groups: no difference (differences
between ratings of 0 or 1), minor difference (differences between
ratings of 2 or 3), and a major difference (differences between
ratings of 4 or 5).
Laryngeal Electromyography and Laryngeal
Function Measures
Details of the laryngeal electromyography examination
and laryngeal function measures are identical to those previously published by our group.16,17
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
Fig. 1. The relationship between
time following onset (x-axis) and the
change in effort rating between initial and final visit (y-axis) for those
individuals who were observed. The
linear regression and equation are
also included.
RESULTS
The first issue to address was if there was a relationship between the two subjective ratings, vocal effort
and vocal fatigue. If patients routinely gave vocal fatigue
and vocal effort the same rating, then there would be no
reason to study both self-rating measures separately.
Overall, 49% of the effort and fatigue ratings were identical. In examining the identical ratings from the initial
examination, 51% (or half of the 49%) were the maximum of 5, representing the ceiling effect. In examining
the identical ratings from the final examination, 54% (or
half of the 49%) were the minimum of 0, representing
the floor effect. Forty-two percent of the rating changes
in effort and changes in fatigue following observation/
treatment were identical. Thus, separate analyses were
done for fatigue and effort.
Our first question was whether a consistent
improvement was demonstrated across the first year following onset for each of the measures. Only patients
who chose to be observed for a period of time were evaluated. Change measures were calculated by subtracting
the ratings or measures of the second visit from the
ratings or measures of the first visit. If a linear
relationship exists between time following onset and
Fig. 2. The relationship between
time following onset (x-axis) and the
change in fatigue rating between initial and final visit (y-axis) for those
individuals who were observed. The
linear regression and equation are
also included.
Laryngoscope 124: July 2014
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
1633
Fig. 3. The relationship between
time following onset (x-axis) and the
change in maximum phonation time
(MPT) in seconds between initial
and final visit (y-axis) for those individuals who were observed. The linear regression and equation are
also included.
change in symptom rating, one would predict that the
more months following onset, the greater would be the
change in rating. Using linear regression, no significant
relationship existed between length of time following
onset and amount of change in the variables of interest
(Figs. 1–4).
The next question was whether a relationship
existed between the reported degree of severity of effort
or fatigue and the degree of abnormality on the functional measures of MPT and flow. Single-factor analysis
of variance (ANOVA) of the functional measures using
the rating categories from the rating scales were
completed. If significant between-group differences were
found, then post hoc Tukey analyses were completed. A
relationship was defined as demonstrating both significant differences using the ANOVA analysis and significant differences between groups using the post-hoc
Tukey analysis. The single-factor ANOVA analyses
between effort ratings and MPT (P 5.0005), effort ratings and flow (P 5.000006), fatigue ratings and MPT
(P 5.0003), and fatigue ratings and flow (P 5.02) were
significant. Only the relationship between effort ratings
and flow were found to be significantly different between
each pair of groups (Tukey, P <.01). Table I summarizes
Fig. 4. The relationship between
time following onset (x-axis) and the
change in flow in milliliters per second between the initial and final visit
(y-axis) for those individuals who
were observed. The linear regression and equation are also included.
Laryngoscope 124: July 2014
1634
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
TABLE I.
Summary of the Means and Standard Deviations of the Functional Measures Based on Categories Defined by Subject Rating.
Symptom
Effort
Fatigue
Rating
Mean MPT, s
Standard
Deviation of MPT
Mean
Flow, mL/s
Standard
Deviation of Flow
0–1
18.6
8.8
267*
135.5*
2–3
4–5
17.1
10.9
9.4
7.9
420*
560*
319.3*
365.5*
0–1
17.2
9
341
287.8
2–3
4–5
18.8
11.7
9.9
7.7
361
502
281.4
325.8
All analysis of variance analyses were significant among groups.
*Significant differences between each pair of groups.
MPT 5 maximum phonation time.
the means and standard deviations of the functional
measures based on categories defined by subject rating.
From Table I, mean values for the groups that rated
the symptom as 0 or 1 (no problem) or rated the symptom as 2 or 3 (moderate problem) were within normal
limits for MPT, so these two groups did not significantly
differ. For flow, the mean value for the group that rated
the symptom as no problem was within normal limits,
but not for the group that rated the symptoms as a moderate problem, so these two groups did significantly differ. The mean values for the group that rated the
symptom as a 4 or 5 (severe problem) were outside normal limits for both MPT and flow. Because we defined
that a relationship only existed if there were significant
differences between all three groups (no problem, moderate problem, and severe problem), a relationship was
only established between effort ratings and flow. Thus,
the remainder of the results will only focus on analyses
between these two variables.
Having established the relationship between degree
of severity of effort ratings and degree of abnormality of
flow measures, the final step was to determine whether
a relationship existed between reported change in the
degree of severity of a symptom and the degree of
change in the functional measure following treatment.
Single-factor ANOVA analyses of the functional measures using the differences in rating categories, followed
by post hoc Tukey analyses, were completed. Figure 5
illustrates the relationship between change in degree of
symptom severity and change in functional measure.
As can be seen in Figure 5, no difference between
ratings pre- to postevaluation was associated with very
small changes in flow (mean, 68 mL/s), minor differences
between ratings pre- and postevaluation were associated
with minor changes in flow (mean, 193 mL/s), and major
differences between ratings pre- to postevaluation were
associated with major changes in flow (mean, 359 mL/s).
The ANOVA analysis revealed significant differences in
Fig. 5. Changes in effort ratings (x-axis)
that correspond to the changes in
flow in milliliters per second (y-axis).
Laryngoscope 124: July 2014
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
1635
the change in flow based on the three categories of
degree of change (P 5.024). Post hoc Tukey analysis
revealed a significant difference between each pair of
groups (P <.05).
Having confirmed a relationship between effort
ratings and flow, it would be reasonable to determine
the sensitivity/specificity of these measures. Using the
effort rating as the variable that determines if disease
is present or not, we can determine the number of individuals whose flow measures were not within normal
limits versus those whose measures were within normal
limits. Sensitivity was 0.79, but specificity was only
0.46.
DISCUSSION
The purpose of this study was to determine if differences in objective measures of laryngeal function could
meaningfully explain different levels of self-perceptions
of effort or fatigue in patients with vocal fold paresis.
The results indicated that a significant relationship
existed between self-ratings of effort and the functional
variable of flow. Increased vocal effort in the healthy
voice can be associated with either increasing the power
source and/or increasing medial compression of the vocal
folds. In paresis patients, because the nerves innervating the muscles responsible for adduction and medial
compression are injured either unilaterally or bilaterally,
they can only partially bring the vocal fold toward midline, resulting in a glottal gap. One possible compensatory strategy for overcoming the lack of vocal intensity,
because air is escaping through the glottal gap rather
than being used to power vibration of the vocal folds,
would be to try and increase the power source. Individuals with vocal fold paresis have demonstrated increased
Psub indirectly during production of voiceless bilabial
plosives.18 Thus, the relationship between increased flow
and perceived effort could be explained by this compensatory strategy.
In a previous study, increased flow was reported to
be significantly related to “poor tone,”18 as defined by
fewer normal motor units and reduced volitional activity
as evidenced by a reduced interference pattern of motor
neurons during modal voice followed by high-pitched
voice. This lack of activation of motor neurons would
reduce the ability of the affected vocal fold(s) to both
adduct and medially compress, which would reduce the
ability to narrow the glottal gap. Just to get more sound
production, the individual may use the same compensatory behavior of trying to increase the power supply,
thus increasing perceived effort.
Thus, getting the vocal folds to midline to close the
glottal gap may improve symptoms of effort; however,
this alone may not be enough to eliminate vocal effort. It
may also require the restoration of muscle tone. Currently, there is no treatment that will improve muscle
tone, unless spontaneous recovery occurs. It would be
interesting to determine if voice therapy using vocal
exercises following medialization in those individuals
who still report effort would be helpful. An analysis of
subjects treated with reinnervation procedures would be
Laryngoscope 124: July 2014
1636
another interesting follow-up study to determine the
effects of neural regrowth on the symptoms of effort.
This study also helps to validate the use of the
GFI7 as an important measure in self-rating vocal symptoms. Given that at least one of the symptoms assessed
relates to vocal function, it would give some more reason
to use this index in assessing patients with paresis.
One criticism of the current study is the lack of
blinded, multiple raters of the EMG activity. The current
study only used a single rater of EMG findings in a nonblinded fashion. The study could be strengthened by the
use of multiple blinded raters of the EMG signals using
the tasks described previously. Blinding of the raters
would significantly strengthen the study and allow for
an analysis of the reliability of the rating scales used.
Another method that could be used to evaluate neuromuscular recruitment would be to apply a numerical
analysis of recruitment. However, the density of neuromuscular activity in the larynx is difficult due to the
high number of motor units seen in this small muscle
with a dense neuromuscular territory.19
Another method of EMG analysis that might reduce
the potential bias of EMG interpretation seen in the current study would be to use a “turns and amplitude” analysis in the larynx.20 This is a method of looking at a
quantification of the EMG signal. Using this technique,
a quantified analysis of the interference signal is performed. This technique holds promise in reduction of the
ratings bias of the current technique used in this article.
Thus, this study has provided us with some insight
into the relationship between perceived, self-rated variables and vocal function in patients with paresis. Clearly,
further work needs to be done with a larger group. A
prospective study would also be beneficial.
CONCLUSION
Changes in reported symptom severity of effort
were related to changes in translaryngeal midvowel
flow. Increased flow has also been demonstrated to relate
to reduced volitional activity as evidenced by a reduced
interference pattern during volitional tasks from electromyography. Reduced neuromuscular activation, as seen
in patients with paresis, may result in poor medial compression, which may be compensated for by increasing
the power source, which in turn produces increased flow
and perceived effort.
BIBLIOGRAPHY
1. Maryn Y, Roy N, De Bodt M, Van Cauwenberge P, Corthais P. Acoustic
measurement of overall voice quality: a meta-analysis. J Acoust Soc Am
2009;126:2619–2634.
2. Niebudek-Bogusz E, Woznicka E, Zamyslowska-Szmytke E, SliwinskaKowalska M. Correlation between acoustic parameters and Voice Handicap Index in dysphonic teachers. Folia Phoniatr Logop 2010;62:55–60.
3. Wheeler KM, Collins SP, Sapienza CM. The relationship between VHI
scores and specific acoustic measures of mildly disordered voice production. J Voice 2006;20:308–317.
4. Woisard V, Bodin S, Yardeni E, Puech M. The voice handicap index: correlation between subjective patient response and quantitative assessment
of voice. J Voice 2007;21:623–631.
5. Omori K, Slavit DH, Kacker A, Blaugrund SM. Influence of size and etiology of glottal gap in glottic incompetence dysphonia. Laryngoscope 1998;
108(4 pt 1):514–518.
6. Schindler A, Mozzanica F, Vedrody M, Maruzzi P, Ottaviani F. Correlation
between the Voice Handicap Index and voice measurements in four
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
7.
8.
9.
10.
11.
12.
13.
groups of patients with dysphonia. Otolaryngol Head Neck Surg 2009;
141:762–769.
Bach KK, Belafsky PC, Wasylik K, Postma GN, Koufman JA. Validity and
reliability of the glottal function index. Arch Otolaryngol Head Neck
Surg 2005;131:961–964.
Welham NV, Maclagan MA. Vocal fatigue: current knowledge and future
directions. J Voice 2003;17:21–30.
Verdolini K, Titze IR, Fennell A. Dependence of phonatory effort on hydration level. J Speech Hear Res 1994;37:1001–1007.
Bagnall AD, McCulloch K. The impact of specific exertion on the efficiency
and ease of the voice: a pilot study. J Voice 2005;19:384–390.
Shewmaker MB, Hapner ER, Gilman M, Klein AM, Johns MM III. Analysis of voice change during cellular phone use: a blinded controlled study.
J Voice 2010;24:308–313.
Chang A, Karnell MP. Perceived phonatory effort and phonation threshold
pressure across a prolonged voice loading task: a study of vocal fatigue.
J Voice 2004;18:454–466.
Buekers R. Are voice endurance tests able to assess vocal fatigue? Clin
Otolaryngol Allied Sci 1998;23:533–538.
Laryngoscope 124: July 2014
14. Laukkanen AM, Ilomaki I, Leppanen K, Vilkman E. Acoustic measures and
self-reports of vocal fatigue by female teachers. J Voice 2008;22:283–289.
15. Sodersten M, Ternstrom S, Bohman M. Loud speech in realistic environmental noise: phonetogram data, perceptual voice quality, subjective ratings, and gender differences in healthy speakers. J Voice 2005;19:29–46.
16. Bielamowicz S, Stager SV. Diagnosis of unilateral recurrent laryngeal
nerve paralysis: laryngeal electromyography, subjective rating scales,
acoustic and aerodynamic measures. Laryngoscope 2006;116:359–364.
17. Stager SV, Bielamowicz S. Evidence of return of function in patients with
vocal fold paresis. J Voice 2010;24:614–622.
18. Ryu IS, Nam SY, Han MW, Choi S-H, Kim SY, Roh J-L. Long-term voice
outcomes after thyroplasty for unilateral vocal fold paralysis. Arch Otolaryngol Head Neck Surg 2012;138:347–351.
19. Sandbrink F. Motor unit recruitment in EMG definition of motor unit recruitment and overview. Medscape website. Available at: http://emedicine.
medscape.com/article/1141359-overview#aw2aab6b4. Accessed June 26, 2012.
20. Statham MM, Rosen CA, Nandedkar SD, Munin MC. Quantitative laryngeal electromyography: turns and amplitude analysis. Laryngoscope
2010;120:2036–2041.
Stager and Bielamowicz: Self-Ratings of Laryngeal Function Measures
1637