Error-Revision in the Spontaneous Speech of Apraxic Speakers

62, 342–360 (1998)
BL971907
BRAIN AND LANGUAGE
ARTICLE NO.
Error-Revision in the Spontaneous Speech
of Apraxic Speakers
Julie M. Liss
Motor Speech Disorders Laboratory, Arizona State University
Spontaneous speech samples from four men diagnosed with apraxia of speech
were transcribed to examine the ways in which they attempted to repair their speech
errors. The study sought evidence for the presence of production or perceptual constraints in error revision and for the presence of a functional prearticulatory monitor.
Three judges independently evaluated the transcriptions and audiotapes to identify
instances in which speakers revised speech errors. They then coded the nature of
the relationship between the error and the revision. In previous reports, the form
of error repairs among normal speakers has been attributed to perceptual constraints,
that is, determined by the needs of the listener. Results of the present study suggest
that the form of some error repairs among these speakers with apraxia of speech
is not in the service of the listener; rather, it conforms with production constraints.
It may be argued that some forms of error repair evidenced by these speakers, such
as the prosodic marking of phonetic errors and prosodic marking in the temporal
domain (syllable segregation), may actually serve to exacerbate the listener’s task of
message decoding. In addition, these speakers offered little evidence of an efficient
prearticulatory monitor. The time delays between interrupting the flow of speech
in recognition of an error and the initiation of a revision suggest an impaired ability
to plan revisions prior to the production of the error.  1998 Academic Press
1.0 INTRODUCTION
A substantial amount of research has been conducted on the self-correction
of speech errors produced spontaneously and on speech errors induced experimentally. This work has demonstrated that people tend to correct their
This research was supported, in part, by NINCDS Grant No. N518797, by Grant No. 5 R29
DC 02672-02 from the National Institute on Deafness and Other Communication Disorders,
National Institutes of Health, and by the University of Minnesota Graduate School Grants-inAid. Gratitude is extended to Stephanie Spitzer, Shelley Brundage, Kristin Tjaden, Stephanie
Hanson, and Julie Campbell for their contribution to the analysis of this data set. A portion
of these data were presented at the Meeting of the Acoustical Society of America, New Orleans,
November 1992.
Address reprint requests to Julie M. Liss, Ph.D., Motor Speech Disorders Laboratory, Arizona State University, Box 871908, Tempe, AZ 85281.
342
0093-934X/98 $25.00
Copyright  1998 by Academic Press
All rights of reproduction in any form reserved.
ERROR-REVISION IN AOS
343
speech errors in fairly systematic and predictable ways, depending on the
nature of the error and the communicative environment in which the error
occurs (Levelt & Cutler, 1983; Nooteboom, 1980). The explanation for this
predictability is a matter of some debate. Levelt (1983) and others have suggested that speakers abide by certain error-correction conventions to facilitate the listener’s continuous processing of the speech signal (Levelt & Cutler, 1983; Cutler, 1983). The implication is that a speaker understands that
their errors challenge or interfere with a listener’s ability to maintain continuous processing of the message, a dilemma Levelt has called the ‘‘continuation problem.’’ By following certain conventions in their repairs, speakers
can let listeners know how to relate the repair to the flawed original utterance.
Brédart (1991), echoing Levelt’s view, suggested that, ‘‘self-repairs are
shaped in such a way that the listener has the critical information to solve
the continuation problem as soon as the first word of the repair occurs’’ (p.
125). The repair notifies the listener that the original utterance was indeed
in error, how and where the original utterance was in error, and how that
error is to be corrected. Thus, the form of speech error repairs is determined
by perceptual constraints.
Berg (1992) challenged the purely listener-oriented perspective of speech
error correction, claiming that the patterns of data reported in the literature
reveal more about a speaker’s knowledge of her own production system than
about the speaker’s consideration of listener needs. His assertion hinges on
the observation that the repair of speech errors only can be attempted if the
error is detected by the speaker, and detection is largely listener-independent.
As Berg (1992) observed, many experimental results used to promote the
listener-oriented explanation, such as the potency of word-initial errors, can
be interpreted equally well as stemming from either production or perceptual
constraints (Berg, 1992). This is not surprising given the highly integrated
nature of speech production and perception. One develops speech by way
of alternately listening and being listened to. A lifetime of communication
breakdowns and successes no doubt shapes the amount of monitoring or
attention a speaker devotes to specific aspects of his or her output. Berg
concluded that although speakers obviously consider listener needs and adjust their output accordingly, listener-based constraints may be subordinate
to or indistinguishable from speaker-based constraints.
A slightly different interpretation of Berg’s argument is that it is not possible to parse out the relative contributions of perceptual and production constraints to error correction among normal speakers because the patterns are
not a simple by-product of the two variables. But is it possible to address
this issue in situations where the relationship between these constraint categories is altered adventitiously? Consider adult-onset apraxia of speech
(AOS) as a potential test case.
AOS is regarded as a disturbance of motor programming, typically as the
result of an anterior cortical lesion in the dominant hemisphere (Darley, Ar-
344
JULIE M. LISS
onson, & Brown, 1975). The motor speech disturbance is presumed to be at
a planning and programming level, impairing the speaker’s ability to properly select and sequence phonemes (Wertz, LaPointe, & Rosenbek, 1984;
see also Seddoh, Robin, Sim, Hageman, Moon, & Folkins, 1996). AOS is
characterized by a slow speaking rate, a labored prosodic pattern, and the
presence of varied articulation errors, successive approximations of articulatory targets (groping), and part- and whole-word repetitions (Darley, 1982).
Although AOS often co-occurs with aphasia and dysarthria, there is no concomitant language disturbance, muscle weakness, or incoordination in its
pure form.
It can be assumed that the speech deficit in AOS (and other disorders)
alters the premorbid relationship between production and perceptual constraints. Speech production becomes more effortful, speech errors become
more frequent, and new criteria for listener satisfaction must be learned. It
also can be assumed that the speaker with AOS recognizes, at some level,
the alteration of the production–perception relationship. The very presence
of self-corrections is evidence of a self-monitoring system that is at least
partially functional (Nickels & Howard, 1995; Schlenck, Huber, & Willmes,
1987; Valdois, Joanette, & Nespoulous, 1989). Thus, by evaluating how
closely these speakers conform to the predictions offered by studies of neurologically normal speakers, we can gain insight into the underlying pathology
of the disorder and the locus of speech deficit in the speech production process, and we can hypothesize about how the relative contributions of perceptual and production constraints explain the pattern of results.
2.0 METHOD
In the context of a larger investigation on apraxia of speech, spontaneous speech samples
were elicited from four people who presented with ‘‘pure’’ apraxia of speech. These speech
samples each contain many instances of self-correction. The present investigation explored
these speech samples to determine whether overt revisions of errors produced in spontaneous
speech by apraxic subjects followed predictions generated by the study of aphasic and neurologically normal speakers. This investigation focused on frank sound and word errors that
appeared to reflect incorrect word choice or phonemic mis-selection, misplanning, or misexecution.
2.1 Subjects
Spontaneous speech samples from four speakers diagnosed with apraxia of speech (designated A1–A4) were examined for this investigation. These speakers have been the subjects
of numerous other investigations and have been described in detail in this journal and elsewhere (e.g., McNeil, Liss, Tseng, & Kent, 1990; McNeil, Weismer, Adams, & Mulligan, 1990;
Odell, McNeil, Rosenbek, & Hunter, 1991). The four men (59, 62, 54, and 72 years of age)
were free of significant concomitant dysarthria or aphasia, as determined by performance on
a battery of tests. Structural–functional exams completed by a neurologist and two certified
speech–language pathologists revealed an absence of articulatory weakness or incoordination
that would indicate a dysarthria. Test results disconfirming aphasia symptoms in these four
ERROR-REVISION IN AOS
345
subjects included normal level performance on the pantomime, auditory comprehension, visual
matching, and reading comprehension subtests of the PICA; and normal level performance
on the auditory comprehension subtests of the Boston Diagnostic Aphasia Examination. Oral
speech was judged by two certified speech–language pathologists to be free of agrammatic
syntax or other language impairments. It was therefore concluded that these subjects were
reasonable exemplars of ‘‘pure’’ apraxia of speech.
All four subjects did exhibit patterns of speech that were consistent with the diagnosis of
AOS including slow rate, effortful articulatory groping, phoneme substitutions, additions and
omissions, and phoneme sequencing errors. Despite the similarities in speech error patterns,
the subjects differed in terms of the severity of their speech impairment. Subjects A1 and A2
were regarded as the least severely impaired by the judges in this present investigation and
A3 and A4 as the most severely impaired.
Brain scan data (summarized from the report of Odell et al., 1991) indicated the most severely affected brain locations were the precentral gyrus (A2, A4); the postcentral gyrus (A1,
A2, A4); the superior parietal region (A3); the supramarginal gyrus (A1, A4); the angular
gyrus (A1); and Wernicke’s area (A1). Less severely involved brain regions included the
premotor cortex (A2); Broca’s area (A2, A4); the middle temporal gyrus (A1); the insula (A1,
A2); the lenticular nuclei (A1); and the posterior internal capsule (A1). Behaviorally, three
of the four subjects (A1, A2 and A4) exhibited concomitant oral apraxia according to the
Apraxia Battery for Adults.
2.2 Speech Samples
Subjects were seated individually in a quiet room and were shown a 7-minute videotape,
without audio, of a young man and woman on a picnic. In this video, certain objects (a basket,
coffee thermos, a pizza, etc.) were shown as a part of the action several times. Subjects viewed
the videotape once and were asked to recount the story as though they were describing it to
someone who had never seen it. Subjects then watched the video and retold the story a second
time. These spontaneous speech samples were recorded in a sound-treated booth with a Shure
SM-10a microphone and Tandberg recorder.1
2.3 Transcription
The transcription and codings of the spontaneous speech samples were accomplished by
four experienced listeners, including the author of this paper. Three of the listeners were graduate assistants who had been employed approximately 6 months prior to the initiation of this
study to conduct acoustic and perceptual analyses of apraxia of speech. Their work had included the analysis of experimentally elicited phrases produced by the subjects of this study,
but the listeners had no prior experience with the spontaneous speech samples per se. All
listeners were therefore highly familiar with the speech patterns and speaking styles for this
group of subjects.
The first task was to obtain an accurate gloss of all intelligible words spoken by each of
the four speakers in their descriptions of the videoplay. To accomplish this task, one listener
orthographically transcribed all of the speech samples. This listener then used the gloss transcript in subsequent listening sessions to identify and broadly transcribe (phonetically) utter-
1
Stemberger (1992) compared the literature on speech errors produced in naturalistic versus
experimental settings and reported similarities as well as important differences in results. The
present procedures did not attempt to elicit speech errors in any experimental or systematic
way. Despite the stress inherent in task participation, these errors are regarded as being most
comparable to naturalistic rather than experimentally derived errors.
346
JULIE M. LISS
ances that were deemed as distorted, unintelligible, or nonwords. The resulting transcript was
independently verified and/or modified by three other experienced listeners, yielding a total
of four transcripts for each of the eight speech samples. The four transcripts were then compared and merged to produce a composite transcript that reflected agreement of at least three
of the four listeners. Utterances within the transcripts that did not reach this level of agreement
were not considered further in any analysis of this investigation. No effort was made to resolve
discrepancies among broad transcriptions, except in rare instances where the differences affected subsequent coding.
2.4 Coding
Following the completion of the transcription, listeners were familiarized with the coding
procedures of the speech errors and the relevant events which surround them. Listeners then
independently coded the composite transcripts while listening to the audiotapes. Because the
listeners demonstrated a reasonable level of comfort with the coding task during training and
because the conservative criterion of majority consensus was required for acceptance of a
given token, no precoding interjudge reliability measures were made.
Table 1 contains an excerpt of subject A3’s transcript on which reparandum–revision sequences are marked along with the associated coding decisions. The coding procedures and
operational definitions of the events are based on those offered by others who have investigated
these phenomena, especially Levelt (1983) and Blackmer and Mitton (1991). Some modifications to their definitions and procedures were necessary to accommodate the utterances, and
these modifications are described herein.
Levelt (1983) outlined a detailed classification system for the variety of speech errors that
are subject to repair among normal speakers. These include instances where the content of
the original message is replaced by a different message (different repairs), instances where
the words chosen to convey the message are replaced by more appropriate terminology (appropriateness repairs), and instances of frank errors in the semantic, syntactic, or phonetic form
of the words (error repairs). Each of these types of errors is classified as overt, because the
error is produced before it is intercepted. A second type, the covert error, is more difficult to
understand because these errors are presumed to exist based on the presence of repetitions,
interruptions, pauses, or editing terms such as ‘‘er’’ and ‘‘uh.’’
All of these types of repair sequences were present in the speech samples of these four
subjects but only overt error repairs were examined. This focus stems from the theoretical
predisposition of this speaker group to produce these types of errors in disproportionately large
numbers. Because speakers did not suffer concomitant aphasia, different and appropriateness
repairs were not expected to occur with exceptional frequency. Moreover, previous studies
have reported relatively infrequent occurrences of these error repairs in normal and aphasic
populations so they have not been well characterized (Levelt, 1983; Schlenck et al., 1987).
TABLE 1
Spontaneous Speech Excerpt with Associated Coding
‘‘and [he she] uh [kthu took] a bag of [papan papkan] from her basket and [ait ate] the
[pap-pa papkan] and uh a she gave the young man some [papa papkan] and he ate some
of that’’
Note. Excerpt from spontaneous speech sample of subject A3 and the associated coding
decisions. Reparandum–revision sequences are presented in brackets, and revisions are underlined. Revisions with prosodic marking are in bold print. All reparanda were interpreted immediately following their production with the exception of the second one. The reparandum of
[kthu took] was interrupted within its production. All restarts here were classified as occurring
after a very brief pause of less than 500 ms.
ERROR-REVISION IN AOS
347
2.41 Error–revision sequences. The three judges listened to the audiotaped spontaneous
speech samples and followed along with the composite transcripts to identify overt error–
revision sequences. In these sequences, judges coded the reparandum, defined as the error or
trouble spot that is to be repaired, and the revision, or the modification of the error.
A reparandum may be a sound, syllable, word, or even larger unit. It was found in the
development of our coding procedures that it was difficult to obtain acceptable interjudge
reliability on reparanda that extended beyond a word because the target was less obvious and
judges were required to speculate. Phrase-level reparanda often included subphrase reparanda
and covert repairs which also hindered reliable coding. Therefore, although these subjects
produced errors at all levels, the present analysis focused on reparanda that were at the word
or sound levels.
As a slight departure from conventional terminology, the term revision rather than repair
is used to describe a subject’s attempt to modify the error, or reparandum. Neurologically
normal speakers generally have little difficulty repairing their speech errors when they are
detected. However, the speakers of the present investigation did not always accomplish what
might be described as an acceptable repair; hence, the term revision is more appropriate. There
also were instances in which the revision following a reparandum was itself revised so that
the initial revision was rendered a reparandum as well, a sort of conduite d’approche (Joanette,
Keller, & Lecours, 1980). For example, for the word ‘‘bag,’’ pronounced by the speaker as
/bag/, the following sequence was observed: ‘‘paf (reparandum) . . . maf (revision, reparandum) . . . bag (the final revision).’’
2.42 Type of error. Judges then decided if the reparandum was predominantly an error in
lexical selection or in the selection, sequencing, or execution of phonemes. These two categories were arbitrarily labeled linguistic and phonetic, respectively. Linguistic errors were defined
as those that appear to reflect language-based mistakes, such as errors in semantic selection
or syntactic form. Phonetic errors were defined as errors that apparently have their origins in
articulatory planning or execution.
This terminology does not converge completely with that used by other investigators. The
use of the terms linguistic and phonetic to distinguish language and motoric origins reflects
the presupposition that speakers with AOS should produce and revise errors in different proportions on these levels. Any labeling system that attempts to differentiate among the presumed
levels of speech production is destined to be flawed because of the overlap and dependencies
among categories (compare the categories of Blackmer & Mitton, 1991; Levelt & Cutler,
1983; Schlenck et al., 1987).
Despite its imperfect nature, a conservative approach to the linguistic–phonetic distinction
was attempted in the present investigation to gain some understanding of the errors’ sources.
Judges were instructed to select the most obvious attribution. For example, ‘‘put, a, took a
cup’’ was coded as a linguistic error, where the verb ‘‘put’’ was the reparandum and ‘‘took’’
was the revision. A feasible, but less obvious interpretation would be that ‘‘put’’ was the
culmination of a metathesis and a substitution of the consonants in ‘‘took’’ and was therefore
a phonetic error. As another example, production of ‘‘wa (reparandum) . . . walou (revision)’’
for the word ‘‘yellow’’ was coded as a phonetic error. A less obvious interpretation, particularly given the form of the revision, is that ‘‘wa’’ was a semantic selection error (‘‘why’’).
2.43 Moment of interruption and way of restarting. Sometime following error detection,
the flow of speech is interrupted and then the revision or repair is initiated. Nooteboom (1980)
originally defined the main interruption rule as an interruption in the flow of speech as soon
as the error is detected. Levelt and Cutler (1983) subsequently described a modification to
the main interruption rule to accommodate a finding in their large corpus of repairs that word
boundaries tend to be preserved unless the word proper is incorrect. Brédart (1991) tested
Levelt and Cutler’s modification and found that there likely is nothing sacrosanct about the
word, rather that the detection mechanism requires time to operate. This was based on his
data that longer error words were interrupted more frequently before their completion than
were shorter words. Brédart suggested that the main interruption rule would only require the
modification that the monitoring mechanism may not be evenhanded. However, Blackmer and
348
JULIE M. LISS
Mitton (1991) reported that some repairs are initiated immediately (0 ms) after the moment
of interruption. If it is assumed that the moment of interruption is the moment of error detection, there is no time for replanning to occur in these cases. Therefore, some moments of
interruption may be more closely aligned with the preparedness of the repair than with error
detection.
In the present investigation, judges coded moments of interruption to examine the mechanisms of self-monitoring and revision planning in AOS. It only was assumed that moments
of interruption occurred sometime after detection of the error. In this way, moments of interruption that occur within the error word (coded as Within), are not necessarily associated with
earlier points of detection than those moments of interruption which occur immediately after
completion of the error word (coded as Right After) or when the error word is followed by
some or all of the remaining message (coded as Delay). In all conditions, error detection may
occur earlier than the moment of interruption but not after.
The greatest explanatory power in the examination of error–revision sequences is in the
joint consideration of moment of interruption and way of restarting the revision. This allows
us to speculate about the point of error detection and the point at which the replanning for
error revision occurs. As was stated earlier about the main interruption rule, revisions that
occur immediately following the instantaneous moment of interruption are evidence that both
error detection and revision planning precede the production of the error. Their presence in
this data set would be evidence of perhaps an inner speech monitor or monitoring of buffered
phonetic plans (Béland, Paradis, & Bois, 1993; Blackmer & Mitton, 1991).
The way in which this restart is initiated also is thought to provide the listener with some
information about the nature of the error and how to optimize continuous processing of the
message (Levelt, 1983). Therefore, it has been regarded as a function of perceptual constraints.
Three restart coding categories in the present investigation were used to examine the relative
contributions of perceptual and production constraints on error repair. These categories are
based only loosely2 on the well-formedness rules postulated by Levelt (1983). In the present
investigation, the following restart coding procedures were adopted: revisions that began
within 500ms3 of the offset of the error with no intervening vocalizations (coded as Pause);
vocalizations that follow the reparandum that are extraneous to the message, but that indicate
trouble has occurred (‘‘er’’ and ‘‘uh’’), and/or silent pauses that exceed 500 ms in duration
(coded as Edit Term); and revisions that included backtracking to repeat a portion of the
message that preceded the moment of interruption (coded as Backtrack).
2.44 Prosodic marking. Finally, judges attended to the audiotapes to determine whether
the revisions in each of the reparandum–revision sequences were prosodically marked relative
to the reparanda (coded as Marked or Unmarked). This was simply a binary decision: For
each sequence, listeners determined the presence or absence of a perceptible difference in
duration, pitch, or loudness that set the revision apart from the reparandum (Levelt & Cutler,
1983; Goffman, 1981).
Prosodic marking is thought to be exceptionally important, in combination with ways of
restarting, to aid the listener in the continuation problem. Levelt and Cutler (1983) hypothesized that prosodic marking of the repair tells the listener that the original utterance actually
was in error rather than just inappropriate. They also found that marking was most prevalent
2
Overt error repairs comprised such a small portion of Levelt’s data set that he did not
attempt to explain restarting relative to them.
3
Blackmer and Mitton (1991) reported error-to-repair times for a corpus of spontaneously
produced normal speech. Their category of ‘‘production errors,’’ defined as misselection of
words or phonemes or mispronunciations, seems to correspond to the combined lexical and
phonetic categories of the present investigation. The average time from the moment of interruption to the onset of the repair was 119 ms (SD 71) for their set of 43 production errors. A
500 ms criterion was used in the present study to accommodate the slow speech rates and
substantially longer pauses produced by these disordered speakers.
349
ERROR-REVISION IN AOS
when the set of semantic items from which the error was drawn was small. Although AOS
is associated with prosody deficits, particularly slow speaking rate, even the most severely
impaired speakers of this group produced perceptible prosodic marking in a portion of their
error revisions. If prosodic marking is exhibited by these speakers, it might be considered
evidence of perceptual constraints if, in fact, it improved message transmission.
2.5 Data Set
The perceptual decisions of the three judges were compared and only reparandum–revision
sequences identified by all judges were admitted into the analysis. The pooled data across the
four speakers yielded a total of 80 sequences. Examples are provided in Table 2.
Interjudge agreement on the coding categories was 100% for all but seven instances, in
which only two of the three judges agreed. These seven coding decisions were discussed and
consensus was reached. It should be emphasized that this corpus represents a reliably identified
set of revised errors. It does not represent how many errors were produced in total nor what
kinds of errors were produced in total. Rather, it constitutes a set of ‘‘linguistic’’ and ‘‘phonetic’’ errors that were detected and revised by the speakers and the ways in which these
errors were detected and revised.
2.6 Acoustic Analysis
Acoustic analysis focused on two issues: the latency between the offset of the reparandum
(which corresponded to the moment of interruption) and the onset of the revision, and the
acoustic correlates of prosodic marking. All 80 reparandum–revision sequences were digitized
at a sampling rate of 22kHz and saved in computer files using the software program CSpeech
(Milenkovic & Read, 1992). Acoustic measures were made by two of the listeners who participated in the transcription and coding, and indices of inter- and intrajudge reliability were
obtained.
2.61 Latencies. Segment, word, and utterance durations were measured using the cursor
function on a waveform or spectrographic display. Latencies between reparanda and revisions
were measured from the offset of the reparandum (defined as the last glottal pulse that extended
TABLE 2
Examples of Reparandum–Repair Sequences and
Associated Error-Type Judgments
Reparandum
paspat
footh
piken
piz
rech
give
carried
glass
her
put
Repair
Error-type coding
baskets
blue
picnic
spills
reaches
said
carrying
coffee cups
he
picks
phonetic
phonetic
phonetic
phonetic
phonetic
linguistic
linguistic
linguistic
linguistic
linguistic
Note. Examples of reparandum–repair sequences and the
judgment of error type. The sequences are presented in orthographic transcription, but all productions contained
varying degrees of phonemic distortions, omissions, and
substitutions.
350
JULIE M. LISS
through at least two formants for voiced segments and as the offset of frication noise for
unvoiced segments) to the onset of the revision (defined as the first glottal pulse that extended
through two formants for the voiced segments and as the onset of the frication noise for
unvoiced segments).4
2.62 Marking. As stated earlier, the construct of prosodic marking typically has been described in perceptual terms, in which there is the perception that the repair differs from the
reparandum along the dimensions of pitch, loudness, and/or duration (Goffman, 1981). In
the present investigation, the perceptual identification of prosodic marking was regarded as
definitive. However, acoustic analysis was undertaken to assess the perceptual experiences.
Fundamental frequency (F0) traces were obtained for each reparandum–revision sequence
using CSpeech’s autocorrelation pitch extraction method. Highest and lowest fundamental
frequencies for each vocalic segment of the sequences were measured, as were peak amplitude
values for these same segments. Simple difference values were then calculated to determine
whether the repair was higher or lower than the reparandum in F0, amplitude, or duration.
Duration differences were only considered when both the reparandum and repair were completely produced words.
2.63 Reliability of acoustic measures. Approximately 20% of all duration, F0, and amplitude measures were conducted a second time to calculate inter- and intrajudge reliability. This
set was selected in a quasi-random fashion to ensure that both judges had remeasured portions
of samples from all speakers. Both inter- and intrajudge reliability were found to be exceptionally high on the F0 and amplitude measures, and the inter- and intra-judge reliability results
are presented as pooled. The mean differences in F0 between the original and remeasurements
was 2.6 Hz (S.D. ⫽ 3.27 Hz). Differences between original and remeasurements of amplitude
were all less than 3 dB, and over 95% were identical. Ninety-five percent of the measure–
remeasure differences in duration were less than 20 ms. Larger discrepancies were reviewed
to determine the source of the error, and values were adjusted accordingly. Inter- and intrajudge
reliability is regarded as highly acceptable for all acoustic measures.
3.0 RESULTS AND DISCUSSION
3.1 Error Types
Table 3 shows the types of errors that were revised and the proportion of
these revisions that were prosodically marked relative to the reparanda. In
all, 49 of the 80 revised errors were judged to be phonetic and 31 were
linguistic in origin. These findings are inconsistent with previous literature
on two counts. First, these speakers with apraxia produced, as expected, large
numbers of phonetic errors. Levelt and Cutler (1983) found that, of large
numbers of self-corrections produced by neurologically normal speakers, repairs of syntax and semantic errors occurred with roughly the same frequency, but that phonetic error-repairs comprised less than 1% of their corpus. Similarly, Brédart (1991) reported that only about 12% of the errors in
his 1225 error–repair corpus were phonetic. The fact that self-repairs of phonetic errors are much less common among normal speakers than repairs of
syntax or semantic errors has led perceptual constraint advocates to conclude
that phonetic errors in speech cause less of a problem for the listener, so the
4
This measure is analogous to the cut-off-to-repair times described by Blackmer and Mitton,
1991.
ERROR-REVISION IN AOS
351
TABLE 3
Instances of Marked and Unmarked
Linguistic and Phonetic Errors
Marked
Linguistic
n ⫽ 31
n/80
n/31
Phonetic
n ⫽ 49
n/80
n/49
Unmarked
19
23.75%
61%
12
15%
39%
32
40%
65%
17
22.25%
35%
Note. Percentages of total revised errors
and percentages of error category also are
provided.
speaker may be less concerned about correcting them (Levelt & Cutler,
1983). Certainly in the present study, phonetic errors can be regarded as
serious impediments to message transmission. However, these speakers did
not attempt to revise all phonetic errors, even when the errors were multiple
and pervasive within a word. There is not an obvious explanation for the
speakers’ decisions of which phonetic errors to revise.
A second finding not predicted from previous literature on AOS is the
relatively high proportion of errors judged to be linguistic in origin. These
speakers showed no signs of aphasia, yet all produced linguistic errors that
were revised. At first glance, this finding is inconsistent with the prevailing
notion that AOS is primarily a motor speech disorder (see, however, McNeil,
Weismer et al., 1990). However, it is entirely consistent with the notion of
a ‘‘higher order’’ motor speech disorder that results, in part, from faulty
phonological encoding.
Levelt (1989) proposed levels of phonological encoding in which the phonetic plans for speech production derive from specifications about syllables
and segments that occur in three primary levels of processing. In the first
level (‘‘morphological/metrical spellout’’), lemmas are used to retrieve morphemes and metrical structure of a word. ‘‘Segmental spellout’’ follows, in
which the morphemes are used to retrieve specifications about syllables and
segments. Finally, syllables and segments are used to access the phonetic
plans (‘‘phonetic spellout’’ stage). Parallel distributed processing models of
speech production (Dell, 1986; Stemberger, 1985) permit the interaction
among these stages as well as earlier stages (such as lemma retrieval). In
other words, we would expect linguistic errors to occur, especially when the
errors share phonemes with the targets (see Béland et al., 1993), as was the
case for many of the linguistic error tokens. At least one third of the errors
classified as linguistic did share phonetic features with the target. Moreover,
352
JULIE M. LISS
these were errors of gender attribution of the two main characters of the
videoplay. The phonetic similarities among the words of this set (he, she,
him, her, his, hers) would predispose them to error in ways that appear to
be lexical.
3.2 Prosodic Marking
Recall that prosodic marking was defined in perceptual terms for this investigation, but that acoustic analysis also was undertaken. Mean difference
values between the F0 and amplitude were calculated for those reparandum–
revision sequences defined perceptually as marked and unmarked. These data
for F0 and amplitude are presented in Table 4. The mean acoustic measures
are consistent with the perceptual experiences to the extent that larger difference values are associated with marked as compared to unmarked sequences.
However, the correlation between perceptually determined and acoustically determined prosodic marking in the present investigation was only
moderate: 12/80 were different, r ⫽ .715. That is, in twelve instances, the
acoustic evidence was contrary to the perceptual designation of marking.
Interestingly, all of the discrepancies between the perceptual decisions and
acoustic evidence were committed in one direction: The judges heard no
marking despite acoustic evidence to contradict this perception. Moreover,
it was not the case that the acoustic changes were small in magnitude. Peak
F0 changes in these instances ranged from 9–115 Hz. A possible explanation
may lie in the fact that speakers produced words between the reparandum
and revision in 10 of these 12. This interceding speech may have contributed
to the perception of the absence of marking.
Table 3 shows that relatively more of the linguistic and phonetic errors
were marked than were unmarked. Because the prosodic marking of linguistic revisions is a complex matter, the fact that more linguistic error revisions
were marked than unmarked is not particularly revealing. Perusal of the actual errors suggests that this finding may be a function of the types of linguistic errors that were committed, and that the prosodic marking is precisely
in line with predictions from earlier investigations.
Recall that prosodic marking is more likely to occur in situations where
the set of lexical alternatives is small and where an error in the context of
discourse would be very detrimental to the transmission of the message (LeTABLE 4
Mean Changes in Peak Fundamental Frequency and Amplitude between the
Reparanda and Their Associated Repairs
Marked
Unmarked
Mean F0 change in Hz (SD)
Mean RMS amplitude change (SD)
12.17 (20.6)
⫺5.60 (30.2)
⫺.017 (.126)
.063 (.120)
ERROR-REVISION IN AOS
353
velt & Cutler, 1983). Each of the four speakers in this study produced errors
in which they confused gender pronouns (he/she, him/her, etc.). In the context of discourse, this error would be costly for the listener because the videoplay depicted one female and one male. Attributing an action to the wrong
character would misrepresent the facts of the story. Therefore, one would
predict that errors in gender attribution would tend to be marked as, in fact,
they were in 70% of the cases.
Examples are provided in Fig. 1. Each of the four panels contains a waveform display of the reparandum–revision sequence and the associated amplitude and fundamental frequency traces, respectively. The revision in the first
panel is not prosodically marked relative to the reparandum. However, the
revisions in the subsequent three panels are prosodically marked.
The unexpected finding here is that more phonetic error revisions were
marked than were unmarked. This is counter to the prediction offered by
studies of prosodic marking in normal speech. According to Levelt and Cutler (1983), repairs of phonetic errors should not be marked because ‘‘the
element to be repaired is below the morphemic level . . . to apply accent to
the word as a whole would be to mislead the hearer into thinking that one
word was to be contrasted with another’’ (p. 216). They maintain that marking phonetic errors could serve to exacerbate the listener’s continuation
problem.
The presence of prosodic marking of phonetic error revisions in this corpus
can be interpreted in several ways. If, indeed, phonetic errors should not be
prosodically marked, doing so may prove problematic for the listener, as
Levelt and Cutler have suggested. Another possibility is that the data from
neurologically normal speakers are not a fair comparison for this type of
error-repair. Neurologically normal speakers make and repair few phonetic
errors; therefore, conclusions are based on a relatively small set of exemplars.
Finally, this may serve as evidence that the construct of prosodic marking
may not be solely in the service of the listener: It may be a by-product of
the production mechanism that accompanies increases in speech effort. If
this is the case, at least some instances of prosodic marking derive from
production rather than perceptual constraints.
The preceding examples of prosodic marking show primarily differences
in the domains of fundamental frequency and amplitude, with less prominent
duration changes. Prosodic marking via duration changes between reparandum and revision has been described relative to normal self-corrections (Levelt, 1983). However, the apraxic speakers in this investigation frequently
displayed another type of durational manipulation which was perceived and
coded by the judges as prosodic marking. In these cases, the reparandum
tended to be produced with a relatively smooth flow of syllables, despite the
presence of an error. The revision then was marked by a staccato production
of the syllables, with or without concomitant changes in F0 and amplitude.
This type of production has been termed ‘‘syllable segregation’’ and ‘‘articu-
354
JULIE M. LISS
FIG. 1. Examples of prosodic marking for gender confusions. Each panel contains a waveform display of the reparandum–revision sequence followed by the associated amplitude and
fundamental frequency traces, respectively. Prosodic marking is evident in the second, third,
and fourth panels. Revisions are of either greater duration, amplitude, or fundamental frequency as compared with their reparanda.
ERROR-REVISION IN AOS
355
FIG. 2. Examples of apparent marking in the temporal domain (syllable segregation). The
top panel contains the speaker’s attempt to produce and revise the word ‘‘parka.’’ The arrows
bracket the closure interval of the /k/ in both the reparandum and revision. The increased
revision closure interval is associated with the perception of syllabification of the word. The
bottom panel contains the speaker’s attempt to produce and revise the phrase ‘‘he not know
any better.’’ The arrows exemplify temporal divisions among the words and syllables of the
revision that contribute to the impression of prosodic marking.
latory prolongation’’ in other neurologically induced speech disorders
(Kent & Rosenbek, 1982), and usually is attributed to speakers’ attempts to
improve articulatory displacement and contact.
Figure 2 shows two examples of this apparent marking in the temporal
domain. The top panel contains the speaker’s attempt to produce and revise
the word ‘‘parka.’’ Note the increased closure interval (marked with arrows)
in the revised version of the word. The bottom panel contains the speaker’s
attempt to produce and revise the phrase ‘‘he not know any better.’’ Again,
the arrows indicate breaks in speech between words and syllables of the
revision that distinguish it from the reparandum.
From this data set, it is not possible to determine precisely the motivation
for syllable segregation. It may be an inevitable by-product of the speech
production disorder, or it may be used by the speaker to facilitate self-
356
JULIE M. LISS
monitoring, articulatory control, or listener understanding. If it is indeed a
prosodic marking strategy, it is, by definition, intended to inform the listener
about the relationship between the error and repair. However, syllable segregation could have quite the opposite effect by impeding a listener’s ability to
distinguish syllable boundaries from word boundaries (Cutler & Butterfield,
1992). It is more probable that the practice of syllable segregation is motivated primarily by production constraints in which the speaker slows and
segments the message to gain articulatory accuracy—perceptual constraints
may be a subordinate consideration.
3.3 Moments of Interruption and Ways of Restarting
The present study offers limited support for the notion of prearticulatory
editing of internal speech in AOS (Levelt, 1992). These speakers, despite
their compromised speech systems, tend to interrupt their errors (at least the
ones to be revised) soon after they’re produced. The point should be made
that many more errors were produced than were revised—nonetheless, some
form of prearticulatory monitoring must be at least partially functional. Table
5 shows the percentages of loci for moments of interruption for the present
corpus of errors. There was no significant relationship between prosodic
marking and error type for moment of interruption so these values were collapsed.
Although comparison data from studies of normal speech errors are limited, one would expect that phonetic errors most likely would be interrupted
within the word containing the reparandum, particularly if the error occurs
early in the word, because the error is sublexical (Blackmer & Mitton, 1991;
Brédart, 1991; Levelt, 1992; Levelt & Cutler, 1983). This was often true;
45% of phonetic errors involved moments of interruption within the word
containing the reparandum (e.g., ‘‘b⵩ k . . . baeskət’’ for the target ‘‘basket’’). Moreover, 53% of the interruptions occurred immediately following
the word containing the phonetic error (e.g., ‘‘papan . . . papkan’’ for the
target ‘‘popcorn’’). Even among this disordered population, the moments of
interruption were temporally close to the phonetic errors.
Though few moments of interruption came within the reparanda for linguistic errors, the majority immediately followed the reparanda: 74% were
TABLE 5
Moments of Interruption by Error Type, Collapsed Across Markedness
Linguistic n ⫽ 31
Phonetic n ⫽ 49
Within
Right After
Delay
4 (13%)
22 (45%)
13 (74%)
26 (53%)
4 (13%)
1 (2%)
Note. Three possible moments of interruption include: (1) within the
word containing the reparandum (Within), (2) immediately following the
word containing the reparandum (Right After), and (3) a delayed moment
of interruption in which the reparandum is followed by some or all of
the remaining message (Delay).
357
ERROR-REVISION IN AOS
TABLE 6
Restart by Error Type, Collapsed Across Markedness
Linguistic n ⫽ 31
Phonetic n ⫽ 49
Pause
Edit
Backtrack
1 (3%)
5 (10%)
21 (68%)
31 (63%)
9 (29%)
13 (27%)
Note. Pause refers to revisions that began within 500 ms of the offset
of the error, with no intervening vocalizations. Edit refers to instances
in which vocalizations follow the reparandum that are extraneous to the
message, and indicate trouble has occurred (‘‘er’’ and ‘‘uh’’), and/or silent pauses that exceed 500 ms in duration. Backtrack refers to revisions
that include repeating a portion of the message that preceded the moment
of interruption.
associated with interruptions immediately following completion of the incorrect word (e.g., ‘‘him . . . her’’ for the target ‘‘her), and 13% were delayed
moments of interruption (e.g., ‘‘placed two glasses down for him . . . (pause)
. . . two coffee cups down for him). Again, this pattern is predicted from
studies of normal speech error revisions in which the unit of error is the
word. Chi-square analysis revealed that the type of error was significantly
related to the moment of interruption (χ 2[2] ⫽ 10.9, P ⫽ .0042).
Even though moments of interruption were very close in time to the phonetic and linguistic errors, the restarts rarely occurred immediately (as defined herein). The vast majority of phonetic error restarts (90%) and virtually
all of the linguistic error restarts (99%) occurred following editing terms,
pauses that exceeded 500 ms, or backtracking, as shown in Table 6. This
suggests that, for these speakers, the moment of interruption is more closely
associated with detection of the error than with the preparedness of the revision. It indicates that the planning of the revision is occurring sometime after
any prearticulatory monitoring process and after the error has been committed.
Latencies between reparanda and revisions are presented in Table 7. Unmarked linguistic and phonetic revisions occurred on average earlier than
did the marked, and phonetic revisions occurred earlier than linguistic. However, a two-way analysis of variance revealed no significant differences
among these latencies, and no interaction between error type and marking.
Similarly, no differences in median latency values for the different moments
of interruption were identified with a Kruskal-Wallis One Way ANOVA on
Ranks (H[2] ⫽ 4.51, P ⫽ .105). As expected, the median latency of pauses
was significantly shorter than either those of edit or backtrack (H[2] ⫽ 19.7,
P ⬍ .0001) when the data were sorted by types of restart.
The fact that immediate restarts were seen only for some of the phonetic
error revisions and none of the linguistic error revisions, at first glance, may
be taken as evidence of a prearticulatory monitor functioning on some phonetic buffer. However, phonetic revisions frequently were not adequate re-
358
JULIE M. LISS
TABLE 7
Ranges, Means, and Standard Deviations of Latencies between
Reparanda and Revisions
Linguistic errors
Marked
Unmarked
Phonetic errors
Marked
Unmarked
N
Range (s)
Mean (s)
S.D.
19
12
.50–8.81
.62–5.07
2.72
2.29
2.54
1.53
32
17
.00–7.12
.00–3.7
2.10
1.72
2.01
1.15
pairs of the errored production. Phonetic revisions were often revised themselves, thereby becoming reparanda. This was especially true for rapid
sequences that would be called ‘‘articulatory groping’’ or ‘‘successive articulatory approximations’’ by perceptual standards (Valdois et al., 1989). Considering the preponderance of very long average latencies these data do not
offer a compelling argument for the presence of a functional prearticulatory
buffer or for the presence of adequate incremental processing (Kempen &
Hoenkamp, 1987). In other words, there is little evidence that the planning
of the revision occurs prior to the production of the error. This is in contrast
with studies of normal error repairs that show very rapid restarts (Blackmer & Mitton, 1991).
4.0 CONCLUSIONS
Does this study of the form of speech error–repairs in AOS provide any
insight to the perception–production constraint controversy? This study indicates that these speakers with relatively pure forms of apraxia of speech do
recognize and modify at least a portion of their speech errors in spontaneous
speech. However, the error repair strategies do not coincide entirely with
those reported for normal speakers. The phenomenon of prosodic marking
has been hypothesized to assist listeners in the recognition of what spoken
segment is in error, and how that error is to be corrected. However, these
speakers exhibited two types of prosodic marking that may actually serve
to confuse the listener: The prosodic marking of phonetic errors and the use
of durational modifications of syllables to prosodically mark revisions were
exhibited by all four speakers. It may be the case that the nonpredicted types
of prosodic marking evidenced here are simply byproducts of the motor
speech disorder and are not exemplars of the prosodic marking that has been
described for the neurologically normal population. This possibility is difficult to assess, given that there is no comparable data base of phonetic errors
produced by neurologically normal speakers who rarely make—and hence
repair—such errors in spontaneous speech. It may also be the case that, as
Berg (1992) suggested, perceptual constraints may be subordinate to produc-
ERROR-REVISION IN AOS
359
tion constraints and that this is only apparent under certain conditions. In
either case, this study offers an example of prosodic marking that does not,
according to current theoretical accounts, benefit the listener.
This study also offers some insight to the nature of the deficit in AOS.
Investigations of speech errors in AOS collectively have found evidence of
‘‘higher level’’ deficits in phonological encoding (e.g., Canter, Trost, &
Burns, 1985), and ‘‘lower level’’ deficits in the specification of muscle activity and sequencing (Itoh & Sasanuma, 1984; McNeil, Weismer et al., 1990;
Seddoh et al., 1996; Ziegler & von Cramon, 1986).5 If we accept a more
unifying view of the phonological encoding stage such as that offered by
Levelt (1989), the presence of errors coded as linguistic and as phonetic
in the present data set is not unexpected for these speakers. The present
investigation also provides some evidence for the presence of an inconsistent
or inadequate prearticulatory buffer in phonological encoding. Despite relatively rapid moments of interruption (which are the theoretical outside limits
for error recognition), restarting tended to be delayed among these speakers.
Their slow repairs may reflect the need to plan and then implement the repair
following the moment of interruption.
REFERENCES
Béland, R., Paradis, C., & Bois, M. 1993. Constraints and repairs in aphasic speech: A group
study. Canadian Journal of Linguistics, 38(2), 279–302.
Berg, T. 1986. The aftermath of error occurrence: Psycholinguistic evidence from cut-offs.
Language and Communication, 6, 195–213.
Berg, T. 1992. Productive and perceptual constraints on speech-error correction. Psychological
Research, 54, 114–126.
Blackmer, E., & Mitton, J. 1991. Theories of monitoring and the timing of repairs in spontaneous speech. Cognition, 39, 173–194.
Brédart, S. 1991. Word interruption in self-repairing. Journal of Psycholinguistic Research,
20(2), 123–138.
Canter, G. J., Trost, J. E., & Burns, M. S. 1985. Contrasting speech patterns in apraxia of
speech and phonemic paraphasias. Brain and Language, 24, 204–222.
Cutler, A. 1983. Speaker’s conceptions of the functions of prosody. In A. Cutler & D. Ladd
(Eds.), Prosody: Models and measurements. Heidelberg, Germany: Springer.
Cutler, A., & Butterfield, S. 1992. Rhythmic cues to speech segmentation: Evidence from
juncture misperception. Journal of Memory and Language, 31, 218–236.
Darley, F. 1982. Aphasia. New York: Saunders.
Darley, F., Aronson, A., & Brown, J. 1975. Motor speech disorders. Philadelphia: Saunders.
Dell, G. S. 1986. A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283–321.
5
These apparently disparate findings have promoted the longstanding debate on whether
AOS is a form of nonfluent aphasia or whether it more closely aligned with other motor speech
disorders (see Kent, 1990, and Rosenbek, Kent, & LaPointe, 1984, for reviews of the history
of this controversy).
360
JULIE M. LISS
Goffman, E. 1981. Radio talk. In E. Goffman (Ed.), Forms of talk. Oxford, UK: Blackwell.
Itoh, M., & Sasanuma, S. 1984. Articulatory movements in apraxia of speech. In J. C. Rosenbek, M. R. McNeil, & A. E. Aronson (Eds.), Apraxia of speech: Physiology, acoustics,
linguistics, management (pp. 135–166). San Diego: College-Hill.
Joanette, Y., Keller, E., & Lecours, A. R. 1980. Sequences of phonemic approximations in
aphasia. Brain and Language, 11, 30–44.
Kempen, G., & Hoenkamp, E. 1987. An incremental procedure grammar for sentence formulation. Cognitive Science, 11, 201–258.
Kent, R., & Rosenbek, J. 1982. Prosodic disturbance and neurologic lesion. Brain and Language, 15, 259–291.
Levelt, W. J., & Cutler, A. 1983. Prosodic marking in speech repair. Journal of Semantics,
2(2), 205–217.
Levelt, W. 1983. Monitoring and self-repair in speech. Cognition, 14, 41–104.
Levelt, W. J. M. 1989. Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Levelt, W. 1992. Accessing words in speech production: Stages, processes and representations.
Cognition, 42, 1–22.
McNeil, M., Liss, J., Tseng, C., & Kent, R. 1990. Effects of speech rate on the absolute
and relative timing of apraxic and conduction aphasic sentence production. Brain and
Language, 38, 135–158.
McNeil, M., Weismer, G., Adams, S., & Mulligan, M. 1990. Oral structure nonspeech control
in normal, dysarthric, aphasic and apraxic speakers: Isometric force and static position
control. Journal of Speech and Hearing Research, 33, 255–268.
Milenkovic, P., & Read, C. 1992. CSpeech Version 4, Laboratory Automation Reference.
Madison, Wisconsin.
Nickels, L., & Howard, D. 1995. Phonological errors in aphasic naming: Comprehension,
monitoring and lexicality. Cortex, 31, 209–237.
Nooteboom, S. 1980. Speaking and unspeaking: Detection and correction of phonological and
lexical errors in spontaneous speech. In V. Fromkin (Ed.), Errors in linguistic performance (pp. 287–305). New York: Academic Press.
Odell, K., McNeil, M. R., Rosenbek, J. C., & Hunter, L. 1991. Perceptual characteristics of
vowel and prosody production in apraxic, aphasic, and dysarthric speakers. Journal of
Speech and Hearing Research, 34(1), 67–80.
Schlenck, K., Huber, W., & Willmes, K. 1987. ‘‘Prepairs’’ and repairs: Different monitoring
functions in aphasic language production. Brain and Language, 30, 226–244.
Seddoh, S., Robin, D., Sim, H., Hageman, C., Moon, J., & Folkins, J. 1996. Speech timing
in apraxia of speech versus conduction aphasia. Journal of Speech and Hearing Research,
39, 590–603.
Stemberger, J. 1985. An interactive activation model of language production. In A. Ellis (Ed.),
Progress in the psychology of language, Vol. 1, (pp. 143–186). London, UK: Erlbaum.
Stemberger, J. 1992. Vocalic underspecification in English language production. Language,
68(3), 492–524.
Valdois, S., Joanette, Y., & Nespoulous, J. 1989. Intrinsic organization of sequences of phonemic approximations: A preliminary study. Aphasia, 3(1), 55–73.
Wertz, R., LaPointe, L., & Rosenbek, J. 1984. Apraxia of speech in adults: The disorder and
its management. Orlando, FL: Grune & Stratton.
Ziegler, W., & von Cramon, D. 1986. Disturbed coarticulation in apraxia of speech: Acoustic
evidence. Brain and Language, 29, 34–47.