62, 342–360 (1998) BL971907 BRAIN AND LANGUAGE ARTICLE NO. Error-Revision in the Spontaneous Speech of Apraxic Speakers Julie M. Liss Motor Speech Disorders Laboratory, Arizona State University Spontaneous speech samples from four men diagnosed with apraxia of speech were transcribed to examine the ways in which they attempted to repair their speech errors. The study sought evidence for the presence of production or perceptual constraints in error revision and for the presence of a functional prearticulatory monitor. Three judges independently evaluated the transcriptions and audiotapes to identify instances in which speakers revised speech errors. They then coded the nature of the relationship between the error and the revision. In previous reports, the form of error repairs among normal speakers has been attributed to perceptual constraints, that is, determined by the needs of the listener. Results of the present study suggest that the form of some error repairs among these speakers with apraxia of speech is not in the service of the listener; rather, it conforms with production constraints. It may be argued that some forms of error repair evidenced by these speakers, such as the prosodic marking of phonetic errors and prosodic marking in the temporal domain (syllable segregation), may actually serve to exacerbate the listener’s task of message decoding. In addition, these speakers offered little evidence of an efficient prearticulatory monitor. The time delays between interrupting the flow of speech in recognition of an error and the initiation of a revision suggest an impaired ability to plan revisions prior to the production of the error. 1998 Academic Press 1.0 INTRODUCTION A substantial amount of research has been conducted on the self-correction of speech errors produced spontaneously and on speech errors induced experimentally. This work has demonstrated that people tend to correct their This research was supported, in part, by NINCDS Grant No. N518797, by Grant No. 5 R29 DC 02672-02 from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health, and by the University of Minnesota Graduate School Grants-inAid. Gratitude is extended to Stephanie Spitzer, Shelley Brundage, Kristin Tjaden, Stephanie Hanson, and Julie Campbell for their contribution to the analysis of this data set. A portion of these data were presented at the Meeting of the Acoustical Society of America, New Orleans, November 1992. Address reprint requests to Julie M. Liss, Ph.D., Motor Speech Disorders Laboratory, Arizona State University, Box 871908, Tempe, AZ 85281. 342 0093-934X/98 $25.00 Copyright 1998 by Academic Press All rights of reproduction in any form reserved. ERROR-REVISION IN AOS 343 speech errors in fairly systematic and predictable ways, depending on the nature of the error and the communicative environment in which the error occurs (Levelt & Cutler, 1983; Nooteboom, 1980). The explanation for this predictability is a matter of some debate. Levelt (1983) and others have suggested that speakers abide by certain error-correction conventions to facilitate the listener’s continuous processing of the speech signal (Levelt & Cutler, 1983; Cutler, 1983). The implication is that a speaker understands that their errors challenge or interfere with a listener’s ability to maintain continuous processing of the message, a dilemma Levelt has called the ‘‘continuation problem.’’ By following certain conventions in their repairs, speakers can let listeners know how to relate the repair to the flawed original utterance. Brédart (1991), echoing Levelt’s view, suggested that, ‘‘self-repairs are shaped in such a way that the listener has the critical information to solve the continuation problem as soon as the first word of the repair occurs’’ (p. 125). The repair notifies the listener that the original utterance was indeed in error, how and where the original utterance was in error, and how that error is to be corrected. Thus, the form of speech error repairs is determined by perceptual constraints. Berg (1992) challenged the purely listener-oriented perspective of speech error correction, claiming that the patterns of data reported in the literature reveal more about a speaker’s knowledge of her own production system than about the speaker’s consideration of listener needs. His assertion hinges on the observation that the repair of speech errors only can be attempted if the error is detected by the speaker, and detection is largely listener-independent. As Berg (1992) observed, many experimental results used to promote the listener-oriented explanation, such as the potency of word-initial errors, can be interpreted equally well as stemming from either production or perceptual constraints (Berg, 1992). This is not surprising given the highly integrated nature of speech production and perception. One develops speech by way of alternately listening and being listened to. A lifetime of communication breakdowns and successes no doubt shapes the amount of monitoring or attention a speaker devotes to specific aspects of his or her output. Berg concluded that although speakers obviously consider listener needs and adjust their output accordingly, listener-based constraints may be subordinate to or indistinguishable from speaker-based constraints. A slightly different interpretation of Berg’s argument is that it is not possible to parse out the relative contributions of perceptual and production constraints to error correction among normal speakers because the patterns are not a simple by-product of the two variables. But is it possible to address this issue in situations where the relationship between these constraint categories is altered adventitiously? Consider adult-onset apraxia of speech (AOS) as a potential test case. AOS is regarded as a disturbance of motor programming, typically as the result of an anterior cortical lesion in the dominant hemisphere (Darley, Ar- 344 JULIE M. LISS onson, & Brown, 1975). The motor speech disturbance is presumed to be at a planning and programming level, impairing the speaker’s ability to properly select and sequence phonemes (Wertz, LaPointe, & Rosenbek, 1984; see also Seddoh, Robin, Sim, Hageman, Moon, & Folkins, 1996). AOS is characterized by a slow speaking rate, a labored prosodic pattern, and the presence of varied articulation errors, successive approximations of articulatory targets (groping), and part- and whole-word repetitions (Darley, 1982). Although AOS often co-occurs with aphasia and dysarthria, there is no concomitant language disturbance, muscle weakness, or incoordination in its pure form. It can be assumed that the speech deficit in AOS (and other disorders) alters the premorbid relationship between production and perceptual constraints. Speech production becomes more effortful, speech errors become more frequent, and new criteria for listener satisfaction must be learned. It also can be assumed that the speaker with AOS recognizes, at some level, the alteration of the production–perception relationship. The very presence of self-corrections is evidence of a self-monitoring system that is at least partially functional (Nickels & Howard, 1995; Schlenck, Huber, & Willmes, 1987; Valdois, Joanette, & Nespoulous, 1989). Thus, by evaluating how closely these speakers conform to the predictions offered by studies of neurologically normal speakers, we can gain insight into the underlying pathology of the disorder and the locus of speech deficit in the speech production process, and we can hypothesize about how the relative contributions of perceptual and production constraints explain the pattern of results. 2.0 METHOD In the context of a larger investigation on apraxia of speech, spontaneous speech samples were elicited from four people who presented with ‘‘pure’’ apraxia of speech. These speech samples each contain many instances of self-correction. The present investigation explored these speech samples to determine whether overt revisions of errors produced in spontaneous speech by apraxic subjects followed predictions generated by the study of aphasic and neurologically normal speakers. This investigation focused on frank sound and word errors that appeared to reflect incorrect word choice or phonemic mis-selection, misplanning, or misexecution. 2.1 Subjects Spontaneous speech samples from four speakers diagnosed with apraxia of speech (designated A1–A4) were examined for this investigation. These speakers have been the subjects of numerous other investigations and have been described in detail in this journal and elsewhere (e.g., McNeil, Liss, Tseng, & Kent, 1990; McNeil, Weismer, Adams, & Mulligan, 1990; Odell, McNeil, Rosenbek, & Hunter, 1991). The four men (59, 62, 54, and 72 years of age) were free of significant concomitant dysarthria or aphasia, as determined by performance on a battery of tests. Structural–functional exams completed by a neurologist and two certified speech–language pathologists revealed an absence of articulatory weakness or incoordination that would indicate a dysarthria. Test results disconfirming aphasia symptoms in these four ERROR-REVISION IN AOS 345 subjects included normal level performance on the pantomime, auditory comprehension, visual matching, and reading comprehension subtests of the PICA; and normal level performance on the auditory comprehension subtests of the Boston Diagnostic Aphasia Examination. Oral speech was judged by two certified speech–language pathologists to be free of agrammatic syntax or other language impairments. It was therefore concluded that these subjects were reasonable exemplars of ‘‘pure’’ apraxia of speech. All four subjects did exhibit patterns of speech that were consistent with the diagnosis of AOS including slow rate, effortful articulatory groping, phoneme substitutions, additions and omissions, and phoneme sequencing errors. Despite the similarities in speech error patterns, the subjects differed in terms of the severity of their speech impairment. Subjects A1 and A2 were regarded as the least severely impaired by the judges in this present investigation and A3 and A4 as the most severely impaired. Brain scan data (summarized from the report of Odell et al., 1991) indicated the most severely affected brain locations were the precentral gyrus (A2, A4); the postcentral gyrus (A1, A2, A4); the superior parietal region (A3); the supramarginal gyrus (A1, A4); the angular gyrus (A1); and Wernicke’s area (A1). Less severely involved brain regions included the premotor cortex (A2); Broca’s area (A2, A4); the middle temporal gyrus (A1); the insula (A1, A2); the lenticular nuclei (A1); and the posterior internal capsule (A1). Behaviorally, three of the four subjects (A1, A2 and A4) exhibited concomitant oral apraxia according to the Apraxia Battery for Adults. 2.2 Speech Samples Subjects were seated individually in a quiet room and were shown a 7-minute videotape, without audio, of a young man and woman on a picnic. In this video, certain objects (a basket, coffee thermos, a pizza, etc.) were shown as a part of the action several times. Subjects viewed the videotape once and were asked to recount the story as though they were describing it to someone who had never seen it. Subjects then watched the video and retold the story a second time. These spontaneous speech samples were recorded in a sound-treated booth with a Shure SM-10a microphone and Tandberg recorder.1 2.3 Transcription The transcription and codings of the spontaneous speech samples were accomplished by four experienced listeners, including the author of this paper. Three of the listeners were graduate assistants who had been employed approximately 6 months prior to the initiation of this study to conduct acoustic and perceptual analyses of apraxia of speech. Their work had included the analysis of experimentally elicited phrases produced by the subjects of this study, but the listeners had no prior experience with the spontaneous speech samples per se. All listeners were therefore highly familiar with the speech patterns and speaking styles for this group of subjects. The first task was to obtain an accurate gloss of all intelligible words spoken by each of the four speakers in their descriptions of the videoplay. To accomplish this task, one listener orthographically transcribed all of the speech samples. This listener then used the gloss transcript in subsequent listening sessions to identify and broadly transcribe (phonetically) utter- 1 Stemberger (1992) compared the literature on speech errors produced in naturalistic versus experimental settings and reported similarities as well as important differences in results. The present procedures did not attempt to elicit speech errors in any experimental or systematic way. Despite the stress inherent in task participation, these errors are regarded as being most comparable to naturalistic rather than experimentally derived errors. 346 JULIE M. LISS ances that were deemed as distorted, unintelligible, or nonwords. The resulting transcript was independently verified and/or modified by three other experienced listeners, yielding a total of four transcripts for each of the eight speech samples. The four transcripts were then compared and merged to produce a composite transcript that reflected agreement of at least three of the four listeners. Utterances within the transcripts that did not reach this level of agreement were not considered further in any analysis of this investigation. No effort was made to resolve discrepancies among broad transcriptions, except in rare instances where the differences affected subsequent coding. 2.4 Coding Following the completion of the transcription, listeners were familiarized with the coding procedures of the speech errors and the relevant events which surround them. Listeners then independently coded the composite transcripts while listening to the audiotapes. Because the listeners demonstrated a reasonable level of comfort with the coding task during training and because the conservative criterion of majority consensus was required for acceptance of a given token, no precoding interjudge reliability measures were made. Table 1 contains an excerpt of subject A3’s transcript on which reparandum–revision sequences are marked along with the associated coding decisions. The coding procedures and operational definitions of the events are based on those offered by others who have investigated these phenomena, especially Levelt (1983) and Blackmer and Mitton (1991). Some modifications to their definitions and procedures were necessary to accommodate the utterances, and these modifications are described herein. Levelt (1983) outlined a detailed classification system for the variety of speech errors that are subject to repair among normal speakers. These include instances where the content of the original message is replaced by a different message (different repairs), instances where the words chosen to convey the message are replaced by more appropriate terminology (appropriateness repairs), and instances of frank errors in the semantic, syntactic, or phonetic form of the words (error repairs). Each of these types of errors is classified as overt, because the error is produced before it is intercepted. A second type, the covert error, is more difficult to understand because these errors are presumed to exist based on the presence of repetitions, interruptions, pauses, or editing terms such as ‘‘er’’ and ‘‘uh.’’ All of these types of repair sequences were present in the speech samples of these four subjects but only overt error repairs were examined. This focus stems from the theoretical predisposition of this speaker group to produce these types of errors in disproportionately large numbers. Because speakers did not suffer concomitant aphasia, different and appropriateness repairs were not expected to occur with exceptional frequency. Moreover, previous studies have reported relatively infrequent occurrences of these error repairs in normal and aphasic populations so they have not been well characterized (Levelt, 1983; Schlenck et al., 1987). TABLE 1 Spontaneous Speech Excerpt with Associated Coding ‘‘and [he she] uh [kthu took] a bag of [papan papkan] from her basket and [ait ate] the [pap-pa papkan] and uh a she gave the young man some [papa papkan] and he ate some of that’’ Note. Excerpt from spontaneous speech sample of subject A3 and the associated coding decisions. Reparandum–revision sequences are presented in brackets, and revisions are underlined. Revisions with prosodic marking are in bold print. All reparanda were interpreted immediately following their production with the exception of the second one. The reparandum of [kthu took] was interrupted within its production. All restarts here were classified as occurring after a very brief pause of less than 500 ms. ERROR-REVISION IN AOS 347 2.41 Error–revision sequences. The three judges listened to the audiotaped spontaneous speech samples and followed along with the composite transcripts to identify overt error– revision sequences. In these sequences, judges coded the reparandum, defined as the error or trouble spot that is to be repaired, and the revision, or the modification of the error. A reparandum may be a sound, syllable, word, or even larger unit. It was found in the development of our coding procedures that it was difficult to obtain acceptable interjudge reliability on reparanda that extended beyond a word because the target was less obvious and judges were required to speculate. Phrase-level reparanda often included subphrase reparanda and covert repairs which also hindered reliable coding. Therefore, although these subjects produced errors at all levels, the present analysis focused on reparanda that were at the word or sound levels. As a slight departure from conventional terminology, the term revision rather than repair is used to describe a subject’s attempt to modify the error, or reparandum. Neurologically normal speakers generally have little difficulty repairing their speech errors when they are detected. However, the speakers of the present investigation did not always accomplish what might be described as an acceptable repair; hence, the term revision is more appropriate. There also were instances in which the revision following a reparandum was itself revised so that the initial revision was rendered a reparandum as well, a sort of conduite d’approche (Joanette, Keller, & Lecours, 1980). For example, for the word ‘‘bag,’’ pronounced by the speaker as /bag/, the following sequence was observed: ‘‘paf (reparandum) . . . maf (revision, reparandum) . . . bag (the final revision).’’ 2.42 Type of error. Judges then decided if the reparandum was predominantly an error in lexical selection or in the selection, sequencing, or execution of phonemes. These two categories were arbitrarily labeled linguistic and phonetic, respectively. Linguistic errors were defined as those that appear to reflect language-based mistakes, such as errors in semantic selection or syntactic form. Phonetic errors were defined as errors that apparently have their origins in articulatory planning or execution. This terminology does not converge completely with that used by other investigators. The use of the terms linguistic and phonetic to distinguish language and motoric origins reflects the presupposition that speakers with AOS should produce and revise errors in different proportions on these levels. Any labeling system that attempts to differentiate among the presumed levels of speech production is destined to be flawed because of the overlap and dependencies among categories (compare the categories of Blackmer & Mitton, 1991; Levelt & Cutler, 1983; Schlenck et al., 1987). Despite its imperfect nature, a conservative approach to the linguistic–phonetic distinction was attempted in the present investigation to gain some understanding of the errors’ sources. Judges were instructed to select the most obvious attribution. For example, ‘‘put, a, took a cup’’ was coded as a linguistic error, where the verb ‘‘put’’ was the reparandum and ‘‘took’’ was the revision. A feasible, but less obvious interpretation would be that ‘‘put’’ was the culmination of a metathesis and a substitution of the consonants in ‘‘took’’ and was therefore a phonetic error. As another example, production of ‘‘wa (reparandum) . . . walou (revision)’’ for the word ‘‘yellow’’ was coded as a phonetic error. A less obvious interpretation, particularly given the form of the revision, is that ‘‘wa’’ was a semantic selection error (‘‘why’’). 2.43 Moment of interruption and way of restarting. Sometime following error detection, the flow of speech is interrupted and then the revision or repair is initiated. Nooteboom (1980) originally defined the main interruption rule as an interruption in the flow of speech as soon as the error is detected. Levelt and Cutler (1983) subsequently described a modification to the main interruption rule to accommodate a finding in their large corpus of repairs that word boundaries tend to be preserved unless the word proper is incorrect. Brédart (1991) tested Levelt and Cutler’s modification and found that there likely is nothing sacrosanct about the word, rather that the detection mechanism requires time to operate. This was based on his data that longer error words were interrupted more frequently before their completion than were shorter words. Brédart suggested that the main interruption rule would only require the modification that the monitoring mechanism may not be evenhanded. However, Blackmer and 348 JULIE M. LISS Mitton (1991) reported that some repairs are initiated immediately (0 ms) after the moment of interruption. If it is assumed that the moment of interruption is the moment of error detection, there is no time for replanning to occur in these cases. Therefore, some moments of interruption may be more closely aligned with the preparedness of the repair than with error detection. In the present investigation, judges coded moments of interruption to examine the mechanisms of self-monitoring and revision planning in AOS. It only was assumed that moments of interruption occurred sometime after detection of the error. In this way, moments of interruption that occur within the error word (coded as Within), are not necessarily associated with earlier points of detection than those moments of interruption which occur immediately after completion of the error word (coded as Right After) or when the error word is followed by some or all of the remaining message (coded as Delay). In all conditions, error detection may occur earlier than the moment of interruption but not after. The greatest explanatory power in the examination of error–revision sequences is in the joint consideration of moment of interruption and way of restarting the revision. This allows us to speculate about the point of error detection and the point at which the replanning for error revision occurs. As was stated earlier about the main interruption rule, revisions that occur immediately following the instantaneous moment of interruption are evidence that both error detection and revision planning precede the production of the error. Their presence in this data set would be evidence of perhaps an inner speech monitor or monitoring of buffered phonetic plans (Béland, Paradis, & Bois, 1993; Blackmer & Mitton, 1991). The way in which this restart is initiated also is thought to provide the listener with some information about the nature of the error and how to optimize continuous processing of the message (Levelt, 1983). Therefore, it has been regarded as a function of perceptual constraints. Three restart coding categories in the present investigation were used to examine the relative contributions of perceptual and production constraints on error repair. These categories are based only loosely2 on the well-formedness rules postulated by Levelt (1983). In the present investigation, the following restart coding procedures were adopted: revisions that began within 500ms3 of the offset of the error with no intervening vocalizations (coded as Pause); vocalizations that follow the reparandum that are extraneous to the message, but that indicate trouble has occurred (‘‘er’’ and ‘‘uh’’), and/or silent pauses that exceed 500 ms in duration (coded as Edit Term); and revisions that included backtracking to repeat a portion of the message that preceded the moment of interruption (coded as Backtrack). 2.44 Prosodic marking. Finally, judges attended to the audiotapes to determine whether the revisions in each of the reparandum–revision sequences were prosodically marked relative to the reparanda (coded as Marked or Unmarked). This was simply a binary decision: For each sequence, listeners determined the presence or absence of a perceptible difference in duration, pitch, or loudness that set the revision apart from the reparandum (Levelt & Cutler, 1983; Goffman, 1981). Prosodic marking is thought to be exceptionally important, in combination with ways of restarting, to aid the listener in the continuation problem. Levelt and Cutler (1983) hypothesized that prosodic marking of the repair tells the listener that the original utterance actually was in error rather than just inappropriate. They also found that marking was most prevalent 2 Overt error repairs comprised such a small portion of Levelt’s data set that he did not attempt to explain restarting relative to them. 3 Blackmer and Mitton (1991) reported error-to-repair times for a corpus of spontaneously produced normal speech. Their category of ‘‘production errors,’’ defined as misselection of words or phonemes or mispronunciations, seems to correspond to the combined lexical and phonetic categories of the present investigation. The average time from the moment of interruption to the onset of the repair was 119 ms (SD 71) for their set of 43 production errors. A 500 ms criterion was used in the present study to accommodate the slow speech rates and substantially longer pauses produced by these disordered speakers. 349 ERROR-REVISION IN AOS when the set of semantic items from which the error was drawn was small. Although AOS is associated with prosody deficits, particularly slow speaking rate, even the most severely impaired speakers of this group produced perceptible prosodic marking in a portion of their error revisions. If prosodic marking is exhibited by these speakers, it might be considered evidence of perceptual constraints if, in fact, it improved message transmission. 2.5 Data Set The perceptual decisions of the three judges were compared and only reparandum–revision sequences identified by all judges were admitted into the analysis. The pooled data across the four speakers yielded a total of 80 sequences. Examples are provided in Table 2. Interjudge agreement on the coding categories was 100% for all but seven instances, in which only two of the three judges agreed. These seven coding decisions were discussed and consensus was reached. It should be emphasized that this corpus represents a reliably identified set of revised errors. It does not represent how many errors were produced in total nor what kinds of errors were produced in total. Rather, it constitutes a set of ‘‘linguistic’’ and ‘‘phonetic’’ errors that were detected and revised by the speakers and the ways in which these errors were detected and revised. 2.6 Acoustic Analysis Acoustic analysis focused on two issues: the latency between the offset of the reparandum (which corresponded to the moment of interruption) and the onset of the revision, and the acoustic correlates of prosodic marking. All 80 reparandum–revision sequences were digitized at a sampling rate of 22kHz and saved in computer files using the software program CSpeech (Milenkovic & Read, 1992). Acoustic measures were made by two of the listeners who participated in the transcription and coding, and indices of inter- and intrajudge reliability were obtained. 2.61 Latencies. Segment, word, and utterance durations were measured using the cursor function on a waveform or spectrographic display. Latencies between reparanda and revisions were measured from the offset of the reparandum (defined as the last glottal pulse that extended TABLE 2 Examples of Reparandum–Repair Sequences and Associated Error-Type Judgments Reparandum paspat footh piken piz rech give carried glass her put Repair Error-type coding baskets blue picnic spills reaches said carrying coffee cups he picks phonetic phonetic phonetic phonetic phonetic linguistic linguistic linguistic linguistic linguistic Note. Examples of reparandum–repair sequences and the judgment of error type. The sequences are presented in orthographic transcription, but all productions contained varying degrees of phonemic distortions, omissions, and substitutions. 350 JULIE M. LISS through at least two formants for voiced segments and as the offset of frication noise for unvoiced segments) to the onset of the revision (defined as the first glottal pulse that extended through two formants for the voiced segments and as the onset of the frication noise for unvoiced segments).4 2.62 Marking. As stated earlier, the construct of prosodic marking typically has been described in perceptual terms, in which there is the perception that the repair differs from the reparandum along the dimensions of pitch, loudness, and/or duration (Goffman, 1981). In the present investigation, the perceptual identification of prosodic marking was regarded as definitive. However, acoustic analysis was undertaken to assess the perceptual experiences. Fundamental frequency (F0) traces were obtained for each reparandum–revision sequence using CSpeech’s autocorrelation pitch extraction method. Highest and lowest fundamental frequencies for each vocalic segment of the sequences were measured, as were peak amplitude values for these same segments. Simple difference values were then calculated to determine whether the repair was higher or lower than the reparandum in F0, amplitude, or duration. Duration differences were only considered when both the reparandum and repair were completely produced words. 2.63 Reliability of acoustic measures. Approximately 20% of all duration, F0, and amplitude measures were conducted a second time to calculate inter- and intrajudge reliability. This set was selected in a quasi-random fashion to ensure that both judges had remeasured portions of samples from all speakers. Both inter- and intrajudge reliability were found to be exceptionally high on the F0 and amplitude measures, and the inter- and intra-judge reliability results are presented as pooled. The mean differences in F0 between the original and remeasurements was 2.6 Hz (S.D. ⫽ 3.27 Hz). Differences between original and remeasurements of amplitude were all less than 3 dB, and over 95% were identical. Ninety-five percent of the measure– remeasure differences in duration were less than 20 ms. Larger discrepancies were reviewed to determine the source of the error, and values were adjusted accordingly. Inter- and intrajudge reliability is regarded as highly acceptable for all acoustic measures. 3.0 RESULTS AND DISCUSSION 3.1 Error Types Table 3 shows the types of errors that were revised and the proportion of these revisions that were prosodically marked relative to the reparanda. In all, 49 of the 80 revised errors were judged to be phonetic and 31 were linguistic in origin. These findings are inconsistent with previous literature on two counts. First, these speakers with apraxia produced, as expected, large numbers of phonetic errors. Levelt and Cutler (1983) found that, of large numbers of self-corrections produced by neurologically normal speakers, repairs of syntax and semantic errors occurred with roughly the same frequency, but that phonetic error-repairs comprised less than 1% of their corpus. Similarly, Brédart (1991) reported that only about 12% of the errors in his 1225 error–repair corpus were phonetic. The fact that self-repairs of phonetic errors are much less common among normal speakers than repairs of syntax or semantic errors has led perceptual constraint advocates to conclude that phonetic errors in speech cause less of a problem for the listener, so the 4 This measure is analogous to the cut-off-to-repair times described by Blackmer and Mitton, 1991. ERROR-REVISION IN AOS 351 TABLE 3 Instances of Marked and Unmarked Linguistic and Phonetic Errors Marked Linguistic n ⫽ 31 n/80 n/31 Phonetic n ⫽ 49 n/80 n/49 Unmarked 19 23.75% 61% 12 15% 39% 32 40% 65% 17 22.25% 35% Note. Percentages of total revised errors and percentages of error category also are provided. speaker may be less concerned about correcting them (Levelt & Cutler, 1983). Certainly in the present study, phonetic errors can be regarded as serious impediments to message transmission. However, these speakers did not attempt to revise all phonetic errors, even when the errors were multiple and pervasive within a word. There is not an obvious explanation for the speakers’ decisions of which phonetic errors to revise. A second finding not predicted from previous literature on AOS is the relatively high proportion of errors judged to be linguistic in origin. These speakers showed no signs of aphasia, yet all produced linguistic errors that were revised. At first glance, this finding is inconsistent with the prevailing notion that AOS is primarily a motor speech disorder (see, however, McNeil, Weismer et al., 1990). However, it is entirely consistent with the notion of a ‘‘higher order’’ motor speech disorder that results, in part, from faulty phonological encoding. Levelt (1989) proposed levels of phonological encoding in which the phonetic plans for speech production derive from specifications about syllables and segments that occur in three primary levels of processing. In the first level (‘‘morphological/metrical spellout’’), lemmas are used to retrieve morphemes and metrical structure of a word. ‘‘Segmental spellout’’ follows, in which the morphemes are used to retrieve specifications about syllables and segments. Finally, syllables and segments are used to access the phonetic plans (‘‘phonetic spellout’’ stage). Parallel distributed processing models of speech production (Dell, 1986; Stemberger, 1985) permit the interaction among these stages as well as earlier stages (such as lemma retrieval). In other words, we would expect linguistic errors to occur, especially when the errors share phonemes with the targets (see Béland et al., 1993), as was the case for many of the linguistic error tokens. At least one third of the errors classified as linguistic did share phonetic features with the target. Moreover, 352 JULIE M. LISS these were errors of gender attribution of the two main characters of the videoplay. The phonetic similarities among the words of this set (he, she, him, her, his, hers) would predispose them to error in ways that appear to be lexical. 3.2 Prosodic Marking Recall that prosodic marking was defined in perceptual terms for this investigation, but that acoustic analysis also was undertaken. Mean difference values between the F0 and amplitude were calculated for those reparandum– revision sequences defined perceptually as marked and unmarked. These data for F0 and amplitude are presented in Table 4. The mean acoustic measures are consistent with the perceptual experiences to the extent that larger difference values are associated with marked as compared to unmarked sequences. However, the correlation between perceptually determined and acoustically determined prosodic marking in the present investigation was only moderate: 12/80 were different, r ⫽ .715. That is, in twelve instances, the acoustic evidence was contrary to the perceptual designation of marking. Interestingly, all of the discrepancies between the perceptual decisions and acoustic evidence were committed in one direction: The judges heard no marking despite acoustic evidence to contradict this perception. Moreover, it was not the case that the acoustic changes were small in magnitude. Peak F0 changes in these instances ranged from 9–115 Hz. A possible explanation may lie in the fact that speakers produced words between the reparandum and revision in 10 of these 12. This interceding speech may have contributed to the perception of the absence of marking. Table 3 shows that relatively more of the linguistic and phonetic errors were marked than were unmarked. Because the prosodic marking of linguistic revisions is a complex matter, the fact that more linguistic error revisions were marked than unmarked is not particularly revealing. Perusal of the actual errors suggests that this finding may be a function of the types of linguistic errors that were committed, and that the prosodic marking is precisely in line with predictions from earlier investigations. Recall that prosodic marking is more likely to occur in situations where the set of lexical alternatives is small and where an error in the context of discourse would be very detrimental to the transmission of the message (LeTABLE 4 Mean Changes in Peak Fundamental Frequency and Amplitude between the Reparanda and Their Associated Repairs Marked Unmarked Mean F0 change in Hz (SD) Mean RMS amplitude change (SD) 12.17 (20.6) ⫺5.60 (30.2) ⫺.017 (.126) .063 (.120) ERROR-REVISION IN AOS 353 velt & Cutler, 1983). Each of the four speakers in this study produced errors in which they confused gender pronouns (he/she, him/her, etc.). In the context of discourse, this error would be costly for the listener because the videoplay depicted one female and one male. Attributing an action to the wrong character would misrepresent the facts of the story. Therefore, one would predict that errors in gender attribution would tend to be marked as, in fact, they were in 70% of the cases. Examples are provided in Fig. 1. Each of the four panels contains a waveform display of the reparandum–revision sequence and the associated amplitude and fundamental frequency traces, respectively. The revision in the first panel is not prosodically marked relative to the reparandum. However, the revisions in the subsequent three panels are prosodically marked. The unexpected finding here is that more phonetic error revisions were marked than were unmarked. This is counter to the prediction offered by studies of prosodic marking in normal speech. According to Levelt and Cutler (1983), repairs of phonetic errors should not be marked because ‘‘the element to be repaired is below the morphemic level . . . to apply accent to the word as a whole would be to mislead the hearer into thinking that one word was to be contrasted with another’’ (p. 216). They maintain that marking phonetic errors could serve to exacerbate the listener’s continuation problem. The presence of prosodic marking of phonetic error revisions in this corpus can be interpreted in several ways. If, indeed, phonetic errors should not be prosodically marked, doing so may prove problematic for the listener, as Levelt and Cutler have suggested. Another possibility is that the data from neurologically normal speakers are not a fair comparison for this type of error-repair. Neurologically normal speakers make and repair few phonetic errors; therefore, conclusions are based on a relatively small set of exemplars. Finally, this may serve as evidence that the construct of prosodic marking may not be solely in the service of the listener: It may be a by-product of the production mechanism that accompanies increases in speech effort. If this is the case, at least some instances of prosodic marking derive from production rather than perceptual constraints. The preceding examples of prosodic marking show primarily differences in the domains of fundamental frequency and amplitude, with less prominent duration changes. Prosodic marking via duration changes between reparandum and revision has been described relative to normal self-corrections (Levelt, 1983). However, the apraxic speakers in this investigation frequently displayed another type of durational manipulation which was perceived and coded by the judges as prosodic marking. In these cases, the reparandum tended to be produced with a relatively smooth flow of syllables, despite the presence of an error. The revision then was marked by a staccato production of the syllables, with or without concomitant changes in F0 and amplitude. This type of production has been termed ‘‘syllable segregation’’ and ‘‘articu- 354 JULIE M. LISS FIG. 1. Examples of prosodic marking for gender confusions. Each panel contains a waveform display of the reparandum–revision sequence followed by the associated amplitude and fundamental frequency traces, respectively. Prosodic marking is evident in the second, third, and fourth panels. Revisions are of either greater duration, amplitude, or fundamental frequency as compared with their reparanda. ERROR-REVISION IN AOS 355 FIG. 2. Examples of apparent marking in the temporal domain (syllable segregation). The top panel contains the speaker’s attempt to produce and revise the word ‘‘parka.’’ The arrows bracket the closure interval of the /k/ in both the reparandum and revision. The increased revision closure interval is associated with the perception of syllabification of the word. The bottom panel contains the speaker’s attempt to produce and revise the phrase ‘‘he not know any better.’’ The arrows exemplify temporal divisions among the words and syllables of the revision that contribute to the impression of prosodic marking. latory prolongation’’ in other neurologically induced speech disorders (Kent & Rosenbek, 1982), and usually is attributed to speakers’ attempts to improve articulatory displacement and contact. Figure 2 shows two examples of this apparent marking in the temporal domain. The top panel contains the speaker’s attempt to produce and revise the word ‘‘parka.’’ Note the increased closure interval (marked with arrows) in the revised version of the word. The bottom panel contains the speaker’s attempt to produce and revise the phrase ‘‘he not know any better.’’ Again, the arrows indicate breaks in speech between words and syllables of the revision that distinguish it from the reparandum. From this data set, it is not possible to determine precisely the motivation for syllable segregation. It may be an inevitable by-product of the speech production disorder, or it may be used by the speaker to facilitate self- 356 JULIE M. LISS monitoring, articulatory control, or listener understanding. If it is indeed a prosodic marking strategy, it is, by definition, intended to inform the listener about the relationship between the error and repair. However, syllable segregation could have quite the opposite effect by impeding a listener’s ability to distinguish syllable boundaries from word boundaries (Cutler & Butterfield, 1992). It is more probable that the practice of syllable segregation is motivated primarily by production constraints in which the speaker slows and segments the message to gain articulatory accuracy—perceptual constraints may be a subordinate consideration. 3.3 Moments of Interruption and Ways of Restarting The present study offers limited support for the notion of prearticulatory editing of internal speech in AOS (Levelt, 1992). These speakers, despite their compromised speech systems, tend to interrupt their errors (at least the ones to be revised) soon after they’re produced. The point should be made that many more errors were produced than were revised—nonetheless, some form of prearticulatory monitoring must be at least partially functional. Table 5 shows the percentages of loci for moments of interruption for the present corpus of errors. There was no significant relationship between prosodic marking and error type for moment of interruption so these values were collapsed. Although comparison data from studies of normal speech errors are limited, one would expect that phonetic errors most likely would be interrupted within the word containing the reparandum, particularly if the error occurs early in the word, because the error is sublexical (Blackmer & Mitton, 1991; Brédart, 1991; Levelt, 1992; Levelt & Cutler, 1983). This was often true; 45% of phonetic errors involved moments of interruption within the word containing the reparandum (e.g., ‘‘b k . . . baeskət’’ for the target ‘‘basket’’). Moreover, 53% of the interruptions occurred immediately following the word containing the phonetic error (e.g., ‘‘papan . . . papkan’’ for the target ‘‘popcorn’’). Even among this disordered population, the moments of interruption were temporally close to the phonetic errors. Though few moments of interruption came within the reparanda for linguistic errors, the majority immediately followed the reparanda: 74% were TABLE 5 Moments of Interruption by Error Type, Collapsed Across Markedness Linguistic n ⫽ 31 Phonetic n ⫽ 49 Within Right After Delay 4 (13%) 22 (45%) 13 (74%) 26 (53%) 4 (13%) 1 (2%) Note. Three possible moments of interruption include: (1) within the word containing the reparandum (Within), (2) immediately following the word containing the reparandum (Right After), and (3) a delayed moment of interruption in which the reparandum is followed by some or all of the remaining message (Delay). 357 ERROR-REVISION IN AOS TABLE 6 Restart by Error Type, Collapsed Across Markedness Linguistic n ⫽ 31 Phonetic n ⫽ 49 Pause Edit Backtrack 1 (3%) 5 (10%) 21 (68%) 31 (63%) 9 (29%) 13 (27%) Note. Pause refers to revisions that began within 500 ms of the offset of the error, with no intervening vocalizations. Edit refers to instances in which vocalizations follow the reparandum that are extraneous to the message, and indicate trouble has occurred (‘‘er’’ and ‘‘uh’’), and/or silent pauses that exceed 500 ms in duration. Backtrack refers to revisions that include repeating a portion of the message that preceded the moment of interruption. associated with interruptions immediately following completion of the incorrect word (e.g., ‘‘him . . . her’’ for the target ‘‘her), and 13% were delayed moments of interruption (e.g., ‘‘placed two glasses down for him . . . (pause) . . . two coffee cups down for him). Again, this pattern is predicted from studies of normal speech error revisions in which the unit of error is the word. Chi-square analysis revealed that the type of error was significantly related to the moment of interruption (χ 2[2] ⫽ 10.9, P ⫽ .0042). Even though moments of interruption were very close in time to the phonetic and linguistic errors, the restarts rarely occurred immediately (as defined herein). The vast majority of phonetic error restarts (90%) and virtually all of the linguistic error restarts (99%) occurred following editing terms, pauses that exceeded 500 ms, or backtracking, as shown in Table 6. This suggests that, for these speakers, the moment of interruption is more closely associated with detection of the error than with the preparedness of the revision. It indicates that the planning of the revision is occurring sometime after any prearticulatory monitoring process and after the error has been committed. Latencies between reparanda and revisions are presented in Table 7. Unmarked linguistic and phonetic revisions occurred on average earlier than did the marked, and phonetic revisions occurred earlier than linguistic. However, a two-way analysis of variance revealed no significant differences among these latencies, and no interaction between error type and marking. Similarly, no differences in median latency values for the different moments of interruption were identified with a Kruskal-Wallis One Way ANOVA on Ranks (H[2] ⫽ 4.51, P ⫽ .105). As expected, the median latency of pauses was significantly shorter than either those of edit or backtrack (H[2] ⫽ 19.7, P ⬍ .0001) when the data were sorted by types of restart. The fact that immediate restarts were seen only for some of the phonetic error revisions and none of the linguistic error revisions, at first glance, may be taken as evidence of a prearticulatory monitor functioning on some phonetic buffer. However, phonetic revisions frequently were not adequate re- 358 JULIE M. LISS TABLE 7 Ranges, Means, and Standard Deviations of Latencies between Reparanda and Revisions Linguistic errors Marked Unmarked Phonetic errors Marked Unmarked N Range (s) Mean (s) S.D. 19 12 .50–8.81 .62–5.07 2.72 2.29 2.54 1.53 32 17 .00–7.12 .00–3.7 2.10 1.72 2.01 1.15 pairs of the errored production. Phonetic revisions were often revised themselves, thereby becoming reparanda. This was especially true for rapid sequences that would be called ‘‘articulatory groping’’ or ‘‘successive articulatory approximations’’ by perceptual standards (Valdois et al., 1989). Considering the preponderance of very long average latencies these data do not offer a compelling argument for the presence of a functional prearticulatory buffer or for the presence of adequate incremental processing (Kempen & Hoenkamp, 1987). In other words, there is little evidence that the planning of the revision occurs prior to the production of the error. This is in contrast with studies of normal error repairs that show very rapid restarts (Blackmer & Mitton, 1991). 4.0 CONCLUSIONS Does this study of the form of speech error–repairs in AOS provide any insight to the perception–production constraint controversy? This study indicates that these speakers with relatively pure forms of apraxia of speech do recognize and modify at least a portion of their speech errors in spontaneous speech. However, the error repair strategies do not coincide entirely with those reported for normal speakers. The phenomenon of prosodic marking has been hypothesized to assist listeners in the recognition of what spoken segment is in error, and how that error is to be corrected. However, these speakers exhibited two types of prosodic marking that may actually serve to confuse the listener: The prosodic marking of phonetic errors and the use of durational modifications of syllables to prosodically mark revisions were exhibited by all four speakers. It may be the case that the nonpredicted types of prosodic marking evidenced here are simply byproducts of the motor speech disorder and are not exemplars of the prosodic marking that has been described for the neurologically normal population. This possibility is difficult to assess, given that there is no comparable data base of phonetic errors produced by neurologically normal speakers who rarely make—and hence repair—such errors in spontaneous speech. It may also be the case that, as Berg (1992) suggested, perceptual constraints may be subordinate to produc- ERROR-REVISION IN AOS 359 tion constraints and that this is only apparent under certain conditions. In either case, this study offers an example of prosodic marking that does not, according to current theoretical accounts, benefit the listener. This study also offers some insight to the nature of the deficit in AOS. Investigations of speech errors in AOS collectively have found evidence of ‘‘higher level’’ deficits in phonological encoding (e.g., Canter, Trost, & Burns, 1985), and ‘‘lower level’’ deficits in the specification of muscle activity and sequencing (Itoh & Sasanuma, 1984; McNeil, Weismer et al., 1990; Seddoh et al., 1996; Ziegler & von Cramon, 1986).5 If we accept a more unifying view of the phonological encoding stage such as that offered by Levelt (1989), the presence of errors coded as linguistic and as phonetic in the present data set is not unexpected for these speakers. The present investigation also provides some evidence for the presence of an inconsistent or inadequate prearticulatory buffer in phonological encoding. Despite relatively rapid moments of interruption (which are the theoretical outside limits for error recognition), restarting tended to be delayed among these speakers. Their slow repairs may reflect the need to plan and then implement the repair following the moment of interruption. REFERENCES Béland, R., Paradis, C., & Bois, M. 1993. Constraints and repairs in aphasic speech: A group study. Canadian Journal of Linguistics, 38(2), 279–302. Berg, T. 1986. The aftermath of error occurrence: Psycholinguistic evidence from cut-offs. Language and Communication, 6, 195–213. Berg, T. 1992. Productive and perceptual constraints on speech-error correction. Psychological Research, 54, 114–126. Blackmer, E., & Mitton, J. 1991. Theories of monitoring and the timing of repairs in spontaneous speech. Cognition, 39, 173–194. Brédart, S. 1991. Word interruption in self-repairing. Journal of Psycholinguistic Research, 20(2), 123–138. Canter, G. J., Trost, J. E., & Burns, M. S. 1985. Contrasting speech patterns in apraxia of speech and phonemic paraphasias. Brain and Language, 24, 204–222. Cutler, A. 1983. Speaker’s conceptions of the functions of prosody. In A. Cutler & D. Ladd (Eds.), Prosody: Models and measurements. Heidelberg, Germany: Springer. Cutler, A., & Butterfield, S. 1992. Rhythmic cues to speech segmentation: Evidence from juncture misperception. Journal of Memory and Language, 31, 218–236. Darley, F. 1982. Aphasia. New York: Saunders. Darley, F., Aronson, A., & Brown, J. 1975. Motor speech disorders. Philadelphia: Saunders. Dell, G. S. 1986. A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3), 283–321. 5 These apparently disparate findings have promoted the longstanding debate on whether AOS is a form of nonfluent aphasia or whether it more closely aligned with other motor speech disorders (see Kent, 1990, and Rosenbek, Kent, & LaPointe, 1984, for reviews of the history of this controversy). 360 JULIE M. LISS Goffman, E. 1981. Radio talk. In E. Goffman (Ed.), Forms of talk. Oxford, UK: Blackwell. Itoh, M., & Sasanuma, S. 1984. Articulatory movements in apraxia of speech. In J. C. Rosenbek, M. R. McNeil, & A. E. Aronson (Eds.), Apraxia of speech: Physiology, acoustics, linguistics, management (pp. 135–166). San Diego: College-Hill. Joanette, Y., Keller, E., & Lecours, A. R. 1980. Sequences of phonemic approximations in aphasia. Brain and Language, 11, 30–44. Kempen, G., & Hoenkamp, E. 1987. An incremental procedure grammar for sentence formulation. Cognitive Science, 11, 201–258. Kent, R., & Rosenbek, J. 1982. Prosodic disturbance and neurologic lesion. Brain and Language, 15, 259–291. Levelt, W. J., & Cutler, A. 1983. Prosodic marking in speech repair. Journal of Semantics, 2(2), 205–217. Levelt, W. 1983. Monitoring and self-repair in speech. Cognition, 14, 41–104. Levelt, W. J. M. 1989. Speaking: From intention to articulation. Cambridge, MA: MIT Press. Levelt, W. 1992. Accessing words in speech production: Stages, processes and representations. Cognition, 42, 1–22. McNeil, M., Liss, J., Tseng, C., & Kent, R. 1990. Effects of speech rate on the absolute and relative timing of apraxic and conduction aphasic sentence production. Brain and Language, 38, 135–158. McNeil, M., Weismer, G., Adams, S., & Mulligan, M. 1990. Oral structure nonspeech control in normal, dysarthric, aphasic and apraxic speakers: Isometric force and static position control. Journal of Speech and Hearing Research, 33, 255–268. Milenkovic, P., & Read, C. 1992. CSpeech Version 4, Laboratory Automation Reference. Madison, Wisconsin. Nickels, L., & Howard, D. 1995. Phonological errors in aphasic naming: Comprehension, monitoring and lexicality. Cortex, 31, 209–237. Nooteboom, S. 1980. Speaking and unspeaking: Detection and correction of phonological and lexical errors in spontaneous speech. In V. Fromkin (Ed.), Errors in linguistic performance (pp. 287–305). New York: Academic Press. Odell, K., McNeil, M. R., Rosenbek, J. C., & Hunter, L. 1991. Perceptual characteristics of vowel and prosody production in apraxic, aphasic, and dysarthric speakers. Journal of Speech and Hearing Research, 34(1), 67–80. Schlenck, K., Huber, W., & Willmes, K. 1987. ‘‘Prepairs’’ and repairs: Different monitoring functions in aphasic language production. Brain and Language, 30, 226–244. Seddoh, S., Robin, D., Sim, H., Hageman, C., Moon, J., & Folkins, J. 1996. Speech timing in apraxia of speech versus conduction aphasia. Journal of Speech and Hearing Research, 39, 590–603. Stemberger, J. 1985. An interactive activation model of language production. In A. Ellis (Ed.), Progress in the psychology of language, Vol. 1, (pp. 143–186). London, UK: Erlbaum. Stemberger, J. 1992. Vocalic underspecification in English language production. Language, 68(3), 492–524. Valdois, S., Joanette, Y., & Nespoulous, J. 1989. Intrinsic organization of sequences of phonemic approximations: A preliminary study. Aphasia, 3(1), 55–73. Wertz, R., LaPointe, L., & Rosenbek, J. 1984. Apraxia of speech in adults: The disorder and its management. Orlando, FL: Grune & Stratton. Ziegler, W., & von Cramon, D. 1986. Disturbed coarticulation in apraxia of speech: Acoustic evidence. Brain and Language, 29, 34–47.
© Copyright 2026 Paperzz