CLINICAL LINGUISTICS & PHONETICS, 1995, VOL. 9, NO. 2, 139-154 The role of listener familiarity in the perception of dysarthric speech KRISTIN K. TJADEN* and JULIE M . LISS** Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 115 Shevlin Hall, University of Minnesota, Minneapolis, MN 55455, USA (Received 30 November 1993; accepted 29 April 1994) Abstract The effects of two qualitatively different familiarization procedures on listeners’ perceptions of speech produced by a Korean woman with moderate-severe spasticataxic dysarthria were investigated. Thirty listeners were randomly assigned to one of three listening conditions. Prior to this task, members of the two experimental listening groups were familiarized with a speech sample produced by the woman with dysarthria. Listeners assigned to the Paragraph group heard the speaker read a paragraph twice as they followed along with a script. Listeners assigned to the Word List group heard the speaker read the same words comprising the paragraph, but the words were presented in a random order. This list was heard twice, as the listeners followed along with a script. The Control group performed the transcription task without receiving prior exposure to the dysarthric speech. The 30 listeners then orthographically transcribed sentences produced by the dysarthric speaker. Results showed that listeners familiarized with this dysarthric person’s speech pattern (Word List and Paragraph groups) performed significantly better on the sentence transcription task than listeners who were not familiarized (Control group). Although the Paragraph group did not significantly outperform the Word List group, as was predicted, a trend in that direction did exist. Keywords: dysarthria, speech perception, familiarization Introduction The human perceptual system is extraordinarily facile in its ability to select and interpret information available from a complex acoustic signal (Pisoni and Luce, 1986; Engstrand, 1992). It is this proficiency that has permitted researchers to regard perception as somewhat of a constant in speech intelligibility research. That is, intelligibility has been viewed largely as a byproduct or function of the integrity of the speech signal, and the acoustic attributes therein (see Weismer and Martin, 1992, for a review of intelligibility research). This approach has provided information about the bottom-up component of speech processing, in which signal characteristics drive perception. This causal relationship has been explored in studies of deaf speech (e.g. Maassen and Povel, 1985; Metz, Samar, Schiavetti, Sitler and Whitehead, 1985; Monsen, 1978), dysarthric speech (e.g. Kent, Weismer, Kent and Rosenbek, 1989; Tikofsky, Glattke and Tikofsky, 1966), and synthesized speech (e.g. Nye and Gaitenby, 1973). *Currently affiliated with the University of Wisconsin-Madison. **Currently affiliated with Arizona State University. 0269-9206/95 $10.00 0 1995 Taylor & Francis Ltd. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 140 Kristin K. Tjaden and Julie M . Liss However, there is reason to believe that listeners confronted with a degraded speech signal must invoke more top-down or higher-level cognitive processing to decipher and reconstruct the message than would be necessary for the processing of non-degraded speech (Duffy and Pisoni, 1992; Greene and Pisoni, 1988; Pisoni and Luce, 1986). Numerous variables have been identified that could influence this process, including word-frequency effects (Rosenzweig and Postman, 1957), semantic or syntactic predictability (Duffy and Giolas, 1974; Giolas, Cooker and Duffy, 1970; Hammen, Yorkston and Dowden, 1991), the amount of available contextual information (Kreider, 1988), and a person’s own prior listening experience (Platt, Andrews, Young and Quinn, 1980). A successful model of the perception of degraded speech must be able to accommodate the interaction between listener variables and signal attributes. Familiarity and synthetic speech A first step towards this goal is to identify and define potent listener variables that affect the listener-signal interaction when the speech signal is substantially degraded. This task is fairly straightforward when speech signal attributes are well specified, as with synthesized or digitally filtered sfjeech. For example, consider the hypothesis that listeners can decipher a degraded signal more readily if they have had prior exposure to it (listener familiarization). Greenspan, Nusbaum, and Pisoni (1988) evaluated the effects of various listener familiarization training procedures on the intelligibility of synthetic speech. They found that listeners who received prior exposure to the Votrax speech signal outperformed those listeners who did not receive the same training. Perhaps more importantly, these investigators found a relationship between certain aspects of the training procedure and listener benefits. Listener performance patterns were influenced by the type of training (sentence versus word level), the extent of training (single exposure versus repeated exposure), and the type of listening task subjects were asked to perform (sentence versus single-word transcription). It was suggested that training at the sentence level may be superior to word level because it provides the listener with information about prosodic patterns that can facilitate segmentation of the acoustic stream into words. The Greenspan et al. (1 988) study is an example of how listener performance, and perhaps perceptual strategies, can be altered differentially with various familiarization training procedures. The listener variable of familiarization was manipulated, and the resulting listener perceptions were interpreted relative to the information available in the synthetic speech signal. The assumption is that, by knowing the acoustic information available to listeners in the speech signal and by combining this with knowledge of how the listener was familiarized with the speech signal, it is possible to hypothesize as to the nature of the constructive, top-down perceptual strategies that listeners invoked to interpret the message. Central to Greenspan et al.’s (1988) study is that the acoustic patterns of synthesized speech are systematic, consistent, and easily characterized. The examination of listener variables (e.g. listener familiarization) is less straightforward when the speech signal is degraded in a non-systematic way, as is the case with dysarthria. Familiarity and dysarthric speech Although one would expect from anecdotal evidence and clinical experience that familiarization would improve a listener’s ability to understand dysarthric speech, the few attempts to document the phenomenon have met with limited success. There are at least two possible reasons for this, including the heterogeneity of dysarthric speech, and Listener familiarity 141 methodological differences involving various definitions and levels of familiarity. These reasons are discussed in turn. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. Heterogeneity of dysarthric speech Dysarthric speakers comprise a highly heterogeneous group (Yorkston and Beukelman, 1980). Even people whose speech is regarded as representative of a specific dysarthric subtype have speech patterns that reflect some combination of the motor speech disorder, the idiosyncrasies of their premorbid speech style, and their attempts to campensate for their loss of intelligibility. The picture is complicated further by differences in severity of speech impairment, and by the lack of stability or consistency of certain segmental and suprasegmental error patterns. The fact that there are no ‘typical’ dysarthric speakers is perhaps the greatest barrier to developing generalizations regarding the perceptual processes listeners use to decipher dysarthric speech. Dysarthric speech differs on a variety of levels, and we do not yet know which of these levels might be associated with certain preferences in perceptual strategies. It is quite possible that listeners invoke different perceptual strategies for different dysarthric speakers, selecting the strategies that are most efficient and effective. Until the phenomenon of familiarization is better understood, the variables on which to categorize groups of speakers in order to obtain generalizable results will be virtually unknown. Only then can predictions about potentially potent speech signal variables, such as severity of impairment and pattern of errors, be explored. Methodological differences among familiarization studies The term ‘familiarization’can be defined along a variety of continua, including the duration of the exposure, the type of material that is used, and whether feedback is provided. The present investigation examined the effects of a brief prior exposure to dysarthric speech on listeners’ ability to transcribe a series of sentences produced by that dysarthric speaker. Because listeners were permitted to follow along on a written transcript while they heard the dysarthric speaker’s familiarization sample, it was hypothesized that they would learn something about her articulatory and prosodic patterns, and consequently be more successful in a transcription task than those listeners who did not receive prior exposure. The method used in the present investigation can be regarded as a form of specijc familiarization training, because listeners were given the opportunity to hear a sample of a specific person’s speech before transcribing sentences produced by that speaker. By contrast, previous studies have examined the effects of more general forms of familiarization, such as exposure to a group of disordered speakers, or prior experience with a particular disorder rather than with a specific individual. Yorkston and Beukelman (1983) investigated the effects of a more general form of familiarization in perception of dysarthric speech. Listeners in the experimental groups were exposed to the speech of several dysarthric speakers. Nine dysarthric subjects produced two sets of sentences (List I and List 11). Nine listeners, who were assigned to one of three familiarization conditions, transcribed these sets of sentences. The first listener group, which received no familiarization, transcribed List I and then transcribed List I1 2 weeks later. The second group transcribed List I, listened to the List I sentences three more times without feedback, and then immediately transcribed List 11. The third group transcribed List I, listened to the List I sentences three more times while following along with an accurate transcript, then transcribed List 11. Results indicated that neither familiarization group Kristin K. Tjaden and Julie M.Liss Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 142 benefited from the exposure to the dysarthric speech, nor did they perform better than the non-familiarized group. Yorkston and Beukelman interpreted these findings as evidence that judges could be used repeatedly to transcribe intelligibility samples from the same speakers, without the threat of artifactual increases in intelligibility scores. However, random assignment of listeners to the three conditions resulted in the most experienced listeners being assigned to the ‘no-familiarization’ condition. As the investigators noted, this may have accounted for the higher prefamiliarization intelligibility scores of the ‘no-familiarization’ group speakers. Thus, differences in listener experience may have obscured any specific familiarization effects. This possibility is supported by evidence from Platt et al. (1980), who reported performance differences between experienced and naive listeners in the transcription of speech produced by people with cerebral palsy (see, however, Hunter, Pring and Martin, 1991, for counter-evidence in this same population). Monsen (1983) described similar results for the perception of deaf speech, wherein people who had personal or professional experience listening to deaf speech performed a transcription task with higher accuracy than those listeners who had little or no prior experience with deaf speech. These mixed results suggest that familiarization is a complex phenomenon that goes beyond allowing a listener to simply map the acoustic-phonetic structure of the speech signal onto prototypical phonemes (see Verbrugge, Strange, Shankweiler and Edman, 1976). Because so very little is known about the variables that influence the perceptual processing of dysarthric speech, the present investigation explored the phenomenon of familiarization as it pertained to one sample of dysarthric speech. The goals were to determine whether listeners familiarized with a dysarthric person’s speech benefited from the exposure, and whether the form of the familiarization material had an impact on degree of perceptual benefit. It was hypothesized that listeners who were exposed to the speaker’s sentence-level prosody and inter-word coarticulation would perform better on a sentence transcription task than those listeners who were exposed only to a list of single words. Method Speech sample A speech sample was collected, as part of a larger investigation, from a dysarthric speaker. This speaker was a 26-year-old Korean woman with cerebral palsy and a spastic-ataxic dysarthria of moderate-severe involvement. Because the purpose of this study was not to generalize the results of the listening experiment to the perception of all dysarthric speech, but to explore the phenomenon of familiarization with a single dysarthric speech pattern, any dysarthric speaker with substantially reduced intelligibility’ would have provided suitable stimuli to test the impact of the two familiarization procedures. The benefit of using this particular speaker was the opportunity to assess listener responses to both relatively variable and relatively consistent error patterns. In this case the variable articulatory errors resulted from the spastic-ataxic dysarthria, and the more consistent error patterns were judged to be related to the speaker’s acquisition of English as a second language (ESL). This distinction is important, especially in this initial attempt to document familiarization as it is defined herein. The Assessment of Intelligibility of Dysarthric Speech (Yorkston and Beukelman, 1981), as scored by a certified speech-language pathologist unfamiliar with the speaker, revealed a mean single-word intelligibility of 46%, and a sentence intelligibility range of 0-85% with a mean of 32.7%. A motor speech examination, conducted by the authors, Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. Listener fumiliarity 143 revealed substantial deficits in speech articulation, and in respiratory-laryngeal control. Prosodic patterns were characterized by inappropriate loudness changes, short phrases, and inappropriate pauses. Some of the more variable articulatory errors were related to the breakdown in respiratory-laryngeal control. These included the intermittent devoicing of voiced consonants and vowels, resulting in a whispered quality for some speech segments. Although devoicing of voiced phonemes may be considered to be a result of this speaker’s ESL errors, the fact that she was unable to produce a sustained vowel for more than several seconds suggests the devoicing to be a laryngeal deficit resultant of the dysarthria. Subjectively, this woman’s low intelligibility entirely masked the fact that she was a non-native English speaker. However, some of her articulatory errors were believed to be the result of her acquisition of English as a second language. These more consistent errors included her tendency to substitute [I] for /r/ in the word-initial position. It should be noted that the Korean language uses two variants of the liquid /1/ including a lateral [I] in word-final position and flap [r] in word-initial position (Kim, 1990). Thus, it appears that substitution of word-initial [I] for /r/ would not be predicted based upon transfer of native Korean contrasts to learning English as a second language. However, Major (1 994) notes that a speaker’s underlying representation or mental representation of a particular sound may reflect the speaker’s native language, the second language, or something intermediate. Furthermore, the processes that act on the underlying representation to produce a surface form may be the same as the native or second language, or may be different from both. To illustrate, Major (1994) gives the example that a Korean learner of English has an underlying representation of English /r/ that is unlike the native Korean flap [r] or the English /r/; instead, when [r] is produced it may sound to English speakers as intermediate between English /r/ and A/. Thus, the interaction between native and non-native contrasts is complex and may not be predictable across speakers. For our purposes we were interested in distinguishing between variable dysarthria errors and consistent (ESL) errors. The mechanism by which the consistent errors were occurring was not of specific interest in the present investigation. However, Major’s (1994) observations suggest that our Korean speaker’s substitution of [l] for /r/ word-initially was related to her acquisition of English as a second language in some complex manner. Audio recordings of the speech samples were made in a quiet room using a Tascam I 12 audio recorder, a BK-1 Electrovoice condenser microphone mounted on a table-top microphone stand, and high-fidelity recording medium. For the present study, three audio tapes were constructed over two 1-hour sessions. The first tape (Transcription) consisted of sentences that were used as the stimuli for the sentence transcription task performed by all listeners in this investigation. The second and third audio tapes (Word List and Paragraph) were used for the familiarization procedures to which some listeners were exposed. The dysarthric speaker, who was a graduate student proficient in English, read through all of the speech material several times before the audio taping, to increase her familiarity with the material. Utterances judged to be poorly read during the taping (read with dysfluencies or word-level additions or omissions) were repeated. The Transcription tape contained 48 six-word sentences2, including 16 questions, 16 declaratives, and 16 imperatives (see Table 3). The sentences were created by the investigators and were designed to sample the speaker’s productions of a wide variety of phonemes and a range of prosodic variation. The speaker was instructed to produce these sentences in her customary way. The Paragraph tape was created by asking the speaker to read a paragraph (see Appendix) that consisted of 12 six-word sentences (72 words). The Word List was obtained by asking the speaker to read a list of 72 words. This list consisted of those words contained in the Paragraph tape arranged in a randomized order. Thus, the Kristin K. Tjaden and Julie M . Liss 144 corpora of words in the Paragraph and Word List tapes were identical. The purpose of creating two familiarization tapes identical in content but not form was to identify differential benefits between prior exposure to sentence-level prosody and inter-word coarticulation (Paragraph tape), and information about word-level articulation patterns only (Word List tape). None of the sentences in the Paragraph tape was identical to any sentences on the Transcription tape. Overlap in content was restricted to common, frequently occurring words (e.g. ‘the, is, it, off, can, to’). Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. Subjects Thirty female3college students from the University of Minnesota served as subjects for this investigation. Subjects reported normal hearing, Standard American English as a first language, and little or no experience listening to dysarthric speech. The listening task took approximately 45 minutes, and subjects were compensated for their time. Listening task The 30 subjects were randomly assigned, in equal numbers, to one of the three listening groups: Control, Word List, or Paragraph. One to five subjects belonging to the same group were assembled in a room equipped with a master tape deck and rows of study carrels that contained high-quality headphones (University of Minnesota Language Laboratory). This setting was ideal for data collection because the study carrels reduced visual distractions and prohibited subjects from being influenced by the written responses of others. General instructions were identical for all listening groups. Subjects were told they would be hearing the speech of a woman whose first language was not American English, and who had suffered damage to her nervous system that affected the way she could use her muscles to talk. Subjects were told that they would hear her produce a list of six-word sentences, each sentence followed by a 20-second pause. During this pause, listeners should write what they heard on an answer sheet. Listeners were encouraged to guess if they were unsure of the speaker’s utterance. Immediately prior to performing the sentence transcription task, members of the Word List and Paragraph groups participated in similar familiarization procedures. Subjects in the Word List familiarization group were given a written list of words corresponding to the content of the Word List tape. Subjects followed along with this written list while they listened to the Word List tape twice. The Paragraph familiarization procedure was conducted in essentially the same way. However, these subjects followed a written script of the paragraph while listening to the Paragraph tape twice. Subjects in the familiarization groups did not transcribe their respective familiarization tapes, but only followed along with the written scripts while they listened to the tape. Listeners in the Control group did not participate in any familiarization procedure. Table 1 contains a summary of the experimental conditions and procedures. Data analysis Two scores were calculated from the listener response sheets and were used for statistical and descriptive purposes. Each score and its method of calculation will be described in detail below. In addition, scores were calculated for individual listeners and then averaged within experimental group. Listener familiarity 145 Table I . Summary of the experimental groups and procedures Listening groups Familiarization procedures Task Control (n = 10) None Transcribe 48 six-word sentences Word List (n = 10) Transcribe 48 six-word sentences Hear Word List tape twice; follow along with written list Paragraph (n = 10) Hear Paragraph tape twice; Transcribe 48 six-word sentences follow along with written script Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. ~ ~ Words-correct score The number of words correctly transcribed of the six words in each sentence was tallied and then summed across the 48 sentences, resulting in a words-correct score. The possible range of this score was from 0 words correctly transcribed to 288 (48 sentences X 6 words) words correctly transcribed. A word was considered to be transcribed correctly if the listener’s response exactly matched the target word; that is, the word the speaker was instructed to read. Total word-response score A total word-response score was obtained by tallying all of the words, correct and incorrect, that a listener attempted to transcribe. This score had a range of possible values from 0 to 288. Because listeners were instructed to ‘guess’ if unsure of the speaker’s productions, it was expected that this score would be considerably higher than the words-correct score. The purpose of this score was to identify any differences in response rates across listener groups. Item analysis An item analysis was conducted to assess if all groups performed similarly for each of the 48 sentences. This assessment was designed to identify inherent differences in intelligibility among the sentences, and to reveal any item-specific differences in group performance. Error pattern assessment Finally, a qualitative assessment of listener error patterns was accomplished by comparing listener responses with selected perceptual/acoustic attributes of the 48 stimuli sentences. Toward this end, the two investigators performed a broad phonetic transcription of the 48 sentences using selected diacritics. Because of the highly distorted nature of the speech, broad-band (300 Hz) spectrographic displays created on a Kay 5500 Workstation were used in conjunction with the phonetic transcription to yield a perceptuaUacoustic characterization for each of the 48 stimuli sentences. These spectrograms served as an additional source of information to help explain the perceptual judgements made by the listeners. This error pattern analysis was undertaken to identify and compare perceptual errors made across listener groups. Kristin K . Tjaden and Julie M. Liss 146 Table 2. Summary data for experimental groups. Mean words-correct were calculated on a possible 288 words for each subject. Total response values are the means of the absolute number of words (correct f incorrect) transcribed by each group. Group No. of words correct Total no. of word responses Control Mean Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. SD Word List Mean SD 84.1 10.5 2 19.2 37.3 99.5 16.5 229.0 36.4 110.2 13.5 238.9 29.1 Paragraph Mean SD Results Parametric results Table 2 contains performance data for each of the three listener groups. A one-way analysis of variance (ANOVA) of the words-correct scores revealed significant differences among these group means F(2,29) = 9 . 1 3 3 3 ; ~< 0.001. In addition, post hoc Newman-Keuls tests identified significant differences between the Control and Word List groups (q = 3.547; p < 0.05); and between the Control and the Paragraph groups (q = 6.012; p < 0.05). The 1 0-point difference between the Paragraph and Word List group means was not statistically significant. The second data column of Table 2 contains the group means of the total word-response score. A one-way ANOVA of these total word-response scores revealed no significant differences among the three group means. This indicates no differences in response rate among the three groups. Descriptive results Item analysis Table 3 contains the target sentences and the total number of words correctly transcribed for each sentence by group. The first data column contains total words-correct values in which the Paragraph (P) group performed better than either the Word List (W), or Control (C) groups (P > W,C). For example, sentence no. 2 (Open the door for your mother) indicates that the Paragraph group had 50/60 correct word transcriptions for this sentence, which exceeded both the 49 and 41 correct words transcribed by the Word List and Control groups, respectively. This trend, in which the Paragraph group outperformed both the Word List and Control groups, occurred for 2.5 of the 48 sentences. The second data column contains total words-correct values for which the Word List performed best (W > P,C). This pattern of performance held for 14 of 48 sentences. The item analysis also revealed a range of inherent intelligibility across sentences. Eight sentences elicited group sums across all listener groups (see Table 3) that met or exceeded 30/60 correct (50%), including sentences 2, 8, 10, 12, 15, 19, 27, and 46. These high 147 Listener familiarity Table 3. Total number of words correctly transcribed for each sentence by group (Paragraph ( P ) Word List (W), Control ( C ) ) are provided. The,first data column contains values,for which the Paragraph group outperformed the Word List and Control groups; the second data column contains values for which the Word List group outperformed the other two. Total values at the end of these data columns represent the number of times in which these patterns occurred. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. Sentence P >w , c Old shoes are usually most comfortable. Open the door for your mother. Turn off the television before bed. Shall we go out for lunch? Who is your college major advisor? Write your name on the paper. Was it raining outside yesterday morning? Too much coffee makes me jumpy. Will she go to the movie? What time are we going home? Throw the garbage in the dumpster. When do you get to work? Move your books off the table. Wash the dishes and dry them. When can I go to sleep? Feed the dog canned food only. Will the mail be here soon? Take the children to the zoo. Be sure to lock the door. Rain may ruin our spring picnic. Bring a friend to the party. Who won the game last night? The husband and wife got divorced. Are we having steak for dinner? Let me carry that heavy package. Working allows people to earn money. Can you come for dinner tonight? Please walk the dog for exercise. Biking is a common summer sport. Order a sausage pizza for dinner. Green plants supply oxygen for us. Will you dust the furniture? Are flat tyres difficult to change? The girl bought rice and potatoes. Water the plants twice a week. Glass shattered all over the counter. Kittens sleep under beds and chairs. Turn the radio to another station. Find the recipe for cherry pie. Indian jewellery sells well in shops. Where will the band perform next? Fans keep rooms cool during summer. Chocolate cake is sweet to eat. Is the shopping centre still open? Telephones allow two people to communicate How many brothers do you have? The gravel crunched beneath our feet. Put the files in the cabinet. Total 8>3>0 50 > 49 > 41 46 > 26 > 20 w > P,C 14>9>3 10>8>4 57 > 33 > 26 9>6=6 46 > 40 > 34 39 > 38 > 29 51 = 51 = 20>18>17 46 > 44 > 42 48 > 34 > 27 1 8 > 9 < 14 44 > 38 < 4 1 35 > 22 < 32 13>5<7 O= 48 > 43 > 42 O= 23>22>13 37 > 28 < 32 38>31 =31 12> 1 0 > 4 46> 37 > 35 31 > 2 9 > 18 7>0=0 24>19>8 8>4>2 29>6>5 17>6<8 6= 6>5>0 6>5>4 6> 5>1=1 27> 16> 15 12>9>6 16> 10< 15 8>6>4 32> 18 < 19 3 6 > 2 7 > 18 58 > 52 > 50 7>6>2 25 14 Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 148 Kristin K. Tjaden and Julie M. Liss intelligibility sentences were at the upper end of this speaker’s intelligibility range, as defined by her performance on the Assessment of Intelligibility of Dysarthric Speech (Yorkston and Beukelman, 1981). Similarly, 12 sentences elicited group sums across all listener group: that fell at or below 15% correct (9/60), including sentences 1, 7, 20, 29, 3 I , 34, 35, 36, 37, 40, 43, and 47. These were identified as low intelligibility sentences. The differences between the high and low intelligibility sentences cannot be explained entirely by any single attribute; however, sentence length may play a role. For example, the high intelligibility sentences contained one- and two-syllable words and typically contained less than eight total syllables. In contrast, many of the low intelligibility sentences contained three-syllable words and had more than eight syllables total. One interpretation of these findings is that longer sentences may have been more difficult for the speaker to produce, and listeners would be challenged in distinguishing word boundaries from syllable boundaries. The item analysis also revealed that none of the listener groups displayed a learning effect across the task. Comparison of performance on items 1-25 was not different from performance on items 26-50 for any group. The absence of feedback regarding accuracy during the transcription task, combined with the speaker’s generally low intelligibility, may have prevented listetfers from learning more about their speech on-line. Error pattern assessment Although statistical analysis failed to reveal a significant difference between the two familiarization procedures, descriptive results point to a pattern of performance in which the Paragraph group outperformed the Word List group, with both groups outperforming the Control group. For example, the words-correct scores of Table 2 indicate a consistent hierarchy of performance in which the Paragraph group exceeded the Word List group, which in turn surpassed the Control group. The item analysis presented in Table 3 shows that the Paragraph group outperformed both the Word List and Control groups in 25 of 48 sentences. Also, of the 480 sentence responses per group (48 sentences X 10 listeners per group), the Paragraph group transcribed 61 sentences without error while the Word List and Control groups transcribed 48 and 37 sentences completely without error, respectively. These results could reflect the importance of exposure to patterns of sentence-level prosody and inter-word coarticulation in providing cues for segmenting the acoustic stream into its lexical components (Greenspan et ul., 1988; see also Verbrugge et al., 1976). To further illustrate differences in performance between the Paragraph and Word List groups for individual sentences, consider listener responses for sentence no. 3 (Table 3). Table 3 indicates that the Paragraph group outperformed the other two groups for number of words correctly transcribed (46 > 26 > 20). Although these group differences barely reach statistical significance,the qualitative differences in response patterns across the three groups are remarkable. Table 4 shows response differences across listener groups for sentence no. 3. First, it is noteworthy that four members of the Paragraph group, three of the Word List group, and two of the Control group transcribed this sentence without error. Also, four additional members of the Paragraph group correctly transcribed the words ‘turn off the’, and five additional members correctly transcribed ‘before bed’. By comparison, only one additional member from both the Word List and Control groups correctly transcribed ‘before bed’ and the rest of the sentence transcription for each of these subjects (e.g. ‘Put away potato chips’ for the Word List member and ‘Take the temperature’ for the Control member) was more different from the target as compared to the Paragraph group members Turn off the television before bed Turn off the television before bed Turn off the television before bed Talk to all the people there People people bad Talk with the teacher people there Put away potato chips before bed In the picture people The picture be perfect The television Word List (26 correct) Control (20 correct) Turn off the television before bed Turn off the television before bed I be prepared Try the taping people Carry on the baby Tunnel vision of picture people vary Trying to hold people back Take the temperature before bed Turn the tape recorder people back Picture before that Numbers in parentheses indicate the tctal number of words correctly transcribed by each group. Turn off the television before bed Turn off the television before bed Turn off the television before bed Turn off the television before bed Turn off the light before bed Turn off the light before bed Turn off the tape before bed Turn off the temperature before bed The day begins before bed [No response] Paragraph (46 correct) Table 4. Responses of listeners to sentence no. 3, ‘Turnoffthe television before bed’. Words scored as correctly transcribed are typeset in bold. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. Kristin K. Tjaden and Julie M . Liss 150 Table 5. Examples of errors that were judged to be primarily phonetic (P), or closely related to the acoustic structure of the distorted production; and those that were judged to be semantically ( S ) related to other percepts in the utterance rather than to the acoustic structure of the production. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. ~~ ~ Target: Responses: Too much coffee makes me jumpy. Too much coffee makes me grumpy (P) Too much coffee makes me dumpy (P) Too much company (P) can be exhausting (S) Target: Responses: Put the j l e s in the cabinet Put the pie (P) in the oven (S) Put the fire (P) in the chimney (S) (e.g. ‘Turn off the light’, ‘Turn off the light’, ‘Turn off the tape’, Turn off the temperature,’ and ‘The day begins’). This suggests that the type of familiarization procedure administered has an impact on a listener’s ability to identify word boundaries in the acoustic stream. Inspection of the listener error patterns also revealed that subjects across groups made similar types of perceptual errors, including phonemic and semantic. Examples of these types of errors are presented in Table 5. Phonemic errors were assumed to be a direct byproduct of the distorted acoustic signal. The transcriptions ‘grumpy’ and ‘dumpy’ for the target word ‘jumpy’ are plausible given the nature of the distortion produced by this speaker, in which the was realized as a distorted stop rather than an affricate. In contrast, ‘exhausting’ is not a plausible substitution for ‘jumpy’; rather is appears to be related to the earlier percept of ‘company’; Similarly, ‘oven’ and ‘chimney’ do not resemble this speaker’s production of the target, ‘cabinet’ but they are semantically related to ‘pie’ and ‘fire’. Although no striking differences between groups were noted relative to phonemic and semantic error patterns, a third error category suggests a particular benefit of the familiarization procedure. This included misperceptions that occurred relative to the speaker’s consistent (ESL) consonant substitution patterns. Recall that the speaker in this investigation produced both variable and consistent segmental errors. Variable errors were predominantly inappropriate voicing and devoicing, while consistent errors includcd the word-initial substitution of [l] for /r/. In cases of this consistent speaker error, subjects in the Control group were more likely than members of the familiarization groups to rely on the surface phonetic form to interpret these consonant substitutions (see Duffy and Pisoni, 1992), while members of the Word List and Paragraph groups responded more variably to these consistent speaker errors. That is, they did not always rely on the acoustic signal, but presumably called upon their knowledge of the speaker’s articulatory patterns to sometimes override the acoustic evidence. Table 6 illustrates this differential group performance in response to an ESL consonant error. Here the target word ‘write’ was produced by the speaker as ‘light’ according to the broad transcription and acoustic evidence. Six of nine Control group listeners who attempted a transcription of this word interpreted the initial consonant as /I/. In contrast, 14 of 17 listeners from the two familiarization groups who attempted a transcription of this word perceived the initial consonant as /r/. /a/ 151 Listener familiarity Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. Table 6. Listener responses to an Englisli-as-a-second language consonant error produced by the speaker (Target: ‘write’ /lait/) Control (CI-CIO) Word List (WI-WlO) Paragraph (PI-PIO) lie light live let let’s let’s write write write write write write write write write write write write write write write write write what’s light learning ‘no response’ Discussion The results of the present investigation indicate that listeners familiarized with a single dysarthric person’s speech pattern (Word List and Paragraph groups) performed significantly better on a sentence transcription task than listeners who were not familiarized with her speech (Control group). Results of the statistical analysis did not support the prediction that the form of the familiarization procedure provides differential perceptual benefits to the listeners. That is, the Paragraph group did not significantly outperform the Word List group. Despite the absence of a statistically significant.difference in performance between the Paragraph and Word List groups, the Paragraph group consistently received higher scores than the Word List group on the measures made in this investigation. This included the mean number of words accurately transcribed, the total number of correct and incorrect words transcribed by each group, and the total number of sentences transcribed without error. Assessment of error patterns, such as the responses in Table 4, suggests that differential benefits of the Paragraph procedure may need to be captured by a systematic qualitative analysis. Although the trend in which the Paragraph group outperformed the Word List group was pervasive, there is some question whether there was an additional perceptual benefit for perceiving segmental acoustic information. Recall the lack of quantifiable benefit (Table 6, ‘write’) for the Paragraph group on the lexical items targeted herein. There was also no notable difference in the types of perceptual errors that were committed by the three groups. All groups made errors that were related to the surface characteristics of the acoustic signal (phonemic), and errors that were related to other percepts in the utterance but did not resemble the features of the acoustic signal (semantic). It was not possible in the present investigation to draw definitive conclusions about the nature of the relationships between signal integrity, error type, and perceptual strategies. However, it is expected that utterance characteristics such as word frequency, semantic predictability, and syntactic complexity may also play a role. These types of variables should be controlled in future investigations. Because so little is known about the variables that affect the processes underlying the perception of dysarthric speech, the results of the present investigation must be considered only relative to the conditions specific to this study. A rather limited range of semantic and syntactic sentence structures were produced by the speaker, and it is likely that manipulations of these two factors would influence transcription results. Specifically, it is probable that less complex semantic and syntactic structures, as used in the present Kristin K. Tjaden and Julie M.Liss Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 152 investigation, would place the least amount of burden on the speaker’s productive ability and the listener’s perceptual abilities. It is likely that the combination of speaker, listener, and procedural characteristics permitted the effects of familiarization to be measurable and significant. This investigation used a single speaker with severe intelligibility deficits whose speech contained both consistent and inconsistent error patterns. Although it is possible that the relative consistency of some of her articulatory errors enhanced the effects of familiarization, the severity of her dysarthria renders it highly unlikely that the familiarization effects demonstrated here were attributable solely to these consistent substitutions. A more plausible interpretation is that the phenomenon of familiarization, as it was defined here, provided perceptual benefits to listeners. These perceptual benefits derived from exposure to substantially unintelligible dysarthric speech containing both consistent and variable error patterns. It would be expected that familiarization would not have similar effects for all speech patterns, all dysarthria types, and all levels of speech impairment (see also Hunter et al., 1991).This is supported by the fact that the item analysis revealed some sentences that were ‘highly intelligible’ and others were ‘unintelligible’ for all groups; thus, there may be certain floor and ceiling areas of intelligibility in which the effects of familiarization are less robust. The characteristics of the listeners used in this investigation should also be considered. The listeners were female, educated, and had little or no experience listening to dysarthric speech. The gains in performance afforded by the familiarization procedure might not be significant for a group of experienced listeners. Experienced listeners also may possess fine-tuned perceptual strategies that tend to be less speaker-specific. Finally, the familiarization procedures used here were specific in nature: a single speaker was used, and feedback about the speaker’s word targets was provided to the listeners in the form of a script. No feedback about their performance was provided to the listeners during the actual transcription task. This final point may account for the absence of an apparent learning effect across the task; that is, this speaker’s low intelligibility did not permit listeners to capitalize on their exposure to her speech during the task. Subsequent research on the phenomenon of familiarization should attempt to control and manipulate the variables identified in this investigation. The results of such studies could have a profound impact on clinical practice, particularly in the areas of intelligibility testing, the determination of progress during treatment, and the development of treatment strategies that target the listener (such as the spouse or caregiver) in addition to the dysarthric patient. Acknowledgements Portions of this paper were presented at the Conference on Motor Speech Disorders and Speech Motor Control, Boulder, CO, 1992. The authors thank Gary Weismer and Keith Kluender for their comments on an earlier version of this manuscript. This work was supported by the Bryng Bryngelson Fund of the University of Minnesota Department of Communication Disorders. Requests for reprints should be directed to Kris Tjaden 437 Waisman Center, 1500 Highland Ave, Madison, Wl 53705. Notes 1. It was expected that the transcription task used here to document the impact of the familiarization procedure might not be sufficiently sensitive to capture the potentially smaller effects for speakers with very high or very low intelligibility. Listener familiarity 153 2. The Transcription tape actually contained a total of 50 sentences. However, after the speech sample had been obtained, it was realized that two sentences contained five rather than six words. Because subjects were told they would be transcribing six-word sentences, responses from these sentences have not been included in the analysis. Thus, all analyses are based on 288 target words rather than the intended 300. 3. Female subjects were chosen because of ease of recruitment from classes in the College of Liberal Arts. It is an empirical question whether men and women perform differentially on the type of listening task employed in this investigation. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. References DUFFY,J. R. and GIOLAS,T. G. (1974) Sentence intelligibility as a function of key word selection. Journal of Speech and Hearing Research, 17, 631-637. DUFFY,S. A. andP1s0~1,D. B. (1992) Comprehension of synthetic speech produced by rule: a review and theoretical interpretation. Language and Speech, 35, 35 1-389. 0.(1992) Systematicity of phonetic variation in natural discourse. Speech CommuniENGSTRAND, cation, 11, 337446. GIOLAS, T. G., COOKER, H. and DLJFFY, J. R. (1970) The predictability of words in sentences. Journal of Audirory Research, 10, 328-334. GREENE, B. G. andPIsON1,D. B. (1988) Perception of synthetic speech by adults and children: research on processing voice output from text-to-speech systems. In L. E. Bernstein (Ed.), The Vocally Impaired: Clinical Practice and Research (Philadelphia: Grune & Stratton). GREENSPAN, S. L., NUSBAUM, H. C. and PISONI,D. B. (1988) Perceptual learning of synthetic speech produced by rule. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14,421433. V. L., YORKSTON, K. M. and DOWDEN, P. (1991) Index of contextual intelligibility: Impact HAMMEN, of semantic context in dysarthria. In C. Moore, K. Yorkston, and D. Beukelman (Eds), Dysarthria and Apraxia of Speech: Perspectives on Management (Baltimore, MD: Paul H. Brookes). HUNTER, L., PRING, T. and MARTIN, S. (1991) The use of strategies to increase speech intelligibility in cerebral palsy: An experimental evaluation. British Journal of Disorders of Communication, 26, 163-174. KENT, R., WEISMER, G., KENT,J. and ROSENBEK, J. (1989) Toward phonetic intelligibility testing in dysarthria. Journal of Speech and Hearing Disorders, 54, 484499. KIM,N. K. (1990) Korean. In B. Comrie (Ed.), The World’s Major Languages (New York: Oxford University Press). KREIDEK, M. A. (1988) The effects of context and rate on the intelligibility of the conversational speech of moderately disordered speakers (Unpublished MS thesis, University of WisconsinMadison). MAASSEN, B. and POVEL,D. J. (1985) The effect of segmental and suprasegmental corrections on the intelligibility of deaf speech. Journal of the Acoiistical Society of America, 78, 877-886. MAJOR,R. (1994) Current trends in interlanguage phonology. In M. Yavas (Ed.), First and Second Language Phonology (San Diego, CA: Singular Publishing Group). METZ,D. E., SAMAR, V. J. SCHIAVETTI, N. SITLER,R. W. and WHITEHEAD, R. L. (1985) Acoustic dimensions of hearing-impaired speakers’ intelligibility. Journal of Speech and Hearing Research, 28, 345-355. MONSEN, R. (1978) Toward measuring how well hearing-impaired children speak. Journal of Speech and Hearing Research, 21, 197-219. MONSEN, R. (1983) The oral speech intelligibility of hearing-impaired talkers. Journal of Speech and Hearing Disorders, 48, 286-296. J. H. (1973) Consonant intelligibility in synthetic speech and in a natural NYE,P. W. and GAITENBY, speech control (modified rhyme test results). Haskins Laboratories Status Report on Speech Research, SR-33, 77-91. PISONI,D. B. and LLJCE, P. A. (1986) Speech perception: Research, theory, and the principal issues. In E. C. Schwab and H. C. Nusbaum (Eds), Pattern Recognition by Humans and Machines: Speech Perception, I (Orlando, FL: Academic Press). PLATT,L. J., ANDREWS, G. YOUNG,M. and QUINN, P. T. (1980) Dysarthria of adult cerebral palsy: intelligibility and articulatory impairment. Journal of Speech and Hearing Research, 23, 28-40. Clin Linguist Phon Downloaded from informahealthcare.com by Arizona State University on 04/18/10 For personal use only. 154 Kristin K . Tjuden and Julie M . Liss ROSENZWEIG, M.R. and POSTMAN, L. ( 1957) Intelligibility as a function of frequency of usage. Journal of Experimental Psychology, 54, 4 12-42 1. TIKOFSKY, R. S., G L A ~ KT. E ,J. and TIKOFSKY, R. (1966) Listener confusions in response to dysarthric speech. Folia Phoniatrica, 18, 28G292. VERBRUGGE, R. R., STRANGE, W., SHANKWEILER, D. P. and EDMAN, T. R. (1976) What information enables a listener to map a talker’s vowel space? Journal of the Acoustical Society ofAmerica, 60, 198-212. WEISMER. G. andMARTIN, R. (1992) Acoustic and perceptual approaches to the study of intelligibility. In R. D. Kent (Ed.), Intelligibility in Speech Disorders (Amsterdam: John Benjamin). YORKSTON, K. and BEUKELMAN, D. ( 1980) A clinician-judged technique for quantifying dysarthric speech based on single-word intelligibility. Journal of Communication Disorders, 13, 15-3 I . YORKSTON, K. and BEUKELMAN, D. (1981) Assessment of the Intelligibility of Dysarthric Speech. (Tigard, OR: CC Publications). YORKSTON, K. and BEUKELMAN, D. (1983) The influence of judge familiarization with the speaker on dysarthric speech intelligibility. In W. Berry (Ed.), Clinical Dysarfhriu (San Diego, CA: College-Hill Press). Appendix College is different than high school. High expectations push everyone to excel. The competition makes studies even harder. Students can find ways to relax. Many go biking, walking, or read. Others will treat themselves to naps. When tests begin, nights get shorter. Burning the midnight oil is common. When exams get returned most rejoice! Hard work and discipline pay off! It is time to reward yourself. That is what school is about.
© Copyright 2026 Paperzz