• Normal hearing (NH) listeners can take advantage of amplitude fluctuations in the noise to improve their speech understanding, compared to their performance in steady noise at the same nominal Signal-to-Noise Ratio (SNR). • On the other hand, most cochlear implant (CI) users show little to no benefit from masker fluctuations. • We tried to artificially promote masking release in CI users by creating a condition with little temporal overlap between speech and noise (“+MR”, Kwon et al. (2012)1). Some proficient CI users demonstrated masking release: they were able to glimpse and understand the speech in the dips of the masker. A Noise B Noise C A+B “+MR” A+C “Steady” Some CI users fail to show masking release. Why? • Listeners must segment speech into meaningful linguistic units to understand the message. • When other segmentation cues are unavailable or unreliable, listeners can use metrical stress as a segmentation guide. • The Metrical Segmentation Strategy (MSS) is a strategy of using syllable strength as indicators of word boundaries.2 • The MSS is useful since most common English content words begin with a strong syllable.3 • Sometimes listeners make Lexical Boundary Errors (LBE). • LBEs can be split into 4 types: incorrectly Inserting a word boundary before a Strong (IS) or Weak (IW) syllable, or incorrectly Deleting a word boundary before a Strong (DS) or Weak (DW) syllable. Department of Hearing, Speech, and Language Sciences, Gallaudet University, Washington D.C., USA Table 1: Lexical Boundary Error (LBE) Examples Stimulus Presented Subject Response Platoons deserve respect The tunes deserve respect IS Clever women wrote this Clever windows open DW He tapped the black device Attack the black device DS The farmer waters crops The farmer bought his crops IW Consistent with the MSS, NH listeners make many more IS and DW errors than DS and IW errors, particularly when the speech signal is noisy or degraded.4,5 NH listeners tend to treat strong syllables as indicating the start of a word, but weak syllables as continuing a word. • A listener’s use of the MSS can be given by an index, IMSS, ranging from 0 to 1, with values near 0.5 indicating little use of metrical cues and values near 1 indicating very strong use of the MSS. IMSS = (#IS + #DW) / #Total LBEs If masking release in fluctuating noise is related to the use of metrical segmentation cues… • CI users who show masking release should also show more robust use of the MSS (higher IMSS value) in the +MR condition than in Steady noise. • Users who show no masking release should also show little difference in their use of the MSS between noise conditions. Method • • Fig. 2: Difference in IMSS as a Function of Masking Release • LBE • What enabled those good CI users to benefit from the masker dips in the artificial condition? HYPOTHESIS: For CI users, the fluctuating noise around the speech might weaken or obscure speech segmentation cues, reducing the usefulness of the exposed speech. Trevor T. Perry & Bomjun J. Kwon The study was supported by NIDCD (R03 DC009061) Fig. 1: Speech in Noise Conditions Speech EFFECT OF MASKER FLUCTUATIONS ON LEXICAL SEGMENTATION IN COCHLEAR IMPLANT LISTENERS 10 Nucleus CI users, data from 8 users included in analysis Sentence Keyword Identification measured at a fixed SNR in two noise conditions (See Fig. 1): • Steady noise • +MR noise +MR noise is an artificial noise condition designed to promote opportunities for “dip-listening” by lessening the temporal overlap between speech and noise. Noise energy is low when speech energy is high, and vice versa. Both conditions have the same spectral shape as the sentences As performance of CI users in quiet varies between subjects, SNRs were selected individually for each subject to minimize ceiling and floor effects. Tested SNRs ranged from -2 to 7 dB • 60 sentences per condition, 3 keywords per sentence • Sentences are constructed to make opportunities for all 4 types of LBE roughly equal with each other. • Presented via custom software based on NIC2 and NMT to emulate actual CI processing (courtesy of Cochlear Ltd.) • Sentence level: 50 dB SPL, default sensitivity • Subjects’ verbal responses are transcribed and their LBEs are coded by 2 raters. • When rater discrepancies could not be resolved, the disputed LBEs are discarded from analysis. • To mitigate potential under-sampling of LBEs within subjects, a subject’s data is included for further analysis only if they make at least 15 LBEs (of any type) in each noise condition. This criterion excluded data from 2 subjects, leaving data from 8 subjects for further analysis. Fewer LBEs does not necessarily mean better speech understanding. It could merely indicate a subject’s reluctance to guess when uncertain. Results • Masking release was defined as the improvement in proportion-correct keyword identification performance from the Steady to the +MR condition. • 4 subjects showed substantial masking release • 1 subject showed poorer performance in +MR than in Steady • All 8 subjects made more IS and DW errors than DS and IW errors in both conditions (IMSS >0.5). • This suggests they could perceive some syllable strength cues and attempted to use these cues for segmentation. Table 2: Representative Results from 2 Subjects Subject Condition S7 Steady +MR S8 Steady +MR SNR 3 3 3 3 Keyword Proportion Correct .44 .72 .50 .53 #IS 17 14 22 20 #DW 8 5 9 11 #DS 3 1 0 1 #IW 4 1 1 1 # Total LBE 32 21 32 33 IMSS(+MR) – IMSS(Steady) W32, CIAP 2013 Background Subjects are grouped here based on masking release Mean Masking Release: 0.25 Mean IMSS +MR: 0.90 Mean IMSS Difference: 0.08 Masking Release Mean Masking Release: 0.01 Mean IMSS +MR: 0.73 Mean IMSS Difference: -0.09 • The Y axis is the difference in IMSS between conditions. More positive Y values can be interpreted as stronger use of the MSS in the +MR condition than in the Steady condition. • Mean IMSS Steady for both groups was the same: 0.82 Discussion • CI users who showed masking release showed more robust use of segmentation cues in the fluctuating masker. • CI users who show no benefit from masker dips appear to have a harder time segmenting the speech in a fluctuating masker than in steady noise. • These two groups differ specifically in their ability to segment speech exposed by masker fluctuations. • The effect of noise is more than energetic masking per se. Noise fluctuations can disrupt speech segmentation, hindering the ability to understand speech, even when the instantaneous SNR is quite favorable. References 1 Kwon, B. J., Perry, T. T., Wilhelm, C. L., & Healy, E. W. (2012). Sentence recognition in noise promoting or suppressing masking release by normal-hearing and cochlear-implant listeners. J. Acous. Soc. Am., 131, 3111. 2 Cutler, A., & Butterfield, S. (1992). Rhythmic cues to speech segmentation: Evidence from juncture misperception. J. Mem. Lang., 31(2), 218-236. 3 Cutler, A., & Carter, D. M. (1987). The predominance of strong initial syllables in the English vocabulary. Comput. Speech. Lang., 2(3), 133-142. 4 Liss, J. M., Spitzer, S. M., Caviness, J. N., Adler, C., and Edwards, B. W. (1998). “Syllabic strength and lexical boundary decisions in the perception of hypokinetic dysarthric speech,” J. Acoust. Soc. Am. 104, 2457–2466 5 Mattys, S. L., White, L., and Melhorn, J. F. (2005). “Integration of multiple segmentation cues: A hierarchical framework,” J. Exp. Psychol. Gen. 134, 477–500.
© Copyright 2026 Paperzz