J Am Acad Audiol 4: 296-306 (1993) Role of Perceptual Acclimatization in the Selection of Frequency Responses for Hearing Aids Stuart Gatehouse* Abstract Previous work concerning the late-onset auditory deprivation and/or acclimatization effect in adult hearing-aid users has suggested that the benefits of a particular frequency response from a hearing aid may not become apparent until material exposure to that frequency response has been achieved . The generality of that finding was tested further. A group of subjects who were established users (12 to 15 months) of a particular frequency response (limited at high frequencies by the system of provision) were re-prescribed with a theoretically advantageous frequency response according to the NAL prescription . Using a speech-innoise test (word identification) and a sentence verification test, the benefits of the represcription were not (or at best only marginally) evident upon immediate testing but became statistically significant and of material clinical magnitude following experience with the represcription for 8 and 16 weeks. These results suggest that comparative selection regimes and research designs based upon little or no experience of the listening environment through the hearing aid are likely to seriously misrepresent the benefits available to the hearingimpaired listener . his special issue of the Journal of the American Academy of Audiology is a T reflection of the growing interest and the activity in the area of auditory deprivation. The content also reflects the two strands of work, which cover effects of deprivation in developing auditory systems (with particular interest in the development of binaural processing abilities) and also deprivation effects in adults suffering from hearing impairments who are managed by hearing aids. Auditory deprivation of late onset was put forward as a concept by Silman and his colleagues (Silman et al, 1984 ; Gelfand et al, 1987) as an explanation for their finding that subjects using monaural amplification exhibit a relative decrement in speech identification scores for the usually unaided ear relative to the usually aided ear, while in contrast individuals using no amplification or us- *MRC Institute of Hearing Research, Glasgow, Scotland Reprint requests : Stuart Gatehouse, MRC Institute of Hearing Research, Royal Infirmary, Alexandra Parade, Glasgow G31 2ER, Scotland, UK 296 ing binaural amplification show no such interaural discrepancies. Further results from other subject groups have reinforced the evidence for this experimental finding (Silverman, 1989 ; Stubblefield and Nye, 1989). The experimental results have been interpreted as a reflection of a decrease in the analyser capacity of an (impaired) ear, which suffers a deprivation of auditory stimulation relative to the (aided) contralateral ear. These putative deprivation effects have been suggested to have mechanisms of action somewhat similar to those occurring in the developing auditory system . Gatehouse (1989) has presented an acceptable but alternative hypothesis for apparent deprivation effects in terms of perceptual acclimatization . This was prompted by the finding that in long-term users of a single hearing aid, the ear that is usually aided performs better than the usually unaided ear at high presentation levels, while at lower presentation levels the converse occurs . This intensity dependence was put forward as evidence that an ear accustomed to receiving a high level of stimulation will "acclimatize" to the pattern of speech cues present and be most efficient at analyzing at high presentation levels . At lower presentation Acclimatization and Frequency Responses/Gatehouse levels, the usually unaided ear receives its accustomed pattern ofcues and so performs better than the usually aided ear. Further experiments (Gatehouse, 1992a) have shown that in users of a single hearing aid, significant increases in the benefit from amplifying speech in the aided ear occurred across a time course of 12 weeks after fitting, but not in control (unfitted) ear. Furthermore, it appeared that the benefits from providing a particular frequency spectrum did not emerge immediately, but over a time course of some 6 to 12 weeks. These experiments additionally showed a small, but statistically significant, decrease in the speech identification abilities of the control (not-fitted) ear over the 12-week period, which might be interpreted as early signs of deprivation effects. The debate between the alternative hypotheses of late-onset auditory deprivation and perceptual acclimatization may therefore be resolved into an empirical estimation of magnitudes and time parameters of two comparable processes. So far we have been concerned with speech identification abilities (both in the usually aided ear and in the usually not aided ear) following provision of a single hearing aid and the roles of the two processes might differ for other types, materials, or configurations of stimulation . More detailed information and evidence on the underlying psychoacoustic (and perhaps physiological) functional changes is required to underpin the choice of the most appropriate ways to examine, compensate for, or even exploit acclimatization in practice, while taking account of the potential effects of deprivation. Experiments to test some specific hypotheses underlying the changes in speech identification ability of users of a single hearing aid are underway, but in the interim the available experimental data have potentially important and far reaching implications for both research on hearing aids and the fitting of hearing aids in clinical practice . There is some evidence (Hurley, 1991) that the long-term decline in the speech identification ability of the usually unaided ear of single hearing-aid users might be reversed by application of a second hearing aid. At any rate, two important implications for hearing-aid fitting emerge from the experimental results so far: (1) the benefits of amplification do increase across time, and also (2) the benefits of a theoretically advantageous frequency response do not emerge immediately, but only after exposure to that listening environment . This article pursues the latter thread rather than distinguishing between the two theoretical concepts of acclimatization and deprivation, which remain the focus of other experiments. The suggestion that hearing-impaired listeners take time to "get used" to amplified speech in some general sense is not a new suggestion to audiologists and indeed the evidence is relatively strong (Kapteyn,1977; Berger and Hagberg, 1982 ; Brooks, 1999). Hearing-aid use times and satisfaction do increase with time (indeed structured rehabilitation regimes aim to actively encourage manipulation ofsurroundings and acoustic conditions to encourage motivation whilst warning against early over-expectation). The evidence for underlying improvements in speech perception abilities and the underlying perceptual rationale for this is much more limited (Barford, 1979 ; Cox and Alexander, 1992 ; Gatehouse, 1992a) . The audiologic literature contains many methodologies and formulae for the prescription of the frequency response and gain of a hearing-aid fitting based on measurements such as pure-tone thresholds (McCandless and Lyregaard,1983 ; Berger et al, 1984 ; Byrne and Dillon, 1986) on comfortable andlor uncomfortable listening levels (Skinner et al, 1982 ; Cox, 1985) or on strategies such as the articulation index (Pavlovic, 1988 ; Berger, 1990 ; Rankovic, 1991). By whatever means employed, these procedures aim to provide an appropriate degree of amplification at each frequency . These procedures characteristically lack validation on large-scale clinical populations. Where attempts at evaluation have taken place, there has been considerable difficulty in showing benefits from theoretically advantageous approaches . It may be argued (Gatehouse, 1992b) that part of the difficulty in distinguishing a theoretically advantageous prescription from a control condition has been a reliance on simple identification scores . Real-world benefit from amplification can be considered to have other dimensions, including ease of listening. Another possible reason for the difficulties relates to the acclimatization hypothesis. If the comparative evaluations have taken place prior to the completion of any acclimatization (that is, before the changes in speech identification ability on provision of amplification have the opportunity to asymptote) then the underlying differences between fitting strategies may not be apparent . It was suggested in Gatehouse (1992a) that this very process was occurring where the benefits of a sloping high-frequency emphasis over a flat-frequency spectrum only became apparent after some 4 to 6 weeks experience 297 Journal of the American Academy of Audiology/Volume 4, Number 5, September 1993 with the high-frequency emphasis condition. The general aim of this article is to identify a situation where one prescription would be unequivocally accepted as superior to an alternative, and then to chart any differences as a function of exposure to the listening environments. In setting up such an experimental paradigm, care should be taken in the choice of the control condition. If its characteristics were deemed to be totally inappropriate to the hearing impairments fitted, the results would have little applicability and, indeed, it is well established that speech identification indices can distinguish between a broadly appropriate and a broadly inappropriate frequency response (Walden et al, 1983) . It would be tempting to take two of the advocated prescription strategies from the literature and to run a comparative trial, but in the absence of any overall acceptance of one particular methodology as superior over a competitor, choice of the conditions proves problematic. However, in the United Kingdom (UK) there is a system of socialized medicine under which all hearing-impaired individuals can receive a standard hearing-aid free of charge from a fairly extensive range. Such a system of course has many advantages in terms of the service costs per patient and in particular to the pocket of the individual hearing-impaired listener, but these advantages are conferred at the expense of quality and flexibility in technology . In the UK, standard postaural hearing aids and standard ear molds are offered free of charge, but these can, and do, result in limited frequency responses. The ethics of a trial in this circumstance, however, do not pose any problems as this is the standard system of provision in the UK. Given the system in place, there is then the opportunity to change the standard prescription to one of the methodologies advocated in the literature and to chart the putative advantages ofthe re-prescription. This then became the detailed aim of the experiment, during the course of which individuals who had been fitted (and acclimatized) to the standard UK procedure were re-prescribed under a theoretically more advantageous procedure and the speech identification abilities compared . Given the suggestions in Gatehouse (1992b) concerning the limitations of simple identification tasks and the demonstration of the ability of a more complex procedure to distinguish between two signal processing strategies, which were similar in terms of traditional identification performance (Baer et al, 1992), the experiment 298 contained not only a measure of traditional identification ability, but also the recently developed sentence verification test . METHOD Audiometry Pure-tone thresholds were measured in each ear by air conduction at 0.25, 0.5, 1, 2, 4, and 8 kHz and by bone conduction at 0.25, 0.5, 1, and 2 kHz using a manual audiometry method recommended by the British Society of Audiology/ British Association of Otolaryngologists (1981) . This is a modified Hughson-Westlake procedure using 5-dB steps. Measurements were performed using a clinical audiometer calibrated to BS2497 (British Standards Institution, 1969, for air conduction ; British Standards Institution, 1972, for bone conduction). Hearing-Aid Insertion Gain The real-ear insertion gain of the hearing aid was measured using a clinical flexible probetube system, the Acoustimed HA2000 . Continuous speech was played in a sound-deadened room from a loudspeaker 2 m distant and 0 degrees azimuth at a sound pressure level (SPL) of 65 dB for the mean of the speech peaks at a point equivalent to the subject's head with the subject removed from the sound field. Initially the subject was seated and asked to adjust the gain on the hearing aid to achieve maximum intelligibility. Thereafter the insertion gain of the hearing aid was assessed using a speechshaped wide-band input signal at 65 dB SPL. The whole procedure was repeated three times and the results averaged . Word Identification in Noise Traditional speech identification performance was assessed for single words in a background of noise using the four alternative auditory feature (FAAF) test. This is a forced-choice word identification test based on the rhyme test principle, described by Foster and Haggard (1979, 1987). The material consists of 20 sets of four minimally paired words, each based on two binary auditory/phonetic distinctions giving an 80-item vocabulary and filtered noise with the same long-term spectrum as the test items. Two examples of these sets are: (1) mail, bale, nail, dale ; and (2) rose, rove, robe, road. Nine sets vary the initial consonant, and eleven the final Acclimatization and Frequency Responses/Gatehouse consonant. The test was administered free-field with the subject seated 2 m from a loudspeaker at 0 degrees azimuth. A fixed speech intensity of 65 dB SPL (measured at the center of the subject's head with the subject removed from the sound field) was used with a noise level of 58 dBA. The speech level was defined from a 1-kHz calibration tone, which had a sound pressure level equal to the mean of the speech peaks of the test words and the overall level of the noise was measured as the A-weighted sound level. The 80 items are stored as digitized waveform files and presented to the subject via a standard clinical audiometer and loudspeaker using 12bit digital-to-analog conversion . Because the study involved multiple presentations of this procedure under different conditions, a random number table was employed to generate sequences of unique ordering for the 80 items. Subject performance is assessed as the number of words (out of a total of 80) correctly identified and is expressed as percentage correct. Sentence Verification Test The construction, validation, and evaluation of the sentence verification test continues (Gatehouse, 1992b), but a brief outline is given here . The test uses a closed vocabulary to construct four-word sentences from an overall vocabulary of 32 words. There are 4 alternatives for the first word in the sentence (Liz, Lynne, Len, Ben), 12 alternatives for the second word (sold, showed, stole, stored, wore, stitched, drove, crashed, cracked, corked, read, tore), 12 alternatives for the third word (four, more, two, few, tweed, cloth, fast, sports, glass, jam, road, street), and 4 alternatives for the fourth word (caps, cars, jars, maps). Of the 144 combinations ofthe second and third words, there are 82 for which there is at least one fourth word that makes the sentence unequivocally silly (nonsense) and at least one fourth word that makes the sentence unequivocally sensible (e .g ., Ben sold street maps is sensible, while Ben sold street jars is silly) . Any combination of a fourth word with a second word-third word pair previously judged to be equivocal with regard to sense/nonsense is not employed in the test. The eventual sentences require identification ofthe second, third, and fourth words in the sentence before a decision regarding the sense/nonsense of the sentence may be made . The 32 words were stored as digitized individual waveform files, which were isolated from sentences spoken by a single male talker and were concatenated to produce the desired sentences. During the construction of the test, care was taken to ensure that the intonation contours of the items and other aspects such as duration of voicing, were similar across items. This was done to remove extraneous cues not directly associated with the intelligibility of the individual word . Following presentation of the sentence to the listener, the subject was asked to indicate whether the sentence was "silly" or" sensible" via a touch-sensitive computer screen, and the response time for that decision (verification time) was recorded . This verification element was followed by the identification element, for which four potential alternatives for the first word in the sentence, four for the second, four for the third and four for the fourth word were displayed on the touch sensitive computer screen. The subject was required to identify the components of the sentence . The test is run adaptively and yields a signal-to-noise ratio for criterion performance. The verification component of the test yields the median response time for the cognitive decision concerning sense/nonsense of the sentences. Previous work (Gatehouse, 1992b) has documented the within-session and betweensession stability of the test for both normalhearing and hearing-impaired subjects, and showed no significant long-term learning effects associated with repeated administration of the closed vocabulary . A 1-kHz sinewave was included in the waveform files and the signal level and noise level (shaped noise with the same long-term spectrum as the single male speaker) were defined in a manner similar to that described above for the FAAF test. The test was administered via a Grason-Stradler GSI 16 audiometer and a Goodmans B41 loudspeaker in the sound-treated room with the subject seated 2 m from the loudspeaker at 0 degrees azimuth. Afixed speech intensity of 65 dB SPL was used and the level of filtered noise with the same long-term spectrum as the SVT items adjusted to achieve criterion performance. The test was configured to follow a two-up, one-down procedure described by Levitt (1971) converging on the 70 .7 percent correct identification point on the psychometric function . The test started at a signal-to-noise ratio of +20 dB and proceeded with a step size of 2 dB . The test continued until ten reversals of signal-to-noise ratio occurred and the last eight reversals were averaged to produce a mean signal-to-noise ratio for 70 .7 percent correct identification . For the verification component of the test, only those sentences that were con 299 Journal of the American Academy of Audiology/Volume 4, Number 5, September 1993 rectly verified (correctly labelled as either being silly or sensible) and identified (each of the four constituent words in the sentence identified correctly) were used . The median of the response times for the verification process using this subset of sentences was then derived. Thus each run ofthe sentence verification test yielded an identification component (the signal-to-noise ratio in dB for 70 .7% correct identification of the sentences) and a verification element (response time for the decision regarding the sense/nonsense of the sentence). Subjects Subjects were identified from the clinical records at the Audiology Department at Glasgow Royal Infirmary. Individuals were selected who had been fitted with a single UK National Health Service postaural hearing aid between 12 and 15 months previously . They were then invited to attend for review and a measurement of real-ear insertion gain (REIG) performed using the procedure and equipment described above. The socialized system of medicine currently present in the UK (the National Health Service, NHS) uses relatively unsophisticated (though low cost) hearing-aid technology with limited bandwidth, particularly at high frequencies ; in conjunction with standard earmold technology (again achieved at low cost at the expense of output) can lead to limitations in the high-frequency output of the hearing-aid fitting. The REIGs were inspected to identify subjects who, it might be considered, were using less high-frequency gain than would be recommended by some external standard . Although there is no internationally recognized standard and many prescription procedures have been proposed in the literature, the formula produced by the National Acoustics Laboratories (NAL) in Australia (Byrne and Dillon,1986) has probably achieved the widest acceptance . It was therefore decided to use the NAL predictions as the external standard . Subjects were selected whose aids gave REIG less than the NALpredictions by 12 .5 dB or more averaged across the frequencies of 2 and 3 kHz (the frequency of 4 kHz was not included in this selection process as almost all subjects would fall into this category, and it was desired to identify those subjects who might be regarded as substantially "underfitted") . This selection procedure identified 36 such subjects with a mean age of 64 years (range 46-81 years) and air-conduction thresholds in 300 the fitted ear of 31 dB HL (range 20-45 dB) at 0.25 kHz, 31 dB HL (range 15-35 dB) at 0.5 kHz, 33 dB HL (range 20-55 dB) at 1 kHz, 43 dB HL (range 20-60 dB) at 2 kHz, 57 dB HL (range 3080 dB) at 4 kHz, and 59 dB HL (range 40-85 dB) at 8 kHz. The mean values by which these subjects were "underfitted" with reference to the NAL predictions were 10 .6 dB, 17 .2 dB, and 19 .7 dB at the frequencies 2, 3, and 4 kHz respectively . The 36 subjects were then refitted, again with a single postaural hearing aid, using hearing-aid and earmold technology available from the commercial sector in an attempt to achieve the NAL target figures for the REIG. In selecting the hearing aids, the technology was restricted to devices that were linear (no compression) and achieved their output limiting by peak clipping in a manner similar to the UK NHS technology. In addition, the standard coupler performance in terms of harmonic and intermodulation distortion between the NHS fittings and NAL re-prescriptions was similar . These restrictions were employed in an attempt to ensure that any differences in performance between the original fittings and re-prescriptions were due to changes in the frequency response, as opposed to any other parameters of the hearing aid characteristic. The solutions adopted did differ across subjects, particularly with regard to the necessary modification to the earmold technology, but a judicious selection of hearing-aid and earmold characteristics enabled all of the 36 subjects to be fitted to within ±3 dB of the NAL targets for the frequencies between 0.25 kHz and 2 kHz and ±5 dB for the frequencies of 3 and 4 kHz. Throughout this article the original fitting is designated "UK NHS" and the re-prescription the "NAL" fitting. Thus, the selection procedure identified 36 subjects who, according to the NAL predictions, initially received substantial underprovision in terms of high-frequency gain, which upon application of the NAL fitting was materially alleviated . Experimental Conditions Each of the 36 subjects selected by the process described above had been using their UK NHS fitting for between 12 and 15 months, and therefore any process of acclimatization to that fitting should have been substantially complete. Each subject was then tested on three occasions designated Week 0, Week 8, and Week 16 . Week 0 was the first occasion upon which the Acclimatization and Frequency Responses/Gatehouse NAL target was achieved and represents the initial comparison of the benefits of providing the extra high-frequency information. Tests at Week 0 reflect the comparison between UK NHS fitting and the NAL fitting when the subject had no experience ofthe NAL characteristic and was acclimatized to the UK NHS fitting. Following the assessment at Week 0, the subject ceased to use the UK NHS fitting and was switched to the NAL fitting, thus providing the subject the opportunity to acclimatize to the NAL fitting. Testing thereafter took place in both conditions with the subject gaining increasing exposure, and hence acclimatization, to the NAL fitting. During this time the earmold and hearing aid associated with the UK NHS fitting was retained in the laboratory and used only during the test sessions . At each of the test sessions (at Week 0, Week 8, and Week 16) the FAAF test and sentence verification test (SVT) were applied in a counterbalanced order between subjects and across visits . For each of the tests (FAAF and SVT) the subject was tested in the aided and unaided conditions (order counterbalanced) with an initial practice run in a randomly selected (aided/unaided) condition. The results from the practice run were discarded and not used in the analysis and were employed to stabilize initial performance. RESULTS he design of the experiment produces reT sults for performance on the FAAF and sentence verification tests at Week 0, Week 8, and Week 16 in the two conditions corresponding to the UK NHS fitting and NAL fitting (remembering that prior to Week 0, the subject listened in everyday life through the UK NHS fitting and between Week 0 and Week 16 the NAL fitting) . The results for the percentage correct score on the FAAF test are shown graphically in Figure 1, which contains the mean score for each fitting at Week 0, Week 8, and Week 16 accompanied by the 95 percent confidence interval (±2 standard errors) on that mean. Inspection of Figure 1 would suggest that at Week 0 (experience of the UK NHS fitting but not the NAL fitting) there is little difference in performance . However, as time progresses to Week 8 and Week 16, performance in the NAL fitting improves whereas performance under the UK NHS fitting remains stable . Figures 2 and 3 contain the corresponding results for the identification component of the sentence verification test and the verification element respec- = UK NHS Fitting 95 U o U c U U1 a NAL Fitting 95 90 90 85 85 80 80 75 WEEK 0 WEEK 8 75 WEEK 16 Figure 1 Mean (± 2 standard errors on the mean) of the percentage correct score on the FAAF test for the 36 subjects at Week 0, Week 8, and Week 16 for the UK NHS fitting and the NAL fitting . tively . Figure 2 shows a somewhat more complex pattern with improvements across time for both the UK NHS and NALfittings, but with the advantage of the NAL fitting over the UK NHS fitting becoming more apparent with time . The results in Figure 3 for the response time (verification element) from the sentence verification test show a pattern more similar to those in Figure 1 from the FAAF test, with no apparent difference between the UK NHS and NAL fittings at Week 0 but one emerging at Weeks 8 and 16 . Preliminary inspection of the data showed the distributions to be approximately normal and therefore parametric statistical methods have been employed. As a first stage, a series of Student's paired t-tests was performed and a digest of the results is shown in Table 1 . It can = UK NHS Fitting NAL Fitting Figure 2 Mean (± 2 standard errors on the mean) of the identification component (signal-to-noise ratio in dB for 70 .7% correct identification of the sentences) on the sentence verification test for the 36 subjects at Week 0, Week 8, and Week 16 for the UK NHS fitting and the NAL fitting. Note that a decrease in signal-to-noise ratio for criterion performance represents an improvement in performance. 301 Journal of the American Academy of Audiology/Volume 4, Number 5, September 1993 = UK NHS Fitting NAL Fitting 1600 1600 ~ 1400 E ro E 1400 1200 1200 1000 1000 N C O d 800 WEEK 0 WEEK 8 WEEK 16 800 Figure 3 Mean (± 2 standard errors on the mean) of the verification component (median response time for decision concerning sense/nonsense of the sentences) on the sentence verification test for the 36 subjects at Week 0, Week 8, and Week 16 for the UK NHS fitting and the NAL fitting. Note that a decrease in response time represents an improvement in performance . be seen that for the UK NHS versus NAL comparison at Week 0 neither the percentage correct score on the FAAF test nor the identification element of the sentence verification test achieves statistical significance, although the verification element of the sentence verification test just achieves statistical significance at p < .05, with an advantage in response time of 112 msec for the NAL fitting over the UK NHS fitting. Contrast these findings with those at Table 1 Summary of the Student's Paired t-Tests FAAF* UK NHS vs NAL at Week 0 UK NHS vs NAL at Week 8 Week 8 and Week 16 where all three aspects of performance show significant advantages ofthe NAL fitting. Paired comparisons were also taken within fitting across time and comparisons for Week 0 versus Week 8, Week 0 versus Week 16, and Week 8 versus Week 16 are also shown in Table 1. The paired comparisons in Table 1 indicate the likely outcome of experiments, which might have been conducted at single points in time, but such comparisons on repeated data sets could be misleading due to the problem of multiple comparisons capitalizing on chance statistical results. Accordingly, a series of repeated measures of analysis of variance was performed using the FAAF test score and both the identification and verification elements of the sentence verification test as dependent variables. These are summarized in Tables 2, 3, and 4 respectively . In these analyses, the fitting (NAL versus UK NHS) and time of assessment (Week 0, Week 8, or Week 16) were designated as within-subject factors, and the interaction term between them was included . Table 2 shows that for the FAAF test there is a significant effect of fitting leading to an overall improvement of 2.87 percent correct score for NAL as opposed to UK NHS fitting, and also an overall effect of assessment time with an improvement of score of 1.52 SVT-SINt SVT-RT $ NS NS p< .05 p < .05 (+2 .3%) p < .001 p< .001 p< .001 (+4.4%) p< .005 (+0 .7 dB) Week 0 vs Week 8 for UK NHS Week 0 vs Week 16 for UK NHS Week 8 vs Week 16 for UK NHS NS NS NS NS p < .01 NS p < .05 (-1 .1%) p< .05 (+0 .8 dB) NS Week 0 vs Week 8 for NAL Week 0 vs Week 16 for NAL Week 8 vs Week 16 for NAL p < .001 (+3 .6%) p < .001 (+4 .5%) NS p< .05 (+0 .9 dB) NS (+1 .2 dB) NS (+230 msec) p < .001 (+151 msec) UK NHS vs NAL at Week 16 (+1 .2 dB) (+1 .1 dB) p< .01 (+112 msec) (+223 msec) p< .001 (+312 msec) p< .001 Results of the Student's paired t-tests are summarized for *the percentage correct score on the FAAF test ; t the identification component (signal-to-noise) ratio in dB for 70 .7% correct identification of sentences) of the sentence verification test ; and $the verification component (median response time) from the sentence verification test . The significance level of each comparison and where this achieved p < .05 the magnitude of the significant difference is listed . The sign of this magnitude is arranged to be positive if the NAL fitting gives an advantage over the UK NHS fitting (irrespective of the direction of the individual metric employed), and also to be positive if there is an improvement across time . 302 11 .rs Acclimatization and Frequency Responses/Gatehouse Table 2 Source of Variation Summary of the Repeated Measures Analysis of Variance with Percentage Correct Score on the FAAF Test as the Dependent Variable Sum of Squares DF Mean Square F P 1724 .8 296 .3 Parameter Estimate (%) 35 1 SE of Estimate (~) 49 .3 296 .3 6 .01 p < .02 2 .87 1 .17 Within cells Time 826 .8 305 .2 70 2 11 .8 152 .6 12 .92 p < .001 1 .52 2 .49 Within cells NAL vs UK NHS by Time interaction 341 .3 146 .0 70 2 4 .9 73 .0 14 .97 p < .001 0 .01 2 .01 Within cells NAL vs UK NHS . 0 .62 0 .52 0 .39 0 .34 The within subject factors are the hearing-aid fitting (NAL vs UK NHS), and the time of test (Week 8 and Week 16 vs Week 0) . The interaction term is included . The parameter estimates are the magnitude of the effects with the reference condition set as the UK NHS fitting and Week 0 . The sign of the parameter estimate is arranged so that a positive value corresponds to an improvement in performance in the NAL vs UK NHS fitting and an improvement in performance of Week 8 with respect to Week 0 and Week 16 with respect to Week 0. Table 3 Source of Variation Summary of the Repeated Measures Analysis of V ariance with Identification Component (Signal-to-Noise Ratio in dB for 70.7% Correct Identification) on the Sentence Verification Test as the Dependent Variable Sum of Squares DF Mean Square F P Parameter Estimate (dB) SE of Estimate (dB) Within cells NAL vs UK NHS 174 .8 37 .3 35 1 5 .0 37 .3 7 .47 p < .01 1 .02 0 .37 Within cells Time 198 .8 49 .4 70 2 2 .8 24 .7 8 .69 p < .001 1 .17 0 .02 0 .25 0 .31 Within cells NAL vs UK NHS by Time interaction 106 .4 14 .7 70 2 1 .5 7 .4 4 .93 p < .02 0 .07 0 .84 0 .23 0 .26 The within subject factors are the hearing-aid fitting (NAL vs UK NHS), and the time of test (Week 8 and Week 16 vs Week 0) . The interaction term is included . The parameter estimates are the magnitude of the effects with the reference condition set as the UK NHS fitting and Week 0 . The sign of the parameter estimate is arranged so that a positive value corresponds to an improvement in performance in the NAL vs UK NHS fitting and an improvement in performance of Week 8 with respect to Week 0 and Week 16 with respect to Week 0 . Table 4 Summary of the Repeated Measures An alysis of Variance with Verification Component (Median Response Time for Sen se/Nonsense Decision) on the Sentence Verification Test a s the Dependent Variable Source of Variation Sum of Squares DF Mean Square Within cells NAL vs UK NHS F 2,660,249 2,506,265 35 1 76,007 2,506,265 32 .97 p < .001 264 .0 46 .1 Within cells Time 4,010,406 698,026 70 2 57,291 349,013 6 .01 p < .01 47 .4 130 .9 35 .0 44 .3 Within cells NAL vs UK NHS by Time interaction 2,021,398 361,133 70 2 28,877 180,566 6 .25 p < .01 6 .5 99 .9 14 .0 37 .5 P Parameter Estimate (msec) SE of Estimate (msec) The within subject factors are the hearing-aid fitting (NAL vs UK NHS), and the time of test (Week 8 and Week 16 vs Week 0) . The interaction term is included . The parameter estimates are the magnitude of the effects with the reference condition set as the UK NHS fitting and Week 0 . The sign of the parameter estimate is arranged so that a positive value corresponds to an improvement in performance in the NAL vs UK NHS fitting and an improvement in performance of Week 8 with respect to Week 0 and Week 16 with respect to Week 0 . Journal of the American Academy of Audiology/Volume 4, Number 5, September 1993 percent from Week 0 to Week 8 and 2.49 percent from Week 0 to Week 16. However, the interaction term is also highly significant and its predominant parameter estimate at 2.01 percent is of comparable magnitude to the overall main effects described above. Thus the repeated measures analysis of variance confirms the impressions obtained in the graphic representation in Figure 1, of not only significant changes across time and within fitting, but a significant interaction between the two, such that the advantages of the NAL fitting over the UK NHS fitting are not apparent immediately, but only emerge following significant experience . Inspection of Tables 3 and 4 shows very similar findings for the two elements of the sentence verification tests, again with overall main effects of fitting and assessment time, but with significant interaction components, which yield parameter estimates of comparable magnitude to the main effects . Thus the impressions from the graphic representations in Figures 1, 2, and 3 and from the multiple paired t-tests in Table 1 are confirmed by the more robust multivariate analysis . DISCUSSION T his article has described a group of subjects fitted with single hearing aids who according to the requirements of a NAL prescription for gain as a function of frequency have been underfitted at the upper end of the frequency range. The subjects used a standard UK NHS postaural aid and mold for between 12 and 15 months, and were then subsequently fitted according to the NAL prescription . Immediate testing of the NAL fitting versus the UK NHS fitting using a word-in-noise identification test and a sentence verification test suggested little, if any, additional benefit from the provision of the high-frequency information, with only the verification element (response time) of the sentence verification test yielding statistically significant results. However, following experience with the NAL fitting of 8 and 16 weeks, all three of the performance indices showed clear and statistically significant advantages of the NAL fitting. An appropriate multivariate analysis of the results shows that these findings were not attributable to task learning (although contrary to Gatehouse, 1992b, there was a significant improvement in performance on the identification element of the sentence verification test, suggesting that further investigations of tasklearning effects in the SVT might be required). However, there was a definite interaction be304 tween the NALversus UK NHS fitting and time of exposure to the re-prescription. Whilst the initial significant differences on the response time element of the sentence verification test are encouraging from the point of view of demonstrating the potential advantages of such a configuration over traditional identification paradigms (Gatehouse, 1992b), the results as a whole have major implications for the interpretation of previous research into hearing aids and the design of future research and perhaps also clinical practice . The el_periments did not attempt to address the issue of acclimatization versus deprivation as competing hypotheses for explaining some of the changes in speech identification ability in users of a single hearing aid. There is certainly little evidence here of decline in the now unfamiliar condition (if the single finding of a decrease in the FAAF scores between Week 8 and Week 16 is ignored) . The initial comparison at Week 0 closely mimics many reports in the literature of comparative evaluations of hearing aid prescriptions and/or hearing aid prescription strategies either in individuals or in groups, where either null or only marginally statistically significant results are achieved . The results from the present experiments suggest that the potential benefits of a theoretically advantageous frequency response require exposure for the benefits to become apparent. As such, the results reinforce the arguments already put forward in Cox and Alexander (1992) and Gatehouse (1992a) that, when new patterns of speech cues are presented to hearing-impaired listeners, the auditory system takes a certain amount of learning time to make optimum use of the particular cues . The present results throw little light, and indeed are not intended to, on the underlying psychoacoustic nature of the changes that take place, which could conceivably be in the domain of streaming/grouping (for an introduction to general principles, see Yost, 1991) or perhaps in a complex re-mapping of internal loudness coding across frequencies. In making the above interpretations of the experimental finding, it has been necessary to attribute the changes in test performance to the differing frequency responses rather than changing patterns of hearing aid use, differing distortion characteristics across the aids employed, or other as yet unidentified differences between the conditions . Although such effects cannot be categorically ruled out, there were no differences in reported daily responses or patterns of use as assessed by simple self-report, but detailed information from daily diaries was not available. Similarly there were no measured Acclimatization and Frequency Responses/Gatehouse differences in distortion as characterized by simple harmonic or intermodulation indices, but measures based on broad-band speech-like signals were not performed. Given the substantial differences in frequency response, it appears reasonable to ascribe the changes to that domain rather than to other potentially secondorder effects. The gains measured for the low/ mid frequencies did not change across time, although by definition the high-frequency measures did do so . The direct implications in terms of selecting the frequency response for a particular individual or group of individuals reinforce the already present evidence concerning the potential drawbacks of relying on immediate consumer preference on intelligibility judgments by extending the limitations to performance tests. Hitherto, the limitations of performance tests have largely been confined to the problems of configuring stable and sensitive instruments (Shore et al, 1960 ; Resnick et al, 1963 ; Walden, 1983). The present results suggest that, even if stable sensitive instruments can be achieved, the fundamental limitation of acclimatization would still potentially debar particular optima from being identified prior to material exposure to the particular frequency responses. There is, however, some encouraging evidence in the significant differences even at Week 0 in the response time element of the sentence verification test, which suggest some promise ofescape from uncertainty. The data in Gatehouse (1992b) showed that for a small group of 4 new hearing aid users, the benefits of a rising over a flat frequency response only became apparent following some 4 to 6 weeks of experience . The present study on alarger subject group further expands theargument, though here in experienced hearing aid users . Given the previous results, the new data suggest that the benefits of making a range of high-frequency speech cues newly available to the listener only become apparent when the listener has substantial experience and exposure to those new cues . The two sets of results make it difficult to interpret the finding in terms of subjects' motivation and effort in familiar circumstances, but rather suggest an underlying change in perceptual mechanisms . In choosing a group of listeners with substantial "under-amplification" at high frequency, the present experiment has allowed potentially large acclimatization effects, due to the potentially large change in patterns of speech available to the listener. Further experiments are required to quantify the magnitude of any improvements in performance for differing fre- quency regions. Although the present data has shown mean improvements across a group of subjects, without extensive repeat measures as in Gatehouse (1992b), it is not possible to quantify the number of subjects as individuals who do show a performance improvement with time, given the magnitude of a critical difference of approximately 15 percent for a procedure such as the 80-item FAAF test . The relationship between magnitude of impairment and the auditory and nonauditory characteristics of the listeners remain the subject of further study. The present results suggest that the current literature on comparative evaluation of frequency responses be re-interpreted, not only in the light of the now known limitations of the particular speech instruments employed, but also in terms of the time given to the various frequency responses tested and whether this achieved a meaningful comparison . It seems sensible to suggest that the same process might apply to intelligibility judgments rather than performance tests themselves, although the present experimental results do not speak directly to that issue . It is also tempting to speculate that similar results might be found for other types of processing, in particular some of the signal processing schemes that attempt to radically alter the nature of the speech signal presented to the listener . If this does prove to be the case, then the development of wearable digital signal processing devices (as opposed to simulations on laboratory computers where listening exposure and experience is by necessity limited) becomes a high priority . Then experiments could be configured in such a way that hearing-impaired listeners are given the opportunity to be exposed to, and maximally benefit from, potentially advantageous signal processing strategies before they are assessed . In this way, although it is likely that only those schemes with demonstrable potential benefits in the laboratory are likely to proceed to field trials, it may be that the laboratory benefits are rather small and that potentially advantageous schemes maybe unnecessarily rejected because of limitations in the experimental design . Unlike the previous experiments by Gatehouse (1989, 1992a), the current results have been obtained from a relatively heterogeneous population in terms of age and hearing level and use not only word identification in noise but perhaps a more representative speech sentencebased test . Coupled with the results in Cox and Alexander (1992) where changes in both measured and reported hearing-aid benefit across time are documented (although not relative changes of one hearing-aid prescription against 305 Journal of the American Academy of Audiology/Volume 4, Number 5, September 1993 another) and similar findings in a study on cochlear implant patients (Tyler et a1,1986), we suggest that the role of acclimatization to particular hearing aid characteristics must be seriously considered both in the design of experimental research studies and in clinical practice where comparative selection regimes are in operation. Gatehouse S. (1989) . Apparent auditory deprivation effects of late onset: the role of presentation level. JAcoust Soc Am 86 :2103-2106 . REFERENCES Gelfand SA, Silman S, Ross L. (1987) . Long-term effects of monaural, binaural and no amplification in subjects with bilateral hearing loss . Scand Audiol 16 :201-207 . Baer T, Moore BCJ, Gatehouse S. (1992). Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing impairment : effects on intelligibility, quality and response times . J Rehabil Res Deu in press. Barfod J . (1979) . Speech perception processes and fitting of hearing aids . Audiology 18 :430-441 . Berger KW, Hagberg EN . (1982) . Hearing aid users attitudes and hearing aid usage . Monogr ContempAudiol 3 :24. Berger KW, Hagberg EN, Raine RL. (1984). Prescription of Hearing Aids : Rationale, Procedures and Results. 4th ed . Kent, OH :Herald. Berger KW . (1990). The use of an articulation index to compare three hearing aid prescriptive methods . Audecibel 39 :16-19 . British Society of Audiology/British Association of Otolaryngologists . (1981) . Recommended procedures for pure tone audiometry using a manually operated instrument. Br JAudiol 15 :213-216. British Standards Institution . (1969) . Specification for a Reference Ofor the Calibration ofPure Tone Audiometers. Data for Certain Earphones Used in Commercial Practice . BS2497 Part 11 . London : British Standards Institute. British Standards Institution. (1972) . Specification for a Reference O for the Calibration ofPure Tone Audiometers. Normal Threshold of Hearing for Pure Tones by Bone Conduction . BS2497 Part IV. London : British Standards Institute. Gatehouse S. (1992a). The timecourse and magnitude of perceptual acclimatisation to frequency responses : evidence from monaural fitting ofhearing aids . JAcoust Soc Am 92(3):1258-1268 . Gatehouse S. (1992b). The Evaluation of a Sentence Verification Test for the Assessment ofHearing Aid Benefit . Unpublished data. Hurley RM . (1991) . Hearing Aid Use in Auditory Deprivation : A Prospective Study. Paper presented at the meetingof the American Academy ofAudiology in Denver, Colorado, April 1991 . Kapteyn K. (1977) . Satisfaction with fitted hearing aid. Scand Audiol 6:147-156 . Levitt H. (1971) . Transformed up-down methods in psychoacoustics. JAcoust Soc Am 49 :467-477 . McCandless GA, Lyregaard PE . (1983) . Prescription of gain/output (POGO) for hearing aids . Hear Instr 34 :1621 . Pavlovic CV . (1988) . Articulation index predictions of speech intelligibility in hearing aid selection. Asha 30:6365 . Rankovic CM . (1991). An application of the articulation index to hearing aid fitting. J Speech Hear Res 34 :391402. Resnick DM, Becker M . (1963) . Hearing aid evaluationa new approach . Asha 5 :659-699 . Shore 1, Bilger IC, Hirsh IH . (1960) . Hearing aid evaluations: reliability of repeated measures . J Speech Hear Disord 25:152-170 . Silman S, Gelfand SA, Silverman CA. (1984). Late onset auditory deprivation: effects of monaural versus binaural aids . JAcoust Soc Am 76 :1357-1362 . Silverman CA. (1989) . Auditory deprivation. Hear Instr 40(9) :26-29 . Byrne D, Dillon H. (1986). The National Acoustics Laboratories (NAL) new procedure for selecting the gain and frequency response of a hearing aid . Ear Hear 7:257-265 . Skinner MW, Pascoe DP, Miller JD, Popelka GR. (1982). Measurements of the region for the optimum placement of speech energy within the listener's auditory area: a basis for selecting amplification characteristics . In : Studebaker GA, Bess FH, eds . Vanderbilt Hearing Aid Report . Monogr Contemp Audiol 161-169. Cox RM . (1985) . Hearing aids and aural rehabilitation: a structured approach to hearing aid selection. Ear Hear 6:226-239 . Stubblefield J, Nye C. (1989) . Aided and unaided time related differences in word discrimination. Hear Instr 40 :38-78 . Cox RM, Alexander GC . (1992) . Maturation of hearing aid benefit: objective and subjective measurements . Ear Hear 13 :131-141 . Tyler RS, Preece JP, Lansing CR, Otto ST, Gantz BJ . (1986) . Previous experience as a complementary factor in comparing cochlear implant processing schemes. JSpeech Hear Disord 29 :282-287 . Brooks DN . (1989) . Adult Auditory Rehabilitation . London: Chapman and Hall . Foster JR, Haggard MP . (1979) . FAAF-An efficient analytical test of speech perception . Proc Inst Acoust 182 :9-12 . Foster JR, Haggard MP . (1987) . The four alternative auditory feature test (FAAF) : the acoustic and psychometric properties of the material with normative data in noise . Br JAudiol 21 :165-174. 306 Walden BE, Schwartz DM, Williams DL, HolumHardegen LL, Crowley JM. (1983) . Test of the assumptions underlying comparative hearing aid evaluations . J Speech Hear Disord 48 :264-273 . Yost WA. (1991) . Auditory image perception and analysis-The basis for hearing. Hear Res 56 :8-18.
© Copyright 2026 Paperzz