XDP strategy LD white paper

Design and effects of post-spectral output
compression in cochlear implant coding
strategy
Manuel Segovia-Martinez, PhD., Senior Signal Processing Engineer, Neurelec
Dan Gnansia, PhD., Director Clinical & Scientific Research, Neurelec
Cochlear implant coding strategies classically integrate automatic gain control on the input audio stream,
and then a front-end or output compression function is applied to optimize the acoustical to electrical dynamic range. This classic signal processing method aims to narrow the acoustical dynamic range to let more
loudness quantization steps (Loizou 1998).
The goal of the present study is to describe the new signal processing strategy for the Neurelec’s forthcoming speech processor, which integrates post-spectral analysis compression (named XDP Strategy). A novel
frequency selective compression has been used. Compression settings uses a new paradigm that eases
and speedups the audiologist tuning.
Presets were designed to maximize the speech information sent to the patient while assuring comfort in
noisy and loud conditions. These pre-sets were statistically determined in order to preserve 95% of the
speech information in quiet, medium and loud environments.
12 unilateral patients and 7 binaural patients were fitted with this output compression function and tested
for pure tone thresholds and speech tests in quiet and in noise. Patients were also asked to rate their preference comparing XDP to previous strategies. Results showed improved intelligibility in quiet for loudest
presentation levels and in noise, with observed loudness growth in normal range.
All together, these results support the fact that output compression designed without automatic gain control allows a full control in dB SPL in the fitting software and showed promising results.
Introduction
The signal processing strategy plays a central role in delivering
the electrical stimuli to the patient. Several are the challenges
and issues to be solved by the signal processing to allow the
patients a useful intelligibility in any environmental condition.
Since the early 1980s, most cochlear implant (CI) systems include an input automatic gain control (AGC) to preserve the signal characteristics regardless of the input sound level (Loizou
1998). However, the usage of input AGC introduces distortion
into the signal (see Van Hoesel, Ramsden, and Odriscoll (2002)
for binaural distortion in CI, and Moore (2008), Stone and Moore
(2007), Stone and Moore (2008), Boyle et al. (2009) for monaural distortion in hearing aids (HA)), furthermore it increases the
complexity of the signal processing algorithm. Moreover, the
usage of input AGC does not allow a direct mapping between
the energy of the input signal to the electrical stimuli energy
delivered to the patient.
Neurelec has developed a novel approach where a backend
frequency based compression is proposed.
This signal processing strategy is named XDP.
The input to the backend compression transfer function (Figure 1) is the acoustic energy as received on the microphones
on logarithmic scale, which are mapped directly to the electrical stimulation levels.
A statistical method has been proposed to maximize the
Frequency ranges
percentage of speech information found in the pre knee point
area. Then, different pre-sets where calculated for pre-defined
environments.
Each electrode has an independent compression transfer function. To reduce the complexity of fitting, four frequency ranges
have been determined by Hierarchical Clustering as shown in
Figure 2. Within this groups, the distribution of the Energy Density Spectrum are similar and an identical compression transfer function is hence used.
The frequency ranges are as follows (obtained using a large database of speech signal from western languages) :
195Hz to 846Hz
846Hz to 1497Hz
1497Hz to 3451Hz
3451Hz to 8000Hz
Fig. 1. Backend Compression Transfer Function
Acoustic Energy Mapping
Previous coding strategies also include output compression
functions, however the mapping of the acoustic range into the
electrical range has usually not allowed an explicit mapping
between dB SPL and stimulation level.
The Energy Density Spectrum per electrode band S_xx (n) is
used to estimate the sound level on dB SPL
EstdBSpl (n) = 10log (Sxx (n))+ inputOffsetdB
Figure 2. Electrode Frequency clustering
Using dB SPL or dB HL levels to be mapped to predefined stimulus levels allows audiologist a more intuitive fitting procedure
than possible for prior art.
Default knee points choice
Input dynamic control
The intermediate threshold level knee points are determined
such that 95% of (output energy) levels typically appearing in
speech situations are below these thresholds.
Given that the input levels are represented using dB SPL or dB
HL, the input dynamic range (IDR) can be controlled directly on
the XDP compression transfer function.
Three types of speech situations have been identified: 1) quiet,
2) medium and 3) loud
Figure 1 shows a particular example where the IDR has been set
from 25dB SPL to 105dB SPL.
Quiet: average speech 60 dB SPL
Medium: average speech 70 dB SPL
Loud: average speech 80 dB SPL
2
or can be adjusted by the audiologist if required.
The medium knee points are shown on the next table (obtained
using a large database of speech signal from western languages):
Frequency Range
Knee point (dB SPL)
195 Hz to 846 Hz
61
846 Hz to 1497 Hz
61
1497 Hz to 3451 Hz
57
3451 Hz to 8000 Hz
50
Outcomes
Material and methods
Subjects
21 subjects were included from two different ENT departments
of university hospitals (Pellegrin hospital in Bordeaux, Pasteur
hospital in Nice). They were fully informed and provided written consent before participating in this study. This study was
carried out in accordance with the Declaration of Helsinki, and
was approved by the ethical committee CPP Sud Mediterannée
Marseille I (Ref: 2012-A00112-41).
Table 1. Frequency Ranges and knee points for Medium Environment
Conditions.
The subjects were adults from 32 to 72 years old (mean =
57, standard deviation = 12), native French speakers, implanted with a Neurelec cochlear implant for more than 9
months. 12 were Digisonic ® SP users, and 9 were Digisonic ® SP binaural users.
Fitting interface
Figure 3 shows the fitting interface currently available to set
the transfer function of the XDP backend compression.
Setup and procedure
Tests were performed in free field, in a sound booth. The testing consisted in pure-tone thresholds and speech perception in
quiet and in noise.
For open-set speech identification, 50 recorded lists of 10
French disyllabic words (Fournier’s Lists; [Fournier, 1951]) were
presented at 40, 55, 70 and 85dB SPL. For the noisy condition, a cocktail- party noise was presented at 55dB SPL, with
a fixed signal-to-noise ratio (SNR) of +10 dB. Both signal and
noise were presented through the same loudspeaker facing the
subject. A correct response corresponded to a fully identified
word; two lists were presented for each condition.
All subjects completed pure tone thresholds and speech in
quiet and speech in noise tasks with their standard fitting.
Then, XDP strategy was activated, and another evaluation was
performed immediately. Subjects were asked to keep and use
their XDP program for 30 days, and were evaluated thereafter
in a second visit.
All subjects were fitted and evaluated with XDP strategy at
medium level presets. Digisonic® SP Binaural users were also
evaluated with Quiet and Loud presets at visit 2 for speech in
quiet.
Figure 3. New Fitting System
Each transfer function controls a given frequency range (Table 1).
On the x-axis the IDR is shown. On the y-axis the percentage of
the stimulation level (with respect to the T and C levels) is shown.
Data analysis and statistics
Paired t-tests were conducted between all scores measured
with and without XDP, at first and second visit.
The default IDR is from 25 to 105 dB SPL.
This range can be adjusted if required in another dedicated
control panel.
The knee-points can be set accordingly to the speech situation,
3
Results
Digisonic® SP users
Average pure-tones thresholds range between 35 and 20 dB
HL, and no statistical difference has been observed when activating XDP, even after 30 days of use (Figure 4).
dard fitting between the two visits, whereas statistically significant improvement is observed at second visit.
Figure 6. Speech intelligibility in quiet with standard and XDP fittings in
immediate evaluation and after 30 days of XDP use.
Figure 4. Average pure-tone thresholds in standard fitting and with XDP
at first and second visit.
Scores for speech intelligibility tests in noisy condition are
shown in figure 7. In standard fitting, average score were 28%,
and a dramatic significant improvement of 27% was observed
with XDP after 30 days. Scores with standard fitting were statistically similar for the two visits.
Average scores for speech in quiet are shown in Figure 5. With
standard fitting, scores showed a maximum of 66% at 70 dB
SPL. Evaluation with XDP strategy showed no improvement at
first visit, however a significant improvement of 12% was noted
at 70 dB SPL after 30 days of use.
Figure 7. Speech intelligibility in cocktail party noise (10 dB SNR) with
standard and XDP fittings in immediate evaluation and after 30 days
of XDP use.
Figure 5. Speech intelligibility in quiet as a function of presentation
level for standard fitting and XDP at first and second visit
Digisonic® SP Binaural users
Average pure-tones thresholds range between 35 and 10 dB HL,
and no statistical difference has been observed when activating XDP, even after 30 days of use (Figure 8).
Speech identification in quiet at 70 dB SPL was also performed
with old standard program after 30 days of XDP use in a control
condition (Figure 6). Results showed no difference with stan-
4
Figure 8. Average pure-tone thresholds in standard fitting and with XDP
at first and second visit.
Figure 10. Speech intelligibility in cocktail party noise (10 dB SNR) with
standard fitting and XDP at first and second visit.
Figure 9 shows intelligibility in quiet with standard fitting, XDP
fitting at first and second visit, and a last condition where XDP
presets has been changed. In this so-called ‘Optimized’ condition, Quiet preset was used for evaluations at 40 and 55 dB SPL,
and Loud preset was used for evaluations at 70 and 85 dB SPL.
Comparing to standard fitting at first visit, significant differences were observed with XDP after 30 days of use and the Optimized condition, both at 55 and 85 dB SPL.
Discussion
Main difference between XDP and standard fitting relies on
electrical dynamic management. As this should not affect absolute thresholds, no effect of XDP on pure-tone thresholds
was observed.
Improvements in quiet and in noise, both for unilateral and Binaural users may be related to the maximization of speech information below the knee point. 95% of levels in a given speech
situation are below this threshold. For example in the high
frequency band speech consonant (fricatives) are strongly enhanced, due to the value of the knee point. Moreover, for high
SNRs such as 10 dB tested here, the low intensity noise is coded on a more reduced electrical dynamic range than standard
fitting. That allowed subjects to benefit from a larger dynamic
for speech.
Effect of XDP presets was not systematically assessed; however results obtained with the ‘Optimized’ condition shown in Figure 9 suggest that these presets have an influence on speech
reception in quiet for different sound environments. Several
adapted programs could be used then in clinical practice.
Figure 9. Speech intelligibility in quiet as a function of presentation
level for standard fitting, XDP at first and second visit, and optimized
XDP presets at second visit.
XDP changes the way sound information are coded, that can
affect CI users with experience on standard fitting. For this reason, adaptation time was needed to observe significant benefit. It is difficult to evaluate this duration, however a benefit
was observed in the present study after 30 days of use.
Scores for speech intelligibility tests in noisy condition are
shown in figure 10. In standard fitting, average score were 39%,
and as for unilateral Digisonic® SP users, a dramatic significant
improvement of 30% was observed with XDP after 30 days.
5
Conclusions
•
•
•
•
XDP shows great speech intelligibility improvement in
quiet and in noise
Adaptation time is needed to observe improvements
Speech in quiet is not improved, modifying XDP presets
can help
All patients preferred XDP strategy, right from the
beginning.
References:
1. Boyle, Patrick J, Andreas Büchner, Michael A Stone, Thomas Lenarz, and Brian C J Moore. 2009. “Comparison of Dual-time-constant and Fastacting Automatic Gain Control (AGC) Systems in Cochlear Implants.” International Journal of Audiology 48 (4) (April): 211–221.
2. Fournier JE. 1951. “Audiométrie vocale: les épreuves d’intelligibilité et leurs applications au diagnostic, à l’expertise et à la correction prothétique des surdités”, Maloine.
3. Loizou, P.C. 1998. “Mimicking the Human Ear.” IEEE Signal Processing Magazine 15 (5): 101–130.
4. Moore, Brian CJ. 2008. “The Choice of Compression Speed in Hearing Aids: Theoretical and Practical Considerations and the Role of Individual
Differences.” Trends in Amplification 12 (2): 103–112.
5. Stone, Michael A., and Brian CJ Moore. 2007. “Quantifying the Effects of Fast-acting Compression on the Envelope of Speech.” The Journal of
the Acoustical Society of America 121: 1654.
6. Stone, Michael A., and Brian CJ Moore. 2008. “Effects of Spectro-temporal Modulation Changes Produced by Multi-channel Compression on
Intelligibility in a Competing-speech Task.” The Journal of the Acoustical Society of America 123: 1063.
7. Van Hoesel, Richard, Richard Ramsden, and Martin Odriscoll. 2002. “Sound-direction Identification, Interaural Time Delay Discrimination,
and Speech Intelligibility Advantages in Noise for a Bilateral Cochlear Implant User.” Ear and Hearing 23 (2) (April): 137–149.
6
7
Phone: +33 4 93 95 18 18
Fax: +33 4 93 95 38 01
[email protected]
www.neurelec.com
DOCEXT0243-A
NEURELEC
2720 Chemin Saint Bernard
06224 VALLAURIS Cedex
France