Masking of speech by amplitude-modulated noise
HansA. Gustafsson
andStigD.Arlinger
Department
of Technical
.4udiology,
University
Hospital,$-$8185 LinkSping,
Sweden
(Received11 September
1991;revised1 June1993;accepted21 August1993)
The maskingof speechby amplitude-modulated
and unmodulated
speech-spectrum
noisehas
beenevaluatedby the measurement
of monauralspeechrecognition
in suchnoiseon youngand
elderlysubjects
with normal-hearing
and elderlyhearing-impaired
subjects
with and withouta
hearingaid. Sinusoidal
modulationwith frequencies
coveringthe range2-100 Hz, as well asan
irregularmodulationgenerated
by the sumof four sinusoids
in randomphaserelation,wasused.
Modulationdegrees
were 100%, + 6 dB, and q-12 dB. Root mean-square
soundpressure
level
wasequalfor modulated
andunmodulated
maskors.
For thenormal-hearing
subjects,
essentially
all types of modulatednoiseprovidedsomereleaseof speechmaskingas comparedto
unmodulated
noise.Sinusoidal
modulationprovidedmorereleaseof maskingthan the irregular
modulation.The releaseof maskingincreasedwith modulationdepth.It is proposedthat the
numberand duration of low-levelintervalsare essentialfactorsfor the degreeof masking.The
releaseof maskingwasfoundto reacha maximumat a modulationfrequencybetween10 and
20 Hz for sinusoidalmodulation.For elderlyhearing-impaired
subjects,the releaseof masking
obtainedfrom amplitudemodulationwas consistently
smallerthan in the normal-hearing
groups,presumably
relatedto changesin auditorytemporalresolutioncausedby the hearing
loss.The averagespeech-to-noise
ratio requiredfor 30% correctspeechrecognitionvaried
greatlybetweenthe groups:For youngnormal-hearing
subjectsit was --15 dB, for elderly
normal-hearing
it was -9 dB, for elderlyhearing-impaired
subjectsin the unaidedlistening
conditionit was +2 dB and in the aided conditionit was +3 dB. The resultssupportthe
conclusionthat within the methodological
contextof the study,age as well as sensorineural
hearingloss,assuch,influencespeech
recognition
in noisemorethanwhatcanbe explained
by
the lossof audibility,accordingto the audiogramand the maskingnoisespectrum.
PACS numbers: 43.71.Lz, 43.71.Ky, 43.72.Dv
INTRODUCTION
One seriouseffectof noiseis its negativeimpact on
speechcommunication.
The speechmaskingabilityof a
noisedependsprimarilyon the relationbetweenspeech
and noiseintensities,i.e., on the speech-to-noise
ratio (S/
N). Usually, the noiseintensityis expressed
as a timeaveragelevel, e.g., the rms level measuredwith a time
constantof 125 ms (i.e., the "fast" responseof the sound
levelmeter). However,the presenceof amplitudemodulation (AM) in the noisemay also influenceits speechinterferingeffect.Two noisesof equal mean intensity,one
containingAM andtheothernot,maywell differin speech
masking.One might speculatethat the presence
of AM
speech-spectrum
noise,singletalker competition,or periodicallymodulatedwhite or speech-spectrum
noise.Most
often however, constant level (unmodulated) white or
speech-spectrum
noisehasbeenusedasa reference
masker.
The results from such studies can be summarized as follows.
When the maskeris the speechfrom a singletalkeror
noisemodulatedeitherperiodicallyor by the speechof a
singletalker, speechintelligibilityimprovescomparedto
when unmodulated noise is used (Carhart etal., 1966,
1967, 1969; Dirks et al., 1969; Festen and Plomp, 1990;
Miller and Licklider, 1950; Wilson and Carhart, 1969),
evenif the modulatedand unmodulatednoiseshave equal
level of
mightincrease
thespeech
maskingeffectdueto activation averagepowers.In suchcases,the instantaneous
of neuronsthat are sensitivespecificallyto AM (M•ller,
1971) and thusmake them lesscapableof reactingto amplitude modulationsin the speechsignal.On the other
hand, the intensityfluctuationsmight provide opportunities for the listenerto pick up importantfragmentsof the
speechsignalduringthe lessloud intervalsof the modulation cycleandthusprovidea releaseof speechmaskingas
comparedto unmodulatednoise.
A number of studieson speechrecognitionin AM
noisehavebeenpublished.The maskersusedhavegenerally beenoneor moreof the following:Multitalkerbabble,
speech-modulated
white or speech-spectrum
noise,single
talker competition,or periodicallymodulatedwhite or
518
J. Acoust.So½.Am. 95 (1), January1994
such maskersis either higher or lower than the average
(rms) level during a relativelylarge portionof time. Presumablytheseresultsare explainedby more speechinformation being detectedduring noiseintervals of lower level
than what is lost during noiseintervalsof higherlevel.
Multitalker noiseor noisemodulatedby multiple talkershasbeenfoundto generallyreducespeechintelligibility
as comparedto unmodulatednoise(Carhart et al., 1969;
Danbauerand Leppler, 1979). Multitalker noisewasmore
effectiveas a maskerthan random noisemodulatedby multipletalkers,andtheeffectwasattributedto theperceptual
similaritybetweenthis maskerand the targetsignal(Carhart et al., 1975;Young et al., 1975).
0001-4966/94/95(1)/518/12/$6.00
¸ 1994 AcousticalSocietyof America
518
Resultscontradicting
thosementionedhavealsobeen
reported.For example,Berry and Nerbonne(1972) and
Horii et el. (1970) foundthat speech
modulated
by a single talker masksspeechmore than unmodulatednoise
Hearing
includes
threesubject
groups:
Young,normalhearing,elderly,normalhearingandelderly,hearingimpaired.Since
hearingaidsmayaffectthe temporalcharacteristics
of the
sound transmitted,the third group was chosento be
hearing-aiduserswho were testedboth with and without
their hearingaids.Speechrecognition
scoresin different
kinds of amplitude-modulatednoise were measuredand
comparedto the scorein unmodulatednoise.
Level
(dr})
o
lO
does. Danhauer etal. (1985) found multitalker noise to
maskspeechlesseffectively
thanunmodulated
noise.These
discrepancies
may be dueto differentwaysof measuring
maskerpoweror failure to compensate
for differences
in
averagemaskerpower.
Younger,normal-hearingsubjectswere usedin all of
theabovestudies.
Therehavealsobeeninvestigations
concerning elderly and hearing-impairedsubjects,which
clearlyshowdifferentresultsfor thesegroupsascompared
to normal-hearing
subjects.
The advant•,ge
of a periodically
modulatedmaskerappearsto be much lessfor hearingimpairedsubjects(Wilson and Carhart, 1969). Also, for
hearing-impaired
subjects,
a forwardor backwardspeech
maskerresultsin equalor more maskingthan unmodulated noise does;contrary to the relation for normalhearing(FestenandPlomp,1990;Plomp,1986).Elderly
normal-hearing
andelderlyhearing-impaired
listeners
both
have elevatedsentencerecognitionthresholdsin babble
noise(Gelfendet el., 1988).The useof hearingaidsdoes
notimprovethe performance
(Tillmanet el., 1970;WelzlMuller and Stephan,1988).
In view of the conflictingresultsfound in different
studies
with differentmaskingnoisesanddifferentsubject
populations,
it wasdecidedto investigate
severallistener
groupswith severaldifferentmaskers
in a singlestudy.It
Fhreshold
2o
qo
5o
125
250
500
i
k
F requen½•j
2
k
4
k
8
k
(Hz)
FIG. 1. Mean hearingthresholdlevelsplus/minusonestandarddeviation
for the groupof elderlynormal-hearingsubjects.
old men accordingto ISO 7029 (1984). Further, the limits
correspond
to the clinicallynormalrangefor frequencies
up to and including3 kHz, at the higherfrequencies
the
limitscorrespond
to a mild hearingloss.A maximumHTL
difference
of 10 dB betweenleft and right earsat any frequencywas allowed.Six left earsand 14 right earswere
tested. The mean HTLs and standard deviations are shown
in Fig. 1. Only at 8 kHz doesthe mean HTL exceedthe
normal range.
3. Elderly, hearing-impairedsubjects
Twenty subjects,nine men and elevenwomen,aged
55-70 years,participated.All subjectsusedhearingaids
for at leastoneyear ( 12 postauricular;8 in-the-ear).Nine
left earsand 11 right earsweretested.The meanHTLs and
standarddeviationsare shownin Fig. 2. Figure3 showsthe
averageinsertiongainfor the aidswhenworn in the tested
ears.
B. Noise
The maskerwas a speech-shaped
randomnoiseob-
I. METHOD
tainedby low-pass
filteringa whitenoisewith a slopeof
-6 dB/octabove1000Hz. Thisnoisewaspresented
either
A. Subjects
1. Young,normal-hearingsubjects
Eleven young, normal-hearingindividuals, 17-33
years,six malesand five females,participated.For each
subject,a testear waschosen;
fiveleft earsand six right
ears.The selection
criteriawerehearingthresholdlevels
(HTLs) of no worsethan 20 dB HL (re':ISO389, 1991) at
all test frequencies125-8000 Hz. The HTL differencebe-
tweenearswas not allowedto exceed10 dB at any frequency.The averageHTLs wereall in the rangefrom0 to
unmodulatedor modulatedin one of two ways:Sinusoidallyby a singlesinusoid
or irregularly.For the latter,the
Hearing
Threshold
Level
(dB)
o
20
40
6O
5 dB HL.
80
2. Elderly, normal-hearingsubjects
tO0
Twentysubjects,
ten womenandten men,aged54-69
years,participated.For eachsubjecta testear waschosen
accordingto the audiogram.The maxirnumHTLs allowed
were20 dB HL in the frequencyrange125-3000Hz, 30 dB
at 4 kHz, 35 dB at 6 kHz, and 40 dB at 8 kHz. These limits
correspond
approximately
to themedianHTLs for 60-year
519 d.Acoust.
Soc.Am.,Vol.95,No.1, January
1994
120
125
250
500
!
Frequencu
k
2 k
4 k
8 k
(Hz)
FIG. 2. Meanhearingthresholdlevelsplus/minusonestandarddeviation
for the groupof elderlyhearing-impaired
subjects.
H.A. Gustafsson
andS. D.Arlinger:
Masking
of speech
13y
AMnoise 519
50
Insertion
gain
(dB)
Sound
pressure
tel.
the
5 envelopeof unmodulated
noise
40
4
//\\
30
20
2
10
1
0
o
0
Y/2
(a)
Frequenc•
--
+/-
12
....
+/-
6
--
100
%
dB
dB
T
Time
(Hz)
Sound
18
FIG. 3. Mean insertiongainplus/minusone standarddeviationfor the
hearingaidsusedby the elderly,hearing-impaired
testsubjects.
the
pressure
envelope
level
of
unmod.
in
d8
noise
1z
G
sumof four sinusoidsof equalamplitudein randomphase
relationwasusedas modulatingsignal.The samerangeof
amplitudevariationwas usedfor singlesinusoidalmodulation as for the irregularmodulation.
The degreesof modulationwereeither 4-6 dB, 4- 12
dB, or + 100%, as illustratedin Fig. 4 on both linear (a)
andlogarithmic(b) ordinatescales.
The logarithmicmodulations+ 6 dB and 4- 12 dB causeasymmetricsoundpressurevariationswhereasthe changesin soundpressurelevel
are symmetric.The oppositeappliesto the linear 4- 100%
modulation.The useof both linear and logarithmicmodulation types in the study would thereforeillustrate the
relativeimportancefor maskingof the higher-levelandthe
lower-level half intervals of the amplitude-modulated
noise.Mathematically,the modulatingsignal,u(t), was
one of the following:
1 •v
o
-G
-12
-lB
+/-6
...................................
-24
Time
45)
(b)
lB
dB
0
-18
-24
-36
u]00%(t)=l+•
• sin(2IIfit+Oi),
i=1
(1)
-42
1
48
U6dB(t)
=2(I/N)Y•=•
sin(2nfit+Oi),
(2)
U12
dB(t)=2
(2/•v)Z•=
lsin(2n/it+oi),
(3)
where
N= 1forsinusoidal
modulation
andN----4
forirregular modulation. For unmodulated noise, u(t)=
1.
In order to obtainthe samerms soundpressurelevel
(i.e., the sameaveragepower) for the modulatedand unmodulated noises,the modulated noise had to be attenuated between 0.5 and 6 dB relative to the rms level of
unmodulatednoise,dependingon modulationdegreeand
whethersinusoidal
or irregularmodulationwasused.The
attenuation factors were calculated mathematically and
were verifiedby measuringthe rms level of the different
modulatedsignals,after correction.The correctionfactors
are listedin Table I. The rms soundpressurelevelwasset
individuallyfor eachsubject.Figure4(c) showsthe resulting rangesof amplitudevariationrelativeto the rmslevel
of the unmodulatednoisefor the differenttypesof modulation.
The frequencies
of the modulatingsinusoids
were2.1,
4.9, 10.2, and 19.9 Hz, coveringthe range of significant
modulationfrequencies
in human speech(Fastl, 1987).
520
dB
100 k
d. Acoust.Sec. Am., Vol. 95, No. 1, January1994
SG
S12
SlO0
IRR6
IRR12
IRR100
FIG. 4. The differentmodulationdegreesillustratedfor sinusoidal
modulation in (a) linear scale,(b) logarithmicscale.The figuresshowthe
soundpressureenvelopeof the differentmodulatednoiseswithoutrms
compensation
applied.On they axis,I (in linearscale)and0 dB (in log
scale)correspond
to theenvelope
of unmodulated
noise.T is theperiod.
In (c) is shownthe envelopevariationrangesfor differentmodulations
with rms correctionapplied.Zero dB is the level of the envelopeof
unmodulated
noise.The actualmodulationdepthof 100% modulationis
- •0, indicatedby downwardpointingarrows.
They werechosennot to be evenmultiplesof eachotherin
orderto assuremaximumirregularitywhenaddedtogether
to producethe irregularmodulationcondition.Five of the
subjectsin the experimentwere alsotestedat the higher
modulationfrequencies
of 50, 80, and 100Hz with a modulationdegreeof i 12 dB. The differentcombinations
of
modulationparameters
usedare listedin TableII.
C. Speech
The speechmaterial(Hagerman,1982) consists
of 11
lists with ten Swedishlow-redundancyfive-word sentences
H.A. Gustafssonand S. D. Arlinger:Maskingof speechby AM noise
520
TABLE I. Attenuationfactorsusedto equatethe rms soundpressure
level of each modulated masker to that of unmodulated noise. N is the
numberof sinusoids
in the modulatingsignal.
Modulationtype
Attenuation(dB)
4- 100%, N= 1
1.8
groupof fiveof the young,normal-hearing
subjects
participatedin an additionalexperiment
to gaininformationon
the effectof highermodulationfrequencies.
Monaural speechrecognitionscorewas measuredfor
the first 13 of the maskers listed in Table II. The S/N was
At=4
individually chosenfor each subject to achieve a 30%
0.5
4- 6 dB, N= I
1.9
N=4
0.5
4- 12 dB, •r=l
6.1
N=4
2.O
in each.The sentences
are spokenby a femalevoice.All
sentenceshave the samestructure:{name) {verb) {number) {adjective){noun), e.g.,"Peterheldninenewboxes"
(in Swedish"Peter h611nio nya lfidor"). The same 50
wordsappearin all of the 11 listsbut in differentcomputereditedcombinations.
There is a 7-s pausebetweenconsecutive sentencesin a list during which time the subjectis
able to repeat the sentence.For each ten-sentencelist, the
numberof wordscorrectlyrecognizedwas notedand convertedto a speechrecognitionscorein percent.
Test tapeswere cassetterecordingscopiedfrom the
masterrecordingof the materials and included a 1000-Hz
calibration tone. The level of the calibration
tone was 3.4
dB below the speechlevel not exceeded90% of the time,
measured with the "fast" time constant of the sound level
meter.All speechlevelsreportedbelowreferto the levelof
this tone. The soundpressurelevel was kept at 65 dB for
all testsexceptfor testingthe elderly, hearing-impaired
subjectswithouttheir hearingaids.In thiscase,the speech
was adjustedto a comfortablelisteninglevel (mean 87 dB
SPL).
D. Test procedures
The main experimentusedmodulationfrequenciesin
the 2- to 20-Hz rangeand comprisedall subjects.A subTABLE II. The maskersusedin the mainstudy."Irregular" denotesthe
modulationwherethe modulatingsignalconsistsof the sum of four sinusoicls
in randomphaserelation.AM maskerNos. 2-13 were usedin the
main experimentand Nos. 14-16 in thc additionalexperiment.
Masker no.
I
none
2
3
4
irregular,
irregular,
irregular,
5
sinusoidal,
Short form
UNMOD
4-6 dB
4-12 dB
4-100%
IRR6
IRRI2
IRR 100
4-6 dB 4.9 Hz
S6/4.9
6
4-12 dB 2.1 Hz
S12/2.1
7
+ 12 dB 4.9 Hz
S12/4.9
S12/10.2
S12/19.9
S100/2.1
S100/4.9
S100/10.2
5100/19.9
S12/50
S12/80
S12/100
8
9
10
I1
12
13
14
15
16
5:)1
Modulation
4-12 dB
4-12 dB
4-100%
4-100%
4-100%
•- 100%
-• 12 dB
4-12 dB
=e12 dB
10.2 Hz
19.9 Hz
2.1 Hz
4.9 Hz
10.2 Hz
19.9 Hz
50 Itz
80 Hz
100 Hz
J. Acoust.Soc.Am.,Vol. 95, No. 1, January1994
speechrecognitionscore with the unmodulatednoise
masker.Pilot experimentsrevealedthat AM offereda releaseof maskingratherthan an increaseand thusthe 30%
criterion was adopted to obtain a test with the highest
sensitivitypossible.The order of maskerpresentationwas
randomly selectedfor each subject.The speechlevel was
kept constantat 65 dB SPL and S/N adjustmentswere
accomplished
by varyingthe noiseleveliu 3-dB steps.Each
test session lasted for about Ih.
The youngand elderly,normal-hearingsubjectswere
testedusingheadphones
as follows:Three training lists
werepresented
to aquaintthesubjectwith thetestsituation
and to reducelearningeffects(Hagermar., 1982), usingthe
unmodulatednoise masker. The S/N was initially large
enoughfor the subjectto recognizeall the words,but was
graduallyreduceduntil the speechrecognitionscorewas
about 30%.
Two more lists (randomly selectedfor each subject)
with the unmodulatednoisemasker were presented.The
speech-to-noise
ratioswere chosento yield speechrecognition scoreson both sidesof 30%. The S,qqcorresponding
to a speechrecognitionscoreof 30% wasthen calculated
by linear interpolation.Keepingthe S/N constantat the
value just described,speechrecognitionscoresfor AM
maskers2-13 (Table II) were measuredusingonespeech
list for each masker. The same lists in the same order were
usedfor eachsubjectwhereasthe order of maskerpresentation was randomly selectedfor each subject.Finally,
withoutalteringthe S/N, the scorefor unmaodulated
masking noisewas determinedonce more usingone randomly
selectedlist for each subject.The mean of this scoreand
the initially determinedscorewastaken.asthe speechrecognitionscorefor unmodulatednoise.This procedurewas
applied to compensatefor learning or fatigue effectsthat
might haveoccurredduringthe courseof the experiment.
In addition,five of the youngnormal-hearingsubjects
werealsotestedusingsinusoidalmodulationfrequencies
of
50, 80, and 100 Hz. The S/N was determined in the same
way as in the main experiment.
The elderly, hearing-impairedwere testedboth with
and withouttheir hearingaidsusingloudspeaker
presentation. The sameprocedureas just de•:ribed abovewas
usedwith the followingexceptions:The aidedand unaided
experiments
wereperformeddifferenttes!days.For half of
the groupthe aidedtestsprecededthe unaided,and for the
otherhalf the oppositeorderwasused.The nontestear was
alwayspluggedwith an EAR foam plug. In the aidedexperiments, the speechlevel was 65 dB SPL as for the
normal-hearingsubjects.The subjectwas askedto adjust
the volume control of his hearing aid to a setting that
provideda comfortableloudness
levelof the speechsignal
presentedfrom the loudspeaker.However, in the unaided
experimentthe speechlevelwas individuallyadjustedto a
H.A. Gustafsson
and S. D. Arlinger:Maskingof speechby AM noise
521
TABLE IlL ANOVA tablefortheexperiment
ontheyoung,normal-hearing
subjects.
Themeansquare
fortheinteraction
between
maskers
andsubjects
is usedastheerrorestimatein all F ratios.SS=sum of squares,
DF= degrees
of freedom,MS= meansquare.MOD is a collective
denomination
of the
modulatedmaskers2-13. Likewise,IRR designates
maskers2-4, S maskers5-13, S12maskers6-9, andSI00 maskers10-13. m(...) denotesthe mean
scoreacrossmaskerswithin parenth•es.The comparisonbetweenmodulationfrequencies
appliesto the mean scoresacross4-12 dB and 100%
modulation.in the interactionbetweenmodulationdegreeand frequency,
comparisons
are not madebetweenscoresbut betweenscoredifferences,
indicatedby a D. For example,D2.1 denotesthe scoredifference
between+ 12 dB and 100% modulationat 2.1 Hz.
Comparison
Between maskers
SS
DF
40379.52
Betweensubjects
4317.45
Interaction
5347.64
Decompositionof $S betweenmaskers:
a/UNMOD-m(MOD)
b/m(IRR)-m(S)
MS
F
p
12
3364.96
75.52
< 0.001
10
431.75
9.69
< 0.001
120
8763.92
20281.71
I
I
44.56
8763.92
20281.71
......
196.68
455.16
<0.001
< 0.001
Within IRR
1733.09
2
866.55
19.45
<0.001
c/IRR6-m(IRR12&IRRI00)
1590.95
I
1590.95
35.69
< 0.001
d/IRRI2-IRRI00
Within S
142.55
9600.81
1
8
142.55
1200.10
3.20
26.93
<0.01
< 0.001
c/S6/4.9-m(S 12&S100)
d/m(S12)-m(S100)
Betweenrood. freq.
e/m(2.1&4.9)-m(
10.2& 19.9)
f/2.1-4.9
g/ 10.2-19.9
6749.17
35.64
1796.73
1313.64
384.09
99
I
I
3
I
I
I
6749.17
35.64
598.91
1313.64
384.09
99
151.46
0.80
13.44
29.48
8.62
2.22
< 0.001
N.S.
<0.001
< 0.001
< 0.005
N.S.
h/Interaction, degr&freq.
1019.27
3
339.76
7.62
<0.001
968.91
I
968.91
21.74
< 0.001
D4.9-D10.2
48.09
I
48.09
1.08
N.S.
D2.1-D19.9
2.27
1
2.27
0.05
N.S.
m( D2. I&D 19.9)-m(D4.9&D10.2)
level of comfortable
loudness which was then used consis-
tently for all unaided tests on that subject.The mean
speechlevelchosenwas87 dB SPL (standarddeviation4.1
dB).
pressurelevelsweremeasuredat the referencepoint (in the
absenceof the subject)by meansof a B&K 4165 microphone,preamplifierB&K 2619, and microphoneamplifier
B&K 2607. For calibrationof the equipmenta calibrator
type B&K 4230 was used.
E. Equipment
The noisewas generatedby a signalprocessing
computer basedon the Texas InstrumentsTMS 32010 signal
processor.The rms level of the AM noisewas equatedto
that of the unmodulatednoiseby meansof a built-in attenuator. Changesof modulationfrequencyand modulation degreewere controlledby a personalcomputer.The
noisesignaland the speechsignalfrom a TandbergTCD
310 cassette
recorderwereroutedto separatechannelsof a
Madsen OB 822 audiometer,which was usedto control the
soundlevels.From the audiometer,noiseand speechwere
separatelyfed to a mixer, which allowed any combination
of noiseand speechlevelsto be set individually for either
ear. The mixedsignalswerethen fed to the SennheiserHD
250 circumauralearphones.
The earphonemarkedleft was
alwaysplacedover the test ear.
Since the elderly, hearing-impaired subjects were
testedboth with and without heatingaid the speechand
noisehad to be presentedto them by meansof a loudspeaker(anechoicfrequencyresponse
flat within + 3 dB in
the range90 Hz-17 kHz). The test subjectwas seatedin a
chair in a soundproofroom approximately1 m in front of
and facingthe loudspeaker.
Sound level calibrationswere accomplishedfor the
headphoneusinga Bruel & Kjaer 4153 artificialear with
adaptersDB 0843 and YJ 0304 and B&K 2209 precision
soundlevelmeter. For loudspeakerpresentationthe sound
522
J. Acoust.Soc.Am.,Vol. 95, No. 1, January1994
II. RESULTS
A. Effects on speech masking by modulation
parameters
Differencesin speechrecognitionscorebetweenmaskers were analyzedby analysisof variance (ANOVA). A
two-way repeatedmesuresmodelwasused,the two factors
beingsubjectandmaskerwithsubjects
treatedasa random
factor.To furtherexploredifferences
betweenmaskers,the
sumof squaresfor maskerswassplitin orthogonalpartsas
detailedin Table III and discussed
below.An exceptionto
this was the analysisof the additionalexperiment.Since
this wasonly a supplementary
experiment,orthogonaldecompositionwas abandonedin favor of the possibilityof
extracting as much information as possible.Each F test
consideringthe comparisonof two mean values correspondsto a two-tailed t test.
1. Young, normal-hearing subjects
Speechrecognitionscoresobtained with the first 13
maskersof Table II on the young,normal-hearingsubjects
are shown in Fig. 5. The scorewith unmodulatedmasker is
the meanof the scoresmeasuredbeforeand after presentation of the modulatedmaskers.It is approximately30%
as expectedfrom the pilot work usedto setthe noiselevel
for this condition.
H.A. Gustafsson
and S. D. Arlinger:Maskingof speechby AM noise
522
100Speechrecognition score (%)
TABLE IV. ANOVA table for the additionalCXlxrimenton five young,
normal-hearingsubjects.The mean squarefor the interactionbetween
maskersandsubjects
is usedastheerrorestimatein all Fratios. SS=sum
of squares,DF=degrees of freedom,MS=mean square.LOW=the lowfrequencymaskers2.1-19.9 Hz and HIGH = the high-frequency
maskers
80
40
20
-
50, 80, and 100 Hz. m(...) denotesthe mean scoreacrossmaskerswithin
1111 IIII
il
IIII i111
I1:
IIII IIII
0
Unmod
12dE{
6d•
lOOk
Ir•.
4.9
S6dB
2.[
10.2
4.9
[9.9
S12d8
2.1
parenthesis.
Comparison
SS
Between maskers
9052.80
Betweensubjects
10.2
4.9
19.9
77.60
Interaction
SlOOk
839.20
UNMOD-m{HIGH}
m(LOW)-m{HIGH)
DF
7
MS
F
p
1293.26
43.15
< 0.001
4
28
19.40
0.65
29.9?
......
627.27
4680.01
!
I
627.2?
4680.01
20.93
156.15
<0.001
< 0.001
< 0.001
FIG. 5. Resultsof the main experimenton the groupof young,normalhearingsubjects.The bars show the mean speechrecognitionscoreob-
Within HIGH
746.13
2
373.0?
12.45
80-100
129.6
I
129.6
4.32
tained with each masker. The horizontal line above each bar shows mean
50-m(80&100)
m(80&I00)-UNMOD
616.53
235.2
I
I
616.53
235.2
plusone standarddeviation.
Table III showsthat the averagespeechrecognition
scorewith the amplitude-modulated
maskersis significantly differentfrom speechrecognitionin unmodulated
noise (a in Table III), and furthermore that the speech
maskingeffectof the irregularlymodulatednoiseis significantly different from that of the sinusoidallymodulated
noise(b). Modulationdepthof 4-6 dB resultedin significantly lessreleaseof speechmaskingthan 4-12 dB or
4-100% (c). However,therewasno significantdifference
betweenthe latter two degreesof modulation (d).
Significantdifferences
betweenmodulationfrequencies
werefound.The averagescorewith the two lowerfrequenciesdifferssignificantlyfrom the averagescorewith the
two higher (e). The internaldifferences
betweenthe two
lower and betweenthe two higherfrequencies
are not significant (f,g). Further, there is a significantinteraction
betweenmodulationdegreeand frequencyfor the 4-12 dB
and 4- 100% maskers(h). The interactionappearsin such
a way that 4-12-dB modulationyieldsspeechrecognition
scoresabout 5% greaterthan 4-100% does,when modulation frequencyis 2.1 or 19.9 Hz. When modulationfrequencyis 4.9 or 10.2 Hz, on the other hand, speechrecognitionwith 4-12-dB modulationis on the average8%
less than with 100%.
N.S.
20.57
7.84
<0.05
<0.001
<0.01
thehighermodulationfrequencies
50, 80, and 100Hz, with
modulationdegreea: 12 dB. The resultsfor 2-20 Hz (from
theoriginalexperiment)is alsoshownfor comparison.
The
ANOVA (Table IV) revealed significantdifferences(p
< 0.001) betweenmodulationfrequencies
and that the average score at high frequenciesis significantlydifferent
both from the unmodulated noise score ;and from the av-
eragescoreat low frequencies.
Further, the scoreat 50 Hz
differssignificantly
from the averageof the scoresat 80 and
100 Hz, whereasthe scoresat 80 and 100 Hz do not differ.
Finally, the averageof the 80- and 100-Hz scoresdiffers
from the unmodulated
score.
2. Elderly, normal-hearing subjects
Figure7 showsthe meanspeechrecognitionscoresin
the variousAM noisemaskersfor the group of elderly
subjectswith normal audiograms.Table V presentsthe
resultsof the ANOVA. When comparingthe outcomewith
the resultsfrom the young,normal-heariaggroupthe situationis very muchthe sameboth qualitativelyand quantitatively.The differences
concernthe interactionbetween
modulationdegreeand modulationfrequency,whichwas
Figure6 showsthe resultsfor the fivesubjects
testedat
1005Peechrecognition%core
Speechreco9nitionscore
75
60
5O
40
---
25
UNHOD
20
o
0
i''
•'''
•''
lb' '2b'' 'sb'sb'
[00
/11
il IIIl_t
IIII
II
Irr.
FIG. 6. Resultsof theadditionalexperimenton fiveof theyoung,normalhearingsubjects.
The mean + onestandarddeviationis indicatedat each
modulationfrequencytested.
J. Acoust.Soc. Am., VOL95, No. 1, January1994
ifil
Unmod
12dB 4.9 2.14.9
10.2
2.1 10.2
6dl3100Z
19o9 4.9 19.9
Modulation;requenc9<Hz>
523
-TI
80
$6dB
Sl2dB
SlOOk
FIG. 7. Mean speechrecognitionscoresfor the groupof elderly,normalhearingsubjectsobtainedwith each masker.The l'orizontal line above
each bar showsmean plus one standarddeviation.
H.A. Gustafssonand S. D. Arlinger:Maskingof speechhy AM noise
523
TABLE V. ANOVA tablefor theresultsobtained
on thegroupof elderlynormal-hearing
subjects.
The meansquarefor theinteraction
between
maskera
and subjects
is usedas the error estimatein all F ratios.SS=sumof squares,DF=degreesof freedom,MS=mean square.MOD is a collective
denomination
of themodulatedmaskers2-13. Likewise,IRR designates
maskers2-4, S maskers5-13, S12maskers6-9 andS100masketa10-13.m(... }
denotesthe meanscoreacrossmaskerswithinparenthesis.
The comparison
betweenmodulationfrequencies
appliesto the meanscoresacross+ 12 dB
and 100% modulation.In the interactionbetweenmodulationdegreeand frequency,comparisons
are not made betweenscoresbut betweenscore
differences,
indicatedby a D. For example,D2.1 denotesthe scoredifference
between4-12 dB and 100%modulation
at 2.1 Hz.
Comparison
SS
Between maskers
DF
78840.54
Betweensubjects
9984.66
Interaction
9612.69
12
19
MS
F
p
6570.04
155.83
< 0.001
525.51
228
12.46
42.16
< 0.001
......
Decompositionof SSbetweenmaskers:
IJNMOD-m ( MOD }
m{IRR}-m(S}
19480.01
36751.02
i
I
19480.01
36751.02
462.04
871.68
< 0.001
<0.001
Within IRR
3283.2
IRR6-m(IRRI2&IRRI00)
2116.8
2
1641.6
38.94
< 0.001
1
2116.8
50.21
IRR 12-IRR 100
Within S
< 0.001
1166.4
19326.31
I
8
I 166.4
2415.79
27.67
57.30
< 0.001
< 0.001
S6/4.9-m(SI2&SI001
m(S12)-m(S100}
14263.21
4.9
1
I
14263.21
4.9
338.30
0.12
<0.001
N.S.
4768.1
3
1589.37
37.70
<0.001
884.45
I
884.45
20.98
< 0.001
42.05
I
42.05
1.00
3841.6
I
3841.6
91.12
Interaction,degr&freq.
290.1
3
96.7
2.29
< 0.1
re{D2. I&D 19.91-m (D4.9&DI0.2)
211.6
I
211.6
5.02
< 0.05
Betweenrood.freq.
2. I-4.9
10.2-19.9
m(2.1&4.9)-m(10.2&19.9)
N.S.
< 0.001
D4.9-D 10.2
2.45
!
2.45
0.06
N.S.
D2.1 -D 19.9
76.05
1
76.05
1.80
N.S.
not statisticallysignificantfor the groupof elderlylisteners
with normalhearingbut wassignificantfor the youngsubjects (Table III).
3. Elderly, hearing-impaired subjects
conditions.However, there was again significantdifferences between the various m_askers used. The differences
generallycorrespondwith those found in the elderly,
normal-hearinggroup although they are much lesspronounced here.
The test results obtained without hearing aids (unaided} are shownin Fig. 8 and thoseobtainedwith hearing
aids (aided} in Fig. 9. The maximum releaseof speech
maskingfound,usingsinusoidalmodulationof + 12 dB at
10-20-Hz modulationfrequency,increasedthe speechrecognition scorefrom 30% to closeto 55%. This is significantly lessthan in the normal-hearinggroupswhere the
corresponding
increasewas from 30% to around80% (p
<0.001, Student's t test}. As shown in Table VI, there
were no significantdifferences
betweenunaidedand aided
B. Speech-to-noise ratio for 30% correct speech
recognition
The speech-to-noise
ratio that yieldedthe criterion
scoreof 30% correctly repeatedwords in nonmodulated
noisevariedconsiderablybetweenthe groups.The average
S/N differedsignificantlybetweenthe threesubjectgroups
(p<0.001, Student'st test) but not betweenaided and
unaidedlisteningfor the elderly hearingimpairedsubjects
(Fig. 101.
Speechrecognition
8O
iOoSpeech
recognition
score
80
40
II
20
40'
II IIii IIII
6•
Irr.
lOOk
S6dB
4.9
19.9
Sl•dB
4.9
19.9
0
SlO0•
Un•
'i-Ji
6dB
I rr.
J. Acoust.Soc. Am., Vol. 95, No. 1, January1994
4.9
l•dB
FIG. 8. Mean speechrecognitionscores(plusonestandarddeviation)for
the groupof elderly,hearing-impaired
subjectsfor eachmaskerin the
unaidedlisteningcondition.
524
10•
IIII
IIII
mmmm
mmm
2.1
10.2
4.9
•
•l•d•
19.9
2.1
10.2
4.•
•1•
FIG. 9. As Fig. 8 but in the aidedlisteningcondition.
H.A. Gustafssonand S. D. Arlinger:Maskingof speech by AM noise
524
TABLE VI. ANOVA tablefor theresults
fromtheelderlyhearing-impaired
subjects.
Individual
errormeansquares
areusedin lheF ratios.SS=sum
of squares,
DF=degreesof freedom,MS=meansquare.ERRMS=error mcansquare,DFERR=degreesof freedomfor ERRMS.The F ratio is
MS/ERRMS. AIDED denotes
measurements
with hearingaid andUNAIDED without.MOD is a collective
denomination
of the modulated
maskers
2-13. Likewise,IRR designatea
maskers
2-4, S maskms5-13, SI2 maskers
6-9, andSI00 maskers10-13.m(...) denotes
themeanscoreacross
maskers
withinparenthesis.
Thecomparison
between
modulation
frequencies
applies
tothemeanscores
across
+ 12dBand100%modulation.
In theinteraction
between
modulation
degree
andfrequency,
comparisons
arenotmadebetween
scores
butbetween
score
differences,
indicated
bya D. Forexample,
D2.1
denotes the score difference between 4-12 dB and 100% modulation at 2.1 Hz.
Comparison
SS
DF
MS
DFERR
ERRMS
Between maskers
32758.89
25
1310.36
475
Betweensubjects
24344.7
19
1281.3
475
Interaction
37262.80
475
78.45
F
p
78.45
16.70
< 0.001
78.45
16.33
<0.001
664.43
47.32
47.87
75.11
33.41
47.26
82.60
20.47
0.00
25.91
58.50
76.95
2.31
14.•7
25. fi3
92.37
N.S.
<0.001
<0.001
< 0.00 !
N.S.
< 0.001
< 0.001
<0.001
............
Decomposition
of SS betweenmaskers:
m(AIDED)-m(
UNAIDED}
Between maskers, unaided
UNMOD-m(MOD}
m (IRR)-m ( S)
Within IRR
Within S
S6/4.9-m(S 12&S100)
m(S12)-m(S100}
Betweenmod. freq.
2.1-4.9
10.2-19.9
m(2. l&4.9)-m(
10.2&19.9}
0.28
14201.15
2800.62
5780
154.53
5466
2117.02
1890.62
1
12
I
1
2
8
I
I
0.28
1183.43
2800.62
5780
77.27
683.25
2117.02
1890.62
19
228
19
19
38
152
19
19
1413.08
3
471.02
57
59.78
7.38
<0.001
80
22.05
!
I
80
22.05
19
19
57.37
53.52
1.59
0.41
N.S.
N.S.
1311.02
I
1311.02
19
68.45
19.15
< 0.001
Interaction,degr&freq.
45.27
3
15.09
57
31.88
0.47
N.S.
D2.1-D4.9
DI0.2-DI9.9
33.8
0.45
I
1
33.8
0.45
19
19
35.38
31.92
0.%
0.01
N.S.
N.S.
11.02
1546.46
2977.66
8187.76
37.27
19
228
19
19
38
28.34
60.74
62.28
85.26
79.09
0. 39
25.46
47.81
96.03
0.,17
N.S.
<0.001
< 0.001
< 0.001
N.S.
m(D2. I&D4.9)-m(D 10.2&D 19.9)
Between maskers, aided
UNMOD-m ( MOD )
m (IRR)-m (S)
Within IRR
11.02
18557.46
2977.66
8187.76
74.53
I
12
I
I
2
Within S
7317.51
8
914.69
152
52.90
17.29
< 0.001
S6/4.9-m(S 12&S100)
m(S12)-m(SI00)
2586.74
2387.02
I
I
2586.74
2387.02
19
19
59.53
37.50
43.46
63.66
<0.001
<0.001
Betweenmod. freq.
2317.48
3
57
62.97
12.;!7
<0.001
19
83.63
0.07
N.S.
19
45.67
0.35
N.S.
<0.001
2.1-4.9
6.05
10.2-19.9
16.2
m(2.1&4.9)-m(10.2&19.9)
1
I
772.49
6.05
16.2
2295.22
i
2295.22
19
59.59
38.51
Interaction,degr&freq.
26.28
3
8.76
57
45.76
0.].9
N.S.
D2.1-D4.9
I 1.25
1
11.25
19
53.67
0.21
N.S.
1.8
I
1.8
19
25.8
0.(}7
N.S.
13.22
I
13.22
19
57.80
0.23
N.S.
DI0.2-DI9.9
m (D2. I&D4.9)-m(DI0.2&DI
9.9)
III. DISCUSSION
A. Effectsof modulation
characteristics
For normal-hearinglisteners,an amplitude-modulated
noisemasksspeechno more (and usuallyless) than unmodulatednoiseof equalspectrumand rms level.Thus no
evidencehasbeenfoundfor amplitudemodulationsof the
maskerto increase
its speech-masking
effect.The releaseof
maskingcausedby the regularsinusoidal
modulationis
considerable,
increasing
thespeechrecognition
scoresfrom
about 30% to about 80% with constant S/N ratio under
optimal
conditions.
Assuming
thatHagerman's
(1982)
psychometric
functionfor the speechtestmaterialfor normal hearingsubjects
is alsovalid for our subjects,
the observedmaximumrelease
of maskingcorresponds
to an improvementin S/N ratio of approximately3 dB. For the
irregular modulation,intended to bear a rough resemblanceto themodulationspectrumof speech,thereleaseof
speechmaskingwasrelativelysmall.Generally,the results
525
J. Acoust.Soc.Am.,Vol. 95, No. 1, January1994
EldertyNH
EldHI u.a.
EldI• aided
FIG. 10. Mean speech-to-noise
ratio for 30% correct in unmodulated
maskingnoisefor thethreegroupsof testsubjects.
NH=normal hearing,
Eld HI u.a.=elderly, heating impaired in unaidedcondition,Eld HI
aided=elderly,hearingimpairedin aidedcondition.
H.A. Gustafsson
and $. D. Arlinger:Maskingof speechb,/AM noise
525
suggestthat predictivemethodsfor speechrecognition
such as the Articulation Index may overestimatethe
speech-masking
effectof a noisecontainingsignificantamplitude modulationssincethosemethodsare basedessentially on the averagemaskerpowerin eachfrequencyband
without consideringany modulations.
Festenand Plomp (1990) usedbroadbandnoisemaskers with long-termaveragespectrumof either the male or
the femalevoiceusedfor the speechsignaland with the
intensitycontrolledby the speechsignalenvelope.The improvementin maskedspeechrecognitionthresholdthey
foundon youngnormal-hearing
subjects
wasin the range3
to 4 dB, which is of the sameorder of magnitudeas our
resultsfor sinusoidalAM. In an earlier study (Festenand
Plomp, 1986), wherethey usedsinusoidalintensitymodulation (100%) of the masker,normal-hearingtestsubjects
on averageimprovedtheir speechrecognitionthresholdin
noiseby 5.5 dB in S/N ratio for the optimummodulation
as comparedto the unmodulated
noise.Considering
the
differencebetweensinusoidalintensityandamplitudemodulation,theseresultsseemto agreewell with ours.
Festenand Plomp (1990) suggested
that comodulation maskingrelease(CMR) might at leastpartly explain
the releaseof maskingfoundin AM maskingnoise.However, the resultsof a recentstudy (Grose and Hall, 1992)
showedthat whereasCMR may occur for speechdetection, no suprathreshold
effectswere found (i.e., there was
no influenceof CMR on speechrecognitionscoresabove
0.4 Relative •requenc•
0.3
0.2
O.l
.....
S100/4.9
....
IrrlO0
--Un•od
0.0
- Max
0
+Max
FIG. 11. Amplitude histogramsfor unmodulatednoise(solid line),
regularlymodulatednoise100% (dashedline) and sinusoidally
modulated noise100%, 4.9 Hz (dotted line). The histogramswere obtainedby
samplingthe signalsat 20 kHz for about 10 s thus yieldinga total of
204 800 sampleper signal.The curvesshowthe relativeportionof sampies in each amplitudeclass. 4- Max is the input rangeof the A/D
converterusedduringthe samplingprocess.
2. The Influence of modulation degree
The releaseof speechmaskingfoundat i 6-dB modulation was much smaller than at 4-12 dB and 4-100%.
However,between4- 12 dB and 4-100% no significantdifferencesexisted for the normal-hearinggroups. (In the
groupof elderly,hearing-impairedsubjectsthesetwo modulation degreesdid producedifferentspeechmaskingeffects. This is discussed below in section on the effect of
heatingimpairment.) Theserelationsheld for both irregular and sinusoidalmodulation and are consequences
of
the modulationdepthsof the signals.Figure 4(c) shows
presenceof silent intervalsof suffientlylong durationto
allow the normal-hearinglistenerto pick up important the envelopevariationrangewith rms correctionapplied,
speechelements.This positiveeffectmustbe considerably for each modulationtype and degree.The more deeply
modulated maskersproduced the highest scoresfor the
greaterthan any negativeeffectcausedby the increasein
normal hearingsubjects.This findingis in agreementwith
masking during the louder than averageintervals of the
AM noise.
earlier studies using periodic modulation (Dirks et al.,
1969;Carhart et al., 1966). The absenceof signitieantdifferencebetween4- 12 dB and 4-100% modulationdespite
1. The influence of modulation type
the large differencein modulationdepth indicatesthat
when a certain modulationdepth is reached,very little is
The two modulationtypes, irregular and sinusoidal
gainedby increasingit further.
modulation,producedvery differentresults.This is most
The "depth of the valleys" in the modulated noise
likely explainedby the lowerprobabilityof noiseintervals
evidentlyhas a greaterinfluenceon the speechmasking
of sufficientlylow-leveland long durationto allow the lispropertiesthan the "heightof the peaks."If not, irregular
tenerto pick up meaningfulfragmentsof the speechsignal, modulation 4-12 dB would be a more effective masker than
the irregularityitselfhavingno likely effect.Recallthat the
irregular modulation 4- 100%, which is not the case.It
amplitudehistograms
differbetweenthe two typesof modseemsthat the listenercan take advantageof the lessloud
ulation,with the sinusoidal
modulationshowinghigherrelintervalsof the noiseto increasespeechrecognition,and
ativeoccurrence
of verylow and veryhigh amplitudethan
that this advantageis not counterbalancedduring the
doesthe irregular modulation (Fig. 11). In addition, each
louder noise intervals.
periodof the sinusoidalmodulationguarantees
an interval
of relativdy long duration with reducednoiseamplitude.
In contrast,when usingirregularmodulationthe intervals 3. The influence of modulation frequency
The effectof varyingthe modulationfrequencyis seen
with reducedamplitude noise are on averageof much
shorter duration. The range of variation of the instanta- in Fig. 6 which appliesto 4-12-dB modulation.The scores
neousnoise level was not very different between sinusoidal
are highestat 10.2 or 19.9 Hz and somewhatlower at 2.1
and irregular modulationfor a givendegreeof modulation and 4.9 Hz. At frequenciesabove20 Hz the scoresseemto
[Fig. 4(c)], whilethe speech-masking
effectdifferedsignif- drop off and asymptoticallyreach the samescoreas for
icantly, and therefore this characteristicis not very likely
unmodulatednoise. This general pattern seemsto agree
to contribute to the influence seen.
well with the resultsreportedby Festenand Plomp (1986)
zcro ).
It appearsthat the releaseof maskingis relatedto the
526
d. Acoust.Soc. Am., VoL 95, No. 1, January1994
H.A. Gustafssonand S. O. Arlinger:Maskingof speech by AM noise
526
usingsinusoidalintensitymodulationof the masker.The
trend is supportedby the significantdifferencesbetween
the averagescorewith 2.1 and 4.9 Hz and the average
scoreof 10.2 and 19.9 Hz as seenin all three groupsof
subjects(Figs. 5, 7, and 8).
The frequencydependence
cannotbe explainedby the
amplitudehistograms,
sincetheseare meanvaluesover a
relativelylong time (comparedto the modulationperiod)
and thusare equalfor all sinusoidalmodulations.Instead,
the speechrecognitionscoreobtainedwith a certainAM
maskermodulationfrequencyis more likely to be related
to the temporalcharacteristics
of the maskingnoise,i.e.,
the main differencebeing quantitative:The releaseof
maskingwas much lessfor the hearingi•paired. A likely
explanationfor this findingis the impairedtemporalresolution commonly found in subjectswith sensorineurai
h6aring loss (Fitzgibbonsand Wightman, 1982; Nelson
and Freyman, 1987). Baconand Viemei,.•ter(1985) noted
a reducedsensitivityto detect amplitudemodulationin
broadbandnoise(i.e., the degreeof modulationhad to be
larger than normal for detection), but the main influence
of varyingthe modulationfrequencyw•a very similarin
normal-hearing
and hearing-impaired
listeners.This finding alsoagreeswith the presentstudy.hnpairedtemporal
the number and duration of the low-level intervals that
resolutionin the subjectswith hearinglosscouldhypothetprovide releaseof maskingand high-levelintervalsthat
ically consistof severalcomponents,one beingincreased
providecompletemasking.At low modulationfrequencies post-masking
from the louderinto the les.';
loud intervalsof
(below about 5 Hz) the high-levelintervalsare relatively the maskerand anotherrelatedto a type of temporalintelongbut sparse,whichmeansthat phoneroes
and syllables gration,resultingin a reducedabilityto retrieveimportant
or even whole words may be masked,resultingin lower
speechinformationduring the shortintervalsof lessloud
masker,However, the presentdata provideno further inspeechrecognition.For frequencies
between10 and 20 Hz,
the low-level intervals are shorter but more frequent, sight into this.
which partly counterbalances
the lossof informationdurIn the groupof elderlyhearing-impaired
subjectsthe
ing the higherlevelintervalsand thusspeechrecognition patternof unmaskingdid differ somewhatalso qualitaincreases.
At higherfrequencies(aboveabout30 Hz), the
tively from that seenin the normal-hearinggroupswith
regardto modulationtype:Sinusoidalmodulationof 4-12
low-levelintervalsare very frequentbut too shortto aid
speechrecognition,which again decreases.
dB gave rise to more releaseof speechmaskingthan
4- 100% for all four modulationfrequencies
tested(Figs. 8
An alternativeor additional interpretationof the increasedspeechmaskingeffectof the AM noiseat higher and 9 and Table V). Figure4 offersno evidentexplanation
modulationfrequencyis in termsof auditorytemporalres- for this findingin terms of range of variation of instantaexplaolution. The limited temporalresolutionof the auditory neoussoundpressurelevel. We haveno reasonable
nation
to
offer
for
this
particular
result.
systemmakesthe maskerenvelopevariationslessaudible
and lessusefulwith regardto the collectionof informative
fragmentsof the speechsignalasthe modulationfrequency C. The effect of age
is increased.
Considering
the curveshownin Fig. 6 asrepIn additionto the effectof heatingimpairmenton the
resentinga low-passfunction,the cutofffrequencyof this
degreeof unmasking
providedby amplitudemodulationas
low-passfunctioncan be estimatedto be about40 Hz. This
discussed
above,our studyalsoshowedlargedifferencesin
agreeswell with the approximately50 Hz found by ViS/2q requiredby the subjectgroupsto reachthe criterion
emeister(1979) and Formby (1985) in experimentsconscoreof 30% correctin unmodulatednoise(Fig. 10). One
cerningthe detectionof sinusoidalamplitudemodulation unexpectedlylarge differencewas that betweenthe groups
of a wideband
noise.
Therefore,a likely contributingfactor is impairedspectral
of youngandof elderlynormal-hearing
subjects.
As shown
in Fig. 1 the averageaudiogramof the elderlygroupwas
essentiallynormalwith mean hearingthresholdlevelsof
only 17 dB at 4 kHz and 24 dB at 8 kHz. Applying the
ArticulationIndex to theseaveragehearingthresholdlevelsandthespectrum
of thespeech
andthenoiseusedin the
study,the averagehearinglossof the groupaccountedfor
lessthan 2 of the 6-dB groupdifference.'['hemostreasonable explanationfor the remainingdifferenceof 4 dB is
age-relateddifferencesin auditory signal processingbetween the two subjectgroups.Whether thesedependon
peripheralor centralauditorypathwaysis not possibleto
infer from the presentstudy.
The questionof whetherageas suchhasany effecton
speechrecognitionthroughaging in central pathwaysor
resolution,anothertype of distortionstronglycorrelated
with sensorineural
hearingloss(Ludvigsen,1985).
The generaleffectof releaseof speechmaskingby amplitude modulationof the maskingnoisefound amongthe
elderly with hearing impairment was qualitatively very
similar to the findingsin the normal-hearinggroupswith
encesin peripheralauditoryfunctionhasbeendiscussed
in
severalstudies.Poulsenand Keidser ( 1991) found no age
effectin speechrecognitionin noisewhen usingmonosyllabic testwordswith carrier phraseand a four alternative
forcedchoiceparadigm.Using sentencematerial in noise
B. The effect of hearing impairment
The most striking effect of the hearing impairment
foundwastheneedfor speech-to-noisc
ratioson average11
to 12 dB betterthan the normal-hearing
groupof the same
agein orderto reachthe criterionperformance
usedin the
study,i.e., 30% of the speechtest materialcorrectin unmodulatedmaskingnoise.A partial explanationfor this
findingis that the higher frequencyrange of the speech
signal was inaudible due to the subjects'high-frequency
lossandinsufficient
hearing-aidgainat frequencies
above2
kHz. However, Articulation Index calculations can account for no more than 4 dB of this difference in S/N.
527
J. Acoust.Soc. Am., Vol. 95, No. 1, January1994
whether all differences found can be attributed to differ-
H.A. Gustafssonand S. D. Arlinger:Maskingof speech by AM noise
527
and a variety of cognitivetests,van Rooij and Plomp
(1992) concludedthat age-relatedchangesin cognitive
ability,relatedto speechrecognitionin noiseare very unlikely;"agedifferences
with respectto speechperception
are mostlikelydueto differences
in audifive factors,notably differences
in auditorysensitivity."However,Dubno
et al. (1984) showedthat the choiceof speechmaterialhas
a clearinfluenceon agedifferences
in speechperception.
For bisyllabietestwordsand sentences
with high predictabilitytheyfounda difference
in speech-to-noise
ratioof 2
dB betweena youngerandan elderlygroup,whereaswith
low-predictabilitysentences
the differenceincreasedto 4
dB. Glasbergand Moore (1990) founda statisticallysignificantcorrelationbetweenspeechrecognitionin noise
andagewhenpartiallingouttheaverage
puretonehearing
loss,and concludedthat "ageemergesas a significant
predictorof the abilityto understand
speechin noise."
Distortedspeechof varioustypeshas beenusedin
clinical tests of central auditory dysfunction.KorsanBengtsen(1973) foundstatistically
significant
differences
betweenyoungand elderlynormal-hearing
listenersusing
interruptedspeech(4, 7, and 10interruptions
persecond)
and time-compressed
speech.Her elderlygroupwasvery
similarto ours:Twentysubjectsaged50-60, otologically
andneurologically
normalwith an averagehearingthreshold level at 4 kHz of 20 dB HL (our group 17 dB). Rodriguezet al. (1990) alsoevaluated
a groupof elderlysubjectswith closeto normalpure tone audiograms.
Their
resultsindicatecentralauditoryinvolvementrelatedto age
that can occurwithout concomitantdeclinein peripheral
hearingsensitivity,
cognitivefunctionor linguisticcompetence.
The speech
testmaterialusedin thepresentstudyis of
the low predictability
category.The taskof repeating
the
five word sentences
is also sensitiveto changesin shortterm memoryfunctions.The conclusion
from our study
remains:Differences
in auditorysensitivityexplainsonly a
part of the difference
in speech-to-noise
ratio foundbetweenyoungandelderlynormal-hearing
listeners.
The remainingpart of the difference
is mostlikely dueto aging
effectswhichare not reflectedby the puretoneaudiogram.
IV. CONCLUSIONS
Speechrecognitionis generallybetter in amplitudemodulatedmaskingnoisethan in unmodulatednoiseof
equalrms soundpressurelevel.The unmaskingdue to
amplitudemodulationincreases
with increasing
modulation depth.For sinusoidal
modulation,modulationfrequencies
in the 10-to 20-Hz rangeprovidethe largestreleaseof masking.
The influenceof the modulationparameterson the
releaseof speechmaskingwassimilarfor youngand elderly subjectswith normalhearing.However,elderlysubjects with impairedhearingshowedsignificantly
smaller
releaseof speechmaskingas a consequence
of amplitude
modulatingthe maskingnoise.With irregularAM no unmaskingat all was recorded.No significant
differences
werefoundwhenthis groupof listenerswas testedwith
and without hearingaid.
For all typesof maskers
used(unmodulated
aswellas
amplitude-modulated
noise),therewereeffects
of bothage
andauditoryfunctionon the speech-to-noise
ratiorequired
to reacha criterionspeechrecognitionscore.
ACKNOWLEDGMENTS
The financialsupportby the NationalSwedishEnvironmentalProtectionBoard is gratefullyacknowledged.
Valuable statistical advice was receivedby E. Leander,
Ph.D., and E. Enqvist,Ph.D. Carl Ludvigsen,M.Sc., provided valuablehelp with measurements
and a computer
programfor articulationindexcalculations.
Arlinger,S., Kinnefors,
C., andJerlvall,L. (1988). "A comparison
betweenPOGO and conventional
hearingaid fitting,"in HearingAid
Fitting,editedby J. Hartrig Jensen(Proc. 13thDanavoxSymposium,
Copenhagen),pp. 97-113.
Bacon,S. P., andViemeister,N. F. (1985). "Temporalmodulationtransfer functions
in normal-hearing
andhearing-impaired
listeners,"
Audiology24, 117-134.
Berry,R. C., andNerbonne,O. P. (1972). "Comparison
of the masking
functionsof speech-modulated
and whitenoise,"J. Acoust.Soc.Am.
51, 121 (A).
Carhart,R., Tillman, T. W., andJohnson,K. R. (1966). "Binauralmask-
ingof speech
by periodically
modulated
noise,"J. Acoust.Soc.Am. 39,
1037-1050.
Carhart, R., Tillman, T. W., and Johnson,K. R. (1967). "Releaseof
masking
forspeech
throughinteraural
timedelay,"J. Acoust.Soc.Am.
42, 124 (A).
Carhart, R., Tillman, T. W., and Greetis, E. $. (1969). "Perceptual
maskingin multiplesoundbackgrounds,"
J. Acoust.Soc.Am. 45, 694-
D. Effect of wearing hearing aids
703.
For the groupmeanresults,no significantdifferences
werefoundbetweenaidedand unaidedconditions(Figs. 8
and 9). In a previousstudy,Arlinger et al. (1988) found
that aidedspeechrecognitionthresholdsin noisewere superiorto unaidedfor subjects
usingIn-The-Canalhearing
aids.The findingin the presentstudyof a tendencytoward
the oppositeresultwasnot expected,and may be at least
partiallyexplainedby the relativelypooraveragehearing
aid gain at frequencies
above2 kHz (Fig. 3). The articulation indexpredicteda slightlybetterunaidedrecognition
of the speechat the elevatedlevel than the aidedrecognition at normalspeechlevel,just as wasfoundin the experiment reportedhere.
528
J. Acoust.Soc. Am., Vol. 95, No. 1, January1994
Carhart, R., JohnsonC., and Goodman,J. (1975). "Perceptualmasking
of spondees
by combination
of talkers,"J. Acoust.Soc.Am. Suppl.1,
$8, S35.
Danhauer,J. L., and Leppler,J. G. (1979). "Effectsof four noisecom-
petitorson the Californiaconsonant
test,"J. SpeechHear. Disord.44,
354-362.
Danhauer,J. L., Doyle, P. C., and Lucks,L. (1985). "Effectsof noiseon
NST and NU 6 stimuli," Ear Hear. 6, 266-269.
Dirks, D. D., Wilson,R. H., and Bower,D. R. (1969). "Effectof pulsed
maskingon selected
speechmaterials,"J. Acoust.Soc.Am. 46, 898906.
Dubno,J. R., Dirks, D. D., and Morgan,D. E. (1984). "Effectsof age
and mild hearinglosson speechrecognitionin noise,"J. Acoust.Soc.
Am. 76, 87-96.
Fastl,H. (1987). "A background
noisefor speechaudiometry,"Audiol.
Acoust. 26, 2-13.
H.A. Gustafssonand S. D. Arlinger:Maskingof speechby AM noise
528
Festen,$. M., and Plomp, R. (1986). "The extra effectof maskerfluctuationson the SRT for hearing-impaired
listeners,"Proc. 12th Int.
Congr.Acoust.,Toronto,Vol. 1, PaperB!I-4.
Festen,J. M., and Plomp R. (1990). "Effectsof fluctuatingnoiseand
interferingspeechon the speech-reception
thresholdfor impairedand
normalhearing,"J. Acoust.Soe.Am. 88, 1725-1736.
Fitzgibbons,
P. J., and Wightman,F. L. (1982). "Gap detectionin normal and hearing-impaired
listeners,"
$. Acoust.Soc.Am. 72, 761-765.
Formby,C. (1985). "Differentialsensitivityto tonal frequencyand to
rate of amplitudemodulationof broadbandnoiseby normallyhearing
Gelland, S. A., Ross,L., and Miller, S. (1988). "Sentencereceptionin
noisefromoneversustwo sources:
Effectsof agingandhearingloss,"J.
Acoust. sec. Am. 83, 248-256.
Glasberg,B. R., and Moore,B. C. J. (1990). "Psychoacoustie
abilitiesof
subjectswith unilateraland bilateralcochlearheatingimpairmentsand
their relationshipto the ability to understandspeech,"Searid.Audiol.
Suppl. 32.
Gruse,I. H., and Hall, J. W. (1992). "Comodulationmaskingreleasefor
speechstimuli,"J. Acoust.see. Am. 91, 1042-1050.
Hagerman, B. (1982). "Sentencesfor testingspeechintelligibility in
noise," seahal.Audiol. 11, 79-87.
Horii, Y., House,A. S., and Hughes,G. W. (1970). "A maskingnoise
with speech-envelope
characteristics
for studyingintelligibility,"J.
Acoust. So:. Am. 49, 1849-1856.
ISO 389 (1991). "Acoustics--Standardreferencezero for the calibration
of pore-toneair conductionaudiometers,"
InternationalOrganization
for Standardization,Geneva.
ISO 7029 (19M). "Acoustics---Threshold
of hearingby air conductionas
a functionof ageandsexfor otologically
normalparsons,"
International
Organizationfor Standardization,
Geneva.
Korsan-Bengtsen,
M. (1973). "Distorted speechaudiomerry,"Acta Otolaryngol.(Sto•kh.), Suppl.310.
Ludvigsen,C. (1985). "Relationsamongsomepsycho•.oustic
parame-
J. Acoust. Soc. Am., Vol. 95, No. 1, January 1994
Miller, G. A., and LickliderJ. C. R. (1950). "The intelligibilityof interruptedspeech,"$. Acoust.Soe.Am. 22, 167-173.
Moller, A. R. (1971). "Unit responses
in the rat cooblearnucleusto tones
of rapidlyvaryingfrequencyand amplitude,"Acta Physiol.seand.81,
540-556.
Nelson, D. A., and Freyman, R. L. (1987). "Temporal resolutionin
sensorineural
hearing-impaired
listeners,"J. Acoust.Soc.Am. $1, 709720.
Plomp,R. (1986). "The hearingaid and the noiseproblem,"Hear. In-
listeners,"J. Aeoust. Soe. Am. 78, 70-77.
529
tersin normaland cochleaflyimpairedlisteners,"J. Acoust.Soc.Am.
78, 1271-1280.
strum. 37(6), 28-32.
Poulsen,T., and Keidser,G. {1991), "Speechrecognitionin youngand
elderlynormal-hearinglistenersin a closedrespoasetest," Scand.AudioL 20, 245-249.
Rodriguez,G. P., DiSarno,N.J., and Hardiman,C. J. (1990). "Central
auditory processing
in normal-hearingelderly adults," Audiology29,
85-92.
Tillman, T. W., Carhart, R., and Olsen, W. O. (1970}. "Hearing aid
efi%iencyin a competingspeechsituation,"J. SlouchHear. Res. 13,
789-811.
van Rooij,J. C. G. M., and Plomp,R. (1992), "Auditire and cognitive
factorsin speechperception
by elderlylisteners.
IlL Additionaldata
and final discussion,"$. Acoust. Soc. Am. 91, 1028-1033.
Viemeister,
N. F. (1979). "Temporalmodulationtr• nsferfunctions
based
uponmodulationthresholds,"
J. Acoust.Soc.Am. 66, 1364-1380.
Welzl-Muller,K., and Stephan,K. (1988). "Speechrecognition
performancewith hearing-aidin noiseless
and noisyconditions,"AudioL
Akustik 27, 108-119.
Wilson, R. H., and Carhart, R. (1969). "Influenceof pulsedmaskingon
the thresholdfor spondees,"
J. Acoust.Soe.Am. 46, 998-1010.
Young,L. L., Jr., Parker,C., and Carhart,R. (1•'5). "Effectiveness
of
speechand noisemaskerson numbersembedde:l
in continuous
discourse,"J. Acoust.Soe.Am. Suppl. I 58, S35.
H.A. Gustafssonand S. D. Arlinger:Maskingof speech I:,yAM noise
529
© Copyright 2026 Paperzz