A comparison of three speech coding strategies using an acoustic

A comparison of three speech coding strategies using an acoustic
model of a cochlear implant
P.J. Blamey, L. F.A. Martin,and G. M. Clark
Departmentof Otolaryngology,
University
of Melbourne,TheRoyal VictorianEye and Ear Hospital,32
Gisborne Street, East Melbourne, Victoria 3002, ,4ustralia
(Received17 May 1984;acceptedfor publication4 September1984)
Threealternativespeechcodingstrategies
suitablefor usewith cochlearimplantswerecompared
in a studyof threenormallyhearingsubjectsusingan acousticmodelof a multiple-channel
cochlearimplant.The first strategy(F 2) presentedthe amplitudeenvelopeof the speechand the
secondformant frequency.The secondstrategy(F0 F2) includedthe voicefundamental
frequency,and the third strategy(F 0 F 1 F 2) presentedthe first formant frequencyas well.
Discourseleveltestingwith the speechtrackingmethodshoweda clear superiorityof the
F0 F 1F2 strategywhenthe auditoryinformationwasusedto supplementlipreading.Tracking
ratesaveragedoverthreesubjects
for nine 10-minsessions
were40 wpmforF 2, 52wpmforF 0 F 2,
and 66 wpm for F0 F 1 F2. Vowel and consonantconfusionstudiesand a test of prosodic
informationwerecardedout with auditoryinformationonly. The voweltestshoweda significant
difference
between
thestrategies,
butnodifferences
werefoundforthe•ther tests.It was
concludedthat the amplitudeand durationcuescommonto all three strategiesaccountedfor the
levelsof consonant
andprosodicinformationreceivedby the subjects,
whilethe differenttracking
rateswerea consequence
of thebettervowelrecognitionandthe morenaturalqualityof theF0 F 1
F2 strategy.
PAC
Snumbers:
43.66.Ts,
43.71.Ky,
43.66.Sr
INTRODUCTION
In recentyears,severalresearchcentershaveachieveda
usefullevel of speechcommunicationfor profoundlydeaf
cochlearimplant patientsusinga varietyof speechprocessing strategies.
A representative
collectionof relevantstudies
appearedin Parkinsand Anderson{1983}.Someof the studiesincludedcomparativedata for severalalternativespeech
processingstrategies.The interpretationof comparative
studiesis often complicatedby a number of practical difficulties:{a}The numberof availableimplantpatientsis small
{usuallyonly one or two). {b}The implanteddeviceimposes
limitationsonthe stimulationpatternsthat canbeproduced.
{c}The amountof trainingunderdifferentconditionsvaries,
with patientsoftenbeingtestedwith a versatilespeechprocessorin the laboratorywhile usinga differentportabledeviceoutside.{d}The resultsdependon uncontrolledfactors
affectingthe patients'cochlearpathology,suchasthe number and frequencydistributionof survivingauditory neurons.A comparisonof resultsfrom differentresearchcenters
is evenmore difficultbecauseof the wide variationamong
patientsfrom the samecenter,as well as the differenttraining procedures,test conditions,and test materialsthat are
used.
In thisconfusingsituation,thereis a needfor comparative studiesof speechcodingschemesunder more strictly
controlledconditions.Oneway of achievingthisis to usean
acousticmodelor simulationof a cochlearimplantwith normally hearinglisteners.If the validity of the model can be
established
for a sufficientlywide rangeof speechcoding
schemes,all of the factorslisted abovemay be controlled
withoutdifficulty.An acousticmodelfor a multiple-channel
209
J. Acoust.Soc. Am. 77 (1), January 1985
.
cochlearimplantwasestablished
by matchingthe resultsof
identicalpsychophysical
testsusingelectricalstimulationof
cochlearimplant patientsand acousticstimulation of normally hearingsubjects(Blarneyet al., 1984a}.The psychophysicaltasksincludedpulserate differencelimen measurements,pitchscalingasa functionof pulserate,pitchscaling
as a function of electrodeposition,and a multidimensional
scalinganalysisof the dissimilarities
of stimuli differingin
pulse rate or electrodeposition or both. A match was
achievedby a suitablechoiceof theacousticstimulusparameters.The equivalentperformanceof cochlearimplant patients and acousticmodel subjectsusingthe samespeech
codingschemewas confirmedfor a wide range of speech
tests(Blarneyet al., 1984b).A numberof acousticanalogsof
single-channelelectrical stimulation of the auditory nerve
havebeenusedby otherauthors(Rosenetal., 1981;Risberg,
1974; Risberg and Lubker, 1978; Risberg and Agelfors,
1982}.Thesemodelshavebeenusedto comparespeechcoding schemes
that usevoicefundamentalfrequency(F 0}, the
amplitudeenvelope,and lipreadinginformationin different
combinations{reviewedby Summerfield,1983}.Thesestudieshaveshownthat F 0 andamplitudeenvelopeinformation
can effectivelysupplementlipreadingin speechtaskswhere
the recognitionof syntacticstructure,word stress,andjuncture is involved.To producea worthwhileimprovementof
lipreadingscoresin singleword or nonsense
syllabletests,or
to provideenoughinformationfor speechrecognitionwith-
out lipreading,somecodingof higher-frequency
information from the speechsignalis necessary.
Summerfield(1983}
suggeststhat the frequenciesof the front cavity resonance
and the first formant are suitableparametersto use.
0001-4966/85/010209-09500.80
@ 1985 AcousticalSocietyof America
209
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
Thisstudycompared
threespeech
processing
strategies
that couldbe usedwith the multiple-channelcochlearimplant producedby NucleusLimited {Crosbyet al., 1983}.
Eachstrategyincludedinformationapproximating
the secondformantfrequencyandeachstrategyusedthe sameamplitudeenvelope
derivedfromthe speechsignal.The strategiesdifferedin the informationthey presented
and in the
manner in which this information
was encoded: The F2
strategyencodedthe secondformant frequencyas a pulse
ratein therange50to 300ppsviaa singlechannel.TheFOF2
strategyencodedthe fundamentalvoicefrequencyasa pulse
rate in the range50 to 300 pps via one of eight channels
determinedby thesecondformantfrequency.TheFOF 1 F2
strategywasthe sameasthe F0 F2 strategy,with the additionalinformationencodedby excitationof a secondchannel
determinedby the F 1 frequency.The F0 F2 strategywas
verysimilarto the strategyusedin the wearablespeechprocessorproducedby NucleusLimited {Tonget al., 1980,
1982,1983}.TheF2 strategyisa simplification
of theF0 F2
strategy,usingonlyonechannel.Strategies
similarto theF2
strategyhavebeentestedwith cochlearimplantpatientsby
Atlas et al. {1983}and Dillier et al. {1983}.The F0 F 1 F2
strategyis the logicalextensionof theF0 F 2 strategyto inelude first formant frequencyinformation.This extension
wasexpectedto producean improvementin the speechrecognitionscoresand to increasethe "naturalness"of the'encodedsignal.Speechsynthesizers
andvocoders
basedonformant extraction have been studied extensively in the
literature {for example,Flanaganand Rabiner, 1973}.Remezet al. {1981}haveshownthat it is possible
to codespeech
usingfrequencymodulationof two or more tonesto representthe formants.They demonstrated
that the intelligibility
and naturalnessof codedspeechincreasedasthe numberof
formants increased from one to two and from two to three. It
is hopedthat thisresultwill hold true for cochlearimplants
when formant frequenciesand amplitudesare codedin a
similar fashion.
TABLE I. Details of the mappingfrom formantfrequencyrangesto acoustic modelfilter frequencies
for the threedifferentspeechcodingstrategies.
TheF 1andF 2 frequencyestimates
werederivedfromzerocrossings
of two
bandpass-filtered
speechsignals.The acousticmodel filtersof fixed frequencyandbandwidth{equalto 40% of thecenterfrequency}
wereselected
accordingto the table.All frequencies
are in Hz.
F2 range
for the
F2 range
for theF0 F2
and F0 F 1 F2
F2 strategy
strategies
3300
F 1 range
for the
F0 F 1 F2
strategy
A. The acoustic
model
The •cousticmodelwasbasedontheresultsof psychophysicaltaskscarriedout usingelectricalstimulationof deaf
patientsusingtheUniversityof Melbournecochlearimplant
{Clarket al., 1977}.The auditorynervesof the patientswere
electrically stimulatedwith biphasiccurrent pulsesproducedby an array of electrodesspacedat 1.5-mmintervals
aroundthebasalturn of thescalatympani.In a normalcochlea, the nerve fibersin this region have characteristicfrequenciesbetweenabout 1 and 15 kHz.
The model used amplitude modulated white noise
burststo representcurrent pulses.A 50% duty cycle was
used and the amplitude was varied smoothly to avoid
"clicks" that might havebeenassociated
with discontinuitiesin the amplitudeenvelopeof eachpulse.The noisebursts
were bandpassfiltered with a different filter to represent
eachelectrode.Each filter was a simpletwo-polebandpass
configurationwith a bandwidth of 40% of the center frequency.The centerfrequenciesof the filters usedfor each
speechcodingstrategyareshownin TableI. TheF 2 strategy
210
J. Acoust.Soc. Am., Vol. 77, No. 1, January 1985
filter center
frequency
10 880
4400
7880
24(X}-3300
1650-2400
5710
1450-1650
4140
1250-1450
3000
1050-1250
2170
1570
850-1050
0-850
1140
820-1000
680-820
830
540-680
600
4(X)-540
430
0-400
320
useda singlefilter centeredat 1140Hz. The F0 F2 strategy
usedeightfiltersequallyspacedon a logarithmicscalefrom
1140to 10 880 Hz. The F0 F 1 F2 strategyusedfour additionalfilterswith logarithmicfrequencyspacingfrom 320to
830 Hz. The frequencyspacingand bandwidthof the filters
werechosento approximatethe spatialdistributionof nerve
fibersexcitedby differentelectrodesin the cochlea.According to the frequency-place
map of Greenwood(1961}, the
lowestfrequencyfilter correspondsto a positionabout 27
mm insidethe roundwindow.This positionis the mostapical accessiblewith the electrodearraysnow in usein Melbourne{Shepherdet al., in press}.
The psychophysicaltasks on which the model was
baseddid not includeany studiesof the interactionsthat may
occur when two electrodes are stimulated
I. METHODS
Acousticmodel
at once. In evalu-
atingtheF0 F 1F 2 strategy,it wasassumed
that theinteractionsbetweenpulseson differentelectrodeswouldbesmallif
the pulseswerenoncoincident
in time. Initial investigations
{TongandClark, 1984}haveshownthat noncoincident
stimulationof closelyspacedelectrodes
leadsto incompleteloudnesssummationbut thereis little interactionbetweenwidely
spacedelectrodes.The noise bursts used for the acoustic
modelcodingof F 1 andF2 had nonoverlapping
amplitude
envelopes,
with a duty cycleof 30% for eachof the two filters
used.No attempt wasmadeto compensatefor any masking
effectsthat may havebeenpresent.
B. Speech parameter estimation
The speechparametervaluesusedwere derivedfrom
the acousticspeechsignalby a hardwiredspeechprocessor
operatingin real time. Every 5 ms, the speechparameters
weredigitizedandreadby a minicomputerwhichcontrolled
the acousticsynthesizerthat producedthe filtered noise
burstsrequiredfor the acousticmodel. In the caseof live
voicetesting,the speechsignalwaspickedup with a Superscopecardioid condensermicrophone30 cm from the
Blamey eta/.' Three speech codingstrategies
210
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
speakerin a quietroom.In the caseof the recordedtests,the
tape recorderoutput was connecteddirectlyto the speech
Departmentof Otolaryngology
betweenthe agesof 20 and
35.Noneofthemhadbeena subject
forspeech
testing
with
the acousticmodelbeforethis studycommenced.
The subAll threestrategies
usedthe sameamplitudeenvelope jectswill bedesignated
D, E, andF to distinguish
themfrom
which represented
the compressed
overallacousticsignal A, B, andC whoweresubjects
in a previousacousticmodel
amplitudeat theoutputoftheautomaticgaincontrol(AGC} study(Blameyet al., 1984b).SubjectD wasan audiologist
of the speechprocessor.
The amplitudeenvelopepresented whohadpreviouslyadministeredthelive voiceteststo cochcovereda rangeof 30 dB.
lear implant patientsbut was unfamiliar with the two reThe firstand secondformantfrequencies
wereestimat- cordedteststhatwereused.Subjects
E andF playedmusical
edby zero-crossing
detectors
at theoutputsof twobandpass instruments
butotherwise
hadnospecial
trainingrelevantto
filterscoveringfrequencyrangesof 300Hz to 1kHz and 800
this study.
Hz to 4 kHz, respectively.
The slopeof the low frequency
skirt of the higherfrequencyfilter was adjustedto ensure E. Training and testing procedures
thatthespectralpeakcorresponding
to F 2 wasgreaterthan
All trainingand testingsessions
were conductedwith
that corresponding
to F1 for the full setof vowelsfor male
the subjectin a soundproof
room.Duringlive voicetesting,
andfemalespeakers.
Thisemphasis
ofthehigherfrequencies the speakerwas outsidethe room and a black and white
issimilarto thatrequiredbySummerfield's
{1983}characterclosedcircuittelevisionmonitorwasusedto presentthe viizationof thefrontcavityresonance
as"the majorspectral sual signalfor lipreadingwhen required.Recordedtests
peakin thefrequency
rangefrom800Hz to 8 kHz in spectro- were conductedwith the tape recorderoutsidethe soundgramsof speechproducednaturally but boostedby a 6
proofroom.The acousticmodelsignalwaspresented
to the
dB/octavelift." Strictlyspeaking,
thezerocrossing
frequen- subjectsbinaurally through SennheiserHD 410 headciesmay not be interpretedasthe firstand secondformants
phones.The live voice testingwas conductedby a male
for all phonemesbut this nomenclatureis usedherefor simspeakerwith a mild Australianaccent.The recordedmateriplicity.All threestrategies
usedthesameF 2 extractionhardal useda differentAustralianmale speakerand a female
ware.
speakerwith an Englishaccent.
Carewastakento provideeachsubjectwith equalopC. Speech coding strategies
portunitiesfor trainingwith eachstrategy.ThiswasparticuThe F2 strategyencodedthe secondformantfrequency larly important,asthe subjects
continuedto improvetheir
asa pulserate.The ratewasproportionalto the logarithmof
scoresfor specifictasksovera longperiodof time. Rather
F 2 suchthat 1 kHz to 4 kHz wasmappedontothe range50
thantrain the subjects
until an asymptoticperformancewas
to 300 pps.The F 0 F2 strategyencodedthe secondformant
reached,we haveuseda balancedexperimentaldesignin
frequencyby the choiceof filter to be usedfor each noise
which the speechprocessing
strategieshave been treated
burst.The F2 frequencyrangescorresponding
to the eight
equivalently.
filters are shown in Table I. It should be noted that this
The initial trainingwasconducted
usingthe methodof
schemeinvolveda substantialupwardtranslationof the sectrackingdescribedby De Filippo and Scott{1978).In the
ond formant frequency.The pulserate usedto excitethe
tracking method, the speakerread from a text which was
filter wasa linearfunctionof the fundamentalvoicefrequenunseenby the subject.The subjectwasrequiredto repeat
cy suchthat 50-400 Hz wasmappedontothe range75-300
exactlythe wordsof the text. Errors were resolvedby an
pps.Unvoicedsoundswere encodedas a fixed 50 pps rate.
interactiveexchangebetweenthe speakerand the subject,
The fundamentalfrequencyof voicedsoundswasmeasured usinga varietyof strategies
includingrepetition,rephrasing,
usinga peakdetectionalgorithmon the speechwaveform. segmentation,and spelling or identification of individual
This methodworkedwell for speechwith a goodsignalto
phonemes
as a last resort.The subjecthad to repeatevery
noiseratio,asin all casesreportedhere.TheF OF 1F2 stratewordbeforemovingon. The numberof wordscorrectlyregy wasidenticalto theF0 F 2 strategyexceptthat two filters
peatedin a fixedtime wasusedasa measureof performance.
were excited at the same rate. The number of filters was
Half-hourtrainingsessions
wereheldapproximately
twice
increased to 12 and the lower five were used to encode F 1
weekly for each subject.In every session,10 minutes of
whilethe uppereightwereusedto encodeF 2 asin theF 0 F 2
trackingwith eachstrategywasgivenin thelipreadingplus
strategy.TheF 1 andF 2 frequencyrangescorresponding
to
hearing(LH} condition.The orderof strategies
wasrandomthe 12 filters are shown in Table I. Each of the two filters was
ized differentlyfor eachsubjectand for eachsession.
A total
excitedwith thesamepulserateandamplitudelevel,but not
of 12sessions
wascompletedbeforetheothertestswerecomsimultaneously,
sothat the pulseswerenoncoincident
with a
menced.
processor.
duty cycleof 30% on eachchannel.It was shownin an ear-
lier paper{Blameyet al., 1984a}that varyingthe duty cycle
between 30% and 50% did not affect the difference limens
for pulseratechanges.
D. Subjects
Threenormallyhearingsubjects
tookpartin theacoustic modeltesting.All threeweremembersof the staffof the
211
J. Acoust.
Soc.Am.,Vol.77,No.1,January1985
The next four sessions
for eachsubjectweretakenup
with vowel and consonant confusion studies conducted live
voicein the hearingalone{HA} condition.The elevenAustralian vowelswere presentedin an/hi-/d/context,
the
words being heed, who'd, heard, hard, hoard, hid, head,
had,
hud,
hood,
hod.
The
twelve
consonants
/p,t,k,b,d,g,m,n,s,z,v,f?were presentedin an intervocalic
/a/-/a/context.
Each test consisted of four utterances of
Blarney
eta/.: Threespeechcoding
strategies
211
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
eachstimulusin randomorder.The subjectsweregivena list
of the possiblestimuliand askedto indicatewhichstimulus
was presentedon eachtrial. Feedbackindicatingwhether
the response
wascorrector incorrectwasgivento maintain
the subject'sattention.The subjectwasnot informedof the
correctresponsein the caseof incorrectlyidentifiedstimuli
becauseof the difficultyof providingunambiguouscorrection via the acousticmodel.A morecomplicatedvisualfeedbacksystemcouldhavebeendevisedbut theauthorsfelt that
this wasnot necessary
for this experiment.
The next two sessions
were usedfor presentationof a
recordedtest of prosodyin the HA condition(Atkinson,
1976).The subjectwas askedto indicatewhich one of six
alternative
z
Sublect
O. Test
no
Sub)ect
E.
Test
no
Subject
F. Test
no
(b)
o
versions of the sentence "Bey loves Bob" was
presented.The sixalternativesconsisted
of thecombinations
of statementor questioncombinedwith stresson oneof the
three words. Each test contained four different utterances of
eachsentencespokenby a singlespeaker,randomizedin a
differentorder.The subjectwasgivena list of stimuliandthe
correctresponsewasindicatedto the subjectafter eachtrial.
In eachof thesesessions,
a Latin squaredesignwith subjects,
strategies,and order of presentationasthe factorswasused.
The testlistsin the firstandsecondsessions
werespokenby a
maleanda femalespeaker,
respectively.
The subjects
had
not heardthe voicesof thesespeakersvia the acousticmodel
speechprocessorprior to thesetests.
The final six sessions
were usedfor presentationof a
recordednonsense
syllabletest{Dubnoand Dirks, 1982)in
the HA condition.The samemaleandfemalespeakersand a
Latin squaredesign,as for the prosodytestwereused.The
testconsisted
of elevenseparatesetsof consonants
in a VC or
CV context where the vowel was/a/,/i/,/u/,or/•/.
The
only differencebetweenthe procedureof Dubno and Dirks
and the oneusedherewasthe replacementof/a/by/•/for
one list since/as/has an offensivemeaningin Australian
Subject D. Test no
Subject E. Test no
Subject F. Test no
FIG. 2. Resultsof the vowelconfusionstudy(a)and consonant
confusion
study(b)for thethreeacousticmodelsubjects
usingtheF2 strategymarked
withA. TheF0 F2 strategy
markedwithO, andtheF0 F 1 F2 strategy
markedwith D. For the vowel study,the maximumnumbercorrectis 44
andthe chancescoreis4. For theconsonantstudy,the maximumscoreis48
and the chance score is 4.
English.Each set of stimuli containedeither voicedor unvoiced consonants but not both. A total of 23 consonants was
usedin the testcomparedwith 12 for the consonantconfusionstudyabove.
II. RESULTS
A. Speech tracking
The speechtrackingratesachievedby eachsubjectare
shownin Fig. 1 togetherwith the resultsof four patients
usingthe Nucleus Limited 22-electrodecochlearimplant
with theF 0 F 2 strategy(Dowelletal., in press.)The acoustic
modelresultsfor the F 0 F 2 strategyare in accordwith the
bestimplant patients'results.
All of the datashowimprovements
asthe trainingprogressed,
althoughsomeof the patientsimprovedmoreslowly than the normally heatingsubjects.An analysisof variance of the acoustic data was carried out with a mixed model
1
2
3
4
Subject
5
8
D.
Test
7
8
9
10 11 12
!
Session
2
3
4
SuO)ect
5
6
E.
Test
7
8
9
lO l!
Session
X
x
$xx • •
ß+• ß
ß
+
+
o
Su13]ect
F.
Test
Session
Implant
Patients.
Test Session
FIG. 1. Speechtrackingrates(in wordsper minute)for the acousticmodel
subjects
usingtheF 2 strategymarkedwith A, theF0 F2 strategymarked
with O, and the F0 F 1 F2 strategymarkedwith D. Trackingratesfor the
four bestpatientsfrom a clinicaltrial of eight cochlearimplant patients
usingtheF0 F2 strategyareshownfor comparison
(Dowelletal., in press).
All resultsare for lipreadingplushearing.
212
J. Acoust.Soc. Am.,Vol. 77, No. 1, January1985
usingstrategiesas a fixed effectand subjectsas a random
effect. The results from the first three sessions were excluded
from the analysisto minimizethe effectof the initial rapid
improvementof the scores.There was a significantdifferencebetweenthe strategies,as shownby the F ratio for the
strategiesmain effect,usingthe subjectsby strategiesinteractionastheerrorterm:F (2,4)-- 29.3,p < 0.01.Therewasa
significantdifferencebetweenthe subjectscomparedwith
the residualerror:F{2,72) -- 23.1,p < 0.01. The interaction
between strategies and subjects was less significant:
F{4,72) -- 3.25,p <0.05. Figure 1 clearlyshowsthat the F0
F 1 F 2 strategywassuperiorto the othersand that for subjectsE andF theF 0 F 1strategywassuperiortotheF 2 strate-
gy.Subject
D performed
aboutequallywellonthistaskwith
thesingle-channel
F 2 strategyandthemultiple-channel
F0
F 2 strategy.
Blarneyeta/.' Threespeechcodingstrategies
212
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
B. Vowel confusions
TABLE III. Informationtransmittedin theconsonant
confusion
study.
Figure2(a)showstheresultsobtainedin the vowelconfusion study.A separateanalysisof variancefor the Latin
squaredesignof eachsession
wascarriedout. The strategies
effectwassignificantat the 1% level for one sessionand at
the 5% levelfor two sessions.
The subjectseffectwassignificant at the 5% level in only one sessionand the order of
presentationwas not a significanteffectin any session.A
repeatedmeasuresanalysisof the combinedresultsfrom all
sessions
showeda significantdifferencebetweenstrategies,
usingthe subjects
by strategies
interactionasthe error term:
F (2,4) = 231.2,p < 0.01. A comparisonof the main effectfor
sessionswith the interactionterm of sessionsby subjects
showedno significanttraining effect. Correlation coefficients of the scores and session numbers were calculated for
eachstrategyseparately,combiningtheresultsfrom all three
subjects.All three coefficientswere significantlygreater
thanzero,showingthat a trainingeffectoccurredasthetestingprogressed.
[r = 0.783,t = 3.98,df= 10,p(r = 0} < 0.01
for F2; r = 0.744, t= 3.52, df= 10,p(r = 0}<0.01 for F0
F2; r = 0.506, t= 1.855,df= 10,p{r=O)<O. 1 for FOF1
F2.] A relatively large sessionby subjectinteractionterm
obscuredthistrainingeffectin theanalysisof variancewhich
is lesspowerfulthan the correlationanalysisfor this purpose.
Strategy
F2
F0 F2
F0 F 1 F2
Total
37%
43%
49%
Voicing
Nasality
35%
86%
34%
84%
50%
98%
Affrication
Duration
Place
31%
62%
19%
32%
71%
28%
40%
81%
28%
Amplitudeenvelope
High F 2
47%
48%
46%
68%
61%
64%
suresanalysisof the combinedresultsshoweda marginally
significanteffect for the different strategiesF(2,4)= 8.0,
p < 0.05. The correlationcoefficientfor the scoresof all three
strategiesand all threepatientswith the sessionnumberwas
calculatedto be 0.656 This showsthat a significanttraining
effectoccurredduringthe four sessions
of testing[t = 5.068,
df= 34, p(r=0)<0.001]. Small differencesbetween the
strategiesappearobviousin Fig. 2 althoughthey disappear
after trainingoccurs.The averagescorefor the final session
includingall subjectsand all strategieswas 58% correct.
Table III
shows the results of an information
transmis-
sionanalysisusingthe featuresof Miller and Nicely (1955}.
Aninformation
transmission
analysis
ofthevowelcon- The last two featureslistedhavebeenformulatedto reprefusion matrices was carried out with the vowels classified
accordingto three groupings:long (heed, heard, who'd,
hard, hoard)or short(hid,head,had,hud, hood,hod);high
F 2 (heed,hid, head,had),mid F 2 (who'd,heard,hud, hard),
or low F 2 (hod,hood,hoard);and highF 1 (hard,hud, had,
hod),mid F 1 (heard,head,hoard,hood),or low F 1 (heed,
hid, who'd}.The data on formantfrequencies
and voweldurationsweretakenfrom Bernard(1970).Table II showsthe
percentageof information transmittedfor each grouping.
There is a clear increase in the total information
transmis-
sionasthestrategiesencodenewspeechinformation.The F 0
F 1F 2 strategyis the only onethat transmitsa largeproportion of theF 1information.A muchgreaterproportionof the
F 2 informationistransmittedwhencodedasa filter frequency rather than a pulserate.
C. Consonant
confusions
Figure2(b)showsthe resultsof the live voiceconsonant
confusionstudy. A separateanalysisof variancefor each
session,carried out in the sameway as for the vowel study,
showedthat strategies,subjects,and order of presentation
werenot significantfactorsat the 5% level.A repeatedmea-
sentthe main groupingsof the consonantconfusionmatrices
in a more economicalfashion.The amplitudeenvelopefeature classifiesthe consonantsinto four groupsas shownby
Fig. 3. These groupswere easily recognizedby eye from
tracesof the amplitudeenvelopes
producedby the real time
speechprocessor.
The highF 2 featurerefersto the outputof
the speechprocessor's
F 2 frequencyextractioncircuit during the burstof the stops?t? and/k/or, duringthe frication
noiseof/s/and/z/./f/and/g/do
not give rise to this
[
vowel
] [consonant]
Unvoiced
plosives
Unvoiced
fPicatives
:•/p,
:
[
vowel
]
t, k/
/f,
s/
/
Voiced
plosives
Nasals
:
& fPicatives
: /b, d, g, v, z/
TABLE II. Information transmittedin the vowelconfusionstudy.
Strategy
F2
F0 F2
F0 F 1 F2
Total
Duration
34%
83%
56%
85%
72%
94%
F1 grouping
F2 grouping
12%
25%
27%
68%
81%
55%
/m, n/
FIG. 3. Schematic
diagramsof theamplitudeenvelopes
for thegroupingof
consonants
usedin the informationtransmissionanalysis.Time is the variablealongthe abcissa
and amplitude(at the outputof the AGC) alongthe
ordinate.
213
J. Acoust.Soc. Am., Vol. 77, No. 1, January1985
Blameyeta/.: Three speechcodingstrategies
213
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
featurebecause
the amplitudeof the signalis toolow during
the periodthat theF 2 frequencyis high.Thusthe highF 2
featureis a binary groupingwith/t/,/k/,/s/,/z/in
one
groupand the remainderof the consonants
in the other. The
valuesin TableIII showa smallincreasing
trendfrom left to
right.The additionoff 1 informationappearsto helpin the
transmission
of all featuresexceptplacewhile the useoff 0
makesvery little differenceto the voicingfeature.
TABLE V. Resultsof the recordednonsensesyllabletest.The maximum
score is 91 and the chance score is 11 in each case.
Subject
D
E
F
Male speaker
F2
32.5
24
31.5
F0 F2
29.5
17
40.5
F0 F2 F2
37.5
24
30.5
D. Prosody test
Table IV shows the results for the stressed word and
question/statement
judgmentsrequiredin the prosodytest.
The stressjudgmentswere easilyperformedat an almost
perfectlevel. The question/statement
judgmentswere only
slightlybetterthan chance.There wereno significantdifferencesbetweenstrategiesor between speakersor between
subjects.
E. Nonsense syllable test
The results of this test are tabulated in Table V. The odd
"half"
scores in Table V arise because one item from each
consonantsetis repeatedand scoreshalf for eachpresentation (seeDubnoandDirks, 1982).The analysisof varianceof
the combinedresultsfor maleandfemalespeakers
indicated
no significantdifferences
betweensubjects,strategies,or order of presentation.Becauseof the smallnumberof presentationsof eachtestitem, no consonantconfusionanalysisor
analysisby consonantsetwasattempted.
III. DISCUSSION
The data presentedabovefor F0 F2 strategymay be
directlycomparedwith data from recentcochlearimplant
TABLE IV. Resultsof the recordedprosodytest.The maximum scorein
eachcaseis 24, with a chancescoreof 8 for the stressjudgmentsand 12 for
the question/statement
judgments.
Subject
D
E
26
19
30
21.5
25.5
F0 F1 F2
25.5
31.5
38.5
studiesusingthe Nucleusportablespeechprocessor
(Dowell
etal., in press).Thereisgoodagreementbetweenthetwo sets
of results,providedthat only averageand above-average
cochlearimplantpatientsare considered.The exclusionof a
smallnumberof patientswith verypoorresultsisjustifiedby
thefact that the acousticmodelmakesno allowancefor possiblevariationsin the numberof survivingnervefibersthat
may adverselyaffect the performanceof the implant. A
much more detailedcomparisonof implant and modelresultsfor the F0 F 2 strategywaspresentedby Blameyet al.
(1984b).The presentacousticmodel data and the recently
acquiredimplant data may be incorporatedinto that comparisonwithout modifyingthe conclusionthat the acoustic
modelgivesresultsthat are empiricallyequivalentto those
achievedby cochlearimplant patients.
Becauseof the multitudeof cochlearimplantdevices,
speechprocessing
schemes,and word teststhat have appearedin the literature,it is beyondthe scopeof thisreport
to compare these results extensivelywith those of other
groups.Briefcomparisons
with thosestrategies
mostsimilar
to theonestestedhereandwith the strategies
that haveprovided outstandingresultsare includedin the discussionof
gieso
The speechtracking rates from Fig. 1 show a clear
superiorityof theF 0 F 1F 2 speechprocessing
strategyover
24
24
23
FOF2
23
24
24
FOF1F2
24
23
23
Femalespeaker
F2
24
22
24
FOF2
24
22
24
FO F 1 F2
24
24
24
Question/statementjudgments
Male speaker
F2
12
8
FOF2
17
16
1•2
FO F 1 F2
14
12
13
15
12
14
12
F0 F2
17
20
13
F0 F1 F2
10
15
17
J. Acoust.Soc.Am.,Vol.77, No. 1, January1985
22.5
F0 F2
the differences between the results for the individual strate-
F2
214
F2
F
Stressjudgments
Male speaker
Femalespeaker
f'2
Femalespeaker
theothertwo strategies.
The averageratesof 60 to 70 wpm
achievedby all threesubjects
makeit possible
to converse
at
a normal rate under the heatingplus lipreadingcondition.
The subjectsreportedthat the F0 F 1 F2 auditorysignal
soundedmorenaturalandwasconsequently
lessfatiguingto
listento thantheotherstrategies
despite
thedistortion
introducedby translationand quantizationof the formant frequencies.
For subjects
E andF theF0 F 2 strategyis clearly
betterthantheF 2 strategybutfor subjectD thisdifference
is
not so marked. The averagetracking rates for the F0 F2
strategyare almostidenticalto thoseobservedfor subjects
A, B, and C in our previousstudy(Blameyet al., 1984b).It is
of interestthat the intersubjectdifferencesin the average
trackingratesare greaterfor theF 2 than for theF0 F 2 and
F0 F 1F 2 results.A possibleexplanationis that the abilities
of subjectsto discriminatedifferentpulse rates are more
widely varying than the abilitiesto discriminatespectral
changes
of thetypeusedhere(Blameyet al., 1984a).
Blameyeta/.' Threespeechcodingstrategies ,
214
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
To the bestof our knowledge,the only reportsof cochlear implantpatientsthat havebeenevaluatedby the speech
trackingmethodarethoseof the Englishgroup(Rosenet al.,
1981;Moore et al., 1983),the Zurich group (Dillier et al.,
1983),and our own (Martin et al., 1981;Clark et al., 1983;
Dowelletal., in press).The Englishgrouphaveuseda singlechannel
device with
an extracochlear
electrode
mounted
near the round window to presentcharge-balanced
square
waveswhosefrequencyis either equalto the fundamental
voicefrequencyof the speaker(F0)or equalto F0 - 50 Hz.
Resultswerereportedfor onepatientonly(FS)whoachieved
an averagetrackingrateof 35 wpm with lipreadingpluselectrical stimulationand 22 wpm with lipreading alone. A
further study(Rosenet al., 1981)with normalhearingsubjectslisteningto an acousticmodelof the single-channel
implant gavecomparableresultsfor two subjectsand much
poorer resultsfor the other three subjectsevaluated,althoughthe hearingpluslipreadingconditionshowedan improvementoverthe lipreadingaloneconditionin eachcase.
The bestof theseresultsare slightlybetterthan the performanceof subjectE with the singlechannelF 2 strategybut
not as goodas the resultsof subjectsD and F. The Zurich
group have evaluatedthree subjectsusinga single-channel
two studieswere accountedfor mainly on the basisof F 1
frequencydifferencesbetween the vowels. Our acoustic
model subjectsachievedcomparableaveragesof 54% and
75% correctfor theF0 F 2 andF0 F 1F 2 strategies,
respectively. Chouardet al. (1983)have reported74% and 88%
correctin a testof sevenFrenchvowelsfor a patientusinga
multiple-channel
devicecapableof presenting
F 1andF 2 in a
similarmannerto theF 0 F 1F 2 strategyusedhere.In viewof
the differentnumbersof stimuli, the differentlanguages,the
small number of patients, the different conditions(live
voice/pre-recorded),
and the trainingeffectsthat occurfor
repeatedclosedsettests,it wouldbe prematureto draw any
finalconclusions
fromthiscomparison
of resultsfrom different researchgroups.
The consonantconfusionstudyshowedno clear differencebetweenthe threestrategies.Figure 3 andTable II indicate the salient acoustic features that are common to all three
strategiesand which seem to accountfor the results obtained.It shouldbenotedthat the amplitudeenvelopeclassification includes the voiced/unvoiced
and nasal/nonnasal
distinctions.Apart from the grossseparationinto high F 2
and low F 2 consonants no details of the formant transitions
are perceivedby the subjects(thiswouldbe expectedto producea higherscorefor the placefeature).This result indion the zero-crossing
intervalsof the speechsignal.The recatesthat improvementin consonantrecognitionmay arise
sultswere 13.2,23.3, and 43.2 wpm for lipreadingpluselecfrom a more salientpresentationof the formant transitions
trical stimulation.It is hazardousto comparethesesetsof
in vowelsadjacentto the consonants.The difficultythat the
results because of the uncontrolled variables such as the difsubjectshaveat presentmay arisefrom the limitationsof the
ficulty of the material, the speakercharacteristics,
the subreal-timespeechprocessor
or from the codingstrategyused
jects'-lipreadingabilities,and the etiologyof the deaf pato representthis information.Further experimentswill be
tients. Neverthelessthere is no data to suggestthat the F2
requiredto determinethe causeof this problem. Several
strategyis lesseffectivethan the the other single-channel studiesby otherimplantgroupshaveachieveda similarlevel
strategiesfor speechtracking.
of performancein closedsetconsonantrecognitionstudies
The remaining experimentsreported here were deandit maybethat thissimilarityariseslargelyfrom informasignedto determinewhich aspectsof the speechprocessing tiontransmitted
to thepatientsfromtheamplitudeenvelope
wereresponsible
for the differences
betweenthe strategies.
of the signal.Hochmairand Hochmair-Desoyer
(1983) reThe onlytestwhich showeda significantdifferencewas
ported 56% correct for a study of 17 consonantswith one
the voweltest.The informationtransferanalysisof Table II
subjectin the HA condition.Edgerton(1983)reported58%
leadsto the followingconclusions:
(a) the amplitudeenve- correctfor a groupof subjectsusingthe Houseimplant in a
lope wasrepresentedwell in eachstrategy,leadingto accustudyof 12 consonants.
Rosenet al. (1983)reported65%
ratejudgmentof vowelduration.(b)F 2 frequencyinformacorrectfor onepatientin a studyof 12consonants
in the LH
condition. In this latter case, much of the information was
tion waspresentin eachstrategybut waslesssalientwhen
codedas a pulserate rather than by filter frequency.(c)F1
derivedfrom lipreadingand the auditorysignalcontained
frequencyinformationwasnot presentin theF 2 andF 0 F 2
voicinginformationonly.
strategiesbut wasclearlypresentin the F0 F 1 F 2 strategy.
The resultsfromtheprosodytestindicatedthat enough
Vowel identificationstudieshavebeenreportedby mostimprosodicinformationwastransmittedto enablesubjectsto
plant researchgroups.Only thoserelevantto the present identifystressed
words,but not enoughfor goodquestion/
study will be mentionedhere. Dillier et al. (1983)have restatement
judgments.Fry (1955,1958)foundthat fundamenportedresultsfor a setof fivevowelswith onepatientusinga
tal frequency,duration, and intensitywere three acoustic
single-channel
strategythat presentedpulsesat a rate equal correlatesof stress,in decreasingorder of importancefor
to F 2 dividedby 18.The 68% correctrecognitionreportedis
normallyheatinglisteners.Duration and intensityare premuchhigherthanthe averageof 34% achievedby the acous- sentedin all three strategies,with fundamentalfrequency
tic modelsubjectsfor elevenvowelsusingthe F 2 strategy. absentfrom theF 2 strategy.The similarityof the resultsfor
Hochmair-Desoyeret al. (1981)reported18%, 58%, and
thethreestrategies
suggests
that fundamentalfrequencywas
60% correctfor threepatientsusinga single-channel
speech not sucha salientcuefor acousticmodelsubjects.The results
round window stimulator
which stimulates at a rate based
processor
in a recognition
taskwith eightGermanvowels.
here are consistent with results for the accent test of the
Eddington(1983)reported58% correctfor onepatientusing
a four-channelspeechprocessorfor recognitionof ten
Americanvowelsand diphthongs.
The resultsof the latter
Minimal Auditory Capabilities(MAC) battery (Blameyet
al., 1984b)whichshowedalmostperfectperformanceon the
judgmentof stressed
words.The question/statement
testof
215
J. Acoust.Soc.Am.,Vol.77, No.1, January1985
Blarneyeta/.' Threespeechcodingstrategies
215
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
the MAC battery, howevergavescoressignificantlyabove
chance.To investigatethis discrepancythe recordedsentencesof the prosodytestwere analyzedwith digital speech
analysisprogramsand the amplitudeenvelopesand fundamentalfrequencycontourswereexamined.It wasclearfrom
thisanalysisthat the intensityanddurationof stressed
words
couldbe usedto make the stressedwordjudgmentswithout
that amplitudeanddurationinformationprovidedthemajor
cuesusedby the subjects.It ispossiblethat thespeechcoding
strategiesdo not adequatelypresentthe F 2 transitionsand
F0 variationsor that thesefrequenciesare not tracedwith
sufficientaccuracyby the real time speechprocessing
hardware. Investigationsinto both of theseareasare presently
under way.
reference to the F 0 contours. The F 0 contours did not follow
the pattern of thosepublishedby Atkinson (1976).Atkinson's sentences showed F0 contours that rose or fell over the
whole sentencefor questionsor statements,respectively,
with the maximum F0 usually occurringon the stressed
word. The sentences used here also had maximum
F0 values
on the stressedword but did not showmarkedrisingor fallingpatternsoverthewholesentence.
Instead,theF0 pattern
within the stressed
word had a risingor fallingcontour.This
characteristic makes the question/statementjudgments
much more difficult for this recordingthan for the MAC
battery test recording,which did have risingor falling F0
contoursoverthe wholesentence.The prosodytesthasbeen
usedwith onecochlearimplantpatientusingtheF 0 F 2 strategy(Martin, in preparation).The resultsfor thispatientwere
similarto thosepresentedhere,althoughintensivetraining
improvedthe question/statement
judgmentsto a level significantlyabovechance.The prosodytesthasnot beenused
by other implant researchgroups,but greaterthan chance
performancefor the MAC battery question/statement
and
accenttestshasbeenreportedin severalinstances,in which
either amplitude envelopeor F0 or both were presented
(Owenset al., 1983;Atlas et al., 1983)
The recordednonsensesyllabletest confirmedthe result of the consonantconfusionstudy that was presented
with live voice. The level of performanceis significantly
abovechancein everycasebut the percentagescoresare well
belowthoseachievedin the consonantconfusionstudy.This
poorerperformancearosefrom a combinationof factorsincludingthe following:a widerrangeof consonants
and vowel contextswasused,the stimuliwerepresentedin CV or VC
rather than VCV contexts,the speakerswere unfamiliar to
the subjects,and no feedbackor training was given. The
authorsare unawareof any publishedstudyof cochlearimplant patientswith thistest.Martin (in preparation)hastestedonepatientusingtheF 0 F 2 strategywhoscoredat a similar level to the acousticmodel subjects.
IV. CONCLUSIONS
ACKNOWLEDGMENTS
We wishto acknowledge
thefinancialsupportprovided
by theLionsInternationalDeafnessFellowship,theNational Health and Medical Research Council of Australia, and
the Deafness Foundation
of Victoria.
We would like to
thank Dr. P.M. Seligmanand M. Harrison for electrical
engineeringsupport,R. C. Dowell and Dr. Y. C. Tong for
helpfuldiscussions,
A.M. Brown, G. Cook, and H. McDermott for their patienceas subectsand H. Hodgensfor the
typing.
Atkinson, J. E. (1976)."Inter- and intraspeakervariability in fundamental
voicefrequency,"J. Acoust.Soc.Am. 60 440-445.
Atlas, L. E., Hemdon, M. K., Simmons,F. B., Dent, L. J., and White, R. L.
(1983)."Resultsof stimulusandspeech-coding
schemes
appliedto multichannel electrodes,"Ann. N.Y. Atari. Sci. 405, 377-386.
Bernard,J. R. L. (1970)."Toward the acousticspecification
of Australian
English,"Z. Phonetik23, 113-128.
Blamey,P. J., Dowell, R. C., Tong, Y. C., and Clark, G. M. (1984a)."An
acousticmodelof a multiple-channelcooblearimplant,"J. Acoust.Soc.
Am. 76, 97-103.
Blamey,P. J., Dowell, R. C., Tong,Y. C., Brown,A.M., Luscombe,S. M.,
and Clark, G. M. (1984b)."Speechprocessing
studiesusingan acoustic
modelof a multiple-channel
cooblearimplant,"J. Acoust.Soc.Am. 76,
104-110.
Chouard, C. H., Fugain, C., Meyer, B., and Lacombe,H. (1983)."Longtermresultsof themultichannelcooblearimplant,"Ann. N.Y. Atari. Sci.
405, 387-411.
Clark, G. M., Black, R. C., Dewhurst, D. J., Forster, I. C., Patrick, J. F.,
and Tong, Y. C. (1977). "A multiple electrodeheating prosthesisfor
cooblearimplantationin deaf patients,"Med. Progr. Technol.5, 127140.
Clark, G. M., Dowell, R. C., Brown,A.M., Luscombe,S.M., Pyman,B.C.,
Webb, R. L., Bailey,Q. R., Seligman,P.M., and Tong, Y. C. (1983).
"Clinical trial of a multiple-channelcooblearprosthesis:
An initial study
in four patientswith profoundtotal heatingloss,"Med. J. Aust. 2, 430433.
Crosby,P. A., Seligman,P.M., Patrick,J. F., Kuzma, J. A., Money,D. K.,
Ridler,J., andDowell,R. C. (1983)."The Nucleusmulti-channelimplantable heating prosthesis,"in Proceedings
of the SecondInternational
Symposium
on CochlearImplants,Paris,September,1983(Acta Otolaryngol.Suppl.1984,Stockholm).
De Filippo, C. L., andScott,B. L. (1978)."A methodfor trainingandevaluating the receptionof ongoingspeech,"J. Acoust. Soc.Am. 63, 11861192.
This studypredictsthat high levelsof speechrecognition can be attainedwith a multiple-channelcochlearimplant codingstrategythat presentsF 1 and F 2 encodedby
electrodepositionandF0 encodedby pulserate. This strategy is predictedto producebetter resultsthan one which
codesonly F2 and F0, with the improvementoccurring
mainlythroughenhancedvowelrecognitionanda morenatural qualityof the encodedspeech.The third strategy,which
codedF 2 in termsof the pulserate on a singlechannel,was
foundto be lesseffectivethan the strategiesthat codedF2 in
termsof electrodeposition.
The presentationof consonantsand prosodicinformation were similar for all three strategiestested,indicating
216
J. Acoust.Soc. Am., Vol. 77, No. 1, January 1985
Dillier, N., Spillman,T., andGuntensperger,
J. (1983)."Computerized
testing of signal-encoding
strategieswith round-windowimplants,"Ann.
N.Y. Acad. Sci. 405, 360-369.
Dowell, R. C., Martin, L. F. A., Clark, G. M., andBrown,A.M. lin press}.
"Resultsof a preliminaryclinical trial on a multiple-channelcochlear
prosthesis,"Ann. Otol. Rhinol. Laryngol.
Dubno,J. R., and Dirks, D. D. 11982}."The evaluationof hearingimpaired
listenersusing a nonsensesyllabletest. 1. Test reliability," J. Speech
Hear. Res. 25, 135-141.
Eddington,D. K. {1983}."Speechrecognitionin deaf subjectswith multichannel intracochlear electrodes,"Ann. N.Y. Acad. Sci. 405, 241-258.
Edgerton,B. {1983}.Presentationto 2nd InternationalConferenceon Cochlear Implants, Pads, September,1983 IActa Otolaryngol.Suppl. 1984,
Stockholm}.
Flanagan,J. L., andRabiner,L. R., Editors11973).Speech
Synthesis
IDowden, Hutchinson& Ross,Stroudsberg,PA}.
Blamey eta/.: Three speech codingstrategies
216
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55
Fry, D. B. (1955)."Duration andintensityasphysicalcorrelatesof linguistic stress," J. Acoust. Soc. Am. 27, 765-768.
Fry, D. B. (1958)."Experimentsin the perceptionof stress,"Language
Speech1, 126-152.
Greenwood,D. D. (1961)."Critical bandwidthand the frequencycoordinates of the basilar membrane," J. Acoust. Soc. Am. 33, 1344-1356.
Hochmair,E. S., and Hochmair-Desoyer,I. J. (1983)."Perceptselicitedby
differentspeech-coding
strategies,"Ann. N.Y. Acad. Sci.405, 268-279.
Hochmair-Desoyer,I. J., Hochmair, E. S., Burian, K., and Fischer, R. E.
(1981)."Four yearsof experience
with cochlearprostheses,"
Med. Prog.
Technol. 8, 107-119.
Martin, L. F. A. (in preparation)."Evaluationof speechprocessing
strategiesfor patientswith implantedhearingprostheses
for profoundor total
deafness,"M.Sc. thesis.
Martin, L. F. A., Tong,Y. C., andClark, G. M. (1981)."A multiple-channel
cochlearimplant:Evaluationusingspeech
tracking,"Arch. Otolaryngol.
107, 157-159.
Miller, G. A., and Nicely, P. E. (1955}."An analysisof perceptualconfusionsamongsomeEnglishconsonants,"J. Acoust. Soc. Am. 27, 338352.
Moore, B.C. J., Fourcin, A. J., Rosen,S., Walliker, J. R., Howard, D. M.,
Abberton, E., Douek, E. E., and Frampton, S. (1983). "Extraction and
presentation
of speechfeatures,"in Proceedings
of loth Anniversary
Conferenceon CochlearImplants,SanFrancisco,June,1983(Raven,New
York).
Owens,E., Kessler,D., and Raggio,M. (1983}."Resultsfor somepatients
withcochlear
implantsontheMinimalAuditoryCapabilities
(MAC) battery," Ann. N.Y. Acad. Sci. 405, 443-450.
Parkins,C. W., andAnderson,S. W., Editors(1983)."Cochlearprostheses:
An internationalsymposium,"Ann. N.Y. Acad. Sci. 405.
Remez,R. E., Rubin,P. E., Pisoni,D. B., andCarrell,T. D. (1981)."Speech
perceptionwithout traditionalspeechcues,"Science212, 947-950.
Risberg,A. (1974)."The importanceof prosodicspeechelementsfor the
lipreader,"in VisualandAudio-visual
Perception
of Speech,
editedby H.
Birk NielsenandE. Klamp, SixthDanavoxSymposium,
Stand.Audiol.
Suppl.4, 153-164.
217
J. Acoust.Soc.Am.,Vol.77, No. 1, January1985
Risberg,A., and Agelfors,E. (1982)."Speechperceptionbasedon nonspeechsignals,"in TheRepresentation
of Speechin thePeripheral/tuditorySystem,editedby R. CarlsonandB. Granstrom(ElsevierBiomedical
P., Amsterdam}209-215.
Risberg,A., andLubker,J. L. (1978)."Prosodyandspeechreading,"
Speech
Transmission
Laboratory,QuarterlyProgress
andStatusRepo,, Royal
Inst. of Technol., Sweden,STL-QPSR 4, 1-16,
Rosen,S. M., Fourcin,A. J., andMoore,B.C. J. (1981}."Voicepitchasan
aid to lipreading,"Nature 291, 1,50-152.
Rosen,S., Fourcin, A., Abberton, E., Walliker, J., Douek, E., Moore, B.,
Frampton,S., andHoward,D. (1983}."Assessment
of speechreceptive
andproductiveabilitywith electricallystimulatedhearing,"in Proceedingsof the 1lth InternationalCongress
on/tcoustics,
Paris,Vol. 4, pp.
297-300.
Shepherd,
R. K., Clark, G. M., Pyman,B.C., andWebb,R. L. (in press).
"The bandedintracochlearelectrodearray: An evaluationof insertion
traumain humantemporalbones,"Ann. Otol. Rhinol.Laryngol.
Summerfield,
Q. (1983)."Speech-processing
alternatives
for electricalauditory stimulation,"in Proceedings
of loth •4nniversary
Conference
on
Cochlear
Implants,SanFrancisco,
June,1983(Raven,New York}.
Tong,Y. C., Millar, J. B., Clark, G. M., Martin, L. F. A., Busby,P. A., and
Patrick, J. F. (1980}."Psychophysical
and speechperceptionstudieson
two multiplechannelcochlearimplantpatients,"J. Laryngol.Otol. 94,
1241-1256.
Tong, Y. C., Clark, G. M., Blamey,P. J., Busby,P. A., and Dowell, R. C.
(1982)."Psychophysical
studiesfor two multiple-channelcochlearimplant patients,"J. Acoust. Soc.Am. 71, 153-160.
Tong,Y. C., Blamey,P. J., Dowell,R. C., andClark,G. M. (1983}."Psychophysical
studies
evaluating
thefeasibilityof a speech
processing
strategy for a multiple-channelcochlearimplant," J. Acoust.Soc.Am. 74,
73-80.
Tong,Y. C., andClark, G. M. (1984}."Loudnesssummationfor electrical
stimulationwithtwoelectrode-pairs
in thehumancochlea,"presented
to
Aust. Physiol.Pharm. Soc.,May, MonashUniv.
Blameyeta/.: Threespeechcodingstrategies
217
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 152.14.136.96 On: Tue, 08 Nov 2016 23:09:55