Perceptual separation of simultaneous vowels: Within and acrossformant grouping by F0
John F. Cullingand C. J. Darwin
Laboratoryof ExperimentalPsychology,
University
of Sussex,
Falmer,Brighton,EastSussexBNI 9QG,
United Kingdom
(Received30 June 1992;revised29 January 1993; accepted22 February 1993)
Six experimentsexploredwhy the identificationof the two membersof a pair of diotic,
simultaneous,
steady-state
vowelsimproveswith a differencein fundamentalfrequency(AF0).
Experiment1 confirmedearlierreportsthat a AF0 improvesidentificationof 200-msbut not
50-msduration"doublevowels";identificationimprovesup to 1 semitoneAFo and then asymptotes.In such •timuli, all the formantsof a given vowel are excitedby the sameF 0,
providinglistenerswith a potentialgroupingcue. Subsequent
experiments
askedwhetherthe
improvement
in ideatification
with AF0 for the longervowelswasdue to listenersusingthe
consistent
F o within eachvowelof a pair to groupformantsappropriately.
Individualvowels
weresynthesized
wil:ha differentF 0 in the regionof the firstformantpeakfrom in the regionof
the higherformantpeaks.Suchvowelswerethenpairedsothat the firstformantof onevowel
bore the sameF o as the higherformantsof the other vowel.Theseacross-formant
inconsistenciesin F o did not substantiallyreducethe previousimprovementin identificationrateswith
increasing
AF0'sof up to 4 semitones
(experiment2). The subjects'
improvementwith increasing AF0 in the inconsistent
conditionwas not producedby identifyingvowelson the basisof
informationin the first-formantor higher-formantregionsalone,sincestimuli which contained
either of theseregionsin isolationwere difficultfor subjectsto identify.In addition,the inconsistentconditiondid producepooreridentificationfor larger AF0's (experiment3). The improvementin identification
with AF0 foundfor the inconsistent
stimulipersisted
whenthe AF0
betweenvowelpairswasconfinedto the firstformantregion(experiment4) but not whenit was
confinedto the higherformants(experiment6). The resultsreplicateat differentoverallpresentation
levels(experiment
5). The experiments
showthatat smallAF0'sonlythefirst-formant
regioncontributesto improvements
in identificationaccuracy,whereaswith larger AF0's the
higher formant regionmay alsocontribute.This differencemay be relatedto other resultsthat
demonstratethe superiorityof resolvedrather than unresolvedharmonicsin codingpitch.
PACS numbers: 43.71.Es, 43.66.Hg
INTRODUCTION
dence has been collected that illuminates the actual pro-
Listenerscan recognizemore easilythe speechof one
speakerwhen it has a different fundamentalfrequency
(F 0) from other speechoccurringat the sametime. Evidence has come from two types of experiments.First, intelligibilityof LPC-resynthesized
sentences
maskedby similar connecteddiscourseis improvedby a differencein F 0
between the two voices (Brokx and Nooteboom, 1982).
Second, pairs of simultaneoussteady-state vowels
("double-vowels") are identified more accurately when
they have different rather than the sameF0's (Scheffers,
1983; Zwicker, 1984; Chalikia and Bregman, 1989;
Assmann and Summerfield, 1990).
A number of computationalmodels have been designedto exploitdifferences
in F 0 betweenvoicesin various
ways.Thesemodelshave beentestedboth on continuous
speech(Parsons,1976;Weintraub,1985;Stubbsand Summerfield, 1990) and on double vowels (Scheffers, 1983;
Assmann and Summerfield, 1990; Meddis and Hewitt,
1992). Althoughthe differentprocesses
employedby the
programshavemodeledthe availableexperimental
data
with varyingdegreesof success,
little experimental
evi3454
J. Acoust.
Soc.Am.93 (6),June1993
cessingemployedby the humanauditorysystem.
It is possibleto distinguishtwo differentwaysin which
differencesin fundamentalfrequency(AFo's) could help
separatecompetingvoicesand facilitate the identification
of speechsoundswhen F o differsacrossspeakers:( 1) segregation of harmonicsof different Fo's within the same
frequencyregion,and (2) groupingof distinctspectralregions, such as those around successive
formant peaks,
which are excitedby a commonF o (Broadbentand Ladefoged, 1957). Each of the computationalmodelsabove
identifiesthe Fo's which are presentand then attemptsto
group harmonicsfrom any part of the spectrumwhich
shareeachof thoseFo's. In the caseof Scheffers'(1983)
model and Assmannand Summerfield(1990)'s "place"
model, the resolutionof the initial frequencyanalysisis
sufficientlypoor at high frequenciesthat the Fo's of the
vowelsare only likely to be distinguishedin the regionof
the first formant peak. These modelscan, therefore,only
groupharmonics
of commonF 0 in that region.The other,
more successfulmodels are more likely to distinguishthe
Fo's underlyinghigherformantpeaksand so may alsobe
able to performacross-formant
grouping.
0001-4966/93/063454-14506.00@ 1993Acoustical
Society
of America
3454
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
--Combined
Envelope
I
I
,
I
I (DB)
140
....Envelope
2
I•SHZ
IOOUZ
130
•120
dno
125HZ
•-•
9O
80
70
(a)
Frequency
I
I
1000
2000
(b)
J
3000
•000
50
5000
FREQOEN•
FIG. 1. Twomechanisms
of vowelseparation:
(a) schematic
illustration
of twooverlapping
formants
mergingin thecombined
spectrum
to forma single
peak; (b) eochlearexcitationpatternof the vowelpair/i/+/o/with
the Fo'sof eachformantmarked.
Grouping of harmonicsof each F 0 within the same
frequencyregionmay improveformant-frequency
estimation. Where formantsfrom competingvowelsshare the
sameregionof the spectrum,the selectionof two different
series
of harmonies
mayhelprevealthefrequencies
of the
two formant peaks.Figure 1 (left panel) showsschematically two formant-likeshapeswhich sumto make a shape
with a singlepeak. For vowelson the sameF 0 only this
singlepeakwouldbe availablefor phoneticinterpretation,
but for vowelson differentFo's the two constituentpeaks
could be derivablefrom the spectralenvelopesof the segregatedharmonicseries.
Where formantsdo not overlap,or where overlapping
formants have already been separatedby component
grouping,the Fo's of the harmonieswhich excite them
might then be used to group them with other formants
which share the sameFo in "across-formantgrouping"
(Broadbentand Ladefoged,1957). Figure 1 (fight panel)
showsa cochlearexcitationpattern, derivedusingfilters
with ERB bandwidths(Moore and Glasberg,1983) for the
pair of vowels/i/+/o/.
At low frequenciesthe pattern
resolvesa number of individual frequencycomponents,
but, giventhat the auditorysystemcaninterpolatebetween
thesecomponents
to estimatethe formantenvelopes,
the
pattern showsfive clear formant peaks.Thesepeakscould
be groupedin variouswaysto form differentpercepts,but
the Fo's markedshowthat the two formantsjust below 1
kHz come from a different source (the/a/)
from the other
formants.If the auditory systemis able to label and segre-
gateformantswhichare excitedby differentF0's thenthis
mechanism
wouldclearlybe valuablein pereeptua!ly
separating competing voices.
Studieswhich have usedAF0's betweenformants,in
orderto explorethe role of F 0 in combiningformantswith
commonF o and in segregatingformantsexcitedby differ3455
J. Acoust.Soc. Am., VoL 93, No. 6, June 1993
entF0's,havehaddifficultyin demonstrating
any_tendency
for listenersto group/segregate
formantsa•ording to
their Fo's (Cutting, 1976;Darwin, 1981). Cuttingfound
that listeners had no difficulty in identifying synthetic
speechsyllableswhich containedtwo formants,synthesizedon F0'swhichdifferedby asmuchas 10.2semitones.
Darwin (1981) found similarlyrobustphoneticlabeling
for vowelscomposed
of threeformants,eachwith different
Fo's (expt. 1). In order to ensurethat this resultdid not
comeaboutthroughsubjectsidentifyingvowelsfrom individual formants, Darwin went on to use formant trajecto-
rieswhich formeddifferentdiphthongsin differentcombinations (expt. 2), thus forcing subjectsto combine
information
from the two formants in order to derive a
particulardiphthongpercept.He foundno evidencethat
differentF 0 glides (140 Hz rising to 180 Hz and 120 Hz
falling to 80 Hz) on two formantsof thesediphthongs
could disrupt diphthong identification. Darwin then
looked into the questionof whether F o can nonetheless
affectthe groupingof formantsinto competingperceptual
organizations(expt. 3). He preparedfour diphthongformants;eachof two first formantsgavea uniquediphthong
perceptwhen presentedin combinationwith each of two
secondformants.He presentedall four formantssimultaneouslyin sucha way that two pairsof formantscould be
groupedeither by ear of presentationor by commonF o
glide, but found no conclusiveevidencein listeners'reporteddiphthongperceptsfor groupingof eitherkind. Finally, however,Darwin (1981, expt.4), and subsequently,
Gardner et al. (1989) found that the 2nd formant of a four
formant syntheticsyllable (/ru/) was excludedfrom subjeets' phonetic percept to producea different perceivedsyllable (/li/) when it had beensynthesizedon a differentF o
from the rest of the formants.
Even in this case however
the effectoccurredonly at larger AF0's ( --•4.4 semitones)
J.F. Cullingand C. J. Darwin:Formantgroupingof vowelsby F'o
3455
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
than thosenecessary
for detectionof a secondsource( < 1
semitone)and much larger than thosewhich showhigher
TABLE I. Formantfrequencies
and 3-dBbandwidths
usedto synthesize
the five individual
vowels.
scores
in Scheffers'
double-vowel
paradigm
(¬and«semi-
Vowel
Formant
tone).
Onepossible
explanation
of the relativeineffectiveness FI
of AF0'sin segregating
formantsisthat thehumanlistener F2
has a tendencyto recombinepotentiallyseparatedsounds
whichtogetherproducea phonetically
meaningful
unit.If
so,in the casesof Cutting(1976) and Darwin ( 1981,expt.
1), in whicheachformantof a speechsoundwasexcitedby
a differentF 0, the formantsin isolationwerenot identifiable speechsoundsand only in combinationdid they acquire phoneticsignificance.
Listenersconsequently
heard
/i/
/o/
/u/
/3/
/•/
Bandwidth
250
650
250
450
350
90
2250
950
850
1250
750
110
F3
3050
2950
1950
2650
2850
170
F4
3300
3300
3300
3300
3300
250
F5
3850
3850
3850
3850
3850
300
provementin accuracyof identificationwith increasing
AF0 which asymptotesabove1 semitoneAF 0.
(Darwin, 1981,expt.4; Gardneret al., 1989),in whichthe
(2) For vowelsof 50-msduration, there is no improvement in identificationaccuracywith increasingAF 0, but a
1
dip in accuracyis foundat • semitoneAF0.
secondof four formants is excitedby a differentF 0, the
secondformant must be perceptuallyexcludedfor the re-
I. METHODS
the combinedsound.In the caseof the "ru/li" paradigm
mainingformantsto be perceivedas/]i/, leavingan isolatedandmeaningless
secondformant,heardasa buzzing
sound.Listenersconsequently
heard/li/only when there
was a large AF 0.
This phoneticconstraintexplanation
may,however,be
inconsistentwith the resultsof Darwin (1981, expt. 3), in
which two first formant (F1) and two secondformant
(F2) soundswere employed.Each F1/F2 combination
formeda differentdiphthong,
but synthesizing
a givenF1/
F2 pairwith oneF oglideandthe otherpairwith theother
F 0 glidehad no effectuponthe perceived
combination
of
soundswhen all four formants were presentedsimultaneously.Here therewascompetitionbetweentwo organizations,eachof whichgroupsall the components
into phoneticallymeaningfulunits,yet evidencefor across-formant
groupingwas not found.
The experimentsof Cutting, Darwin, and Gardner
COMMON
TO EACH EXPERIMENT
A. Stimuli
FollowingAssmannand Summerfield
(1990), the experimentsusedthe five British-Englishtensevowels,/i/,
/o/, /u/, /3/, and/•/.
The vowelswere synthesizedon a
VAX 11/780 computerusinga programwhich addedto-
gethersinewaveswith amplitudes
andphases
determined
by the transferfunctionproducedby the cascadeconfiguration of Klatt (1980)'s parallel/cascade
speechsynthesizer.The formantfrequencies
and bandwidthswere identical to thoseof Assmannand Summerfieldand are given
in Table I, but the relativeintensityof the/a/vowel was6
dB lowerthan in their experimentto reducethe rangeof
intensities
amongthe constituent
vowels.The vowelswere
synthesizedwith 10-ms raised cosineonset and offset
rampsand on variousF0's, the lowestof which was 100
Hz.
et al. have addressedthe role of across-formantgrouping
Double vowelswere made by digitally adding wave-
asa cueandfoundonly weakeffectsrequitinglargeAF0's.
The secondpotentialmechanism,
improvedformantfre-
forms. In each double-vowelstimulus, one vowel always
quencyestimation,
may thereforebe responsible
for the
largeimprovements
in double-vowel
identification
thatoccur with the introductionof smallAF0's.The presentex-
periments
setoutto investigate
themismatch
between
the
data from double-vowelexperimentsand from formantgrouping
experiments
by usingacross-formant
inconsistenciesin F 0 in a double-vowel
paradigm.
To do this, vowelswere synthesizedwith a discrete
changein F 0 betweenF1 andF2. Thesevowelswerethen
combinedin such a way that the first formant of each
vowelhad the sameFo as the higherformantsof the competing vowel. If across-formantgroupingaffectsdoublevowel identificationthen an inconsistency
in F 0 acrossthe
first two formants(which are the mostinfluentialin vowel
identification(Petersenand Barney,1952), shouldconfuse
the listenerand producea decrement
in performance.
Beforetacklingthe mechanisms
of perceptualseparation by AF0, experimentI replicatedtwo basicdouble-
vowelphenomena,
as exemplified
by the resultsof Assmann and Summerfield (1990).
(1) For vowelsof 200-msduration,thereis an im3456
J. Acoust.
Soc.Am.,Vol.93,No.6, June1993
had an F o of 100Hz, the otherhadthe sameor a higher
F 0. Sinceno vowelwaseverpairedwith itself(evenon a
differentF0), therewereten phonetically
differentvowel
pairs,requiringdifferentresponses
fromthe subject.Each
pairwasrepresented
bytwostimuliat eachAF0, eachwith
a differentallocationof F0's to vowels(making20 vowel
combinationsat each AFo).
B. Procedure
Subjects
werefirstgiventhe individualvowelsto identify. Eachvowelwasplayedoncein a randomorderwith
no feedbackand subjects
who mademorethan two errors
were required to repeat the practice sessionuntil they
achieved this criterion.
Soundswereplayedat a 10-kHz samplingrate via a
12-bitdigital-to-analog
converterthroughan antialiasing
filter (4.5-kHz low pass) and presentedto subjectsover
SennheiserHD414 headphonesin a sound-attenuating
booth.The presentation
levelsof constituent
vowelsat 100
Hz F 0 lay in the range77-85 dB(A). Subjects
wereinstructedto registertheir two responses
sequentially
by
pressing
one of five keys,marked"EE," "AR," "OO,"
J.F. Culling
andC.J. Darwin:
Formant
grouping
ofvowels
byFo
3456
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
"ER," and"OR." The subjects
wereableto pressthesame
key twice, but were aware that the stimuli always containedtwo differentvowels.No feedbackwas given.
C. Data analysis
1 oo
75
Following Seheffers(1982,1983), Zwicker (1984),
and Assmannand Summerfield(1990), we calculatedfor
eachstimulustype the percentage
of trialson whichboth
vowelswere correctlyidentified.
50
II. EXPERIMENT
1
A. Stimuli
Each of the five vowelswas synthesizedwith 50- and
200-msoveralldurationson six differentFo's: 100, 101.46,
25
0$Oms
[
102.93,105.95,112.46,and125.99Hz (0, •, «, 1, 2, and4
equal-temperedsemitones,respectively,above 100 Hz).
The 100-Hz version of each the five vowels was combined
with eachof the otherfour vowelsat eachof the sixF0's to
give 120 differentdouble-vowelstimuli at each duration
(20 vowelcombinationsX6 AFo's).
0
1/2
200ms
1
2
4
Fo Difference (semitones)
B. Procedure
Thirteen subjects,with no declaredhearingproblems
and inexperienced
in double-vowelexperiments,attended
an hour-longsession.
After identifyingthe singlevowelsin
thepractice,theyheardtwo experimental
blocksof stimuli,
one of 50-ms stimuli and one of 200-ms stimuli. Six of the
subjects
listenedfirstto the 50-msstimuliand sevento the
200-msstimuli. Within eachblock, subjectsheard eachof
the 120 stimuli twice in a randomized
1/4
[]
order.
C. Results
Figure 2 showsthat the identificationof the 200-ms
stimuliimprovesmarkedlyas AF0 increases
from zero to
FIG. 2. Percentcorrectlyidentifiedvowelpairsand standarderrorsfor
subjectsin experimentI for (I} 50 stimuli (circles) and (2} 200-ms
stimuli (squares}.
each nonzero AFo [F(1)=7.03, p<0.05; F(1)=11.0,
p<0.01; F(1)=5.8, p<0.05; F(1)=5.8, p<0.05; F(1)
= 8.4, p < 0.02]. Within the 200-mscondition,identification wassignificantly
betterin a Tukey (HSD) pairwise
comparison
at eachnonzeroAF0 from that at zero AF0
(q=6.04, 9.31, 8.83, 9.50, 12.09, respectively,
p<0.01).
The 4-semitoneAF 0 conditionwasalsosignificantlybetter
thantheJ-semitone
condition
(q= 6.04,p < 0.01). Within
Jand«semitone
AF0's,andthenasymptotes.
Bycontrast, the 50-mscondition,however,the only significantdifferences
wereproduced
bythedipat • and«semitones.
The
identificationof the 50-msstimulidoesnot improvewith
J-semitone
identification
rateswerelowerthanin the 2increasingAF0, in fact there is a slightdrop in perfor-
manceat J and« semitones.
The pattern of data is broadly similar to that of
Assmannand Summerfield(1990); both durationsyield
similaraccuracyof identificationfor zero AF0, but identificationratesimproveasymptoticallywith increasingAF0
in the 200-msconditionand showno improvementin the
50-ms condition.Comparedwith Assmannand Summerfield's data, improvement in the 200-ms condition is
greater
forJsemitone
AF0 andasymptotes
moreslowlyas
&n oincreases,
whilethe 50-msdata reproduces
their dip at
J semitone
AFo, but showssomecontinued
depression
of
identification
rate at « semitone.
An analysis
of variance
wasconducted,
whichcovered
thesixAF0's(0, ¬,«, 1, 2,
and 4 semitones) and the two durations (50 ms/200 ms).
The analysis showed significantmain effects of AFo
[F(5,60)=12.89, p<0.0001] and duration IF(1,12)
=31.21, p<0.0002] and an interaction between AF o and
duration[F(5,60) ----7.62;p < 0.0001].
The 200-msstimuli gavesignificantlybetteridentification than the 50-ms stimuli in the simple main effectsat
3457
J. Acoust.Soc. Am., Vol. 93, NO.6, June 1993
and 4-semitoneconditions(q=4.32, p < 0.05 and q= 5.18,
p < 0.01) andthe«-semitone
identification
rateswerelower
than in the 4-semitonecondition(q = 4.32, p < 0.05).
D. Discussion
and conclusion
Our resultsgenerallyreplicatethoseof Assmannand
Summerfield(1990), despitethe useof only the exclusive
set of vowel pairs and the useof a presentationlevel approximately30 dB higher.Like them,we foundthat AF0's
improveidentificationfor 200 msbut not for 50-msdouble
vowels.There are, however, two differencesbetweentheir
results and ours.
First, our overall scoresare lower despitethe use of
only exclusivevowel combinations.This differencecan
probablybe accountedfor by the greaterpracticeand experienceof Assmannand Summerfield'ssubjects.Although our subjectsgave variable overall responserates, all
showed increasedscoreswith increasingAF0 for the
200-ms stimuli. However, in subsequentexperimentswe
rejectedsubjectswho scorelessthan 55% overall.
J.F. Cullingand C. J. Darwin:Formantgroupingof vowelsby Fo
3457
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
Second,the drop in performancefound here in the
B. Stimuli
50-msdataat ¬and« semitone
AFowasonlyobserved
by
Vowel duration was extended to 1000 ms, since other
Assmann
andSummerfield
in their¬-semitone
condition. experiments
had produceda morereliableimprovement
in
They interpretedthe dropascausedby the patternof beats
betweencorresponding
lownumberedharmonicsfrom different vowels, which are not resolvedseparatelyby the
peripheralauditorysystem.The overallbeatingpatternis
cyclic,havinga periodequalto the differencein F 0. These
beatsmay make the identity of the constituentsof a double
vowelmore or lessrecognizablefrom the combinedspectrum at differentpointsin the cycle.The observedperformancedip can be explainedif the majorityof the vowel
pairsare relativelyunrecognizable
in the smallportionof
identificationwith increasingAF 0 at this longerduration
(Culling, 1991).
The first stagein the synthesisprocedureproduced
1000ms "half-vowels"which containedeitherjust the first
formantor just the higherformantregionsof eachvowel.
A cross-overfrequencyfx wasselectedbetweenthe F 1 and
F2 of each vowel, near the minimum of its spectralenvelope (/i/, 1250Hz;/o/, 800 Hz;/u/, 650 Hz;/•/, 600 Hz;
/$/, 900 Hz). Each half-vowelwas then synthesizedwith
eachofthesixFo's(0, •,• 5,
• 1, 2, and 4 semitonesabove100
thecyclewhichwasheardbythesubjects
in the50ms/¬- Hz) by digitallyaddingtogetheronly frequencycomposemitone condition.
nents that were either lower than fx-50 Hz, or higher
Thedipin performance
in ourdataat¬and• semitone thanfx+ 50 Hz. The resulting100-Hzspectralnotchcenfrequencywasusedto eliminatethe
is consistentwith this interpretation,sincewe useddouble teredon the cross-over
possibility
of
beating
between
componentsof the same
vowelswhoserelativeglottal phaseswere very similar to
vowel
above
and
below
the
crossover.
The presenceof the
the relative phasesused by Assmannand Summerfield
notch
had
little
influence
on
the
phonetic
quality of the
(Summerfield,
1992),andsincefor• and«semitone
AFo's
vowels.
50 ms is only 7% and 14% of the beatingcycle,respecHalf-vowelswerethen combinedto producefull vowtively. Presumably,the identities of the constituentsbecome,on average,more recognizablefrom the combined els which had either the sameF 0 acrossthe whole specspectrumin laterpartsof thecycle,allowingsubjects'
iden- trum ("normal" vowels),or vowelswith an abruptchange
in F 0 betweenF1 and F2 ("split" vowels).The split vowtificationratesto recoverat 1 semitoneAF o. The contrielseithersteppedup from 100 Hz F 0 in the F1 regionto a
butionof beatingto the identification
of doublevowelswill
higher F0 in the rest of the spectrum,or steppeddown
be discussedfurther in a forthcomingpaper.
from the higherF 0 in the F1 regionto 100 Hz in the rest
of the spectrum.There were thus 30 normal vowels(6
Fo'sX5 vowels)and 60 split vowels(6 F0'sX5 vowelsž2
step directions).It shouldbe noted that the split vowels
with no AF0 were identicalto the corresponding
normal
III. EXPERIMENT
2
vowels,and were included only for statisticalreasons.
A. Introduction
Normal vowelswere paired with other normal vowels
to
produce
"normal" double vowels. Split vowels were
If, asBroadbent
andLadefoged
(1957)suggested,
a
paired with other split vowels,which usedthe sametwo
commonF0 allowsthe formantsfrom a particularspeaker
F0's, but in complementaryspectralregions,to produce
to be groupedtogetherappropriately,
thenidentification
of
"Fo-swapped"
doublevowels.In suchF0-swapped
double
doublevowelsshouldbe impairedwhen suchgroupingis
vowels,the lower harmonicsof one vowelhad the sameF 0
disrupted.In thisexperimentacross-formant
groupingwas
as the higher harmonicsof the other. Example spectrafor
disruptedby swappingthe F0's of the two vowelsof a pair
the constituentvowelsof an F0-swappedpair are shownin
in a particularfrequencyregion.The constituents
of these Fig. 3. With 20 vowelcombinations
X 6 AFo's,therewere
"F0-swapped"doublevowelsthus had an inconsistentF0
120 normal and 120 F0-swappeddouble vowels.
acrosstheir formants;the first formant of one vowel was
synthesised
on the sameF 0 as the higherformantsof the
c. Procedure
other vowel, and vice versa (for the purposesof the experNine subjects(eight of whom had participatedin at
imentsin this paper,the frequencyregionof the first forleast one double-vowelexperimentbefore) attended one
mant is definedas extendingup to the spectralminimum
After the practicetestwith the 30 indibetweenF1 and F2 of the vowel in question). Across- hour-longsession.
formant groupingshouldbe severelyimpaired by this ma-
nipulation,sinceit would lead to the wrong groupingsof
first and higherformants.On the other hand, this manipulationgenerallymaintainsa AF o betweenthe vowelsof a
pair withina particularformantregion.If across-formant
groupingis responsible
for the improvementin identification of doublevowelsfoundin experiment1, then the improvementwith increasing
AF0 will beseverelyreduced,or
evenreversedin theFo-swapped
conditionof the following
experiment.
3458
J. Acoust.Soc.Am.,Vol.93, No. 6, June1993
vidual normal vowels, subjectsfirst received a further pre-
test usingthe 60 half-vowels,for which there was no performancecriterion.Then subjectsheardtwo tokensof each
of the 240 normal and F0-swappeddouble vowelsin a
random order which was changedfor every third subject.
D. Results
Figure4 showsthe percentage
of trialson whichsubjectscorrectlyidentifiedbothmembersof a pair of normal
or Fo-swapped
vowelsasa functionof the &Fo. Thereis a
J.F. Cullingand C. J. Darwin:Formantgrouping
of vowelsby F0
3458
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
(DB)
(DB)
150
{
150
140
140
131
130
130
120
œ10
t00
lO0
90
90
8O
80
70
{,
0
1000
200
3000
4000
5000
70
1000
2000
3000
FREQUENCY (HZ)
(HZ)
4000
5000
FIG.
3. Examples
ofthespectra
oftheconstituents
ofanF0-swapped
stimulus:/i/+/3/(for
illustrative
purposes
anexaggerated
AF0of9semitones
is shown).
significant
improvement
in correctidentification
rate with
half-vowels,
presented
in thehalf-vowel
pretest,
aregiven
increasing
AF0 [F(5,40)= 16.73,p < 0.0001],butnosignif- in Table II. Subjectscorrectlyidentified43% of vowels
icantdifference
between
thenormalandFo-swapped
con- from their FI regionsaloneand 40% of vowelsfrom the
ditions,
whichareindistinguishable
in thefigureat AF0's remainingformantsalone.
of¬and«semitone.
TheFo-swapped
condition
produced
slightlypooreraverage
identification
ratesthanthenormal
conditionat AFo'sof 1, 2, and 4 semitones.
This trendis
reflected
in aninteraction
between
AFoandstimulus
type
in theanalysis
of variance
[F(5,40)=3.08,p<0.02],but
the simplemain effectsdid not showthat the two condi-
tionsdiffered
significantly
at anyAF0 (p> 0.05).
Confusion matrices for identification of the isolated
1 oo
•
75
I
•> 50
If listeners
grouptogether
theformants
of a vowelby
their commonF0, thentheywouldgroupthe F1 of one
vowelof anF0-swapped
stimulus
withthehigherformants
of the competing
vowel.Suchinappropriate
grouping
shouldimpairtheirperformance;
but asFig. 4 shows,
the
F0-swapped
doublevowels
showed
onlyslightlyworseperformance
thannormaldoublevowelsandthenonlywith
AF0'sof 1, 2, and4 semitones.
The improvement
in identification
for theF0-swapped
vowelswithsmallAF0'scannot be attributedto across-formant
grouping.It remains
possiblethat improved formant-frequency
estimation,
ratherthanacross-formant
grouping,
is responsible
for the
improvement
in identification
at smallAF0's.This improvedformantestimation
couldarisein theregionof the
first formant,the higherformants,or in both regions.
Whichfrequency
regioncontributes
is addressed
in experiment 4.
0
o
•
E. Discussion
Normol
TABLE ll. Confusion
matrices
showing
percent
responses
of eachclass
n F-o-swapped
to each"half-vowel,"for (a} FI only (b} F2-5 onlyin the "half-vowel"
pretest
of experiment
2. Also,percent
correct
forcombined
stimuli,predictedfromthe"half-vowel"
datausingBoothroyd
andNittrouer[1988,
25
•.
(1)1.
Responses(%)
Fl-only
0
1/4
1/2
1
2
4
Stim. lil Iol lul 131 Iol
Ill Iol lu/ 131 131 (% correct)
/i/
i00
0 0 0
0
0 96 0
0
4
/o/
Fo Difference(semitones)
lul
FIG. 4. Percent
bothvowels
correctly
identified
in experiment
2 for (1}
Prediction
full vowel
F2-5
0
2 96
2 0
0 37
0
59 4
22 O 76
0
2 0
0 56 9 30 5
13/
0
0
100 0
0 70
0
0
30
/3/
0
0 33 67 0
0 ii
2 74 13
100
98
78
100
13
normalstimuli(circles}and(2} F0-swapped
stimuli(squares}.
3459
d.Acoust.
Soc.
Am.,
VoL93,No.6,June
1993
J.F.Culling
andC.d.Darwin:
Formant
grouping
ofvowels
byFo
3459
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms.j
The slightlylowerperformance
for F0-swapped
stimuli
relativeto normalstimuliat aF0's largerthan half a semitone raisesthe possibilitythat this differencemight continueto increase
with largerAF0's.This issueis addressed
in experiment3.
However,an alternativeexplanation
for the resultsof
e•perfment
2 must
also
beexcluded.
ARhough
the"halfvowel" test showsthat vowelsare not easilyidentifiedfrom
the upperor lowerportionsof their spectraalone,in the
double-vowelexperimentssubjectsmay still have combined informationfrom differentfrequencyregionsafter
the phoneticlabelingstage.Had they donesotheir results
would have been insensitiveto the disruptionof acrossformantgroupingcues.Boothroydand Nittrouer[1988,
Eq. (1)] relate the probabilities
for identifyinga token
from either of two statisticallyindependentsourcesof in-
formationin isolation(p• and P2) to the probabilityof
identificationfrom both in combination (Pc):
B. Stimuli
Half-vowels were preparedwith F0's of 100, 102.29,
105.95,112.25,125.99,141.42,168.18,and 200 Hz. The F 0
of 102.29wasintended
to be«semitone
above100Hz, but
this value, enteredin error (the correctF0 being 102.93
Hz), was only 0.39 semitonesabove 100 Hz. The programmednatureof the stimuluspreparationensuredthat
thiserrorwasconsistent
across
all stimuliin the«-semitone
condition.Therewerethus8 Z•F0 conditions
altogether0,
"«,"1, 2, 4, 6, 9, and12semitones.
The normaland F0-swapped
stimulustypeswerepreparedin the sameway as in experiment2 by addingtogetherfour half-vowels.
The Fl-only stimuliwereprepared
by addingtogetheronly the two F1 half-vowels.
The F2-5
stimuli were preparedby combiningonly the two F2-5
half-vowels.
Sinceonly two half-vowelscomposedthe Fl-only and
F2-5 stimuli, thesestimuli were lower in intensitythan the
normal and F0-swappedstimuli,which were eachcomposedof four half-vowels.In addition,due to the --6 riB/
This equationwasappliedto the probabilityof identifying oct. spectraltilt on the Klatt synthesiser's
combinedvoice
a singlevowel from its completespectrum.Information sourceand radiation functions,the F2-5 half-vowelswere
from the first-formantregionand from the regionof the
considerablylessintensethan the F1 half-vowels,making
higherformantsweretakento be statistically
independent. the Fl-only stimulimoreintensethan the F2-5 stimuli.No
The identificationratesfor the singlecompletevowelspreattemptwasmadeto compensate
for theseintensitydifferdictedunderthis assumption
are includedin Table lI and
ences.With 8 •Fo's X 4 conditions• 20 vowelcombinapc= 1- ( 1-œ•) ( 1-œ2)-
( 1)
show that most of the vowds could be identified well from
independentphoneticcategorizationof the differentfrequencyregions.The equationis somewhatgenerous,in
tions,there were 640 stimuli altogether.
C. Procedure
that it assumesthat new information always servesto im-
provethe chances
of a correctresponse
ratherthanpotentially misleadingthe listener.Nonetheless,
as an additional
control, experiment 3 also compared normal and F0-
swappedstimuliwith thosewhichcontainedonlytheFI or
only the F2-5 regionsof the constituentvowels.
Eight subjects,all of whom were experiencedin
double-vowelexperiments,
attendedonehour-longsession.
The practicecontainedthe 40 individualvowelswhich
would composethe normal stimuli in the experimentto
follow. The 640 stimuli were presentedonce in a random
order which waschangedfor everythird subject.
D. Results
IV. EXPERIMENT
3
A. Introduction
The Fo-swapped
stimuliin experiment2 weredesigned
to misleadmechanisms
of across-formant
grouping.Experiment 3 investigatedtwo possibleexplanationsfor the failure to observethe substantial
declinein performance
predictedby across-formant
grouping.First, across-formant
groupingby F0 may be weak only for the rangeof small
/XFo'semployedin experiment2; and second,an individual
spectralregionmay providesufficientinformationfor each
vowelto be identifiedwithout integratingit with information from other spectralregions.
Accordingly,experiment3 repeatedthe conditionsof
the previous experiment using normal and Fo-swapped
double vowels, but extended the stimulus set to include
larger AF0's. As a controlfor vowel identificationbeing
basedon information available only in a single frequency
region,experiment3 alsoaskedsubjectsto identifyvowel
pairsconsisting
of onlythe firstformantor onlythehigher
formants
and "F2-5"
of the
conditions,
constituent
respectively.
vowels:theseare the "Fl-only"
3460
J. Acoust.Soc.Am.,Vol.93, No.6, June1993
Figure5 showsthatconditions
whichhavefull spectra
(normal and F0-swapped)givemarkedlybetteridentification rates than those which have only a portion of the
spectrum(Fl-only and F2-5). This resultwasreflectedin
an analysisof variancecoveringthe normal, Fo-swapped,
Fl-only,andF2-5stimulus
typesandthe8 •xF0's(0, "«,"
1, 2, 4, 6, 9, and 12 semitones)by a main effectof stimulus
type [F(3,21 ) = 138.2,p <0.0001]. Betteridentificationof
the normal and Fo-swappedstimuli was confirmed by
Tukeypairwisecomparisons;
Fl-only andF2-5 conditions
differedsignificantlyfrom eachother, and eachgavesignificantlylower identificationratesthan either the normal
or F0-swappedconditions(p < 0.01 in eachcase).The normal and Fo-swappedconditionsdid not differ significantly
(q = 3.09, p > 0.05).
Figure5 alsoshowsthat identification
ratesimproved
with fifo only in the normaland F0-swapped
conditions.
This interpretationwas corroboratedby the analysisof
variancewhich showeda main effect of AFo [F(7,49)
= 3.36,p < 0.01] anda significant
interactionbetweenstimulustypeandfiJv0 [F (21,147) = 4.20,p < 0.0001]. Thesim-
d.F. CullingandC. d. Darwin:Formantgrouping
of vowelsbyFo
3460
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms.j
0
ment with increasingAF0. Apparently,subjectsdid not
identify the vowelsby independently
labelingseparate
Normal
[] Fo-swapped
A F1 -only
lOO
spectralregions.
We now turn to the difference in identification between
v
2-5
¸
F1+F2-5
the normaland the F0-swapped
conditions.
The significantlyimpairedperformance
for F0-swapped
ascompared
info
to normalstimuliat a AF0 of 9 semitones
showsthat the
effectof misleading
across-formant
groupingmechanisms
in the F0-swapped
conditionis stronger
at higherA/o'S.
Across-formant
groupingmechanisms
couldberesponsible
75
for this clear deterioration in vowel identification in the
F0-swappedcondition.
50
Why shouldacross-formant
groupingmechanisms
exert their effectonly at relativelylargeAF0's?For mostof
the vowelsusedhere,the frequencycomponents
in the
second-and-higher-formant
regionwill not be resolved.In
orderfor listenersto usea AF0 to segregate
formants,they
25
would have to be able to detect two different fundamental
o 1/4
1/2
1
2
4
6
9
Fo Difference(semitones)
FIG. 5. Percentbothvowelscorrectly
identified
in experiment
3 for ( 1)
Normalstimuli(circles),(2) F0-swapped
stimuli(squares),(3) F2-5
stimuli(uprighttriangles),
(4) Fl-onlystimuli(inverted
triangles),
and
(5) predictedfrom Fl-only and F2-5 scores(diamonds).
pie main effectsconfirmedthat the effectof AFo was
significant
only in the normaland Fo-swapped
conditions
IF(7) =699.7,p<0.0001 andF(7) =431.3,p<0.005, respectively].Although the figure showsthat the normal
stimuli gave higheridentification
rates than the Foswapped
stimuliat AFo'sof 2-9 semitones,
Tukeypairwise
comparisons
at eachAFo showeda significantdifference
betweentheseconditions
only at 9 semitones
AFo (q
=4.64, p <0.05).
E. Discussion
frequencies
in eachfrequencyregionand correctlymatch
upF0'swhichwerecommonto differentfrequency
regions.
Listenerscandetectdifferences
in F 0 acrossregionsof resolvedand unresolvedharmonics(Carlyon et al., 1992),
but relativelylarge differences
in F o are required(1-2
semitones).This resultis undoubtedlyrelatedto the fact
that the fundamental
frequencydifference
limenfor complex periodicsoundsconsisting
of only unresolvedharmonicsisaboutanorderof magnitude
greaterthanthatfor
soundsthat contain some resolvedharmonics (Houtsma
and Smurzynski,1990).
The largeAF0 of 9 semitones
requiredto producea
significantdifferencebetweenthe normal and the F oswappedconditionsin the presentexperimentcan be comparedwith previousresultsby Gardneret al. (1989). They
foundthat a AF0 of about4.4 semitones
wasrequiredto
produceperceptualseparationof F2 from the other simul-
taneouslypresented
formantsof a composite
ru/li syllable.
V. EXPERIMENT
4
A. Introduction
The primarygoalof this experiment
wasto identify
thefrequency
regionresponsible
for theimprovedidentificationwith smallAF0'sin the F0-swapped
doublevowels
of the previoustwo experiments.
The experiments
by
and conclusions
Houtsma and $murzynskireferredto aboveshowedthat
First we consider the identification scores for the control conditions. What level of identification would we ex-
listeners
are moresensitive
to sequentially
presented
F0
pectin the normalandtheFo-swapped
conditions
if listeners were simply basingtheir judgementson pairs of
phoneticlabelsderivedindependently
from the two different formantregions?Again, takinginformationfrom the
first-formant
regionandthe higherformantregionsasstatisticallyindependent,
Eq. (1) from Boothroydand Nittrouer (1988) canbe usedto predictidentificationratesfor
completedoublevowelsfrom scoresin the Fl-only and
monics.If listenersare relativelyinsensitive
to small differencesin F o betweensoundsconsisting
of only unresolvedharmonics,thenthe improvement
in identification
scoresthat we havefoundat smallAF0'swith the F 0swappedstimulicouldbe dueentirelyto F 0 differences
in
the first formantregionnearthe FI peak,wherethe har-
F2-5 conditions.The results of this analysisare shown as
double-vowelstimulusin which the differencein F o occurred only in first formant region;both the higher-
thestarsin Fig. 5. The predictions
notonlyfall far shortof
the level of performancefoundin the normaland in the
F0-swapped
conditions
but alsodo notshowanyimprove3461
J.Acoust.
Soc.Am.,Vol.93,No.6, June1993
differences between resolved than between unresolved har-
monics are resolved.
To testthishypothesis
we produced
anothertypeof
formantregions
hadthesameF 0. These"sameF2:5"double vowelsshouldbe identifiedasaccuratelyasthe normal
J.F. Culling
andC.J. Darwin:
Formant
grouping
ofvowels
byFo
3461
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
normal,Fo-swapped,and sameF2-5conditionsacrossthe
1 oo
6 AF0's(0, {, «, 1, 2, and4 semitones).
Thisanalysis
re-
0
yealed significantmain effectsof stimulustype [F(2,7)
=10.1, p<0.002] and AFo [F(5,35)=18.6, p<0.0001],
but no interaction. Overall, the accuracyof identification
for Fo-swapped
andsameF2-5stimuliwassomewhatlower
than for normal stimuliand the conditionsbeginto diverge
more sharplyat 4 semitones.
Althoughthis divergencedid
not produce an interaction in the ANOVA, the simple
main effectsshowedthat there were significantdifferences
betweenthe three stimulustypesonly at 4 semitonesAF 0
IF (2) = 7.18,p < 0.01]. All threestimulustypesproduced
significantsimple main effects of AF0 [F(5)=5.58,
p <0.001; F(5) =6.53, p < 0.0005;F(5) =4.62, p < 0.005].
Normal
[]
Fo-swapped
•
Same
Fo for
F2-5
E. Discussion
Identification
J
I
I
o
•/•
•/2
I
1
I
1
2
4
of the double vowels increased over the
firstsemitoneof AF0by aboutthesameamountin all three
conditionsof this experiment.In particular,the sameF2-5
conditionshowsas large an increaseas the F0-swapped
condition.
Fo Difference (semitones)
FIG. 6. Percentboth vowelscorrectlyidentifiedin experiment4 for ( 1)
Normal stimuli (circles), (2) Fo-swapped
stimuli (squares),and (3)
sameF2-5stimuli (triangles).
and the F0-swapped
doublevowelsif listenersare insensitive to small AFo'sin the higherformants.
This increase in identification
cannot then be
due to mechanisms
operatingin the higher-formantregion
(includingacross-formant
grouping)since,in this region,
eachvowelin the sameF2-5conditionhad the sameF 0 as
its partner. The increasemust be due to mechanisms
operating solelyin the first formant region.
With a 4 semitoneAF 0 thereis againsomesuggestion
that mechanismsconcernedwith the higher-formantregion can influenceidentification.The Fo-swappedand the
sameF2-5 conditionswere both slightly worse than the
normal
condition.
B. Stimuli
Vl. EXPERIMENT
The half-vowelsfrom experiment2 wereusedagainin
this experimentin order to recreatethe normal and F 0swappedconditions, plus the new sameF2-5 condition.
This conditionwas madeby combiningthe half-vowelsin
a differentway. As before,the sameF2-5stimuliwere composedfrom FI half-vowelswith differentF0's,but for this
conditionthe F2-5 half-vowelsboth had the sameF 0, so
that only oneFo waspresentin thisfrequencyregion.Two
versionsof each sameF2-5 stimulus were made, which dif-
feredin whichof the two Fo'swasusedfor the F2-5 region.
5
A. Introduction
Experiment4 showedthat AF0'sin the F1 regionare
sufficientto producethe improvementin identificationaccuracyfor doublevowelswith small Ateo'S.One explanation for this effectis offeredby physiologicaldata. Miller
and Sachs(1984) foundthat the neuralrepresentation
of
within-channelamplitudemodulation,thoughtto encode
pitch for high numbered harmonics (Schouten, 1940),
cannot be detected in auditory nerve microelectrode re-
With 6 AF0's(0, {, «, 1, 2, and4 semitones)
X20 vowel cordingsat high presentationlevels.
combinations,this designresultedin 120 normal, 120 F 0swapped,and 240 sameF2-5doublevowels.
C. Procedure
Sevensubjectsfrom previousexperimentsand one who
was inexperiencedin double-vowelexperimentsattended
Experiments1-4 usedstimuli at levelsaround 85-90
dB(A). At thesehigh levels,perceptualseparationin the
F2-5 region may, therefore,be lost through impaired encodingof amplitudemodulation.In contrast,Scheffers
(1983) presented his stimuli at around 60 dB (SPL),
Zwicker presentedhis stimuli at 63 dB(A), Assmannand
Summerfield used only 53 dB(A), while Chalikia and
one hour-longsession.The 480 experimentalstimuli were
presentedonce to each subject in a random sequence, Bregmanused65 dB(A). In somerespectsthe higherpresentationlevelsusedhere representa more realisticmodel
which was changedfor every secondsubject.
of the cocktail party effect,which is intrinsicallyconcerned
with the auditory system'sresponseto noisy listeningenD. Results
vironments, rather than the 65-70 dB levels typical of
speechin a quiet listeningenvironment.
Figure 6 showsthat all threeconditionsyieldedsimilar
Experiment5 was designedto assessthe influenceof
improvements
in identification
ratesas AFoincreased
from
presentation
level on acrossformant groupingby F 0. Ex0 to 1 semitone.An analysisof variancecomparedthe
3462
J. Acoust.Soc. Am., Vol. 93, No. 6, June 1993
J.F. Cullingand C. J. Darwin:Formantgroupingof vowels by Fo
3462
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms.
lOO
periment5 replicatedexperiment2 usingtwo presentation
levels: 85-90 dB(A), as used in experiments1-4, and
55-60 dB(A), typicalof previousresearch.If the encoding
of amplitudemodulation,and hencepitch, has beenimpairedat high presentation
levelsthen a lower level will
improveF2-5 segregation
by AFo; at the lowerlevel,performancewithFo-swapped
stimulishoulddecline(relative
to that with normal stimuli) at smallerAFo's than at the
higher level.
o
B. Stimuli
50
0
The stimuli were identical to those of experiment2,
with normaland Fo-swapped
stimulustypesand 6 AF0's
0
Loud
(0, ¬,«, 1, 2, and4 semitones)
giving240stimuli.Attenuation of 30 dB in the low intensitycondition was achieved
Norm•$1
[] Fo-swapped
o
---
25
Quiet
by insertingseparateAdvanceElectronicsA64 stepattenuatorsinto eachear'ssignalchannelafter digital-to-analog
conversion.
c. Procedure
Eight subjects,experiencedin double-vowelexperiments,attendedone hour-longsession,which was divided
into two blockswith different presentationlevels. Each
stimuluswas presentedonce in each block. Four subjects
receiveda block with the higher intensityfirst, while the
other four startedwith the lower intensityblock. Two different random sequences
were usedacrosssubjects.
0
1/4
1/2
1
2
4
Fo Difference (semitones)
FIG. 7. Percentboth vowelscorrectlyidentifiedin experiment5 for ( l )
loudandnormal(squares,
solidline), (2) loudandF0-swapped
(circles,
solidline), (3) quiet and normal (squares,dashedline), and (4) quiet
and F0-swapped(circles,dashedline).
higherfrequencyregionsmay be more effectiveat lower
levels,this data offersonly limited supportin the form of
Figure 7 showsthat the overallpattern of resultsis
an appropriate
but smallandnonsignificant
trend(the difsimilar to thoseof experiment2; in eachconditionidentiferencebetweenthe normal and Fo-swappedconditionsat
fication improvesover the first semitoneand decreases only 2 semitones
AF 0 at the lowerpresentation
level). The
somewhatfor the F0-swappedconditionsat 4 semitones. resultsare consistent
with the hypothesis,
but indicatethat
These resultswere reflectedin an analysisof variancecovany effectis quitesmall.The resultsfrom this experiment
ering stimulustype (normal and F0-swapped),stimulus do not, therefore,detract from the generalfinding that
level(quietandloud),andAF0 (0, ¬,«, 1, 2, and4 semi- pitchmechanisms
basedon unresolved
harmonics
canplay
tones)by a maineffectof AF0 [F(5,35)= 11.5,p < 0.0001] only a minor role in the improvementin double-vowel
and an interaction between stimulus type and AF 0
identificationwith AF0'sof a semitoneor less.
Since stimulus level has little effect on across-formant
[F( 5,35)=6.1, p <O.0005].
Identificationwasbetterfor the loud presentationlevel
grouping,the ineffectiveness
of across-frequency
grouping
[F(1,7)=15.7, œ<0.01] and better for normal than for
at small AF0'shassomegenerality.It remains,therefore,
F0-swapped
stimuli[F(1,7) = 8.1,p < 0.05]. The figureapto explorethe conditionsin which a role can be demonpearsto showa dip in correctidentificationrate at the loud
stratedfor AF0'sin the higherformant region.
presentation
levelfor a AF0 of « semitone
andalsothat
Since the results of experiment5 showedthat the
there is some differencebetween the normal and F 0higherstimuluslevelusedhithertoimproveslisteners'acswappedconditionsat 2 semitonesusing the quiet level.
curacyof identification,but doesnot havedifferenteffects
There was,however,no significantchangewith levelin the
in different conditions,further experimentscontinued to
pattern of improvementwith AF 0, as reflectedin a threeuse this high level.
D. Results
way interaction between stimulus type, level, and AF 0
[F(5,30) =2.10, p> 0.05].
E. Discussion
VII. EXPERIMENT
A. Introduction
and conclusions
The resultsof experiment5 are consistentat each presentationlevel with the resultsof experiments2 and 4, in
that normaland F0-swappedstimuligivesimilarimprovements in identificationat low AF0's. With regard to the
specifichypothesisthat pitch mechanismsoperatingin the
3463
J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993
6
Experiment 6 examined further the contribution of
AF0's in the higher formant region to double-vowelidentificationat the larger AF0's used in experiment3. The
normal,F0-swapped,andsameF2-5conditionsfrom experiment 4 were extendedto an octaverangeof AF0's.Exper-
J.F. Culling and C. J. Darwin: Formant grouping of vowels by F0
3463
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
iment4 showedthat across-formant
inconsistencies
in F0
impaired identificationof Fo-swappeddouble-vowels
at
lOO
AFo's greater than about 2 semitones.On the basisof this
findingthe sameF2-5conditionshouldalsobe worsethan
the normalconditionat larger AFo's,sinceit alsowill be
susceptibleto the effect of across-formantinconsistencies
in F0.
A newstimulusconditionwasintroduced,
sameF1,in
which vowelpairshad differentFo's only in the higher
formantregion.In this conditionthe AFo in the higher
formant regionmust be responsible
for any observedimprovementin identificationwith increasingAF0. On the
basisof previousexperiments,we predictedthat any improvementin identificationwith increasingAFo in this
conditionshouldbe slight.Small AFo'sshouldbe ineffective in the higherformantregionbecausethey are closeto
the differencelimen for fundamentalfrequencyfor unresolvedharmonics•while at higherAFo'sgroupingof the
first formantwith higherformantsshouldbe disruptedby
an inconsistency
betweenthe F0's in the first and the
higherformant regions.
O
Normal
• Same
FoforF1 J
[] Fo-swapped
A
0
1
Some
2
Fo for
4
6
F2-5
9
12
Fo Difference (semitones)
B. Stimuli
A newsetof "half-vowels"
wassynthesized
withFo's
of 100, 105.95, 112.46, 125.99, 141.42, 168.18,and 200 Hz,
usingthe samecross-over
frequencies
asexperiment2. The
higherFo's of the stimuli were thus 0, 1, 2, 4, 6, 9, or 12
equal-temperedsemitonesabove 100 Hz.
Thesehalf-vowelswerecombinedin orderto produce
the normal,F0-swapped,and sameF2-5conditionsusedin
experiment4, plus a fourth condition,sameF1, for which
theF1 half-vowels
werebothat thesameF0, whiletheF0's
of the higherformantsdiffered.As in experiment4 there
weretwo versionsof eachsameF2-5pair at eachAF0, one
with the higherformantregionat 100Hz F0, anda second
usingthe otherF 0. Similarly,two versionswerealsoused
FIG. 8. Percentbothvowelscorrectlyidentifiedin experiment6 for ( 1)
normal stimuli (circles), (2) F0-swappedstimuli (squares), (3)
sameF2-5(uprighttriangles),and (4) sameFl (invertedtriangles).
B first, and then session
A. Overall eachsubjectreceived
280 stimuli from each condition: 280 sameF2-5 stimuli in
session
A, 280 sameF1stimuliin session
B, and two presentations (in different sessions)of the 240 normal and 240
F0-swappedstimuli.
D. Results
The overall pattern of results(Fig. 8) is consistent
with
previousexperiments.Accuracyof identificationimfor the sameF1 condition, so that, with 20 vowel combiproves
steeplywith a AF 0 of one semitonein the normal,
nationsX7 AF0's,therewere 140normalstimuli,140F 0the
F0-swapped,
and the sameF2-5conditions.The latter
swappedstimuli, 280 sameF2-5 stimuli, and 280 sameF1
two conditionsshowa declinein identification
accuracyat
stimuli.
higherAF0's.The effectof AFo'sin the sameF1condition
is smallerand growsmore slowly with increasingAFo,
C. Procedure
beforedecliningrapidly at 6 semitonesAF 0. All four conNine subjects(of whom sevenhad performedexperi- ditions show similar identificationaccuracywith one ocments2 and/or 4 and two were inexperienced
in double- tave (12 semitones)AF0 as at zero AF0.
vowelexperiments)attendedtwo hour-longsessions.
BeAn analysisof variancewas conductedcoveringthe
fore each sessionsubjectsidentifiedall the 35 individual four stimulustypes(normal,F0-swapped,
sameF2-5,and
vowels which would constitute the normal stimuli in the
sameFl), two repeatedmeasures(repeatedpresentations
experiment.
of normal and Fo-swappedstimuli; different versionsof
Due to extensionsof the experimentintroduced after
sameF2-5and sameF1stimuli) and 7 AF0's (0, 1, 2, 4, 6,
somedata had alreadybeencollected,counterbalancing 9, and 12 semitones).The analysisrevealedsignificantdifwasincomplete.The two sessions
consistedof differentsets ferencesbetweenthe four stimulustypes[F(3,24)= 13.9,
of conditions:sessionA contained the normal stimuli, the
p <0.0001] and betweenthe sevenAFo's[F(6,48)= 18, 0,
Fo-swapped
stimuli,andthe sameF2-5stimuli;session
B p < 0.0001]. There werealsointeractionsbetweenthesetwo
containedthe normal stimuli, the Fo-swappedstimuli and
factors [F(18,144) ----3.75, p < 0.0001], betweenrepeated
the sameF1 stimuli. Each stimulus from each of these conmeasuresand AF o [F(6,48) ----2.7, p < 0.05] and between
ditions was presentedonce per session.Sevenof the suball threefactorsIF(18,144) = 2.93,p < 0.0005].
jectshad alreadycompletedsessionA beforesessionB was
The simplemain effectsshowedthat the effectof AF o
addedto the experiment.A further two subjectsdid session was highly significantfor the normal, Fo-swapped,and
3464
d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993
d.F. Cullingand C. d. Darwin:Formant groupingof vowels by Fo
3464
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
lOO
stimuliin whichthe first formantregionis excitedby the
higher of the two F0's. This effectof sameF1 versionis
probablyresponsible
for the interactions
betweenrepeated
measures(version) and AFo and betweenstimulustype,
repeatedmeasuresand AF 0.
75
E. Discussion
1. Effects of across-formant grouping
The useof across-formant
groupingin the identification of double-vowel
stimuliis demonstrated
in experiment
6 by the significantlylower identification scoresfor the
Fo-swappedconditioncomparedto the normal condition
at 9 semitonesAF o (seeFig. 8). Performanceis depressed
in the F0-swappedcondition becauseacross formant
groupingmechanismsare beingmisled.
5o
0
25
Both
Fls
on 100Hz
Fo
[] Both Fls on higher Fo
One of the constituent vowels in a sameF2-5 stimulus
0
1
2
4
6
9
12
Fo Difference (semitones)
FIG. 9. Percentboth vowelscorrectly identifiedin experiment6 for
sameFl stimuli,the Fl's of which wereboth excitedby ( 1) 100 Hz F o
(circles)and (2) the higherF o (squares).
sameF2-5 conditions [F(6)=9.39, p<0.0001; F(6)
----7.37,p<0.001; F(6)=5.23, p<0.0005], but was only
just significantin the sameF1 condition [F(6)=2.34,
p < 0.05], reflectingthe smallerincreasein correctidentificationwith z•,F0 in thiscondition.The differences
between
the four stimulustypeswere significantat 1, 4, 6, and 9
semitonesAF0 [F(3)=3.5, p<0.05; F(3)=3.7, p<0.05;
F(3) =4.78, p<0.01; F(3)=7.4, p <0.002], reflectingsuperior identificationaccuracy for normal double vowels
comparedto one or more of the other conditions.A Tukey
pairwise comparisonbetween the four stimulus types
showedthat the normal conditiongavesignificantlymore
accurate identification that the sameF1 condition at 1, 4, 6,
and9 semitones
z•F0 (q=4.37, 4.37, 5.32,and 5.46,respectively). The nonsignificance
of the simplemain effectand
the normal-sameF1pairwisecomparisonat 2 semitones
AF oreflectsthe relativelygoodrecognitionaccuracywhich
subjectsachievedwith sameF1 stimuli at that AF 0. The
pairwisecomparisons
alsoshowedthat the normal condition was significantlybetter than the Fo-swappedand
sameF2-5 conditions at 9 semitones&F o (q=4.44,
p <0.05; q=5.97, p <0.01).
All four conditionsshowa declinein performancebetween 4 and 6 semitonesAFo. For the normal condition
performancerecoversat 9 semitones,
forminga dip in recognitionaccuracyat 6 semitones
AFo. For the Fo-swapped
and sameF2-5 conditions it forms part of a progressive
decline which begins at 4 semitonesAF o, while in the
sameF1conditionidentificationaccuracycollapsesat this
point to zero fia*o levels. Figure 9 showsthat the latter
drop in accuracy is due mainly to those versionsof the
3465
J. Acoust.$oc. Am., Vol. 93, No. 6, June 1993
pair also containsan across-formant
inconsistency
in F o,
which may upsetacross-formantgrouping.The identification rates for sameF2-5 stimuli follow closelythe trends
shownby the F0-swappeddata acrossall AFo's and are
also significantlypoorer than the normal data at 9 semitones AF0. This outcome suggeststhat across-formant
grouping mechanismsare being confusedby sameF2-5
stimuli in the sameway as by F0-swappedstimuli.
2. Improvement in F2 and F3 separation
In the sameF1conditionthere is a AF 0 only between
the higher formants of the vowels. Given that improvement in identificationaccuracyis mediatedby separation
processes,
it is clearfrom the resultsof the sameF1condition (see Figs. 8 and 9) that separationprocesses
which
exploitAF0'sare activein the F2-5 region.In addition,the
effectsof across-formantgrouping,discussedabove,are
indirect evidence for such processes.The maximum
sameF1 performanceoccurs at 2-4 semitonesAF 0. In
comparison,the much bigger effect mediatedby the F1
region (observedhere and in experiment 4 through the
sameF2-5 stimuli), is completeafter the introductionof
only I semitoneAF o. Thus larger AFo's are required in
order to separatethe higher formantsthan to separatethe
first formants.The needfor larger AFo'sfor separationin
higherfrequencyregionsaccordswith the resultsof Gard-
ner et al. (1989) who foundthat the •F 0 requiredto detect the presenceof a secondsource(with around 90%
reliability) was greater for a mistuned fourth formant
(=4.4
semitones) than for a mistuned second formant
( =0.6 semitones).
On the other hand, comparisonof the Fo-swappedand
sameF2-5conditionsdoesnot supporta role for AFo'sin
the F2-5 region. Figure 8 shows no trend for the Foswappedstimuli, which have Z•Fo'sin the F2-5 region, to
give progressively
superiorscoresto the sameF2-5stimuli,
which do not. Indeed, averagedacrossAF0's the F0swappedstimuli tend to give lower scores.
The apparently conflicting evidencefor and against
separationin the F2-5 region can be resolvedif one considersthe relative dominanceof the F1 region.In the F oswappedand sameF2-5 conditionsthe AF0 in the F1 re-
J.F. Cullingand C. J. Darwin:Formantgroupingof vowelsby F0
3465
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
gion has a powerfuleffect,raisingperformancemarkedly.
Any small contribution from improved F2 and F3 frequencyestimationmay be swampedby the F1 effect.
It shouldbe notedthat a separationeffectobservedin
the F2-5 regiondoesnot necessarily
imply an effectmediated by unresolvedharmonics,since for some vowels the
secondformant was at leastpartially excitedby peripherally resolvableharmonics.A recentextensionof the "ru/
li" paradigm(Darwin, 1992), in which the secondformant
wasmistunedin F0 from syllablessynthesized
with F0's of
80, 120, or 200 Hz, suggeststhat perceptualexclusionof
that formant (and hence perception of/li/, rather than
/ru/) is more dependentupon the absoluteF o of the formant than of the AF 0. The F 0 at which a categoricaltransitionoccurredwasconsistentwith the hypothesisthat the
formant had to be excitedby resolvedharmonicsfor perceptual exclusionof that formant to occur.
3. Other
effects
An unexpectedissuearose in the sameF1 condition.
Here eachvowelcombinationwasrepresented
by two stimuli at eachAFo; for one,the F1 regionof both vowelswas
excitedby 100 Hz F o, while for the other,both Fl's were
excitedby the higherF 0. In the sameF1condition,there
wasa drop in performancefor stimuliwhich had the Fl's
of bothvowelson an Fo of 100Hz+6 semitones
or 141Hz
(see Fig. 9). It seemslikely that this drop is causedby a
poordefinitionof the combinedF 1 spectralenvelope.Similar dropsin performancewere observedin the other three
conditionsas well as in the 6 semitoneAF o conditionsof
experiment3 (for normal and F0-swappedstimuli). The
poor definitionof the F1 envelopeappearsto be relatedto
the particular harmonic spacingof 141 Hz, since two
sourcesof evidenceshow that other large harmonicspacingsdo not havethe sameeffect.First, in both experiments
3 and 6, there is somerecoveryin identificationaccuracyat
9 semitonesAF o, beforethe drop at one octave,and sec-
to segregate
formantsand that neededto improvelistners'
identificationaccuracyfor doublevowelsby usingconstituent vowelswith across-formant
inconsistencies
in F 0 in
double-vowelexperiments.These inconsistencies
were designedto confuseany grouping/segregation
of formants
accordingto their Fo's.
Experiments2-6 comparedthe accuracyof identification for stimuli composedfrom vowelswith and without
across-formant
inconsistencies
in F 0 (the normaland F oswapped conditions). In either case, performanceincreasedmarkedly with increasingAF 0 up to 1 semitone,
showingthat this increasecannotbe attributableto acrossformantgroupingmechanisms,
whichoughtto be confused
by the inconsistent
F0's of the constituentvowels.AF0'sof
at least4 semitoneswererequiredbeforean across-formant
groupingeffect,in the form of slightlylower performance
in the conditionswith inconsistent
F0's,waslargeenough
to be significant,although the normal condition always
yieldedslightlyhigherscoresat eventhe smallestz•F0's.
Experiments3 and 6 showedthat whenlarger AFo'sof
6 and 9 semitonesare used,the inconsistent
F0's can disrupt performancemore, indicating that, in line with the
results of Darwin
and Gardner et al., across-formant
Experiments with simultaneous("double") vowels
showthat differences
in fundamentalfrequency(AF0's) as
grouping/separation
mechanisms
requirelarge AF0's. Experiment3 also showedthat the doublevowelswere not
recognizableusing either the first-formantregion or the
higherformant regionalone,and that identificationbased
eitheron theseisolatedregions,or uponthe phoneticlabels
derivedindependently
from bothregions,doesnot improve
with increasingAF 0. So, althoughacross-formant
grouping by F 0 has only a minor role in double-vowelexperiments, informationfrom the first-formantregion and the
higher formant region must be integratedin some way
prior to phoneticcategorization.
Experiment5 showedthat the role played by acrossformant groupingwas not significantlyaltered when the
presentationlevel was reduced from around 90 to 60
dB(A), but that overall performancewas significantly
poorer with the lower presentationlevel.
The powerfuleffectof smallAF0'son the identification
of doublevowelswasinvestigatedin experiments4 and 6.
Sinceacross-formant
groupingby F 0 has little effectupon
identification
ratesfor low AF0's,it waspossibleto construct stimuli which possessed
z•dr0'sonly in the firstformantregionor only in the higherformantregion,without inappropriate formant grouping disrupting
performance.
Theseexperiments
showedthat z•dr0's
in the
first-formantregionwere requiredfor large improvements
smallas ¬ semitone(1.5%) makethe two vowelsmuch
in identification, and that AFo's in the higher formant re-
ond, Assmann (1992) has found that the identification
accuracyfor double-vowelswhich have a baselineF o of
200 Hz is very similar to that for doublevowelswhich have
a 100-Hz baseline.These resultsshow that widely spaced
harmonicsdo not in themselvesdisrupt listeners'identification accuracy.
VIII. GENERAL
DISCUSSION
easierto identify ($cheffers,1983; Zwicker, 1984; Chalikia
and Bregman, 1989; Assmann and Summerfield, 1990).
This effecthasbeenattributedto mechanismswhich group
gion producedsmallereffects,which requiredlarger zXF0's.
So, the large effectof small AFo's in double-vowelexperimentsmust be attributedchieflyto mechanismsoperating
parts of the soundwhich have the sameF o and segregate in the first-formantregion.This patternof resultsis compatiblewith Houtsmaand Smurzynski'sdata on pitch difpartswhichhavedifferentFo's.However,the limitednumference limens (Houtsma and Smurzynski, 1990). They
ber of experimentswhich have demonstratedthat the auditory systemsegregates
the formantsof speechsoundson
found that the pitch differencelimen grew sharply when
the basisof commonF o have requiredmuch larger z•dro'S the stimuli contained only harmonics above the 10th,
which tend not to be resolvedby the peripheralauditory
(Darwin, 1981; Gardner et al., 1989). The presentstudy
investigatedthe mismatchbetweenthe sizeof AF o needed system.Listeners'ability to identifythe F 0 which excitesa
3466
J. Acoust.Soc. Am., Vol. 93, No. 6, June 1993
J.F. Cullingand C. J. Darwin:Formantgroupingof vowelsby F'0
3466
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
particularfrequencyregionof a soundis, therefore,highly
dependenton the harmonic number of the harmonics in
Carlyon,R. P., Demany,L., and Semal,C. (1992). "Detectionof acrossfrequencydifferencesin fundamentalfrequency,"J. Acoust. Soc. Am.
that region,and in a double-vowelexperimentthe higher
formantsaremuchmorelikelyto beexcitedby highnum-
Chalikia,M. H., andBregman,
A. S. (1989). "The perceptual
separation
of simultaneous
auditorysignals:Pulsetrain segregation
andvowelseg-
bered harmonics than the first formant. Hence, when the
91, 279-292.
regation,"Percept.Psychophys.46, 487-496.
AF0 is small, listenersmay be able to determinethe two
F0'sin the first-formant
region,and sogroupcomponents
of commonF 0 within that region,but they are poorerat
determiningthe œ0'swhich excite the higher formants.
When the F0's in the higher formant regionare not determined,listenersare able neitherto groupthe components
which excite the higher formants, nor to group the formantsthemselveson the basisof commonF o.
Culling,J. F. (1991). "The perceptual
separation
of concurrentvowels,"
Ph.D. thesis,SussexUniversity.
Cutting,J. E. (1976). "Auditory and linguisticprocesses
in speechperception:inferences
from sixfusionsin dichoticlistening,"Psychol.Rev.
IX. CONCLUSIONS
Gardner, R. B., Gaskill, S. A., and Darwin, C. J. (1989). "Perceptual
groupingof formantswith staticand dynamicdifferences
in fundamental frequency,"J. Acoust.Soc.Am. 85, 1329-1337.
Houtsma, A. J. M., and Smurzynski,J. (1990). "Pitch identificationand
discriminationfor complextoneswith many harmonics,"J. Acoust.
83, 114-140.
Darwin, C. J. (1981). "Perceptualgroupingof speechcomponentsdiffering in fundamentalfrequencyand onset-time,"Q. J. Exp. Psychol.A
33, 185-207.
Darwin, C. J. (1992). "Listeningto two thingsat once,"in The/luditory
Processing
of Speech,editedby M. E. H. Schouten(Mouton & Gruyer,
Berlin ).
The resultsof thesesix experiments
showa fairly consistentoverallpattern.
( 1) At eventhesmallest
AF0'sused({ semitone)
there
Soc. Am. 87, 304-310.
is a powerfuleffecton identificationaccuracymediatedby
Klatt, D. H. (1980). "Softwarefor a cascade/parallelformant synthethe FI region, which accountsfor most of the 25% insizer," J. Acoust. Soc. Am. 67, 838-844.
Meddis, R., and Hewitt, M. J. (1992). "Modeling the identificationof
creasein performanceover the first semitoneof AF 0.
concurrentvowelswith differentfundamentalfrequencies,"J. Acoust.
(2) There is little evidenceeffectsof AF0'sin the F2-5
Soc. Am. 91, 233-245.
regionat the smallestAF0's. Firmer evidenceof its smaller
Miller, I. M., and Sachs,M. B. (1984). "Representationof voicepitch in
contributionis only visibleat AF0'sof 2--4 semitones.
dischargepatternsof auditory-nervefibers,"Hear. Res. 14, 257-279.
(3) Across-formantgroupingeffectsalso show little
Moore, B. J. C., and Glasberg,B. R. (1983). "Suggested
formulaefor
signof emergingat low AF0's,regardless
of presentation calculating auditory-filter bandwidthsand excitation patterns," J.
Acoust. Soc. Am. 74, 750-753.
level,and only emergestronglyfor AF0'sof at leastfour
Parsons,T. W. (1976). "Separationof speechfrom interferingspeechby
semitones.
means of harmonic selection," J. Acoust. Soc. Am. 60, 911-918.
ACKNOWLEDGMENTS
We wishto thank QuentinSummerfield,David Bailey,
and Ray Meddisfor readingand providingvaluablecomment on variousversionsof this manuscript.The SERC
providedthe laboratorywith equipmenton variousgrants
and supportedJohnCulling with a researchstudentship.
Assmann, P. F. (1992). Personal communication.
Assmann,P. F., and Summerfield,Q. (1990). "Modelingthe perception
of concurrentvowels:Vowelswith differentfundamentalfrequencies,"
J. Acoust. Soc. Am. 88, 680-697.
Boothroyd,A., and Nittrouer, S. (1988). "Mathematicaltreatmentof
contexteffectsin phonemeand word recognition,"J. Acoust.Soc.Am.
84, 101-114.
Broadbent,D. E., and Ladefoged,P. (1957). "On the fusion of sounds
reachingdifferentsenseorgans,"J. Acoust.Soc.Am. 29, 708-710.
Brokx, J.P.
L., and Nooteboom, S. G. (1982). "Intonation and the
perceptualseparationof simultaneousvoices,"J. Phon. 10, 23-36.
3467
J. Acoust.Soc. Am., Vol. 93, No. 6, June 1993
Peterson,G. E., and Barney, H. L. (1982). "Control methodsusedin a
study of vowels," J. Acoust. Soc. Am. 24, 175-184.
Scheffers,
M. T. M. (1982). "The roleof pitchin perceptualseparationof
simultaneous
vowelsII," IPO Ann. Prog.Rep. 17, 41-45.
Scheffers,M. T. M. (1983). "Sifting vowels:Auditory pitch analysisand
soundsegregation,"
Ph.D. thesis,Gronigen.
Schouten,J. F. (1940). "The residueand the mechanismof heating,"
Proc. Kon. Ned. Akad. Wetensch. 41, 1086-1093.
Stubbs,R. J., and Summerfield,Q. (1990). "Algorithms for separating
the speechof interferingtalkers:Evaluationswith voicedsentences,
and
normal-hearingand hearing-impairedlisteners,"J. Acoust. Soc. Am.
87, 359-372.
Summerfield,Q., and Assmann,P. F. (1991). "Perceptualseparationof
concurrentvowels: Effects of pitch pulse asynchronyand harmonic
misalignment,"J. Acoust.Soc.Am. 89, 1364-1377.
Summerfield,Q. (1992). Personalcommunication.
Weintraub,M. (1985). "A theory and computationalmodelof auditory
monaural sound separation,"Ph.D. thesis,Stanford Univ., Stanford,
CA.
Zwicker, U. T. (1984). "Auditory recognitionof diotic and dichotic
vowel pairs," SpeechCommun. 3, 265-277.
J.F. Cullingand C. J. Darwin:Formant groupingof vowels by Fo
3467
wnloaded 12 Oct 2010 to 131.251.143.145. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms
© Copyright 2026 Paperzz