Speaker- and group-specific information in formant dynamics: a

Speaker- andgroup-specific
informationinformantdynamics:
aforensicperspective
VincentHughes
PaulFoulkes
LabPhon 15Satellite
Speechdynamics,socialmeaning,andphonologicalcategories
13th July2016
e.g.
Roberts(2012)
Rhodes(2012)
Harrison(2013)
2
Outline
1.
2.
3.
4.
5.
theforensicproblem
formantdynamicsinforensics
researchquestions
method
experiments:
– speakerdiscrimination
– groupdiscrimination
6. discussion
3
1.Theforensicproblem
• forensicvoicecomparison(FVC):
unknown offender
known suspect
vs.
4
1.Theforensicproblem
FVC
defence
(innocent)
prosecution
(guilty)
5
1.Theforensicproblem
• propertiesofidealfeatures:
– highbetween-speakervariability
– lowwithin-speakervariability
– resistancetodisguise
– robustnessintransmission
– measurability
– availability
✓
✗
fromNolan(1983)
6
1.Theforensicproblem
• propertiesofidealfeatures:
– highbetween-speakervariability
– lowwithin-speakervariability
– resistancetodisguise
– robustnessintransmission
– measurability
– availability
✓
✗
fromNolan(1983)
7
2.Formantdynamicsinforensics
• commonlyusedinforensicsforlast20years
– startingwith…Greisbach etal.(1995)
– McDougall(2006)
• valueofparametricrepresentations
• polynomialsbetterthanrawHzinput
– Morrison(2009)
• comparisonofdifferentparametricrepresentations
8
2.Formantdynamicsinforensics
whydynamics?
• targets=learnedbyspeechcommunity
• transitions=“acquired…bytrialanderror”
• “speakers''vocalsignatures'lieintherapid,
transitionalmovementsofthespeechorgans
betweensounds”
fromNolan(1997)/McDougall(2004)
9
2.Formantdynamicsinforensics
• so…phonologyisallabouttargets?
speech
Mokhtari (1998)
(language,contrastetc.)
individual
e.g.transitions
speaker
Garvin &Ladefoged (1963)
group
e.g.targets
2.Formantdynamicsinforensics
• but…inconsistentwithe.g.usage-based
models?
– anyelementofphonetic/phonologicalstructure
canbelearned&representedcognitively
– thuspotentialfortransitionstocarry‘group’
information
• formantdynamicsincreasinglyusedtoexplore
group-patternsinsociophonetics
11
3.Researchquestions
• towhatextentisspeaker- andgroup-specific
informationencodedinthedynamicsof
formanttrajectories?
– implicationsformodelsofphonology
– valueoftheforensicperspective
12
4.Method
variable
• PRICE/aɪ/
– subjectofconsiderableanalysisinforensics
– coversawiderangeofthevowelspace
• potentialforconsiderableformantmovement across
thedurationofthevowel
13
4.Method:datasets
(1)StandardSouthernBritish
English(SSBE)
– DyViScorpus(Nolanetal.2009)
– 97malespeakers
– 18-25years
– mockpoliceinterview(map
task)
14
4.Method:datasets
(2)Newcastle (Milroyetal.1994-97)
(3)Manchester (Haddican etal.2013)
(4)Derby (Milroyetal.1994-97)
– 8malespeakers
– 18-31years
– sociolinguisticinterviewsin
peer-grouppairs
15
4.Method
dynamics
• c.10tokens/sp
• measurementsat
+10%steps
• pre-testingfor
optimalfit
– cubicpolynomials
• 4coefficients/
formant
statics
• +20%&+80%Hz
values/formant
16
5.Results:speakerdiscrimination
• SSBEspeakers:
– 20testspeakers
– 57referencespeakers
• same- (SS)&different-speaker(DS)comps
• likelihoodratios(LRs)usedfordiscrimination
p(E|Hp)
p(E|Hd)
p =probability
E=evidence
|=‘given’
Hp =prosecutionhyp
Hd =defence hyp
17
5.Results:speakerdiscrimination
• output=log10 LRs:
– centeredon0(noevidence)
– >0=supportforprosecution
– <0=supportfordefence
• errormetrics:
– equalerrorrate(EER)
– logLRcostfunction(Cllr)
Closerto0,the
betterthe
performance
18
5.Results:speakerdiscrimination
Static
35
30
Dynamic
35
F2-only
F3-only
30
F1-only
25
20
15
F1, F2 and F3
EER (%)
EER (%)
25
F1-only
20
F2-only
15
10
10
5
5
0
0
F3-only
F1, F2 and F3
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Log LR Cost (Cllr )
Log LR Cost (Cllr )
19
5.Results:groupdiscrimination
• predictingregionalbackground
• cross-validateddiscriminantanalysis:
– eachtokenassignedto1of4regionalgroups
– modelsbuiltonalldataexcludingtargettoken
• generatesclassificationratebasedon
posteriorprobability
• chance=25%(1/4)
20
5.Results:groupdiscrimination
Formant
Classification rate
F1
63.8%
F2
64.7%
F3
40.6%
21
6.Discussion
speakerdiscrimination
• formantdynamicscontainconsiderable
speaker-specificinformation:
– betterperformancethanstaticvalues
• higherformants=greaterspeakerdiscriminatorypower
– speech-speakerdichotomy(Mokhtari 1998)
– F1~F2responsibleforcontrast
22
6.Discussion
groupdiscrimination
• group-specificinformationisn’tallabout
targets
– individualcubiccoefficientscapableofpredicting
regionalbackgroundabovechance
– allcoefficientsincombinationoutperformanyone
inisolation
• so…fine-grainedphoneticsclearlyshared
acrossspeechcommunities
23
6.Discussion
• resultschallengeunderlyingphonological
modelforformantdynamics
– groups=notallabouttargets
– Individuals=notallabouttransitions
• needtorethinkthedichotomies:
– speech-speaker(Mokhtari 1998)
– group-individual(Garvin&Ladefoged 1963)
– maybeit’saboutcontinua?
24
7.Conclusion
• formantdynamicscapableofencodingboth
speaker- andgroup-information
– consistentwithusage-basedapproaches?
• focusontheindividualmayhelpusbetter
understandacquisitionofvariation
– thereforearoleforforensics(methodologicaland
theoretical)inunderstandingphonology
25
Thanks!
Questions?
LabPhon 15Satellite
Speechdynamics,socialmeaning,andphonologicalcategories
13th July2016