Bayesian Model of Stress Assignment

A Bayesian Model of Stress
Assignment in Reading
Olessia Jouravlev & Stephen J. Lupker
University of Western Ontario
MUSKET OR MUSKET
Models of Stress Assignment
∗ Dual-route model (Rastle & Coltheart, 2000)
∗ Connectionist model (Seva et al., 2010)
∗ CDP++ (Perry et al., 2010)
Simulation Results: Words
100
80
Trochaic Stress
99
92
95
67
Iambic Stress
99
97
77
88
60
40
20
0
Rastle &
Coltheart
Seva et al.
Seva et al.
(training set) (testing set)
CDP++
Simulation Results: Nonwords
Trochaic Stress
100
80
Iambic Stress
93
89
78
60
44
42
45
40
20
0
Rastle & Coltheart
Seva et al.
CDP++
Probabilistic nature of human
cognition
Bayesian Decision Making
Bayesian Decision Making
A: Pneumonia
B: No Pneumonia
Prior Probabilities:
Pneumonia - .1; No Pneumonia - .9
Likelihood of Evidence (Coughing) given:
Pneumonia - .8; No Pneumonia - .2
P (Pneumonia|Evidence) =
.1 (.8)
.1 .8 +( .9 .2
=
.08
.26
= .31
Bayesian Model of Stress
Assignment
MUSKET
(Stress1)
MUSKET
MUSKET
(Stress2)
Bayesian Model of Stress
Assignment
Prior probability of a Stress Pattern: P(Stress)
frequency of that stress pattern in the language.
Likelihood of evidence: P(Evidence|Stress)
Probability of non-lexical evidence being present in a word
given a particular stress pattern.
P( Stress1| Evidence) =
P( Evidence | Stress1)* P( Stress1)
P( Evidence | Stress1)* P( Stress1) + P( Evidence | Stress 2)* P( Stress 2)
Is it a universal model?
∗ YES, but the prior probabilities of
stress patterns and the sources of
evidence for stress are language
specific.
What is evidence for stress?
∗ Non-lexical information provided by a word cueing the
most probable stress pattern of the word (aka “stress cue”)
∗ A reliable stress cue is characterized by
a)
b)
high validity (there is a relationship between the cue
and a stress pattern in a language)
high utility (readers use the presence of this
relationship to assign stress)
Can the model consider multiple
cues?
∗ YES, it does it in a stepwise fashion.
P( stress | A) =
P( A | stress ) P( stress )
∑ P( A | stress ') P(stress ')
stress '∈STRESS
P ( stress | A) = P( stress ) *
P( stress | A, B) =
∑
P( B | stress ) P( stress ) *
P( B | stress ')(1 − P( stress )*)
stress '∈STRESS
Bayesian Model of Stress
Assignment in Russian
∗ Russian is opaque in its rules of
spelling-to-stress mapping
∗ Stress is assigned only as a result of
lexical processing (Gouskova, 2010)
Overview of Studies Completed
Prior Probabilities:
∗ Study 1: Corpus Analysis
Sources of Evidence for Stress:
∗ Study 2: Corpus Analysis(binary logistic
regression)
∗ Study 3: Word Naming (linear mixed effects
model)
Simulations:
∗ Study 4: Word Naming
∗ Study 5: Nonword Naming
Study 1: Prior Probabilities of
Stress Patterns in Russian
Goal: identify the frequency of Trochaic and
Iambic Stress in Russian disyllabic words
Method:
corpus analysis of 13,923 disyllabic words.
Results:
Trochaic Stress – 55%
Iambic Stress – 45%
Study 2: Binary Logistic
Regression
Goal: identify cues that have high validity
Method:
∗ DV: stress patterns in 13,943 disyllabic words
∗ IVs: (1)Grammatical Category (2) Log Frequency, (3)
Length, (4) Word Onset Complexity, (5) Word Coda
Complexity, (6-11) Six Orthographic Components
Study 2: Binary Logistic
Regression: Results
∗ Stress cues that are probabilistically
associated with stress patterns in Russian
(i.e., have high validity) are:
∗ Onset Complexity
∗ Ending Complexity
∗ CVC1
∗ CVC2
∗ VC2
Study 3: Linear Mixed Effects
Model
Goal: identify cues that have high utility
Method
∗ IVs: a set of 11 predictors (fixed factors), Subjects
and Items (random crossed factors)
∗ DV: stress pattern assigned to 500 disyllabic words
by 34 native speakers of Russian
Study 3: Linear Mixed Effects
Model: Results
∗ Stress cues that readers of Russian use
in stress assignment (i.e., have high
utility) are:
∗ CVC1
∗ CVC2
∗ VC2
Lexical Stress in Russian:
Conclusions
∗Prior Probabilities:
Stress 1 – .55; Stress 2 - .45
∗Reliable Stress Cues:
CVC1, CVC2, VC2
Study 4: Simulation of Stress
Assignment in Word Naming
Q: Can the model predict stress assignment
performance of native readers on words?
Method:
∗ Bivariate Regression, 500 Russian disyllabic words
∗ IV: Probability of Trochaic Stress computed by the model
∗ DV: Ratio of Trochaic Stress assigned by 34 readers
∗ The model’s ability to predict ratio of trochaic stress assigned
by readers to words was significant:
r (498) = .76, F (1,498) = 681.25, p < .001
Study 4: Simulation of Stress
Assignment in Word Naming
Q: Can the model predict stress patterns of words
in the language?
Method:
∗ The posterior probabilities of Trochaic stress patterns
were interpreted in the following way:
∗ < .45 – Prediction is Iambic Stress
∗ > .55 – Prediction is Trochaic Stress
∗ .45 - .55 – Unclear prediction
Study 4: Simulation of Stress
Assignment in Word Naming
Results:
∗ Stress patterns predicted in the language
correctly - 78%
incorrectly – 16%
unclear – 6%
Study 4: Simulation of Stress
Assignment in Word Naming
Q: Do readers tend to make stress errors on words
for which the model predicts the incorrect stress
pattern?
Method:
∗ Mixed Effects Model
∗ IV: Degree of Inconsistency of Prediction
= 1 – Probability of Correct Stress Patten
∗ DV: correct stress (0) vs. incorrect stress (1)
Study 4: Simulation of Stress
Assignment in Word Naming
Results: χ2(1) = 194.10, p < .001; z = 14.76, p < .001
Study 5: Simulation of Stress
Assignment in Nonword Naming
Q: Can the model predict stress assignment
performance of native readers on nonwords?
Method:
∗ Bivariate Regression, 200 disyllabic nonwords
∗ IV: Probability of Trochaic Stress computed by the
model
∗ DV: Ratio of Trochaic Stress assigned by 30 readers
∗ The model made significant predictions on ratio of
trochaic stress assigned to nonwords by readers: r
(198) = .87, F (1, 198) = 600.35, p < .001.
Study 5: Simulation of Stress
Assignment in Nonword Naming
Q: Can the model predict the most frequent stress
pattern being assigned to a nonword?
Results:
∗ The most frequent stress patterns predicted
correctly - 86%
incorrectly – 8%
unclear – 6%
Conclusions
∗ A new approach to the modeling of stress assignment
∗ A reader estimates posterior probability by adjusting a
prior belief about the likelihood of a stress pattern based
on non-lexical sources of evidence for stress.
∗ The Bayesian model of stress assignment was
implemented in Russian.
∗ The model could accomplish stress assignment in Russian
disyllabic words and nonwords with a high degree of
accuracy.
Thanks
Steve Lupker’s Lab
Supervisory Committee
Jason Perry
Mark McPhedran
Jimmie Zhang
Debra Jared
Marc Joanisse
Ken McRae
Jouravlev, Lupker, & Jared,
Cross-language phonological activation:
Evidence from masked onset priming and ERPs,
Friday 12 pm poster session, 54th Psychonomic
Society Annual Meeting
Some Remarks…
∗ The selection of a response unfolds in a way
similar to a random walk
STRESS 1
STRESS 2