The phonological grammar is probabilistic: New evidence pitting

The phonological grammar is probabilistic: New
evidence pitting abstract representation against
analogy
Claire Moore-Cantwell
Yale university
October 9, 2015
Claire Moore-Cantwell
Yale
1 / 34
Introduction
Speakers extend probabilistic trends in their lexicons to new words
Example: Initial stress in English
a majority of 2-syllable words have initial stress (about 75%)
but stable exceptions are plentiful: guitár, garáge, devı́ce
English speakers prefer initial stress in novel words (Guion
et al., 2003)
Probabilistically:
They sometimes produce finally-stressed nonwords as well
The rate of initial stress can be influenced by other factors
- Part of speech
- Syllable weight
What is the cognitive mechanism that underlies this
ability?
Claire Moore-Cantwell
Yale
2 / 34
Introduction
Speakers extend probabilistic trends in their lexicons to new words
Example: Dutch voicing alternations (Ernestus and Baayen, 2003)
[vErVEid@n], [vErVEit@n] ! [vErVEit]
80
40
0
40
% voicing
80
Production
0
% voicing
Lexicon
p/b
t/d
s/z
f/v
x/γ
p/b
t/d
s/z
f/v
x/γ
Similar results: Hayes et al. (2009); Becker et al. (2011); Zuraw (2000,
2010) and many others
Claire Moore-Cantwell
Yale
3 / 34
Introduction
Speakers extend probabilistic trends in their lexicons to new words
They ‘probability match’
Rather than categorically choosing the most common pattern
? Grammar contains probabilistic generalizations?
Represents not just what to do, but also how often to do it
Or are these trends represented some other way?
Analogy to existing items
Statistical learning: Cognitively general mechanism
Claire Moore-Cantwell
Yale
4 / 34
Introduction
Speakers extend probabilistic trends in their lexicons to new words
They ‘probability match’
Rather than categorically choosing the most common pattern
? Grammar contains probabilistic generalizations?
Represents not just what to do, but also how often to do it
Or are these trends represented some other way?
? Analogy to existing items
Statistical learning: Cognitively general mechanism
Claire Moore-Cantwell
Yale
5 / 34
Introduction
1 Case of probability matching in the English stress system
2 Is analogy happening?
Use nonwords with no near lexical neighbors
Ask participants to provide potential analogical bases
Compare: Stress of analogical base to produced stress
Guion et al. (2003): E↵ects of analogical base AND
phonological generalizations
Claire Moore-Cantwell
Yale
6 / 34
Analogy
How do you choose what to analogize to?
Randomly choose a word
No guarantee that your word will have the necessary properties
Use the entire lexicon
Divide the lexicon up into categories; choose the one where all
the words match your nonword in some relevant way (Skousen,
1989)
Calculate the phonetic similarity between your nonword and
each actual word (Nakisa et al., 2001)
Choose a word based on similarity
Lookup words using feature(s) of the nonword
Use Lexical access mechanism?
e.g. TRACE (McClelland and Elman, 1986)
Claire Moore-Cantwell
Yale
7 / 34
English Stress
Chomsky and Halle (1968); Halle and Vergnaud (1987):
‘Latin Stress Rule’
(A) Stress a heavy penultimate syllable (amálgam)
Very few exceptions in the lexicon (galaxy,character)
Obeyed in speakers’ productions (Domahs et al., 2014;
Olejarczuk, 2014)
(B) else stress antepenult (cánopy)
Exceptions abound (vanı́lla, banána, spaghétti, canáry . . . )
(Pater, 1994)
Not obeyed in speakers’ productions (Domahs et al., 2014)
Claire Moore-Cantwell
Yale
8 / 34
English Stress
Corpus search:
Corpus: CMU pronouncing dictionary (Weide, 1994)
Frequency threshold: SubtLex
(Brysbaert and New, 2009)
All words 3+ syllables
Automatic annotation: syllable structure, vowel qualities,
stress pattern
Claire Moore-Cantwell
Yale
9 / 34
English Stress
Chomsky and Halle (1968); Halle and Vergnaud (1987):
Antepenult
L
‘Latin Stress Rule’
H
H: CVV,CVC*
L: CV
Penult
Heavy penult: aróma, bonánza
Claire Moore-Cantwell
Light penult: tobóggan, elı́cit
Yale
10 / 34
English Stress
Stress is partially conditioned by the final vowel
-
-i
689
792
57%
96%
If [@]-final, no preference
If [i]-final, then Antepenultimate
Final [i] drives stress leftward
Penult
Antepenult
Lexicon: light penults
(Hayes, 1982; Liberman and Prince,
1977)
total: 2035
Claire Moore-Cantwell
Yale
11 / 34
Strategy
(1) Does the i-final trend generalize to nonwords?
(2) Do speakers use analogy to do so?
Methods very similar to Guion et al. (2003)
Part 1: wug test
Part 2: same nonce words again, this time fill-in-the-blank
What real word does it remind you of?
Web-based experiment using Amazon Mechanical Turk
Claire Moore-Cantwell
Yale
12 / 34
Methods
Wug test
Isolated syllables presented auditorily: [bǽ] [mǽ] [kı́]
Participants speak the word ‘fluently’
Both stress options presented: [b@mǽki], [bǽm@ki]
Participants choose one
! Forced choice as proxy for production
Claire Moore-Cantwell
Yale
13 / 34
Methods
Claire Moore-Cantwell
Yale
14 / 34
Methods
Claire Moore-Cantwell
Yale
15 / 34
Methods
Claire Moore-Cantwell
Yale
16 / 34
Methods
Getting potential analogical bases
Isolated syllables presented again: [bǽ] [mǽ] [kı́]
‘What English word does the sequence of syllables remind you
of?’
Participants filled in a blank
! Word most likely to serve as analogical base
Claire Moore-Cantwell
Yale
17 / 34
Methods
Details:
48 Participants recruited through Amazon Mechanical Turk
Presented using Experigen (Becker and Levine) plus a plugin
for recording over the web
32 nonword items, 8 real word fillers
Nonwords selected to have very low neighborhood density
under (Bailey and Hahn, 2001), GNM value < 0.01
20 minutes total
Claire Moore-Cantwell
Yale
18 / 34
Results
General:
Most participants succeeded at the production task
Produced e.g. [bǽm@ki] not [bǽmǽkı́]
Chose the sound file that corresponded to their production
! Can trust forced choice data
Analogical base task was harder
Provided an actual word about 58% of the time
Rest of the time: transcribed the nonword
or gave no answer
Claire Moore-Cantwell
Yale
19 / 34
Results
Results of production task
-
-i
474
695
58%
77%
i-final More Antepenult
@-final Equal
Penult
Antepenult
Forced choice responses
total: 1728
Claire Moore-Cantwell
Yale
20 / 34
Results
Compare
Lexicon: light penults
-i
-
-i
474
695
689
792
58%
77%
57%
96%
Penult
total: 1728
Claire Moore-Cantwell
Antepenult
-
Penult
Antepenult
Forced choice responses
total: 2035
Yale
21 / 34
Results
Properties of analogical bases:
Favored 3-syllable words
Number of Syllables
1
2
3
4
5
194 221 411 58 3
22% 25%
53%
Matched final vowel 91% of the time
Claire Moore-Cantwell
Yale
22 / 34
Results
Properties of analogical bases:
Antepenult
Chosen Bases
46
-i
126
43%
79%
Penult
More Antepenult in i-final bases
total: 266
Claire Moore-Cantwell
Yale
23 / 34
Results
Does base stress predict produced stress?
i-final
-final
Stress of chosen base
Stress of chosen base
total: 222
Claire Moore-Cantwell
Antepenult
78%
Produced stress
Antepenult
85%
62
Penult
Produced stress
121
Antepenult
Penult
Penult
52
69
58%
52%
Penult
Antepenult
total: 223
Yale
24 / 34
Results
Does base stress predict produced stress?
Logistic regression with two factors:
Model: Produced Stress ⇠ Final Vowel + Analogical Base Stress
Estimate p
Intercept -0.54
0.02
Final Vowel = i -1.22
0.0001
Analogical Base Stress = Penult. 0.42
0.20
AIC: 290
remove:
Final Vowel
Analogical Base Stress
Claire Moore-Cantwell
change in AIC
+13
0
Likelihood ratio
15.66
1.7
p
0.0001
0.20
Yale
25 / 34
Results
What if participants access a di↵erent real word each time
they hear the nonword stimulus?
But they’re still using analogy
? What behavior is predicted for each nonword based on
the set of nearby real words?
Stimulus
[rE vE si]
Analogical Base légacy lı́very
no. Responses
1
1
83% Antepenult, 17% Penult
Stimulus
[sE fE ni]
Analogical Base sýmphony
no. Responses
8
91% Antepenult, 9% Penult
Claire Moore-Cantwell
prı́vacy
1
fámily
1
régistry
1
sésame
1
rémedy
1
safári
1
revéal
1
sapphire
1
say
1
receive
1
save
1
Yale
26 / 34
Results
What if participants access a di↵erent real word each time
they hear the nonword stimulus?
But they’re still using analogy
? What behavior is predicted for each nonword based on
the set of nearby real words?
Stimulus
[rE vE s@]
Analogical Base revı́sion revérsal
no. Responses
5
1
0% Antepenult, 100% Penult
revise
2
Stimulus
[sE fE n@]
Analogical Base sýmphony savánna
no. Responses
2
2
50% Antepenult, 50% Penult
secondary
1
Claire Moore-Cantwell
rabbit
1
seven
1
vista
1
sa↵ron
1
vivid
1
safe
1
Yale
27 / 34
Results
0.8
0.2
0.4
0.6
Final vowel
-i
-
0.0
% antepenultimate in production
1.0
% Antepenultimate stress by item
0.0
0.2
0.4
0.6
0.8
1.0
% bases antepenultimate
Claire Moore-Cantwell
Yale
28 / 34
Results
Participants ‘probability matched’ antepenultimate stress on
i-final words
They also observe this trend in their choice of analogical bases
! But the stress of the base does not predict stress in production
Participants’ probability matching seems not to be the result of
analogy to exisiting items
Claire Moore-Cantwell
Yale
29 / 34
Conclusions
Analogy is not responsible for the productivity of the i-final
trend
Previous studies (Guion et al., 2003; Baker and Smith, 1976) showed
e↵ects of BOTH analogy and abstract generalization
! Used words with richer neighborhoods, in some cases near
neighbors (cinempa)
Here: no e↵ect of analogy at all
Nonwords were very far from any actual word
Speakers can extend the i-final trend to nonwords even when
analogy is difficult
! Abstract representation of the i-final trend
Claire Moore-Cantwell
Yale
30 / 34
Thank You
Claire Moore-Cantwell
Yale
31 / 34
Individual Subjects
10
20
d' =Z(% Initial, i-final) - Z(%Initial, -final)
i final
final
8
6
Participants
10
i final: 88% Initial
final: 54% Initial
4
15
Lexicon
Lexical values:
0
0
2
5
Participants
Experiment
i final: 77% Initial
final: 57% Initial
0.0
0.2
0.4
0.6
0.8
1.0
-2
-1
0
1
2
% Initial stress
Claire Moore-Cantwell
Yale
32 / 34
Morphology?
Morphologically complex
-i
-
-i
201
524
591
64%
89%
55%
98%
Penult
total: 483
Claire Moore-Cantwell
Antepenult
165
Penult
Antepenult
Morphologically simple
total: 1552
Yale
33 / 34
Introduction
Categorical phonology: Grammar
Inexorably applies to new words
Regardless of similarity to actual words (Prasada and Pinker,
1993)
Speakers cannot viridically perceive violations: [dla] ! ‘gla’
(Moreton, 2002; Breen et al., 2013)
Hard to un-learn
Learning the sound pattern of a second language is not simply
a matter of learning the words
Experimental cases: (Finn and Kam, 2008; Whalen and Dell,
2006)
Limited range of possible patterns
Some categorical patterns are common: Antepenultimate stress
Others surprisingly rare: Post-peninitial stress
Claire Moore-Cantwell
Yale
34 / 34