Overview of lecture Connectionism in brief

Connectionist models of language
Overview of lecture
Connectionist models of
language
„ Computational
modeling
„ Connectionism in general
„ Examples applied to language
Thomas R. Shultz
Department of Psychology &
School of Computer Science
McGill University
„ Semantics:
English personal
pronouns
„ Phonology: word stress
2
conn lang
Why computational modeling?
Connectionism in brief
Precise, concrete, easy-to-manipulate
„ Covers (generates) phenomena
„ Explanation
„ Links different observations
„ Prediction
„ Improvement
„
„
3
connectionism (cognition)
Contrast with symbolic rule models
„
Network of units & weights
Each unit has simple program
„
„
„
„
Compute weighted sum of inputs from other
units
Output a number, a non-linear function of
weighted sum of inputs
Send output to other units doing the same
Modify weights to reduce error
4
conn lang
Neurons
„ Functionalist
models are
symbolic and serial
„ Rules
with conditions and
actions
5
Copyright 2005 Thomas R. Shultz
conn lang
6
conn lang
1
Connectionist models of language
Psychological equivalents
Translation: Brain to neural net
unit
activation
weight
sum of products
sigmoid function
neuron =
firing rate =
synapse =
synaptic reception =
cell threshold =
7
conn lang
„
Active memory
„
Long term memory
„
Learning
„
„
Pattern of activation across units
Connection weights
Adjustment of connection weights
„ Growth & pruning of network
„
8
conn lang
Spreading activation
Distributed representations
x4
+
w14
y1
9
conn lang
Sigmoid activation function
10
w24
y2
w34
y3
conn lang
Multi-layer feed-forward network
1
1
1 + e−x
Output
0.8
0.6
Output units
0.4
Hidden units
0.2
0
-10
-5
0
5
10
Input units
Net input
11
Copyright 2005 Thomas R. Shultz
conn lang
12
conn lang
2
Connectionist models of language
Personal pronouns: Me & you
Semantic rules for me & you
Cannot be learned by imitation
„ Mother calls herself me & calls her
child you
„ Imitation produces reversal errors
„ Me
„
„
refers to person who
uses pronoun
„ You refers to person who is
addressed when pronoun is
used
Child calls
„ himself
„ his
you
mother me
17
conn lang
18
How can pronouns be learned?
conn lang
Me-you: Addressee condition
„ Yuriko
Oshima-Takane
„ Listening to non-addressed
(overheard) speech
„ Directly addressed speech
produces reversal errors
you
you
Father
Child
Mother
me
19
conn lang
me
20
conn lang
Me-you game: Test
Me-you: Non-addressee condition
you
you
Father
Father
me
you?
you?
Mother
Child
Mother
me
me?
me?
Child
21
Copyright 2005 Thomas R. Shultz
conn lang
22
conn lang
3
Connectionist models of language
Pronoun results
Pronoun training
Reversal errors from addressee
speech
„ Correct rules from non-addressee
speech
„ Firstborns (9:1) have more reversal
errors than second-borns (5:5)
„ Inputs:
Speaker, addressee,
referent
„ Outputs: Pronoun (me or you)
„ Train in 2 phases
„
23
„ Parent-speaking
patterns
„ Include child-speaking patterns
conn lang
24
conn lang
Error-free pronoun learning
Therapy for reversal errors
„ Yoshio
„
Takane
„ Error-free generalization in
phase 2
Speaker 1
„5
speakers
„ Implicit input coding of kind
person
25
„
„
26
„
„
„
Weight adjustment (synaptic potentiation)
Growth (neurogenesis, synaptogenesis)
27
Copyright 2005 Thomas R. Shultz
conn lang
Phonology
Rule-following emerges from statistical
regularities
Stage sequences emerge from
environmental bias & network growth
Transitions are due to
„
me
Child
conn lang
„
Speaker 2
me
Pronoun conclusions
„
Massive doses of overheard speech using
pronouns
you
you
conn lang
28
Word stress
Gerken, L. A. (2004). Nine-month-olds
extract structural principles required for
natural language. Cognition, 93, B89-B96.
Shultz, T. R., & Gerken, L. A. (2005). A
model of infant learning of word stress.
Proceedings of the Twenty-seventh
Annual Conference of the Cognitive
Science Society (pp. 2015-2020).
Mahwah, NJ: Erlbaum.
conn lang
4
Connectionist models of language
Stress constraints (ranked)
A.
B.
C.
D.
Gerken experiment (2004 Cognition)
2 stressed syllables cannot occur in
sequence
Heavy syllables (ending in a consonant)
are stressed
Syllables are stressed if they are 2nd to
last (2nd, L2)
Alternating syllables are stressed starting
from left (right, L2)
29
conn lang
„
„
„
„
„
„
L2 words
Ranking
TON ton do RE mi
do RE mi ton TON
A>B
TON do re
do re TON
B>C
DO re TON
TON do RE
B>C
DO re TON mi fa
do re TON mi FA
B>C
DO re mi FA so
do RE mi fa SO
C>D
do TON re MI fa
do RE mi TON fa
A>D
30
conn lang
Gerken infant lab
7 examples of each word type
„
L1 words
TON do re
TON re mi
TON mi fa
TON fa so
TON so la
TON la ti
TON ti do
31
conn lang
Mean looking
Mean looking
8
6
4
2
0
L1
„
10
8
6
4
„
2
0
L2
L1
Familiarization
L1 test
„
Expt. 2
10
L2 test
L2
„
Familiarization
L1 test
33
Copyright 2005 Thomas R. Shultz
conn lang
Familiarization paradigm
Gerken’s (2004) results
Expt. 1
32
„
L2 test
conn lang
34
Category-building
Infant no longer needs to work on stimuli
in category
New stimulus compared to stored
representations
If match, then no attention
If novel, then additional processing
conn lang
5
Connectionist models of language
Encoder networks
Sonority
„ Reproduce
„Articulatory:
inputs on outputs
„ Stimulus features are abstracted
in hidden unit representations as
connection weights are adjusted
„ Error corresponds to the need to
direct current processing
35
Openness
of vocal tract
„Acoustic: Loudness,
vowel-likeness
conn lang
Sonority coding
„
„
„
„
„
low vowels /a/ /æ/ 6.0
mid vowels /ε/ /e/ /o/
5.0
high vowels /I/ /i/ /U/
/u/ 4.0
semi-vowels /w/ /y/
laterals /l/ /r/ -1.0
nasals /n/ /m/ /η/ -2.0
„
„
„
„
conn lang
Actual sonority codes
voiced fricatives /z/ /v/
-3.0
voiceless fricatives /s/
/f/ -4.0
voiced stops /b/ /d/ /g/
-5.0
voiceless stops /p/ /t/
/k/ -6.0
37
36
conn lang
SDCC network structure & coding
Syllable
Consonant 1
Vowel
Consonant 2
do
-5.0
5.0
0.0
re
-1.0
5.0
0.0
mi
-2.0
4.0
0.0
fa
-4.0
6.0
0.0
so
la
-4.0
-1.0
5.0
6.0
0.0
0.0
ti
-6.0
4.0
0.0
ton
-6.0
5.0
-2.0
38
conn lang
Network 0, L1 familiarization
Outputs
cvc1 cvc2 cvc3 cvc4 cvc5 s1 s2 s3 s4 s5
Hidden layer 1
Mean train error
Hidden layer 2
40
h5 h6
h1 h2 h3 h4
30
20
10
0
2
bias cvc1 cvc2 cvc3 cvc4 cvc5 s1 s2 s3 s4 s5
Inputs
39
Copyright 2005 Thomas R. Shultz
conn lang
3
4
5
6
7
8
25th output epoch
40
conn lang
6
Connectionist models of language
Test error after training
Looking & error: Infants & networks
25
Infants
15
Mean error
L1 test
L2 test
10
5
10
25
8
20
Mean error
Mean looking
20
6
4
2
0
L1
0
L1
L2
conn lang
Hidden-unit structures
„
„
„
„
„
„
„
„
„
7
5
6
6
4
6
4
6
5
„
1
„
„
„
2
„
„
2
„
„
2
„
L1
L2 test
L1 test
L2 test
42
conn lang
L1 words
6 1
6
6
6
4 2
6
7
6
6
L2
Familiarization
Deletion predictions
L2 familiarization
„
5
0
L2
L1 test
41
L1 familiarization
15
10
Familiarization
Familiarization
„
Networks
L2 words
A = not 2 in sequence
B = heavy syllable
C = 2nd-to-last or 2nd
D = alternate from left
or right
Ranking
TON ton do RE mi do RE mi ton TON A > B
43
conn lang
Test error: deletions
TON do re
do re TON
B>C
DO re TON
TON do RE
B>C
DO re TON mi fa
do re TON mi FA
B>C
DO re mi FA so
do RE mi fa SO
C>D
do TON re MI fa
do RE mi TON fa
A>D
44
conn lang
Serial position of heavy syllable
1868/52 = 36
Delete CD
40
25
30
20
Mean error
Mean error
Delete BC
20
10
0
L1
15
10
5
0
L2
L2 test
L2 words
Ranking
TON ton do RE mi
do RE mi ton TON
A>B
TON do re
do re TON
B>C
x2
B>C
DO re TON mi fa
do re TON mi FA
B>C
DO re mi FA so
do RE mi fa SO
C>D
do TON re MI fa
do RE mi TON fa
A>D
DO re TON
L1
Familiarization
L1 test
L1 words
L2
Familiarization
L1 test
45
Copyright 2005 Thomas R. Shultz
L2 test
x2
L1 train = 2.5
3
conn lang
46
TON do RE
L2L2
train
train
==
3 3.5
conn lang
7
Connectionist models of language
Test error equating serial position of TON
What does it all mean?
„
150
2 test words have
„
100
Mean error
„
0
L1
L2
Familiarization
„
47
conn lang
Knowledge representations for transitive inference
Generalize beyond familiar stress patterns
to abstract system
Poverty-of-the-stimulus?
48
4
Side
2
Left
Right
0
-2
-4
-6
L1 L2 L3 L4 L5 L6 R1 R2 R3 R4 R5 R6
conn lang
Overall conclusions
„ Challenges
6
Mean weight
„
L1 test
L2 test
50
Different stress patterns than familiarization
words
Same stress pattern, differing only in location
of TON
of language are
formidable
„ Some interesting progress has
been made with connectionist
simulations
Stick input
49
conn lang
50
conn lang
The end
„ Tutorial
on cascade-correlation
„ Simulation papers
„ http://www.psych.mcgill.ca/per
pg/fac/shultz/personal/default.
htm
51
Copyright 2005 Thomas R. Shultz
conn lang
8