The CLS theory

The CLS theory: Basic features, role of
replay and responses to recent
challenges
Psychology 209 – Winter 2017
Feb 28, 2017
The Rumelhart Model
The Training Data:
All propositions true of
items at the bottom level
of the tree, e.g.:
Robin can {grow, move, fly}
Early
Later
Later
Still
E
x
p
e
r
i
e
n
c
e
Emergence of Meaning in Learned
Distributed Representations
• Distributed representations that capture aspects of
meaning emerge through a gradual learning process
• The progression of learning and the representations
formed capture many aspects of cognitive
development
– Differentiation of concepts
– Generalization, illusory correlations and
overgeneralization
– Domain-specific variation in importance of feature
dimensions
– Reorganization of conceptual knowledge
What happens in this system if we
try to learn something new?
Such as a Penguin
Learning Something New
• Used network already trained
with eight items and their
properties.
• Added one new input unit
fully connected to the
representation layer
• Trained the network with
the following pairs of
items:
– penguin-isa
living thing-animal-bird
– penguin-can
grow-move-swim
Rapid Learning Leads to
Catastrophic Interference
Effect of a Hippocampal
Lesions
•
Intact performance on tests of
intelligence, general knowledge,
language, other acquired skills
•
Dramatic deficits in formation of
some types of new memories:
– Explicit memories for
episodes and events
– Paired associate learning
– Arbitrary new factual
information
•
Temporally graded retrograde
amnesia:
– lesion impairs recent
memories leaving remote
memories intact.
Note: HM’s lesion
was bilateral
A Complementary Learning System
in the Medial Temporal Lobes
name
action
Temporal
pole
motion
color
valance
form
Medial Temporal Lobe
Avoiding Catastrophic Interference
with Interleaved Learning
Initial Storage in the Hippocampus Followed by
Repeated Replay Leads to the Consolidation of
New Learning in Neocortex, Avoiding
Catastrophic Interference
name
action
Temporal
pole
motion
color
valance
form
Medial Temporal Lobe
Inside the MTL…
• Pattern separation:
– Sparse random conjunctive coding
– Floating threshold idea
– How learning can increase pattern
separation
– Cheating during ‘retrieval’ by bypassing the
dentate
In more detail…
• Input from neocortex
comes into EC
• Drastic pattern
separation occurs in
DG
• Downsampling in CA3
• Moderate invertable
sparsified
representation in CA1
• One- or fewish- shot
learning in DG, CA3,
CA3-CA1 allows
reconstruction of ERC
pattern from partial
input.
Challenges to the theory
• Inference and generalization can depend on
the MTL
– Better learning of ‘premises’ leads to better
ability to make inferences
• Sometimes new information can be integrated
into neocortical learning systems quickly
How might hippocampus support
inference and generalization?
‘Inference’
• Finding missing
links in the
transitive inference
task
‘Similarity based
generalization’
• Relying on partial
activation of
multiple memories
to decide if a
stimulus is familiar
or unfamiliar
Richard
Morris
The Second Challenge to the Theory
Rapid Consolidation of
Schema Consistent
Information
Tse et al (Science, 2007, 2011)
During training, 2 wells
uncovered on each trial
Day 2
Day 9
Day 16
Lesion Control
After New
•
Initial learning of flavor-place
associations is gradual.
•
After initial learning, one new pair
of flavor-place associations
learned in one trial.
•
Performance is unaffected by HPC
lesion 48 hrs after learning new
associations.
•
Not only was the new material
learned quickly, it appears to
have been rapidly integrated into
the neocortex
Rapid Gene Induction for New
Schema Consistent Information
Old paired
associates (OPA)
New paired
associates (NPA)
New map (NM)
Caged Control
(CC)
Implications for Theory
“These findings indicate that the rate at which
systems consolidation occurs in the neocortex
can be influenced by what is already known. In
contrast, in the complementary learning systems
approach, the hippocampus is said to be
`specialized for rapidly memorizing specific
events’ and the neocortex for ‘slowly learning
the statistical regularities of the environment.’”
Are These Findings Really
Inconsistent with Complementary
Learning Systems Theory?
Or did I simply fail to convey the full
schema underlying the theory?
Why, after all, did I choose the penguin to
demonstrate the importance of the MTL in
new learning?
Schemata and
Schema Consistent
Information
•
What is a ‘schema’?
– An organized knowledge
structure into which existing
knowledge is organized.
•
What is schema consistent
information?
– Information that can be
added to a schema without
disturbing it.
•
What about a penguin?
– Partially consistent
– Partially inconsistent
•
In contrast, consider
– a trout
– a cardinal
New Simulations
• Initial training with eight
items and their properties
as before.
• Added one new input unit
fully connected to the
representation layer also as
before
• Trained the network on one
of the following pairs of
items:
– penguin-isa & penguin-can
– trout-isa & trout-can
– cardinal-isa & cardinal-can
New Learning of Consistent and
Partially Inconsistent Information
I
LEARNING
INTERFERENCE
Connection Weight Changes after
Simulated NPA, OPA and NM Analogs
Tse Et al 2011
How Does It Work?
How Does It Work?
Take home messages
• The brain clearly contains many learning systems
– Hippocampus
– Neocortex
– Basal Ganglia
– Amygdala
• Generally speaking learning is likely to depend on many
systems working together at the same time.
• These systems can be parameterized very differently
– Sparse vs Dense
– Different learning rates
– Reward- vs. prediction error driven
• We can also make explicit conscious inferences sometimes
– This ability likely depends on many systems working
together, especially if the information needed to link
inferences is stored in memory