Internal representations and self-organization in recurrent

Representations
and
self-organization in
RNNs
Daniel Krieg
Internal representations and self-organization
in recurrent neural networks
FIGSS seminar
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Daniel Krieg
June 21, 2010
Conclusion
Overview
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural networks
Artificial neural
networks
Self-organziation
Recurrent networks
Self-organziation
Adaption of
representations
Conclusion
Recurrent networks
Adaption of representations
Conclusion
About me
Representations
and
self-organization in
RNNs
Daniel Krieg
I
I’m working in the Core project of the Vision Iniative of
the Bernstein Focus
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
I
There, my task is to provide a communication
framework for the different projects and contribute to
the integration of the different components
I
Scientifically, I’m interested in neural representations,
how they can arises in recurrent networks and be
utilized computationally.
Conclusion
Representations
and
self-organization in
RNNs
Artifical neural networks
Daniel Krieg
I
I
My work is based on artificial neural networks, where
artificial means the reduction of the very complex,
3-dimensional neuronal structure to a simple point-like
building block
Disregarding the pulsed or spiking nature of the
neurons, a neuron can be described by a rate model


X
yi = f 
Wij xj 
j
A (mostly non-linear) element acting on the weighted
sum of its inputs.
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Learning
Representations
and
self-organization in
RNNs
Daniel Krieg
I
A feed-forward network consists of an input layer, zero
or more hidden layers and an output layer.
I
Learning is mainly done by the adaption of the weights
There are several different forms of learning, with the
coarsest division
I
I
I
I
Supervised learning
Learn an input-output mapping from an omniscient
teacher
Reinforcement learning
Learn to act optimally in an environment by maximizing
rewards
Unsupervised learning
Optimize an objective function with respect to the given
data
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Self-organziation
I
Self-organziation is a widely known phenomenon in
many different areas of science.
Self-organization is the spontaneous often
seemingly purposeful formation of spatial,
temporal, spatio-temporal structures or functions in
systems composed of few or many components. 1
I
It’s intimiatly related to emergence, where complex
behaviour arises from simple interactions
I
In the context of self-organized criticality, the attractor
of the dynamical system tends towards a critical regime
I
In the setting of a neural network, it can be mainly
regarded as a subset of unsupervised learning rules.
1
Hermann Haken (2008) Self-organization. Scholarpedia, 3(8):1401
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
I
When trained in a supervised fashion, a feed-forward
network can be used as a general function approximator
by minimizing the objective function
X
E=
(f (x) − y )2
i
I
An example for a self-organizing feed-forward network is
the Kohonen-Map. It is a topological mapping of the
input data to a low (mostly 2-) dimensional space. The
learning rule
Wv (t + 1) = Wv (t) + Θ(v ) (yv (t) − Wv (t))
depends only on the input data
I
This type of mapping is also employed in the brain
where the different sensory modalities are mapped
topographically onto the cortex
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure:
http://www.harmonicresolution.com/Sensory%20Homunculus.htm
Representations
and
self-organization in
RNNs
I
But the brain is not a simple feed-forward network
Recurrent and feedback connections are very prominent
I
Compared to feed-forward network, a recurrent network
is dynamical system
I
Its dynamics is sensitive to its own state, therefore it
can incorporate the history of its input
I
In the case of rate neurons, the dynamical system is
described an iterative map
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
y (t + 1) = ft (y (t), z(t))
but where the map itself is time dependent due to
learning
Representations
and
self-organization in
RNNs
Daniel Krieg
I
It can have different behaviour, depending on the
equations and initial conditions
fixed points (e.g. Hopfield network), limit cycles,
attractors
I
They can be either stable, critical or chaotic
I
Understanding a recurrent network helps in analyzing
more indirect loops like feedback from higher areas
(top-down)
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Learning in RNNs
Representations
and
self-organization in
RNNs
Daniel Krieg
I
Supervised training of recurrent networks is a
computationally hard problem
I
The backpropagation algorithm used in feed-forward
networks is not directly applicable
I
Several different variants and other solutions have been
proposed
I
I’m focusing on the echo state network (ESN), which
emphasizes the character of neural representations
I
It shares the basic idea of reservoir computing with the
liquid state machine (LSM)
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Echo state network
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
I
The ESN tries to overcome the problem of supervised
training by only adapting the weights to the output
population
I
The recurrent weights of the ’hidden’ layer (called the
reservoir) are kept constant
I
They are a transformation into high dimensional space
and are only used for the representation of the input
and its history
Problem: what if the representation doesn’t fit your data?
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Representations
and
self-organization in
RNNs
Daniel Krieg
I
Apply self-organziation schemes to let the network learn
its own representation
I
Biologically inspired local adaption rules
Artificial neural
networks
Self-organziation
Recurrent networks
1. Intrinsic plasticity: each neuron adapts its own transfer
function according to some objective function.
I
I
maintain homeostasis: keep neurons firing rate in a
reasonable regime between silence and saturation
make neuron maximally informative, i.e. minimize
entropy of distribution of output firing rates
→ for a given mean rate (energy) this leads to an
exponential distribution
Adaption of
representations
Conclusion
I
I
With these objectives one can derive gradient rules
the parameters of the transfer function.
2
for
For a sigmoidal functional
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
1
1 + exp (−(ax + b))
1
+ x∆b
∆a = η
a
1
1 2
∆b = η 1 − 2 +
y+ y
µ
µ
y=
Conclusion
Figure: Sigmoid function
2
J. Triesch, ’A Gradient Rule for the Plasticity of a Neuron’s
Intrinsic Excitability’, ICANN 2005
2. synaptic plasticity: use a local hebbian-like rule, but
respect causality
I
Spike-timing dependent plasticity (STDP) adapts the
weights depending on the difference between pre- and
post-synaptic spike timings
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure: source: http://www.scholarpedia.org/article/STDP
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
I
Normally parameterized by two exponentials
I
Make it binary for rate neurons
h
i
∆W = η ypost (t) ∗ ypre (t − 1) − ypost (t − 1) ∗ ypre (t)
Recurrent networks
Adaption of
representations
Conclusion
Influence of plasticity rules
I
I
Investigated by Andreaa Lazar, a former phd student
Using simple threshold neurons, she showed that with
both plasticity types the internal representations
improve in a counting task
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure: taken from: Lazar et al., ’SORN: a self-organizing
recurrent neural network’, 2009
Influence of plasticity rules
I
I
distance between network states with same input
increased
leads to better prediction performance at the supervised
output population
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Analytical considerations
I
Is this just coincidence or a general feature of
STDP-like learning
I
We can derive an objective function to explain the
functional role of the STDP rule
I
Use distance between successive network states as
energy function
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
E (t) = (~y (t) − ~y (t − 2))2
∂E (t)
= F 0 (t) [y (t)y (t − 1) − y (t − 2)y (t − 1)]
∂W
with F 0 (t) = diag (f 0 (t)1 , ..f 0 (t)N ).
∆W (t) =
I
Rearranging terms this is exactly the stdp rule for linear
neurons (F = Id)
I
The additional term prevents learning in the case of
saturation or silence
Representations
and
self-organization in
RNNs
Network states
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure: No SP
Figure: With STDP
PCA
I
I
The network states get more variable and occupy a
larger subspace
With more principle components, classification gets
harder, but the representational power increases
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Separation of states
I
4000 presentations of a random alternation of two input
sequences
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure: Without synaptic plasticity
Separation of states
I
Distance between states increases on average and they
spread more evenly in the state space
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Figure: With STDP
Synaptic weights
I
The weights get more specific
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Criticality
I
The networks criticality is measured by its Lyapunov
exponents λi which are the log eigenvalues of the
averaged Jacobian Jt
yi (t + 1) = f (Wij yj (t) + z(t))
J(t) =
∂y (t)
∂W (t)
Jt =
Y
F 0 (i)W (i)
A change of the state along the eigenvector of the
corresponding exponent will evolve through time like
|δXi | ∝ e λi t |δXi |
I
Artificial neural
networks
Self-organziation
Recurrent networks
Conclusion
i
I
Daniel Krieg
Adaption of
representations
#1/t
"
Representations
and
self-organization in
RNNs
The critical regime between stability and chaos is
benefical for compuations
Criticality
I
I
the instant lyapunov exponents (from weight matrix)
get more seperated
spectrum shifts to more negative values (under-critical),
but a few exponents become more critical
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Criticality
I
most of the high dimensional space is insensitive to
changes, but a small subspace is critical and can
distinguish between different histories
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Criticality
I
I
averaged over a longer sequence of inputs, it becomes
more and more peaked
overall reservoir is stable on average (homeostasis), due
to the instrinsic plasticity
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Outlook
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
I
is it possible to relate stdp in spiking neuronal networks
with real timing dependence to a similar objective
function interpretation?
Recurrent networks
Adaption of
representations
Conclusion
I
further investigation of data-dependence of criticality
and subspace formation
I
study relevance of self-organizing representations for
statistical inference in neural networks
Conclusions
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
I
I
I
Self-organization is useful for adapting the internal
states of a (initially) random network to the task
Improved representations in recurrent neural networks
by STDP can be explained through an objective
function
The interplay of synaptic and intrinsic plasticity seems
to lead to an average stable system with a small critical
subspace
Self-organziation
Recurrent networks
Adaption of
representations
Conclusion
Representations
and
self-organization in
RNNs
Daniel Krieg
Artificial neural
networks
Self-organziation
Recurrent networks
Thanks for your attention.
Adaption of
representations
Conclusion