Transforming the BCPNN Learning Rule for Spiking

Transforming the BCPNN Learning Rule for Spiking - CSC

Transforming the BCPNN
Learning Rule for Spiking Units
to a Learning Rule
for Non-Spiking Units
ANTOINE
BERGEL
Master of Science Thesis
Stockholm, Sweden 2010
Transforming the BCPNN
Learning Rule for Spiking Units
to a Learning Rule
for Non-Spiking Units
ANTOINE
BERGEL
Master’s Thesis in Biomedical Engineering (30 ECTS credits)
at the School of Computer Science and Engineering
Royal Institute of Technology year 2010
Supervisor at CSC was Örjan Ekeberg
Examiner was Anders Lansner
TRITA-CSC-E 2010:059
ISRN-KTH/CSC/E--10/059--SE
ISSN-1653-5715
Royal Institute of Technology
School of Computer Science and Communication
KTH CSC
SE-100 44 Stockholm, Sweden
URL: www.kth.se/csc
Abstract
The Bayesian Confidence Propagation Neural Network (BCPNN) Model
has been developed in the past thirty years for specific tasks like, among
others, classification, content-addressable memory and data mining. It
uses a Bayesian-Hebbian learning rule, which exhibits fairly good performances, both as an counter model and in a continously operating
incremental learner. This learning rule has never been up and running
in spiking units networks and one is bound to use the outcome of the
learning for non-spiking units and to transpose it to the spiking context
afterwards, which is highly restrictive.
The aim of Master Thesis Project is to transform the existing BCPNN
learning rule for non-spiking units, including the bias term, to the
domain of spiking neural networks based on the Hodgkin-Huxley cell
model. One aims to have a modelisation running in NEURON, which
can exhibit the same features observed with non-spiking units. The secondary goal of this paper is to compare the new learning rule to the old
one, and also with other Spike-Timing Dependent Plasticity learning
rules.
To achieve such a goal, we introduce a new version of the BCPNN
learning rule, which can account for spiking input activities. This learning rule is based on the use of synaptic traces as local variables to keep
a trace of the frequency and timing between spikes. It includes three
stages of processing, all based on low-pass filtering with three different temporal dynamics, in order to give an evaluation of the probabilities used to compute the Bayesian weights and biases. The Bayesian
weights are mapped to a synaptic conductance, updated according to
the values of these synpatic traces, and we map the bias term to an
activity-regulated potassium channel.
We exhibit results of the new spiking version of the BCPNN learning
rule in single-synapse learning and retrieval. We implement two main
models : the first based on abstract units in MATLAB and another one
based on Hodgkin-Huxley spiking units in NEURON. The last model
accounts for spike-frequency adaptation and can be used to study the effect of exact timing between presynaptic and postsynaptic spikes, under
repeated stimulations.
Acknowledgements
I would first like to thank Anders Lansner for allowing me to work at the
department Computational Biology and Neuroscience at KTH, for devoting time
and patience to assuming both roles of supervisor and examiner of this Master
Thesis, and for always helping me, guiding me and finally leaving me in the best
conditions to produce valuable work. This first step into research at a high-level
scientific department has been a very enriching experience, which I will always
remember. I would also like to thank Örjan Ekeberg, for accepting to tutor this
Master Thesis from abroad at first, and later for all the precious comments about
the report, presentation and structure of this work.
This passed year, at the department, I have had the chance to meet a lot of
people, from different backgrounds and countries. They have contributed to make
the atmosphere of work very special and especially warm and welcoming : Claudia,
who has been here from the very beginning, Charles, for his ping-pong and chess
skills when a break was needed, Aymeric, Dave, Simon, Pawel, Pierre and all the
others for making me discover new bars and restaurants. I want to give a special
thank to Mikael, for interesting talk, Pradeep and David, for their disponibility,
kindness and help with NEURON, and finally to Bernhard, who has been not only
always eager to answer my numerous questions and investigate new problems with
me, but also a devoted friend, who proposed me tremendous support and help, when
time pressure was high.
I cannot cite all the people that I have met these past two years, but I want
to say how getting to know all of them, all the conversations and moments we
had together, have changed me and made me realise that there exist no geographic
border to friendship and love. So, I want to thank Natasha, for the time she spent
on improving the language in my report, and simply for being always supportive
and making me feel that she was here with me, though at the other side of the world.
This year would have been so much different without my lovely room-mates Birte,
Isabel, Stefan F., Stefan T. and Volker, for August mondays among other things,
my two French buddies Fred and Joseph, for lunchbreaks, poker sessions and crazy
parties. I want to give a special thank to my two Italian friends who showed that
beyond neighbour rivalry, we just have so much in common and so much to share :
Enrico, the craziest person I have ever lived with and Sara, probably the best pizza
and focaccia cooker ever.
Finally, I want to thank my parents who have always helped me with all the
problems one can encounter when studying abroad for two years : I know how lucky
I am to have them with me and I hope they measure the respect I have for them.
A little word to my syblings, my sister Karen and my brother Samuel, who I will
be very happy to meet and live with again.
Contents
1 Introduction
1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 The BCPNN Model
2.1 Context and Definitions . . . . . . . . . . . . .
2.2 Bayesian Confidence Propagation . . . . . . . .
2.2.1 Using Neurons as probability estimators
2.2.2 Derivation of Network Architecture . . .
2.2.3 Bayesian-Hebbian Learning . . . . . . .
2.3 Gradual Development of the BCPNN model . .
2.3.1 Naive Bayes Classifier . . . . . . . . . .
2.3.2 Higher Order Bayesian Model . . . . . .
2.3.3 Graded units . . . . . . . . . . . . . . .
2.3.4 Recurrent Network . . . . . . . . . . . .
2.4 BCPNN Learning Implementations . . . . . . .
2.4.1 Counter Model . . . . . . . . . . . . . .
2.4.2 Incremental Learning . . . . . . . . . .
2.5 Performance Evaluation and Applications . . .
1
1
2
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
8
8
9
9
9
9
11
12
14
14
14
15
16
.
.
.
.
.
.
.
.
19
19
21
21
22
23
25
25
26
4 Abstract Units Implementation
4.1 Pattern presentation . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
27
3 A spiking BCPNN Learning Rule
3.1 Formulation . . . . . . . . . . . . . . . . . . .
3.2 Features . . . . . . . . . . . . . . . . . . . . .
3.2.1 Synaptic traces as local state variables
3.2.2 Spike-timing Dependence . . . . . . .
3.2.3 Delayed-Reward Learning . . . . . . .
3.2.4 Long-term Memory . . . . . . . . . . .
3.2.5 Probabilistic features . . . . . . . . . .
3.3 Biological relevance . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
28
28
29
31
32
5 Hodgkin-Huxley Spiking Implementation
5.1 Cell Model . . . . . . . . . . . . . . . . .
5.1.1 Hodgkin Huxley Model . . . . . .
5.1.2 Spike Frequency Adaptation . . . .
5.2 Pattern presentation . . . . . . . . . . . .
5.3 Learning Rule Implementation . . . . . .
5.3.1 Synaptic Integration . . . . . . . .
5.3.2 Bias term . . . . . . . . . . . . . .
5.4 Retrieval . . . . . . . . . . . . . . . . . . .
in NEURON
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
35
37
38
39
39
42
45
6 Results
6.1 Abstract units . . . . . . . . . . . . .
6.1.1 Learning . . . . . . . . . . . . .
6.1.2 Retrieval . . . . . . . . . . . .
6.2 Hodgkin-Huxley Spiking Units . . . .
6.2.1 Steady-State Current Discharge
6.2.2 Learning . . . . . . . . . . . . .
6.2.3 Parameter tuning . . . . . . . .
6.2.4 Retrieval . . . . . . . . . . . .
6.2.5 Spike Timing Dependence . . .
.
.
.
.
.
.
.
.
.
4.2
4.3
4.1.1 Non-spiking Pattern Presentation . . . . . . . .
4.1.2 Spiking frequency-based Pattern Presentation .
4.1.3 Spiking Poisson-generated Pattern Presentation
Learning Rule Implementation . . . . . . . . . . . . .
Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
47
47
51
51
51
55
58
59
60
7 Discussion
7.1 Model Dependencies . . . . . . . . . . . . . . . . . . . .
7.1.1 Learning Rule Parameters . . . . . . . . . . . . .
7.1.2 Pattern Variability . . . . . . . . . . . . . . . . .
7.1.3 Learning-Inference Paradigm . . . . . . . . . . .
7.2 Comparison to other learning rules . . . . . . . . . . . .
7.2.1 Spiking vs Non-spiking Learning Rule . . . . . .
7.2.2 Spike-timing dependence and real data . . . . . .
7.2.3 Sliding threshold and BCM Rule . . . . . . . . .
7.3 Further Developments and limitations . . . . . . . . . .
7.3.1 Network implementation . . . . . . . . . . . . . .
7.3.2 RSNP cells and inhibitory input . . . . . . . . .
7.3.3 Hypercolumns, basket cell and lateral inhibition
7.3.4 Parallel computing . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
63
63
63
65
65
66
66
68
69
71
71
71
72
73
8 Conclusion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
Bibliography
77
Appendices
79
A NMODL files
A.1 Synapse modelisation . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 A-Type Potassium Channel . . . . . . . . . . . . . . . . . . . . . . .
81
81
83
B Hodgkin-Huxley Delayed Rectifier Model
B.1 Voltage Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.2 Equations for Gating Variables . . . . . . . . . . . . . . . . . . . . .
87
87
87
C NEURON stimulations parameters
89
Chapter 1
Introduction
1.1
Context
Since 1949 with Hebb’s theory, synaptic plasticity (the ability for the synaptic connection between two neurons to change its strength according to a certain
conjunction of presynaptic and postsynaptic events) is thought to be the biological
substrate for high level cognitive functions like learning and memory. This idea
is actually much older and was fomalized by the Spanish neuroanatomist Santiago
Ramón y Cajal in 1894, who suggested “a mechanism of learning that did not require the formation of new neurons”, but proposed that “memories might instead be
formed by strengthening the connections between existing neurons to improve the
effectiveness of their communication” [29]. Hebb went a step further by proposing
his ideas about the existence of a metabolic growth process associating neurons that
tend to have a correlated firing activity [13].
For the brain to be able to form, store and retrieve memories, as well as learn specific tasks, the biological changes at the synapse level need to be long-lasting. This is
called long-term potentiation (LTP) or depression (LTD) which means a persistent
increase or decrease in synaptic strength which is said to be the key mechanism underlying learning and memory. The biological mechanisms responsible for long-term
potentiation are not exactly known, but specific protein synthesis, second-messenger
systems and N-methyl D-aspartate (NMDA) receptors are thought to play a critical
role in its formation [20].
In 1995, Fuster defined memory as “a functional property of each and all of
the areas of the cerebral cortex, and thus of all cortical systems”. He distinguishes
several types of memories : short-term/long-term, sensory/motor, declarative/nondeclarative and individual/phyletic. He proposes that all memory is associative and
its strength depends on the number of associations we make to a specific word or
mental object [11]. He introduced several key concepts like working memory, as
a gateway to long-term memory waiting to be consolidated, and active memory
as a cortical network of neurons with an activity that is above a certain baseline.
Also, his perception-action cycle suggesting a constant flow of information between
1
CHAPTER 1. INTRODUCTION
sensory and motor memory, has been proved to be a matter of interest for future
experimentation.
More recently, investigations have focused on spike-timing-dependent plasticity
(STDP), which refers to synaptic changes sensitive to the exact timing of action potentials between two connected neurons : one refers to pre-post timing or positivelycorrelated timing, when the presynaptic neuron fires a few milliseconds before the
postsynaptic neuron, and to post-pre timing or negatively-correlated timing, when
it goes the other way around. STDP has become a popular subject since the experimental work of Bi and Poo [6] who first demonstrated the strong influence of
exact timing (typically a time-window of 20 ms for cultured hippocampal neurons)
of presynaptic and postsynaptic spikes on synaptic long-term potentiation. Their
work with culture of hippocampal neurons, seconded by the work from others, e.g.
Rubin et al. 2005 and Mayr et al. 2009 [30, 25], has resulted in the formulation of
STDP type learning rules [27, 9].
One must be aware, however, that these rules are considered rather crude approximations by relevant experimentalists. There is a constant duality of the two
possible ways to approach neuroscience : some aim to understand the biological
mechanisms at the cell and membrane level, so that they can build up models to
reproduce them, whereas others aim to reproduce cell behaviour for applications
and fit their model to experimental data, rather than to theory. Both approaches
have their justification and it is likely that both approaches are complementary.
However, if some results arise, our understanding of the mechanisms of the brain is
still partial and a great deal remains to be done.
In this project, we focus on the Bayesian Confidence Propagation Neural Network (BCPNN) model, which has been first studied by Lansner and Ekerberg (1989)
[22] and Kononenko (1989) [18]. Its main features are a network architecture directly derived from Bayes Rule and unit activities representing the probability of
stochastic events. The BCPNN model will be thoroughly described in Chapter 2.
1.2
Motivations
In 2003, Sandberg et al. proposed that “a possible future extension of the
existing BCPNN model would be to implement the model using spiking neurons to
further examine its generality and properties, such as the effect of spike synchrony
in memory reset and the effects of AHP modulation on network dynamics” [32]. At
that time, the model had just been improved from a counter model to a continously
operating incremental learning model. In this respect, the work presented is in the
continuity of what has already been done and seeds the need to have such a learning
rule operating in spiking context.
Artificial neurons are a very crude approximation of real neurons : given input
from other neurons they generate an output through an activity function. Spiking
neurons, however, mimic the behaviour of real neurons : in particular, they exhibit
spikes (they “fire” and take a high positive value) only when their potential crosses
2
1.3. OUTLINE
a threshold and for a very short amount of time. These neurons simulate all-ornothing behaviour and action potentials observed in real neurons [20]. The variables
attached to them, such as membrane voltage, capacitance, synaptic conductance,
have a real biological meaning.
Since the large-scale implementations of neural networks are often based on
spiking units, it is valuable to have such a formulation to be able to run on-line
learning also in large-scale spiking neural networks. The project aims to end up
with a network model with the new online learning rule operating and use it to test
some of the emergent phenomena. Evaluating the model by comparing it to the
original BCPNN rule, other STDP rules, as well as some experimental data on LTP
[30], is our prime motivation. Because of its specific features (both Hebbian and
Bayesian), the BCPNN learning rule can always be used as a reference for other
STDP learning rules to be implemented in the future. With regard to the bias
term a comparison can also be made with the threshold in the BCM learning rule,
developed in 1982 by Bienenstock, Cooper and Munroe [7].
The transformation of the BCPNN learning rule to a spiking neuron environment is somewhat challenging and has never been done before. This opens up
tremendously the extent of our work and the scope of this Master Thesis is to be
limited, for the sake of simplicity. We narrow our work to two main objectives
: the comparison to other existing learning rules, as explained above, is the first.
The second, somewhat more abstract, is to conciliate the probabilistic features of
the original BCPNN learning rule and spike-timing dependent features developed
in STDP models (Bi and Poo 1998, Morrisson 2008, Clopath 2008) [6, 27, 9]. The
new learning rule presented in Chapter 3 is built to take into account STDP-like
features, and we aim to fit our model to existing experimental data, relating to
the spike-timing dependent plasticity window (Bi and Poo 1998) [6] and intrinsic
excitibality (Jung et al. 2009) [19], following a phenomenological approach to the
problem.
An further improvement of our work would be to modify our learning rule so
that it could run on parallel computers in a large-scale context. This work is not
meant to state decisive results, or to study exhaustively one specific feature of the
BCPNN model, but rather to trigger the conversion of the BCPNN model to spiking
unit environment.
1.3
Outline
We will first redefine, in Chapter 2, the basics of the BCPNN model and its
mathematical context, from its most basic form (Naive Bayes Classifier) to more recent ones (Higher Order Model, Recurrent Network). We will also relate the existing
implementations (counter model, incremental learning) and their applications. In
Chapter 3, the ‘spiking’ version of the learning rule is presented, its new features and
their biological motivation. The including two following chapters contain the core
of the thesis : we develop how we implemented the new learning rule respectively
3
CHAPTER 1. INTRODUCTION
with abstract units in MATLAB (Chapter 4) and in a spiking context in NEURON
(Chapter 5). The results are presented in Chapter 6, both single-synapse, network
implementation and phemenological approach to fit STDP data. Dependence on
model parameters and comparisons to other existing learning rules are discussed
in Chapter 7. Finally, Chapter 8 is dedicated to further developements and to the
conclusion.
4
Chapter 2
The BCPNN Model
2.1
Context and Definitions
Artificial Neural Networks
An artificial neural network (ANN) is a computational model that aims to
simulate the structure and/or functional aspects of biological neural networks. It
consists of a group of computational units, connected by weighted links through
which activation values are transmitted
The reader can find documentation about ANNs in the literature and the purpose here is not to discuss Neural Networks in a general fashion. Still, we think it’s
valuable in our context to relate the main features of artificial neural networks.
Nodes The functional unit or node of the network is the basic element constituting
it. Even if, in the first place, it has a biological equivalent, like a neuron or
more recently a minicolumn, it is an abstract unit, which means that the
variables attached to it are artificial and do not have an intrinsic biological
meaning. A node i is assigned a random variable xi that can be either binary,
discrete or continous. It takes its input from other units xj and generates an
output yi .
Activity Function The activity function or transfer function is the function giving the input-output relationship for one node. Common activity functions
include linear, thresholded and sigmoÃ¯d functions. The input-output relaP
tionship for one unit i is given by yj = φ(βi + nj=1 ωij xj ) where φ is the
activity function, ωij the weight between unit i and unit j and βi the bias of
unit i.
Learning Rule The learning rule is an algorithm that modifies connections between units, the so-called weights, in response to the presentation of an input
pattern. It is often the key point of the implementation, because it determines
the response of the network to specific input, hence its applications. Classical
5
CHAPTER 2. THE BCPNN MODEL
learning rules include Perceptron Learning, Delta rule and Error Backpropagation.
Network Architecture A network can have several topologies. It can be composed of layers (single-layer, multi-layer networks) that can communicate in
only one direction (feedforward network) or in both directions (backpropagation or recurrent network). Connections between units in the network can be
sparse or all-to-all. They can include one or several hidden layers (internal
computational units which are not accessible from the network interface, but
used to create a specific internal representation of the data).
Input and Output units In a feedforward network, a network receives information from input units and proposes an interpretation available at the output
units. In a recurrent network, though, the difference between input and output
units is less clear : input consists of an activation of a set of units representing
an input pattern and an output pattern is read from the activity of the units
after a phase called relaxation.
Learning and Retrieval
A network can be used in one of the two following modes : learning or retrieval.
During the learning phase, the network input units are clamped to a certain value
(clamping means that the units are assigned a value set by the operator through a
controled process) during a certain amount of time (a set of input units represents
an input pattern). During clamping, the learning rule operates, so that the weights
are updated, and retain the information contained in the pattern that has been
presented to them. In other words, during learning, the network adapts to reality
(the clamped input pattern) and changes its internal connection to remember it in
the future : learning is said to be stimulus-driven.
During the retrieval phase the weights of the network are assumed to be fixed,
keeping the internal structure of the network unchanged. Distorted, incomplete or
different patterns than the ones used during learning are presented to the network
and an output pattern is generated. In the case of layered networks, the inference
is realized by feeding a pattern to the input units and collecting it at the output
units. In other words, the network interprets the input data, using its internal
representation or knowledge.
For a recurrent network, however, the input pattern is fed to all input units (all
units in the network except for the hidden units), and the network starts a phase
called relaxation. Relaxation of the network consists of taking a pattern as input
and incrementally updating the units’ activities according to an inference rule ;
this stops when stability is reached, i.e. when the change in the units’ activities is
quite small. When the weight matrix is symmetric, convergence is guaranteed and
relaxation always converges to a stable attractor state [16].
For correct knowledge to be acquired, one must learn a pattern (learning phase)
and then check if the pattern has been stored correctly (retrieval). It is important
6
2.1. CONTEXT AND DEFINITIONS
however to alternate these two phases, so that the information stored by the network
is constantly updated and corrected. One must pay attention that a network does
not learn its own interpretation of the data, by shutting off the learning phase from
time to time.
Hebb’s postulate
Introduced by Donald Hebb in 1949, Hebb’s postulate, also called cell assembly
theory, is one of the earliest rules about synaptic plasticity. It has been formulated
as follows :
When an axon of cell A is near enough to excite a cell B and repeatedly
or persistently takes part in firing it, some growth process or metabolic
change takes place in one or both cells such that A’s efficiency, as one
of the cells firing B, is increased [13].
The theory is often summarized as “cells that fire together, wire together" and
is commonly evoked to explain some types of associative learning in which simultaneous activation of cells leads to pronounced increases in synaptic strength. Such
learning is known as Hebbian learning. The general idea is that cells or group of cells
that are repeatedly active at the same time will tend to become associated, so that
activity in one facilitates activity in the other [1]. Work in the laboratory of Eric
Kandel has provided evidence for the involvement of Hebbian learning mechanisms
at synapses in the marine gastropod Aplysia Californica [21].
Associative Memory
Fuster describes associative memory as “a system of memory, usually constituted
by associations between stimuli and reinforcement” [11] as opposed to recognition
or episodic memories. However, according to him, association is an attribute of all
memories from the root of their genesis to their evocation. More widespread is the
definition of auto-associative and hetero-associative memories as a form of neural
networks that enables one to retrieve entire memories from only a tiny sample of
itself. Hetero-associative networks can produce output patterns of a different size
than that of the input pattern (mapping from a pattern x to a pattern y with a
non-squared connection matrix W ), whereas auto-associative networks work with
a fixed size of patterns (mapping of the same pattern x with a squared connection
matrix W ).
The Hopfield network (Hoppfield 1982 [16]) is the most implemented autoassociative memory network and serves as content-addressable memory with binary
threshold units. Under the following restrictions : wii = 0 (no unit has a connection
with itself) and wij = wji (connections are symmetric), convergence to a local minimum of a certain Energy Function is guaranteed. During learning, the connection
matrix W is modified to allow for attractor dynamics, so that relaxation of the
network causes the input pattern to converge towards the closest attractor state.
7
CHAPTER 2. THE BCPNN MODEL
2.2
Bayesian Confidence Propagation
2.2.1
Using Neurons as probability estimators
The main idea underlying the BCPNN learning rule is to use neurons as probability estimators. The input and output unit activities represent probabilities. The
neuron is used to estimate its probability of firing in a given context, i.e. knowing the information carried out by the activities of others neurons in the network.
Confidence propagation relies on the fact that the conditional probability of a given
neuron yi to fire given the context x P (yi |x) is a better approximation than the a
priori probability P (yj ). By updating units like this, one propagates the confidence
of one unit to the other units in the network.
Figure 2.1: Using Neurons as probability estimators
The BCPNN learning rule is based on a probabilistic view of learning and retrieval ; input unit and output unit activities representing respectively confidence
of feature detection (the input to unit i from unit j is a number between 0 and 1
representing the confidence that xj is a part of this pattern) and posterior probabilities of outcomes (the output to unit j is a number between 0 and 1 representing
the probability of outcome of xj knowing the pattern context)
One drawback of the fact that we use neurons as probabilistic estimators is
that we have to separate the signal. Indeed, the observation of the absence of an
attribute in a given vector is somewhat different than the absence of observation
of this attribute. However, if we only map one attribute but one unit, then the
BCPNN model will interprate zero input to this unit as an absence of information
on this attribute, and it will compute the a posteriori probabilities of other units,
discarding the input from this unit. To solve this problem, we need to separate the
data, i.e. we need to create units for all possible values of an attribute. In the case
8
2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL
of binary units, this corresponds to having two units a and ā for attribute A. When
no observation is made on this attribute, the network will discard input from both
of these units.
2.2.2
Derivation of Network Architecture
The Bayesian Confidence Propagation Neural Network (BCPNN) has been developed gradually (Lansner and Ekeberg 1989, Lansner and Holst 1996, Sandberg
et al. 2002, Sandberg et al. 2003) [22, 23, 31, 32]. Starting from Bayes Theorem
(equation 2.1), we derive a network architecture, meaning that we identify the terms
in our mathematical formulae to weights ωij , biases βj , input xi and output unit
activities yj . The purpose of the learning phase will then be to update weights and
biases so that their value fits the one in the mathemical derivation of the network.
According to the complexity of the training set we use, the network architecture can
be a single-layer (see Naive Bayes Classifier), multi-layer (see Higher Order Model)
or fully-connected network (see Recurrent Network).
2.2.3
Bayesian-Hebbian Learning
The BCPNN learning rule derived in the next section uses Bayesian weights
and biases (equation 2.4). It exploits the statistical properties of the attributes in
the training set (frequencies of activation of one attribute xi and co-activation of
two attributes xi and xj ) in order to evaluate the probabilities P (xi ) and P (xi , xj )
used to update the weights and biases. It also shows Hebbian features because
it reinforces connections between simultaneously active units, weakens connections
between units independent from one another, and makes connections inhibitory
between anti-correlated units.
As we shall see later in this paper, when applied to a recurrent attractor network,
it gives a symmetric weight matrix and allows for fixed point attractor dynamics.
The update of the weights in the network resembles what has been proposed as
rules for biological synaptic plasticity (Wahlgren and Lansner 2001) [33].
2.3
2.3.1
Gradual Development of the BCPNN model
Naive Bayes Classifier
The Naive Bayesian Classifier (NBC) aims to calculate the probabilities of the
attributes yj given a set x = (x1 , x2 , ... , xi , ... , xn ) of observed attributes. Both are
assumed to be discrete (for now, we only consider binary inputs). The main assumption in this case is the Independence Assumption, which states that the attributes xi
Q
are independent (P (x1 , ... , xn ) = ni=1 P (xi )) and conditionally independent given
Qn
yj (P (x1 , ... , xn |yj ) = i=1 P (xi |yj ))
9
CHAPTER 2. THE BCPNN MODEL
The Bayes Theorem is given by the following equation for x and y two random
variables.
P (x|y)P (y)
P (y|x) =
(2.1)
P (x)
Using this and the Independence Assumption, we can calculate the conditional
probability πj of the attribute yj given the observed attributes xi
n
n
Y
Y
P (x|yj )P (yj )
P (xi |yj )
P (xi , yj )
πj = P (yj |x) =
= P (yj )
= P (yj )
P (x)
P
(x
)
P
(xi )P (yj )
i
i=1
i=1
Now, we assume that we only have partial knowledge of the attributes xi . We
are given completely known observations xi when i ∈ A ⊆ {1, ... , n} and have no
information at all about the attributes xk when k ∈ {1, ... , n}\A. Then, we get
πj = P (yj |xi , i ∈ A) = P (yj )
P (xi , yj )
P (xi )P (yj )
i∈A
Y
Then, taking the logarithm of the last expression, we obtain :
n
X
P (xi , yj )
P (xi , yj )
oi log
log(πj ) = log(P (yj ))+
= log(P (yj ))+ oi log
P
(x
)P
(y
)
P
(xi )P (yj )
i
j
i=1
i∈A
(2.2)
where the indicator variable oi equals 1 if i ∈ A (which means that the ith attribute
xi is known) and equals 0 otherwise.
We finally end up with the following equation
"
#
"
#
X
log(πj ) = βj +
(
with
n
X
ωij oi
i=1
i
h
P (yj ,xi )
log P (yj )P
(xi )
ωij =
βj = log(P (yj ))
(2.3)
(2.4)
This can be implemented as a single-layer feedforward neural network, with input
layer activations oi , weights ωij and biases βj . In this way, the single-layer feedforward neural network calculates posterior probabilities πj given the input attributes
using an exponential transfer function.
The weights and biases given in the equation 2.4 are called Bayesian weights.
We can point out the Hebbian character of these weights : ωij ∼ 0 when xi and
yj are independent (weak connection between independent units), ωij ∼ log( p1 ) > 0
when the units xi and yj are strongly correlated, since in this case P (xi , yj ) ≈
P (xi ) ≈ P (yj ) ∼ p > 0 (strong connection between simultanously active units) and
ωij → −∞ when they are anti-correlated, because in this case P (xi , yj ) → 0 (strong
inhibitory connection betwen anti-correlated ints).
The bias term βj gives a measure of intrinsic excitability of the unit xj , as we
shall see later in details. We observe that βi → 0 when pi → 1 so that the bias term
10
2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL
has no effect in computation when unit xi is strongly activated and βi → −∞ when
pi → 0 thus muting the information carried out by unit xi when it has been activated
seldom. This process is democratic in the sense that it gives more importance to
the units who have ‘a lot to say’ and shuts off the ones not taking part in pattern
activation, considered irrelevant for learning and inference.
2.3.2
Higher Order Bayesian Model
The problem encountered in practical applications is that the Independence Assumption is often violated, because it is too restrictive. The standard way to deal
with this, as when facing a non-linearily separable training set, is to introduce a
hidden layer with an internal representation in which classes are separable. Here,
we use a structure of the hidden layer consisting of feature detectors organized in
hypercolumns.
Starting from the previous model we assume independence between all attributes
and conditional independence given yj
(P (x1 , ... , xn ) =
n
Y
P (xi ))
and
(P (x1 , ... , xn |yj ) =
n
Y
P (xi |yj ))
i=1
i=1
However if two variables xi and xj are found not to be independent, they can
be merged into a joint variable xij , giving :
P (x1 , ... , xn ) = P (x1 )...P (xij )...P (xn )
and a similar method may be used for the conditional probabilities. This means
that in the network we get one unit for each combination of outcomes of the original
variable xi and xj . For example, if two groups of units corresponding to primary
features A = {a, a} and B = {b, b} are not independent, we insert in their place a
group of complex units AB = {ab, ab, ab, ab} making up a composite feature. The
hypercolumn structure formed produces a decorrelated representation, where the
Bayesian model is applicable.
We note that all formulae above are unchanged. We have just introduced a
hidden layer that increases internal computation but the external environment is
unchanged. The structure of our network now resembles the structure in figure 2.2.
This process relies on a measure of independence of the attributes xi of an input
pattern x. A partially heuristic method (Lansner and Holst 1996) [23] is to merge
two columns if the measure of correlation (like the mutual information) between
them is high :
X
P (xi , xj )
Iij =
P (xi , xj ) log(
)
(2.5)
P
(xi )P (xj )
x ∈X ,x ∈X
i
i
j
j
A major drawback of this method is that the number of units increases exponentially with their order, i.e. how many input attributes they combine (Lansner
and Holst 1996, Holst 1997) [23, 15].
11
CHAPTER 2. THE BCPNN MODEL
Figure 2.2: Architecture of the BCPNN with a hidden unit for internal decorrelated
representation
2.3.3
Graded units
Thus far, we have only considered treated binary inputs. However, it is valuable
too that the network handles graded input : for instance, if an attribute is unknown
or its value uncertain, graded input would then be a ‘confidence’ value between 0
(no) and 1 (yes). This cannot be coded directly as a graded input activity between
zero and one, because this would be interpreted as a probability in the BCPNN.
Thus we use a kind of soft interval coding to a set of graded values.
Suppose that each attribute i can take Mi different values, xii0 would be a
binary variable describing the probabilty for the ith attribute to take the i’th value
{xii0 = 1 ⇔ xi = i0 }. Making the necessary labellings in the previous formulae, we
get
Y P (xik , yjj 0 )
πjj 0 = P (yjj 0 |xik ) = P (yjj 0 )
P (xik )P (yjj 0 )
A
where for each attribute i ∈ {1, ..., n} a unique value xik is known, where k ∈
{1, ..., Mi }. Similarly it follows that
πjj 0 = P (yjj 0 )
Mi
n X
Y
P (xii0 , yjj 0 )
oii0
P (xii0 )P (yjj 0 )
i=1 i0 =1
with indicators oii0 = 1 if i0 = k and zero otherwise. oii0 can be seen seen as a
degenerate probability oXi (xii0 ) = δxik (xii0 ) = PXi (xii0 ) of the stochastic variables
Xi which is zero for all xii0 except for the known value xik (Sandberg et al. 2002)
[31].
12
2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL
Taking the logarithm of the previous expression leads to

n
X

Mi
X
P (xii0 , yjj 0 )
log(πjj 0 ) = log(P (yjj 0 )) +
log 
oii0 
P (xii0 )P (yjj 0 )
i=1
i0 =1
(2.6)
The corresponding network now has a modular structure. The units ii0 in the
network, where i ∈ {1, ..., Mi }, explicitly representing the values xii0 of Xi may be
viewed as a hypercolumn as discussed above. By definition the units of a hypercolP i
umn i have a normalized total activity M
i0 =1 oii0 = 1 (the variable xi can only have
one value k at a time).
Transforming these equations to the network setting yields
hjj 0 = βjj 0 +
n
X

log 
Mi
X

ωii0 jj 0 oii0 
(2.7)
i0 =1
i=1

 ω 0 0 = P (yjj 0 ,xii0 )
ii jj
P (yjj 0 )P (xii0 )
with
 β 0 = log(P (y 0 ))
jj
(2.8)
jj
where hjj 0 is the support of unit jj 0 , βjj 0 is the bias term and ωii0 jj 0 is the weight.
πjj 0 = f (hjj 0 ) = exp(hjj 0 ) can be identified as the output of unit jj 0 , representing the
confidence (heuristic or approximate probability) that attribute j has value j 0 given
the current context. We also need to normalize output within each hypercolumn by
exp(hjj 0 )
πˆjj 0 = f (hjj 0 ) = P exp(h
.
0)
j0
jj
Figure 2.3: Architecture of the BCPNN with a hidden unit and an additive summation layer for graded input handling
13
CHAPTER 2. THE BCPNN MODEL
Figure 2.3 shows a ‘pi-sigma network’, able to handle graded input. The notion
of a support unit is used to update the units simultaneously and not one by one
: calculations are first stored in the support units for all units and the transfer
function is then used to update the units all at once.
2.3.4
Recurrent Network
Now, because both the input oii0 and the output πˆjj 0 of the network represent
probabilities, we can feed the output back into the network as input, creating a fully
recurrent network architecture, which can work as an autoassociative memory. The
currently observed probability oii0 = PXi (xii0 ) is used as an initial approximation of
the true probability of Xii0 and used to calculate a posterior probability, using the
learning parameters βjj 0 and wii0 jj 0 , which tends to be a better approximation. This
is then fed back and the process is iterated until a consistent state is reached, which
is guaranteed because the weight matrix is symmetric. The reader should note that
we have now incorporated the yjj 0 among the xii0 , thus dropping the notions of
input and output units.
In the recurrent network, activations can be updated either discretely or continuously. In the discrete case, πˆjj 0 (t + 1) is calculated from πîi0 (t), or equivalently,
the hjj 0 (t + 1) from hii0 (t) using one iteration of the update rule
hjj 0 (t + 1) = βjj 0 +
n
X

log 
Mi
X

ωii0 jj 0 f (hii0 (t))
(2.9)
i0 =1
i=1
In the continuous case hjj 0 (t) is updated according to a differential equation,
making the approach towards an attractor state continuous:

M

n
Xi
X
dhjj 0
τc
log 
ωii0 jj 0 f (hii0 (t)) − hjj 0 (t)
= βjj 0 +
dt
i=1
i0 =1
(2.10)
where τc is the ‘membrane time constant’ of each unit. Input to the network is
introduced by clamping the activation of the relevant units (representing known
events or attributes). As the network is updated the activation spreads, creating
the a posteriori beliefs of other attribute values.
2.4
2.4.1
BCPNN Learning Implementations
Counter Model
This model has been developed and described (Lansner and Ekeberg 1989) [22].
The purpose is to collect statistics of unit activity and co-activity of pairs of units,
to be able to estimate the probabilities P (xi ) and joint probabilities P (xi , xj ) used
to calcultate W and βj values. An input pattern consists of a stimulus strength in
the range [0,1] for each unit in the network. Here, the network is entirely ‘stimulusdriven’ during learning, otherwise the network would first interpret the input and
14
2.4. BCPNN LEARNING IMPLEMENTATIONS
then learn its own interpretation, which is to be avoided. This allows a reduction
in computing time during learning, because no time is used to infer from the data
(no internal computation).
The basic idea behind the counter model, is to estimate the probabilities P (xi ),
P (xj ) and P (xi , xj ) by counting occurences and co-occurences in the training set.
With an estimate of p = Zc , we obtain
ci
βi = log(P (xi )) = log
Z
where
Z=
X
κ(α)
P (xi , xj )
cij Z
ωij = log
= log
P (xi )P (xj )
ci cj
(2.11)
"
and
ci =
α
X
κ(α) πi
α
cij =
#
X
κ(α) πi πj
"
#
(2.12)
α
Here, πi is the output of unit i, α is an index over the patterns in the training
set, and κ is the significance attributed to a certain learning event. It provides
a mechanism for over-representing subjectively important learning examples and
ignoring unimportant ones. This technique is similar to boosting used in classification, which is the over-representation of hard examples in order to increase accuracy
of the classifier. Special care has to be taken when counters come out as zero. In
the case when ci or cj is zero, wij is also set to zero. If ci and cj are both non-zero
but cij is zero, wij is set to a large negative value, log( Z1 ). This also happens for βi
when ci is zero.
The counter model provides a simple and fast implementation of BCPNN learning, but when the maximum capacity of the network is reached, catastrophic forgetting occurs (i.e. all memories are lost when the system is over-loaded).
2.4.2
Incremental Learning
In order to avoid catastrophic forgetting, incremental learning using exponentially running averages has been implemented (Sandberg et al. 2002, Sandberg et
al. 2003) [31, 32]. The idea is to introduce intrinsic weight decay (forgetting) in
the network, so that the system never becomes over-loaded. A time constant α is
used to control the time-scale of this weight decay, allowing for short-term working
memory behaviour as well as for long-term memory.
A continuously operating network will need to learn incrementally during operation. In order to achieve this, P (xii0 )(t) and P (xii0 , xjj 0 )(t) need to be estimated
given the information {x(t0 ), t0 < t}. The estimate should include the following
properties:
1. It should converge towards P (xii0 )(t) and P (xii0 , xjj 0 )(t) in a stationary environment.
2. It should give more weight to recent than remote information.
3. It should smooth or filter out noise and adapt to longer trends, in other words
lower frequency components of a non-stationary environment.
15
CHAPTER 2. THE BCPNN MODEL
(1) is the prime constraint. Our estimate needs to converge to these probabilities
because they are needed to compute the Bayesian weights and biases. (2) makes the
model operate as a ‘palimpsest memory’ meaning that recent memories constantly
overwrites old ones. Thus a pattern has to be reviewed not to be forgotten. (3)
is a stability constraint in a non-stationary environment. The low-pass filtering
operation is to be investigated again in Chapter 3.
The incremental Bayesian learning rule proposed here achieves this by approximating P (xii0 )(t) and P (xii0 , xjj 0 )(t) with the exponentially smoothed running averages Λii0 of the activity πîi0 and Λii0 jj 0 of coincident activity πîi0 πˆjj 0 . The continuous
time version of the update and learning rule takes the following form:

Mj

n
X
X
dhii0 (t)
0
τc
= βii +
log 
ωii0 jj 0 (t) πˆjj 0 (t) − hii0 (t)
dt
0
j=1
j =1
(2.13)
exp(hii0 )
πîi0 (t) = P
i0 exp(hii0 )
(2.14)
dΛii0 (t)
= α([(1 − λ0 )πîi0 (t) + λ0 ] − Λii0 (t))
dt
dΛii0 jj 0 (t)
= α([(1 − λ20 )πîi0 (t)πˆjj 0 (t) + λ20 ] − Λii0 jj 0 (t))
dt
Λii0 jj 0 (t)
ωii0 jj 0 (t) =
Λii0 (t)Λjj 0 (t)
βii0 (t) = log(Λii0 (t))
(2.15)
(2.16)
(2.17)
(2.18)
The above probability estimates converge towards the correct values given stationary inputs for sufficiently large time constants. Since the weights of the network
depend more on recent than on old data, it appears likely that a Hopfield-like network with the above learning rule would exhibit palimpsest properties.
Special care has to be taken to avoid logarithms of zero values (see Sandber et al.
2002) [31]. In addition, the parameter α provides a means to control the temporal
dynamics of the learning phase (from short-term working memory to long-term
memory), It also allows us to switch off learning when the network needs to be used
in retrieval mode, allowing for change in the network activity without corresponding
weight change, because when α = 0 the running averages ‘freeze’ to their current
values.
2.5
Performance Evaluation and Applications
Performance Evaluation
There are many criteria available to evaluate the performance of a model. Of
course, no model is better than the others on every level, nor it is designed for
every purpose. Nevertheless, in order to be accepted and developed in the future, a
16
2.5. PERFORMANCE EVALUATION AND APPLICATIONS
model needs to gather some basic features : robustness, reasonable time of execution
and stability are required to consider a model efficient. Here, we present the main
criteria we use to evaluate the BCPNN model.
Frequency of correct retrieval This is the most used criterion to evaluate the
performance of the network. Feeding a list of input patterns to the network, we
want to know how well the network learns them, by counting the occurences
of successfully completed patterns after learning. An important parameter
is the age of the pattern, because recent patterns tend to be retrieved more
accurately than old ones. Numbers of patterns, their complexity and time of
presentation are to be taken into account too.
Storage Capacity The storage capacity is the amount of patterns that a network
can store. Hopfield network capacity has been investigated (Hopfield 1982)
[16]. In our counter model, the capacity is fixed, thus it is susceptible to catastrophic forgetting, whereas the incremental learner has a capacity dependant
from its spontaneous forgetting (short-term memories with fast weight decay dynamics are protected from catastrophic forgetting because capacity is
hardly reached, whereas long-term memories are more exposed to it).
Noise Tolerance In reality, patterns fed to the network are always a little noisy
and it is important that the attractor dynamics of the network overcome
this. To test this, we feed distorted patterns to the network and count the
frequency of retrieving the original ones. A special case is the one of competing
interpretations when a mixture of two stored patterns is fed to the network.
Convergence speed The convergence speed for relaxation of the network is also
an important trait of our model. Inference has to be fast enough so that
testing patterns do not take too long and, on the other hand, it has to use
small enough timesteps for it not to skip any attractor state with a narrow
convergence domain. Convergence speed increases substantially for distorted
and ambigous patterns (because they are distant from stable attractors in the
attractor space) (Lansner and Ekeberg 1989 [22]).
Applications
The domain of applications of the Bayesian Confidence Propagation Neural
Network is wide. Because of its statistically-based method of unsupervised learning,
it can be implemented in a series of different contexts. We present some of its
applications here.
Classification The BCPNN is first designed to evaluate probabilities from a set
of observed features or attributes, so it is natural that the BCPNN is used
for classification tasks, which aim to label a pattern and assign it to a corresponding class. The network architecture of these networks is single or
multi-layered, depending on the complexity of the data set. The input units
17
CHAPTER 2. THE BCPNN MODEL
correspond to the attributes, and the output units to the class units. BCPNN
and classification has been investigated exhaustively (Holst 1997) [15].
Content-addressable memory When used in a recurrent network architecture,
the BCPNN model performs quite well as a content-addressable memory. It
takes into account to statistical properties of the data and performs better
with patterns for which attributes can be considered independent, like pixel
grey-levels in an image, letters in a list of words or digits in a list of numbers.
The capacity has to be large enough to avoid memory overloading.
Because of its associative character, BCPNN memory networks can perform
pattern completion (restoring a pattern from only a sample of it) and pattern
rivalry (decision between ambigous patterns or a mixture of two existing ones).
A good example for pattern rivalry is found in optical illusions and ambigous
images.
Pharmacovigilance and Data Mining The BCPNN has been used for highlighting drug-ADR pairs for clinical review in the WHO ADR database as
part of the routine signal detection process (Bate et al 1998, Lindquist et al.
2000). The recurrent BCPNN has also been implemented a tool for unsupervised pattern recognition and has been tested on theoretical data, and shown
effective in finding known syndromes in all haloperidol reported data in the
WHO database (Bate et al. 2001, Orre et al 2003). More recently, Ahmed
et al. revisited Bayesian pharmacovigilance signal detection methods, in a
multiple comparison setting (Ahmed et al. 2009).
18
Chapter 3
A spiking BCPNN Learning Rule
In this chapter, we introduce the new ‘spiking’ version of the BCPNN learning
rule. We give its mathematical formulation and discuss its specific features and how
they account for biologically observed phenomena.
In order to have a mapping from the original BCPNN learning to the spiking
version of it, we need to match one descriptor of the activity of the biological neurons
to the input and output of the abstract units. The most natural choice seems to be
the frequency or rate of firing of one neuron. Thus the range [0, 1] of the units in
the non-spiking network will be mapped to a range [0, fmax ] where fmax represents
the maximum firing frequency of one neuron.
3.1
Formulation
The version of the learning rule that we are going to implement in a spiking
neuron context has the following form :
1
Mi
1
zj0 =
Mj
dzi
yi − z i
=
dt
τi
dzj
yj − z j
=
dt
τj
zi0 =
(3.1)
(3.2)
In this first stage of processing (equations 3.1 and 3.2), we filter the presynaptic
and postsynaptic variables yi and yj , which exhibit a ‘spiking-binary’ behaiour most
of the time, with a low-pass filter of respective time constant τi and τj (note that
they can be different). The resulting variables zi and zj are called primary synaptic
traces. Mi and Mj are the number of units in the pre-hypercolumn and the post
hypercolumn respectively, and are only used in a network context. In single-synapse
19
CHAPTER 3. A SPIKING BCPNN LEARNING RULE
learning, we set Mi = Mj = 10. The typical range of τi and τj is 5 to 20 ms.
1
Mi
1
zj0 =
Mj
1
e0ij =
M i Mj
dei
zi − e i
=
dt
τe
dej
zj − ej
=
dt
τe
deij
zj zj − eij
=
dt
τe
e0i =
(3.3)
(3.4)
(3.5)
In the second stage of processing (equations 3.3, 3.4 and 3.5), we filter the
primary traces zi and zj with a low-pass filter of constant τe (note that it is the
same for the 3 equations). The typical range of τe is 100 to 1,000 ms. The resulting
variables ei ,ej and eij are called the secondary synaptic traces. We note the introduction of a secondary mutual trace eij , which keeps a trace of the mutual activity
of yi and yj and will be used to later compute P (xi , xj ). Note that a mutual trace
is impossible to get at the first stage of processing since the direct product yi yj is
zero most of the time. This is because yi and yj are ‘spiking’ variables and thus
equal zero except on the occurence of a spike, so yi yj would be non-zero only when
yi and yj spike at the exact same time, which almost never happens.
1
Mi
1
p0j =
Mj
dpi
ei − pi
=κ
dt
τp
ej − pj
dpj
=κ
dt
τp
dpij
eij − pij
=κ
dt
τp
p0i =
p0ij =
1
Mi M j
(3.6)
(3.7)
(3.8)
In the third and last stage of processing (equations 3.6, 3.7 and 3.8), we filter
the secondary traces ei ,ej and eij with a low-pass filter of constant τp (note that it
is the same for the 3 equations). The typical range of τp is 1,000 to 10,000 ms. The
resulting variables pi ,pj and pij are called the tertiary synaptic traces. We also note
the presence of a mutual tertiary trace that is a direct approximation of P (xi , xj ).
(
βi =
(
ωij =
log(ε)
if pi < ε
log(pi ) otherwise
(3.9)
p
log(ε)
if piijpj < ε
p
log( piijpj ) otherwise
(3.10)
The equations for updating the weights and biases (equations 3.9 and 3.10)
are the classical Bayesian weights and biases equations. Note that these equations
change a little in the case of ‘pi-sigma’ higher order networks with graded input
(equations 2.7 and 2.8). Because we deal only with binary input, we keep these
equations unchanged. When pi takes a small value it is set to a minimum value
p
ε in order to avoid a logarithm of zero. The same is done when piijpj becomes
20
3.2. FEATURES
too small. We note the presence of the parameter κ. It is a global ’print-now’
signal that regulates the update of tertiary traces, while leaving unchanged the
internal structure of the network (primary and secondary traces). We will explain
its function in further detail later.
The spiking version of the BCPNN learning rule is the set of these 10 equations.
It relies on 3 stages of processing that perform the same operation (low-pass filtering)
with different temporal dynamics. The set of parameters that can be controled are
the time-constants τi ,τj ,τe and τp , the initial values of the traces and the print-now
signal κ.
3.2
3.2.1
Features
Synaptic traces as local state variables
The implementation of local synaptic state variables such as synaptic traces
in the above learning rule is a common approach of STDP learning rules [27, 25].
These variables are used to keep a trace or memory of a presynaptic or postsynaptic
events such as the occurence of a spike. In addition, low-pass filtering enables us to
manipulate continous variables rather than ‘spiking variables’ which is problematic
when we want to estimate, for example, a joint probability P (xi , xj ), since the
direct product of two spiking variables is likely to be zero, due to the ‘impulse’
nature of a spike. Indeed a spike has a very short duration and is often described
as a discontinous variable, that is non-zero only on the occurence of a spike.
Scaling these variables between 0 and 1 is very useful because it makes their
quantitative use easier. One can deal with different types of synaptic traces.
Additive trace The additive trace updates the local state variable x(t) by a constant value A. The particularity of this trace is that it can be greater than 1
when a lot of events occur in a short time. It is implemented by the following
equation
dx
x X
=− +
Aδ(t − ts )
dt
τ
ts
where ts denotes the time occurence of a spike.
Saturated trace The saturated trace updates the local state variable x(t) to a
constant value A. This trace is always in the range [0,1] and it keeps only
the history of the most recent spike, because it’s invariably reset to 1 on the
occurence of a spike. It is implemented by the following equation
x X
dx
=− +
(1 − x− )δ(t − ts )
dt
τ
ts
where ts denotes the time occurence of a spike and x− is the value of x just
before the occurence of the spike.
21
CHAPTER 3. A SPIKING BCPNN LEARNING RULE
Proportional trace Here, the local state variable x(t) is updated to a value proportional to its deviation to 1. This trace is always in the range [0,1] and it
realizes a synthesis of the effects of the two traces above. It keeps a value
of x(t) close to 1 when many spikes occur in a short time and it is easy to
evaluate the occurence of the last spike by looking at the exponential decay
at a time t. The proportional trace is the one we use later. It is implemented
by the following equation
dx
x X
k(1 − x− )δ(t − ts )
=− +
dt
τ
ts
with ts , x− as described above and k is the proportion of update. Typically
we use k ∈ [0.5, 0.8]. Figure 3.1 shows the dynamics of the 3 different synaptic
trace types.
Figure 3.1: Different types of synaptic traces - The upper figure corresponds
to a spike train and the lower figure displays the three different synpatic traces :
the black, blue and red curves correspond respectively to the additive, saturated
and proportional traces
3.2.2
Spike-timing Dependence
The first stage of the processing of our learning rule (equations 3.1 and 3.2)
allows us to create the primary synaptic traces. These variables with very fast
dynamics are used as recorders of the spikes : on the occurence of a spike they are
set a certain value (since we use proportional traces, this value is proportional to the
deviation between 1 and the value of the synaptic trace just before the spike x(t))
and decay exponentially until another spike occurs. Proportional traces convey two
22
3.2. FEATURES
pieces of informations : history of the last spike by looking at the current decay (if
the last spike occured recently, the trace is steep and decays fast) and global history
of the past events (when numerous spikes occur in a short period of time the trace
value comes close to 1).
The dynamics of the primary traces zi and zj are controled by the time constants
τi and τj . Since these constants can be different, pre-post timing can be promoted
over post-pre timing, and the other way around. For instance, if we set τi = 20 ms
and τj = 1 ms, then zj will decay much faster than zi . Then, if a postsynaptic spike
occurs 10 ms after a presynaptic spike, the product zi zj will be non zero shortly
after the occurence of the postsynaptic spike. On the other hand, if a presynaptic
spike occurs 10 ms after a postsynaptic spike, then the product zi zj will still be
zero because of the fast decay of zj . By setting τj to a small value compared to τi ,
we have given a priority to pre-post timing (see figure 3.2).
The values of these two time-constants define a spike-timing time-window, (see
Bi and Poo 1998 [6]). The width and symmetry of this can be controled by manipulating these constants.
Figure 3.2: Different effects of pre-post and post-pre timing on the primary
synaptic traces - The upper figure corresponds to a regular spike train post-prepost. Since primary traces have different time-constants (τi = 50 ms and τj = 5 ms)
pre-post timing is promoted over post-pre timing, because the resulting product zi zj
(not displayed here) is much bigger after regular pre-post timing than after post-pre
timing.
3.2.3
Delayed-Reward Learning
It can be a little puzzling to realize that our learning rule has three stage
of processing of the data while we always perform the same operation (low-pass
23
CHAPTER 3. A SPIKING BCPNN LEARNING RULE
filtering). However, these three filtering procedures perform three very specific and
different tasks. As observed in previous models (Bi and Poo 1998, Rubin et al. 2005,
Morrisson et al. 2008, Mayr et al. 2009) [6, 30, 27, 25] exact spike timing between
presynaptic and postsynaptic plays a crucial role in LTP. Moreover, a time-window
of 20 ms before and after a postsynaptic spike seems to exist, so that no long lasting
change occurs if delays between spikes are greater than 20 ms.
However, the activity in the network needs to be long-lasting and to reverberate
on a much greater time-scale. In the context of delayed reward learning [28] and
reinforcement learning, the reward, which triggers the induction of LTP, occurs with
a delay on a time-scale of hundreds of milliseconds to seconds. Worse, this delay
isn’t predictable so that one cannot know when the reward and the actual learning
will take place. In order to solve this problem, we include secondary traces that
extend the reverberation of activity in the network.
Then, when a spike occurs, activity is recorded in the primary and secondary
traces. After a few hundred milliseconds, the activity has disappeared in the primary
traces, but is still reverberating in the secondary traces ei , ei and eij (equations 3.3,
3.4 and 3.5). Thus, if the print-now signal, representing the reward, is set to 1, the
secondary traces convey the information and learning can still take place.
Figure 3.3: Temporal dynamics of the different synaptic traces - Thin curves
correspond to the primary traces, thicker curves to the secondary ones and bold
curves to the tertiary traces. Blue corresponds to presynaptic traces, red corresponds to postsynaptic variables and black corresponds to mutual traces - The
temporal dynamics are the slowest for the tertiary traces that build up and decreases slowly. The combination of these three levels of processinf enables us to
achieve different goals.
It is important to stress that both of these traces are required if we want to
account for the following phenomena : the exisence of a spike-timing window in
the order of tens of milliseconds (about 20 ms for spike delays) outside of which
24
3.2. FEATURES
no significant weight change takes place, and the fact that the reward enhancing
the learning process comes with a delay on a time-scale of hundreds of milliseconds.
As we will see later, there are biological equivalents to this print-now signal and
delayed synaptic traces.
Figure 3.3 shows the temporal dynamics of the primary, secondary and tertiary
traces for a pattern stimulation followed by no activity.
3.2.4
Long-term Memory
Finally, the third stage of processing (equations 3.6, 3.7 and 3.8) computes
synaptic state variables that have much slower dynamics. Typically, the pi , pj and
pij account for long-term memory, meaning that they store events that have been
repeated on a set of actions and experiments.
We assume that our learning rule operates in the context of delayed reward
learning and we take the example of an animal, a rat for instance, being proposed
several buttons to open doors, behind which some food (reward) is present or absent. The primary traces’ activities with fast dynamics record the precise spike
timing when activity spreads in the network consequently to taking actions (stimulus, button pressing). The secondary traces account for the delayed obtention of
the reward, which comes as a delayed result of action-taking. If the rat accesses the
reward, then the ‘print-now signal’ is set to 1 and long-term memory is triggered.
The tertiary traces are activated when delayed reward has been obtained several
times and that stimulus has been reinforced. This means that pi , pj and pij build
up when the activities of the secondary traces have been above a certain baseline
on a repeated scale. Then, reinforcement occurs and memories can be stored.
It is singular however that the print-now signal κ shows up on this stage of
processing. It could have done similarly on the equations 3.3, 3.4 and 3.5, but
the biological equivalent of the print-now signal suggest that the metabolic changes
occur even if it is not activated, whereas only the weights are overwritten if the
print-now is active. Thus, it makes more sense for it to appear right before the
weight update.
3.2.5
Probabilistic features
It is important to keep in mind that our spiking version of the BCPNN learning
rule is not another implementation of a STDP pair-based learning rule. Indeed,
the state variables that we calculate represent probabilities and their values have
an intrinsic meaning on their own. This is the main reason why feeding graded
input to the network is not trivial, because it interprets activities in the network as
probabilities. As discussed previously, input to the units represents the confidence
of feature detection and the output represents the posterior probability of outcome.
In the original counter model, P (xi ) and P (xi , xj ) were quite easy to approximate by counting occurences and co-occurences of the features within the training
set. Due to the spiking structure of the input variables yi and yj it is a bit trickier
25
CHAPTER 3. A SPIKING BCPNN LEARNING RULE
to evaluate the probabilities P (xi ) and P (xi , xj ). The use of synaptic traces allows
us to create mutual traces eij and pij that convey the information about correlation
between spikes.
3.3
Biological relevance
This new version of the BCPNN learning rule shows the biological relevance
on different levels. The first is the use of synaptic traces which are thought to
have a direct biological meaning. For instance, when a presynaptic spike arrives
at a synapse, there is a quantified release of neurotransmitters. According to the
nature of the synapse, the additive trace or saturated trace might be used : the
first when the amount of transmitters is small compared to the synapse size, so that
the occurence of a new spike has an additive effect because enough free receptors
are available for synaptic transmission, and the second when the quantity of neurotransmitters released reaches the maximum capacity of the synapse, which means
that the synapse saturates all of its available receptors on the occurence of each
presynaptic spike.
Another direct equivalent is the ‘print-now signal’ that can be seen as a memory
modulator concentration like dopamine, which is thought to have a direct enhancing
effect on learning and memory when present in high quantities. The delayed-reward
mechanism has indeed a direct biological relevance and has been observed experimentally (Potjans et al. 2009) [28].
As explained before, the mixture of variables with slow and fast temporal dynamics makes sense and fits to what has been observed. The concentration of calcium
ions in the postsynaptic site is thought to play a key role in synaptic plasticity [30]
with much faster dynamics than the protein synthesis governing the transition from
early-LTP to late-LTP [9].
Clopath et al. [9] present a model to account for transition from early to lateLTP, containing three different phases of Tag-Trigger-Consolidation. A synapse can
be in one of the three following states : untagged, tagged for LTP (high state) ot
tagged for LTD (low state), depending on a presynaptic and postsynaptic event. If
the total number of tagged synapses exceeds a threshold, a trigger process occurs
and opens up for consolidation (long lasting changes in synaptic efficacy). What
is similar in our model is the three different temporal dynamics. The secondary
mutual trace eij can be seen as an equivalent to a tagging procedure : if its value
stays above a threshold for a long enough time, then metabolic changes, such as
specific protein synthesis, occur allowing for conversion from working memory to
long-term permanent memory.
26
Chapter 4
Abstract Units Implementation
In the two next chapters, we present different implementations of the spiking
version of the BCPNN learning rule presented previously. The first implementation
consists of abstract units in MATLAB and serves as a gateway towards spiking
neuron models in NEURON. For each model, we explain how we present patterns
to the cells, implement the learning rule and use the model in retrieval mode.
Due to its ability to handle vectors and matrices, MATLAB serves as a convenient computational tool to build up artificial neural networks. The built-in
functions allow a great variety of 2D and 3D graphic display. One can also import
data computed elsewhere into MATLAB and process it according to its wishes.
But MATLAB loses all of its computational power when it has to process data
procedurally which is the case for our differential equations. In our learning rule,
we have to update and compute multiple variables at each time step, because we
deal with three sets of first order linear differential equations (equations 3.1 to 3.8).
Since these computations cannot be gathered in a matrix and all treated in batch
fashion, MATLAB is structurally inefficient for our task.
However, we can use it for single-synapse learning (only two units : one presynaptic and one postsynaptic) on reasonable time-scales (between 1,000 ms and
10,000 ms) and exploit its graphical display facilities, which is the reason why we
first implemented our learning rule in MATLAB. The aim is qualitative : displaying
weights and biases corresponding to different input patterns and giving an insight
to the synapse’s internal dynamics (primary, secondary and tertiary traces’ timecourses).
4.1
Pattern presentation
In this section, we explain how we presented patterns to the units ; in other
words how input is fed to the network. We have three ways to present patterns
: non-spiking, frequency-based spiking and Poisson generated spiking. It is to be
noted that all along the following chapters we focus on single-synapse learning,
meaning that we deal with two units (presynaptic and postsynaptic) connected by
27
CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION
a single synapse.
4.1.1
Non-spiking Pattern Presentation
As a starting point of our investigations and a reference for our further results,
we will test our learning rule by feeding patterns in a similar process to what has
been done before with non-spiking units (Sandberg et al. 2002) [31]. To achieve
this, we clamp input to the presynaptic and postsynaptic units yi and yj to the
respective values ki and kj during a certain time of presentation of about tens of
milliseconds. The values ki and kj can take only binary values or a continous value
in the range [0, 1] (graded input). Patterns are fed to the network sequentially.
For instance, if the set of input patterns, we want to learn is {(1, 1), (0, 0), (1, 1), (0, 1), (1, 1)},
then yi will be clamped to the following set of values (1,0,1,0,1) and yj will be
clamped to (1,0,1,0,1). The input variables yi and yj are ‘stepped’ and discontinous (see Figure 4.1a). Hence, abstract units are artifical, because no biologically
observed variable takes constant values or exhibits such a discontinous time-course.
The time of presentation is important because it needs to be long enough for
the primary traces to retain pattern activities (the longer the pattern is seen, the
stronger the memory) but it is also valuable to impose some resting time between
patterns. Indeed, during each pattern presentation, the network needs to adapt to
it and rearrange its internal structure. In addition, between patterns, it needs to
rest for a short while, so that the fast dynamics internal variables return to their
baseline. An explanation of this is when we are learning different things and we
always need some adaptation to jump from one thing to another. We will expand
on this in the Discussion section.
On the other hand, when we want to teach a concept to our network throughout a
temporal series of patterns, the time-scale of the learning phase needs to be smaller
than the dynamics of the long-term memory traces pi , pj and pij , otherwise the
synapse forgets what has been fed to it in the past. If the long-term memory timeconstant τp equals 1 second, then after 5 seconds, past events will be discarded. So,
in this case, it doesn’t make sense to have a learning procedure that takes longer
than 5 seconds. In a nutshell, learning procedures should not exceed the forgetting
of our long-term memory.
In MATLAB, the function generate_pattern_nonspiking generates a driving
input x(t) from a series a of parameters : delay, the resting between pattern
presentation, dur, the duration of presentation of one pattern, T, the length of
the output and pattern, a vector containing the values for the driving input x(t).
Figure 4.1a shows the input activity of an abstract unit fed with the pattern x =
[1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].
4.1.2
Spiking frequency-based Pattern Presentation
Because of its biological irrelevance, the previous pattern presentation scheme
is limited. This time, we try to mimic the ‘spiking behaviour’ of membrane voltage
28
4.1. PATTERN PRESENTATION
observed in real experiment. Still, our spike generation in MATLAB is artificial
but we are making progress in trying to imitate spiking behaviour. So, we build up
artificial spiking voltages by setting the input variable yj to 1 on the occurence of
a spike and to zero otherwise. If ts denotes the time occurence of a spike for unit i,
then our input variable yi variable can be rewritten
yi (t) =
X
δ(t − ts )
ts
Pattern presentation to the input units is now based on their firing frequency
rather than on a fixed stepped value. The idea is to realise a linear mapping from
a value of xi between 0 and 1 (representing the confidence of feature detection
developed in previous chapters) to a frequency fi . To achieve this, the value 1 for
xi will be mapped to a maximum frequency fmax and other values between 0 and
1 to a value directly proportional in the range [0, fmax ] (i.e. 0.5 will be mapped
fmax
to fmax
2 , 0.25 will be mapped to
4 , and so on). By doing this, we have just
created an input filter that transcripts graded input xi (t) between 0 and 1 to a
spiking time-dependent variable yi (t). We will later refer to the stepped value xi (t)
as the driving input and to yi (t) as the actual input activity, the first being used
only for pattern presentation and the latter to compute the synaptic traces, hence
the weights and biases.
An important feature in the frequency-based pattern presentation is that it
allows us to easily control the timing between presynaptic and postsynaptic spikes.
This offers an implementation possibility when we want to investigate the effects of
exact spike timing on the weight modification in our learning rule.
In MATLAB, the function generate_frequency_spiking generates an input
activity y(t) from a driving input x(t). The series a of parameters is similar with
the previous section and includes a value fmax, which corresponds to the maximum
output frequency (when x(t) takes a value of 1). In order to generate spikes, we
discretize the time-scale by intervals of 1 milisecond : when a spike occurs at a
specific time t0 , the value y(t0 ) is simply set to 1. Figure 4.1b shows the input
activity of an abstract unit fed with the pattern x = [1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].
4.1.3
Spiking Poisson-generated Pattern Presentation
We make one more step in the direction of mimicking neural-like data by implementing Poisson spike trains, to feed input units. In the cortex, the timing of
successive action potentials is highly irregular and we can view the irregular interspike interval as a random process. It implies that an instantaneous estimate of
the spike rate can be obtained by averaging the pooled responses of many individual neurons, but precise timing of individual spikes conveys little information The
benefit of the Poisson process for spike generation is that it adds randomness and
discards the determinism in our simulation (each random seed will give different
spike trains). Thus, we focus on the parameters underlying this random process
rather than modeling precise coincidences of presynaptic and postsynaptic events.
29
CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION
(a) Non-Spiking Pattern Presentation
(b) Spiking Frequency-based Pattern Presentation (c) Spiking Poisson-generated Pattern Presentation
Figure 4.1: Abstract Units Pattern Presentations corresponding to the pattern x =
[1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75]
We assume here that the generation of each spike depends only on an underlying
signal r(t) that we will refer to as an instantaneous firing rate. It follows that the
generation of each spike is independent of all the others spikes, which is called the
spike independent hypothesis. Plus, we will make the assumption that the firing rate
r(t) is constant over time (actually r(t) is updated by steps but for one pattern we
can suppose than r(t) = r). The Poisson process is then said to be homogenous.
In a Poisson process, the probability that n events with a instantaneous rate r
occur in ∆t is given by the formula :
P ({n spikes during ∆t }) = e−r∆t
(r∆t )n
n!
(4.1)
By setting n = 0 and ∆t = τ , we obtain P ({next spike occurs after τ }) = e−rτ and
it follows that
P ({next spike occurs before τ }) = 1 − e−rτ
(4.2)
One way to implement a Poisson spike trains is to use equation 4.2 : we generate
a random number between 0 and 1 and the inter-spike interval is given by the value
of τ that realizes the identity. But, the drawback of this method, is that the spike
train has to be created sequentially. We can create a whole Poisson spike train at
once by doing as follows.
30
4.2. LEARNING RULE IMPLEMENTATION
The average spikeRcount between t1 and t2 can be defined from the instantaneous
firing rate by hni = tt12 r(t) dt and for sufficiently small intervals, t1 = t − δ2t and
t2 = t + δ2t , the average spike count can be approximated by hni = r(t)δt = rδt after
the homogenous poisson process hypothesis. Furthermore, when δt is small enough,
the average spike count equals the probability of the firing of a single spike
P ({one spike occurs during the interval (t −
δt
δt
, t + )}) = rδt
2
2
(4.3)
Now, assuming δt is small enough (usually 1 ms), if we want to create a spike
train af arbitrary length T using 4.3, we need to generate δTt random numbers pi
between 0 and 1. Then if pi < r, we generate spike at the time correponding to the
index of pi and if not, no spike is generated.
The Poisson Spike generation is an intermediate stage to NEURON implementations. It allows us to account for random rate-based spike generation. This is
valuable, because this process is easy to implement and gives us an idea if our
model responds well to noisy or random data. Later, some noisy spike trains may
be added to our data so that it resembles what is observed in vivo.
In MATLAB, the function generate_poisson_spiking generates an input activity y(t) from a driving input x(t). The series a of parameters is similar with the
previous section and the rate r is set to the same value as the frequency fmax used
before. We stress the fact that the Poisson-generation of spike trains is based on a
random process. Thus, each seed gives a different input activity y(t) for the same
driving input x(t). By setting the same seed in two runs, they become identitical. Figure 4.1c shows the input activity of an abstract unit fed with the pattern
x = [1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].
4.2
Learning Rule Implementation
In order to solve the differential equations in MATLAB, we used the solver
ode45. Its use is quite straight-forward, except that this is achieved by using functions handles, which makes it tricky to control intrinsic equation parameters, like
the time-constant τi or the print-now signal κ. If the implementation of the learning
rule follows the set of equations 3.1 to 3.10, a non-negligible phenomenon arises :
when spikes are modeled by a sum of unit impulse functions in MATLAB and it
is likely that the solver might miss them, because at each time-step, in order to
evaluate the derivatives of a point it uses points in its neighbourhood. Not only the
spiking variables are highly discontinous, but they are also zero most of the time,
which prevents the solver ode45 to detect any activity.
A solution to this problem is to introduce a ‘duration’ δt for the spikes (typically
δt equals 1 to 2 milliseconds), so that the mathematical modelisation of a spike
switches from an impulse function to a door function of width δt centered in ts .
i
But in that case, τ1i is an upper bound for of dy
dt (see equation 3.1), which results
in a small increase of the primary trace zi (t). This propagates to the secondary
31
CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION
and to the tertiary traces, which, as a result, hardly overcome 0.001 which is highly
undesirable, because they are supposed to represent probabilities of activation.
To bypass these problems, we decide to split the set of equations 3.1 to 3.10
in two phases : First we update the primary traces with the help of an auxiliary
function generate_primary_trace, which solves the equation 4.4.
(
Zs (t) = zi− (t) + r(1 − zi− (t))
zi (t) = Zs e
s
− t−t
τ
i
if xi (t) = 1
(4.4)
where ts records the occurence of the last spike and Zs is updated according to the
proportional trace update.
The set of equations 3.3 to 3.8 is done separately using the solver ode45. Special
care has to be taken with the time-step increment in order to find a trade-off between
computing time and accuracy. Also, the weight update 3.10 and bias update 3.9
are straight-forward.
Finally, it is important to mention that we have implemented ‘off-line learning’,
in the sense that weights and biases are updated independtly from each other.
Everything happens as if there was no connection at all between the cells. This is
somehow not a constraint during learning, but on the contrary rather convenient.
This remains to be investigated : when learning should occur and when inference
should take over. In our model of abstract units though, the boundary between
learning and inference is clear because they are governed by different formulae used
in a different context.
4.3
Retrieval
If the learning phase is central in our implementation, it is also crucial to check
that the stored patterns can be retrieved correctly. The aquired knowledge is to be
easily retrieved, especially when we use the BCPNN as an auto or hetero-associative
memory.
Thus, in this section, we assume that a certain learning phase has already occured and that the weights ωij and bias βj are set. Our goal is to present an
incomplete pattern and to check if the network is able to complete it correctly.
Since we only deal with one synapse, input will be fed to the presynaptic unit and
output will be collected at the postsynaptic unit.
Because, we have three different pattern presentation schemes in our abstract
units model, inference is done in three different fashions. In all cases, however,
the retrieval phase aims to realise an input-output mapping from unit i to unit j.
Quantitative results are presented in the next Chapter, we focus here on the method
that enables us to achieve this.
Non-Spiking Inference
This case is the simplest because the activity of a unit is constant over time
(for the duration of one pattern presentation). In other words, because there is no
32
4.3. RETRIEVAL
difference between the driving input xi (t) and the input activity yi (t), the inputoutput mapping is straight forward. Assuming that unit i is fed an input pattern
corresponding to the driving input xi = ki , we first compute the support value hj
of unit j with hj = ωij xi + βj , and then we update the output unit activity with
xj = ehj . Finally, the input-output mapping is realized by the equation 4.5.
xj = eωij xi +βj
(4.5)
In order to produce the input-output relationship curve, we compute the output
xj according to equation 4.5, for a set of input values xi , regularly spaced between 0
and 1. We end up with a corresponding output vector y mapped to an input vector
x. It is to be noticed that the above equation is the same as the equation presented
in Chapter 2 (section 2.3.4), with only two units. If the learning phase has been
successful, xj is nothing but the a posteriori probability of unit j, knowing unit i.
Spiking Inference
In the Spiking Frequency-based Poisson-generated pattern presentation schemes,
the input activity yi (t) is not constant over time. A value ki of the driving input
xi (t) corresponds to a firing frequency fi in one case, and to a firing rate in the
other case. Thus, the process of inference for one value of xi is not given by a direct
calculation like in equation 4.5, but instead, it is dependent on the time-course of
the spiking input activity yi (t), governed by the driving input value xi of unit i.
This input activity yi (t) needs to be processed to calculate a correspond output
value xj . In order to map an input value xi to a number xj between 0 and 1, we
process as follows :
1. We generate a regular spiking input yi (t) with frequency fi (FS) or a Poisson
spike train with rate fi (PS), during a time Tinf equal to 5 seconds. The firing
frequency or rate obeys fi = xi .fmax with xi ∈ [0, 1].
2. We compute a support activity sj (t) according the relation
sj (t) = ωij yi (t) + βj
3. The support activity sj (t) is then low-pass filtered by a filter with a high
time-constant τf and slow update value k.
dsˆj
k(sj − βj ) − sˆj
=
dt
τf
4. We take the exponential of the filtered support activity sˆj (t).
5. xj is finally set to the mean stationnary value of the output activity hyj (t∞ )i.
33
CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION
Figure 4.2: Spiking Inference with abstract units - Different stages of processing
Figure 4.3 shows these different stages of processing. This technique, despite its
apparent complexity, gives a good fit to the previous non-spiking mapping. The key
procedure occurs at step 3, when we filter the support input activity sj (t). This
variable is equal to the bias term βj when the input unit does not spike, and it is
set to the value βj + ωij on the occurrence of a spike. When we filter with a specific
low-pass filter (high time constant, small update rate), we can generate a filtered
support activity sˆj (t), which works as an additive trace. Hence, the value of sˆj (t)
at the end of the stimulation gives a measure of the firing frequency of the cell.
The direction of update k.(sj − βj ) is proportional to the weight value ωij , which
allows negative or positive build-up according to the sign of ωij . Typically, we use
1
τf = 500 ms and k = fmax
.
Step 4 is needed to keep the inference equation homogenous. It is crucial that
sˆj (t) stays in the range ]−∞, 0], because we want to get a value of xj between 0 and
1. This can be controled either by the value k, or by modifying the filtering equation
like in the case of saturated traces (see Chapter 3). The biological model is composed
of steps 3 and 4, because we can draw an analogy of these processes with what occurs
at the synapse level. Shortly, the filtering accounts for synaptic integration with
low release of neuro-transmitters and slow degradation. The exponentiation in step
4 is observed in the mapping current-frequency of a cell (called current-discharge
relationship).
For the Poisson-generated spike trains, the underlying random process gives different output at each run. Thus, we have to compute average values, after repeating
the same inference process over several runs (between 5 and 10 runs). There is a
trade-off between discarding the randomness by increasing the number of runs and
computing time for simulations. Also it is important to keep the randomness introduced with the Poisson process because it accounts for irregular spiking observed
in real neurons.
34
Chapter 5
Hodgkin-Huxley Spiking
Implementation in NEURON
NEURON is a simulation environment for modeling individual neurons and
neural networks. It was primarily developed by Michael Hines, John W. Moore,
and Ted Carnevale at Yale and Duke. Documentation about NEURON and how to
implement models in NEURON is given in the NEURON book [8].
NEURON, which is associated with the object-oriented NMODL language, offers
an efficient means to run simulations of highly connected network of neurons. Built
on the paradigm of C language, it does not suffer under procedural processing of
data and uses efficient and fast algorithms to solve differential equations. The
computing time of the abstract units model is thereby reduced by a factor 10.
5.1
5.1.1
Cell Model
Hodgkin Huxley Model
In 1952, Alan Lloyd Hodgkin and Andrew Huxley proposed a model to explain
the ionic mechanisms underlying the initiation and propagation of action potentials
in the squid giant axon [14]. They received the Nobel Prize in 1963 in Physiology
and Medicine for this work and the model has since been refered to as the HodgkinHuxley model. It describes how action potentials in neurons are initiated and
propagated with the help of a set of nonlinear ordinary differential equations which
approximates the electrical characteristics of excitable cells such as neurons and
cardiac myocytes [2].
The main idea behind the Hodgkin-Huxley formalism is to give an electrical
equivalent to each bioligical component of the cell that plays a role in the transmission of acion potentials, which is the support of signaling within the cell. The
components of a typical HodgkinâHuxley model, shown in Figure 5.1.1, include :
• A capacitance Cm , representing the lipid bilayer. A cell, considered as whole,
is electrically neutral but the neighbourhood surrounding the cell membrane
35
CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON
Figure 5.1: Hodgkin-Huxley model of a cell
is not. Membrane voltage is the consequence of the accumulation of charged
particles on both sides of that bilayer, impermeable to ions. A typical value
for Cm is 1 nF.
• Nonlinear electrical conductances gn (Vm , t), representing voltage-gated ion
channels. Their behaviour is described by gating variables that describe open,
closed and reverberatory states (see Appendix for equations). These conductances are both voltage and time-dependent : gn (Vm , t) where n denotes a
specific ion species. In addition, they exhibit fast dynamics because they account for the cell regenerative properties implied in the propagation of action
potentials.
• A linear conductance gleak , for passive leak channels, these channels that
are not ion-selective, always open and contribute to the resting membrane
potential. A typical value for gleak is 20 µS.cm−2 .
• Generators En, describing the electrochemical gradients driving the flow of
ions, the values of which are determined from the Nernst potential of the ionic
species of interest.
This model can be extended by modeling ion pumps with the help of current
sources (the sodium-potassium ion pump is responsible for the concentrations equilibrium inside and outside the cell). More elaborate models, include chloride and
calcium voltage-gated current, however, we only deal here with two ionic currents :
sodium and potassium and one leakage channel.
36
5.1. CELL MODEL
Further, our cell model will contain additional channels (see figure 5.3.2) : a slow
dynamics voltage-gated potassium channel accounting for spike-frequency adaptation (see section 5.1.2) and an activity-dependent potassium channel modeling intrinsic excitability (see section 5.3.2).
As a convention we will use I > 0 when ions flow from the outside to the inside
of the cell, so that, in the normal cell dynamics, the sodium current takes positive
values and the potassium current takes negative values. The voltage equation is
given by the relation between applied current Iapp , capacitive currents Ic and the
sum of ion and leak currents Iion :
Iapp = Ic + Iion = Cm
dVm
+ IN a + IK + Ileak
dt
(5.1)
We see that when Iapp > 0 then dVdtm > 0 and the membrane voltage becomes more
positive (depolarization). The dynamics of the detailed voltage and gating variables
equations are given in Appendix.
5.1.2
Spike Frequency Adaptation
Spike-frequency adaptation is a type of neural adaptation that plays a key role
in firing frequency regulation of neurons. It is characterized by an increase of the
interspike interval when a neuron is current-clamped. Among other mechanisms,
various ionic currents modulating spike generation cause this type of neural adaptation : voltage-gated potassium currents (M-type currents), the interplay of calcium
currents and intracellular calcium dynamics with calcium-gated potassium channels
(AHP-type currents), and the slow recovery from inactivation of the fast sodium
current (Benda et al. 2003)[35]. Spike-frequency adaptation can account for the
findings in burst firing (Azouz et al 2000) [5].
Figure 5.2: Spike-frequency Adaptation : Membrane voltage and state variable p
37
CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON
In our model, spike frequency adaptation is taken into account by adding a
slow-dynamics voltage-gated potassium channel. The conductance of this channel
is non-linear and depends on membrane voltage Vm . It is described by an activation
variable p, that works in a similar way to an additive synaptic trace (see figure 5.1.2).
Dynamics of the channel are given by the following equations
gkim (Vm , t) = gkim .p
(5.2)
dp
p∞ (Vm , t) − p
=
dt
τ (Vm , t)
(5.3)
Figure 5.1.2 describes the build up of the trace p and the conductance gkim ,
which is responsible for the increase of the interspike interval along the stimulation.
Because of its slow decay, the delay between stimulation must be much longer than
the stimulation itself for the p variable to return to baseline. The slow dynamics of
this channel suggest that repeated strong transient stimulation has a better effect
than long-lasting stimulation.
5.2
Pattern presentation
Now that our cells are no longer modeled by artificial units, but instead by
complex spiking Hodgkin-Huxley units, input and output need to be matched to
real variables. As presented above, the Hodgkin-Huxley model is based on the
relation between membrane potential Vm and the individual ionic currents Ii . Thus
it is natural to feed input to one cell by injecting a certain amount of current Iapp
into it and reading the output as the membrane potential firing frequency f . To
achieve this, we will use current electrodes to present patterns to the network :
Some current is injected inside the cell membrane, which depolarizes it and triggers
action potentials, and the membrane voltage is recorded as the difference of potential
between two electrodes, one inside and the other outside of the cell (see Kandel 1995
[20] about current-clamp technique).
So the input-output relationship (which is similar to the activity function in
artificial networks) is given by mapping the injected current Iapp to the membrane
firing frequency f . The curve giving the firing frequency of one unit versus the injected input current is called steady-state current discharge. This curve is presented
for our units in the next chapter (see figure 6.6). For weak currents, no active firing
is triggered (the depolarization induced by the current injection is too small for the
membrane to reach the threshold and no action potential is recorded). For currents which are too strong, the Hodgkin-Huxley voltage-gated potassium channels
become inefficient to repolarize the cell and the membrane voltage stabilizes to a
supra-thresholded value. Thus, we must feed input currents that belong to a range
[0, Imax ], where the steady-state current discharge is approximately linear.
During learning, we feed input patterns sequentially. It is entirely frequencybased, meaning that input corresponding to a value between 0 and 1 is mapped to a
firing frequency. The current-frequency relationship will be used to find the current
38
5.3. LEARNING RULE IMPLEMENTATION
clamp value in order to obtain the right frequency. Let’s assume that the set of
input patterns we want to learn is (1, 1), (0, 0), (1, 1), (0, 1), (1, 1), then the unit
i must fire with the following frequency behaviour (fmax , 0, fmax , 0, fmax ) and unit
j with (fmax , 0, fmax , fmax , fmax ). Using the steady-state current discharge curve,
we inject corresponding currents in order to obtain the desired firing frequencies.
5.3
Learning Rule Implementation
In this section, we present how weights and biases are represented and updated
during learning, in the NEURON spiking environment. We use the object-oriented
NMODL language to create new mechanisms for simulations in NEURON. The
details of the code are given in Appendix.
5.3.1
Synaptic Integration
Modeling the synapse
In the artificial context the weight ωij between two units quantifies the strength
of the connection between them. If ωij is high, then the connection between the two
units is strong and they influence one another significantly. On the other hand, if
ωij is close to zero, the connection is very weak and the corresponding units behave
as if they were not connected. The simplest way to represent this in our spiking
context is to map ωij to a synaptic conductance gij between two units i and j. This
conductance would be time-dependent and closely related to the presynaptic and
postsynaptic events.
So, we create a model of a synapse which has intrinsic properties fulfiling the
weight update equation of our spiking learning rule 3.10, and call it a BCPNN
Synapse. It is defined as a point-process in NMODL, which means that one can
implement as many instances of this mechanism as long as one specifies a location
(a section in NEURON). All local variables associated with the section it has been
attached to become available to the point-process (membrane voltage, ionic currents, etc.). As a convention, we will always place a synapse on the postsynaptic
cell soma.
Conductance Expression
In our model, the synaptic conductance gij (t) is a product of three quantities :
gij (t) = gmax .gcomp (pi , pj , pij , t).αi (yi , t)
(5.4)
gmax is the maximum conductance of the synapse : it regulates its strength (ability to conduct current) and can temporarily be set to zero if one wants to operate
off-line learning.
gcomp (pi , pj , pij , t) is directly computed from the tertiary traces pi , pj and pij simi39
CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON
larly to equation 3.10 :

 log(ε)
gcomp (pi , pj , pij , t) =
 log pij
pi pj
if
pij
pi pj
<ε
otherwise
(5.5)
αi (yi , t) is a gating variable that allows current to flow through the synapse only
after the occurence of a presynaptic spike and during a period controled by a timeconstant τα . In our implementation we used αi (yi , t) = zi (t) and τα = τi , but these
two variables should not confused, because they do not represent the same biological
processes, although assimilating them here is appropriate.
Solving the differential equations
Solving the set of 8 differential equations 3.1 to 3.8 is required to compute gcomp (pi , pj , pij , t).
We will use the method cnexp. The 8 traces are defined as STATE variables and
the set of equations is implemented in DERIVATIVE block. In order to avoid the
problems we could encounter if we were to implement the equations directly (see
section Learning Implementation in the previous chapter), equations 3.1 and 3.2 are
implemented by direct update (equations 5.6 and 5.7).
(
= − zτii
in the absence of presynaptic activity
−
zi = k(1 − zi ) when a presynaptic spike is detected
(5.6)
Another modification, which is not introduced with abstract units, is added in
the spiking context : Because, we want to investigate spike-timing dependence,
we will have to give different values to the parameters τi and τj . However, when
the primary traces have different time-constants, it can be demonstrated that this
propagates to the secondary and tertiary traces. Indeed, the secondary trace ei
quantifies the integral (in terms of surface or area under a curve) of the primary
trace zi . When zi (t) and zj (t) have different time-constants, one of the secondary
traces becomes much greater than the other, which is equivalent to silencing one
unit over the other and is undesirable.
To overcome this we introduce the quotient ττji in equations 3.2 and 3.5. This
heuristic method allows us to manipulate time-constants independtly without spoiling the activity of secondary and tertiary traces. Equation 5.7 garantees that, after
the occurence of a spike, the integral of the two primary traces zi (t) and zj (t) is
equal (so that secondary traces ei (t) and ej (t) stay in the same range), and equation
5.8 corrects the mutual trace eij (t) to be influenced by the change in time-constants,
because its calculation is based on the product ei (t)ej (t).
zi (t) is given by =
dzi
dt
( dzj
zj (t) is given by =
z
= − τjj
in the absence of postsynaptic activity
−
τi
zj = k( τj − zj ) when a postsynaptic spike is detected
(5.7)
dt
40
5.3. LEARNING RULE IMPLEMENTATION
τi
deij
τj zi zj − eij
(5.8)
=
dt
τe
Besides these changes, the learning rule implementations follows the set of equations
presented in Chapter 3. It is important to mention that the modification introduced
in equation 5.8 makes all secondary traces decay with the same time-constants. Thus
no change needs be added to the tertiary traces equations.
Connecting neurons to the synapse
(a) Single-synapse NEURON implemen- (b) Mutli-synapse NEURON implementation
tation
Figure 5.3: Schematic representation of BCPNN Synapses - The synapse is
always places on the postsynaptic cell and links to the presynaptic cell with a certain
delay, threshold and weight. The synapse has a virtual link to the postsynaptic cell
accounting for backpropagating action potentials with a short delay of 1 ms
The remaining difficulty is the method used to ‘detect’ spikes. A BCPNN synapse
is a point-process and can only access the local variables of the section it has been
attached to : the postsynaptic cell. Since we do not want to use pointers or import
pre-calculated data to the synapse, we use NetCon objects to create a connection.
A NetCon object is a built-in process that enables one to attach a source of events
s (usually a membrane voltage) to a target t. When this source of events crosses a
threshold thresh in the positive direction, a weight w is send to the NET_RECEIVE
block of the target, with a delay del, and the code in this block is executed. In our
case, the target is the synapse and the presynaptic membrane voltage is the source
of events. When the voltage crosses a threshold (typically -20 mV), a positive weight
w is sent to the synapse with certain conduction delay. We did not give any value to
this conduction delay, assuming that presynaptic and postsynaptic events arrive at
the synapse at the same time. However, our model can account for different delays
41
CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON
according to the cell originating the action potential. Typically, a value between 5
and 10 ms is realistic for a conduction delay along the presynaptic axon.
However, postsynaptic spikes also need to be detected in order to update zj (t),
as in equation 5.6. This time postsynaptic membrane voltage is available at the
synapse level, but we need to construct a function detect that checks at each time
step if a postsynaptic spike has occured and updates the trace zj to the desired
value. This is possible, but the main problem is that it would make us use different
methods to detect presynaptic and postsynaptic spiking.
The ‘trick’ used to bypass these problems, shown in figure 5.3, is to create a
virtual link from the postsynaptic cell to all of its synapses. Each synapse receives
another NetCon object, which source of events is nothing but the postsynaptic cell
itself, with a short delay of about 1ms, accounting for the backpropagation action
potential delay, and a negative weight indicating that the sender is the cell itself.
At the synapse level, events coming from the presynaptic and postsynaptic cells are
treated similarly : the sign of the weight w is used to recognize the event sender and
update only the corresponding primary trace zi (t) or zj (t), according to 5.6.
5.3.2
Bias term
In this section, we present how the bias is represented and updated during learning, in the NEURON spiking environment. Again, the NMODL language is used to
create a new mechanism for simulations in NEURON. The details of the code are
given in Appendix.
Ion Channel Modeling
In the artificial context, the bias term βi of one unit quantifies its excitability.
If βi is close to zero, unit i is normaly excitable. However, if βi takes a strong
negative value, the activity required to reach the threshold of the transfer function
is much higher. The bias term quantifies the intrinsic plasticity of the cell which is
the persistent modification of a neuronâs intrinsic electrical properties by neuronal
or synaptic activity. A phenomenon called long-term potentiation of intrinsic excitability (LTP-IE) has been demonstrated. It is characterized by strong, transient
synaptic or somatic stimulation that tends to produce an increased ability to generate spikes. It is often accompanied by a decrease in the action potential threshold,
a shift in the steady-state inactivation curve of activity dependent potassium channels and a reduction of After Hyperpolarization Phase (AHP) (Xu et al. 2005) [34].
Moreover Jung et al. relate a biphasic downregulation of A-type potassium current
: transient shift in the inactivation curve and long-lasting reduction of peak A-type
current amplitude (Jung et al. 2009) [19].
The assumption here is to implement the bias by an activity-regulated potassium
channel : one way to achieve this is to map the bias term βi to a real parameter in
our spiking context like the conductance of an activity-dependent A-type potassium
channel gki , attached to unit i. This conductance would be time-dependent and
42
5.3. LEARNING RULE IMPLEMENTATION
closely related to the past presynaptic events. However it does not depend on the
synaptic events, but only on the past activity of the cell. In NMODL, The A-Type
Potassium channel is defined as a distributed mechanism, intrinsic to a cell. One cell
cannot have more than one A-type K+ channel, but we assume that all pyramidal
cells will include this A-Type Potassium channel.
Figure 5.4: Extended Hodgkin-Huxley cell model
Conductance Expression
In our model, the potassium conductance gki is a product of two quantities :
gki (t) = gkmax .gkcomp (pi )
(5.9)
gkmax is the maximum conductance of the A-type potassium channel : it regulates
the permeability of the channel and can temporarily be set to zero if one wants to
discard intrinsic plasticity effects during learning.
gkcomp (pi ) is directly computed from the tertiary traces pi according to :
(
gkcomp (pi ) =
log(ε)
if pi < ε
− log(pi ) otherwise
(5.10)
In fact, equations 3.9 and 5.10 are similar, with the only difference being that
gcomp (pi ) and βi have opposite values. The reason for this is that LTP-IE is equal
to downregulation of A-type Potassium current : the cell is more excitable when
less potassium current flows through the channels because it reduces the afterhyperpolarization phase (AHP). This is consistent with our definition of gkcomp (pi ),
43
CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON
which will decrease if pi increase, meaning that enhanced intrinsic plasticity occurs
after strong activity in the cell.
Solving the differential equation
Now, in order to compute gcomp (pi ), we only need to solve three differential equations
(equations 3.1, 3.3 and 3.6). Because we only need the intrinsic traces of cell i :
zi (t), ei (t) and pi (t), we do not have to take different actions according to which
neuron fires. We will use the method cnexp. The 3 traces are defined as STATE
variables and the set of equations is implemented in DERIVATIVE block.
Equations 3.3 and 3.6 are implemented directly, but the primary trace zi (t) is,
once again, implemented by a disjunction of cases : exponential decay when no
activity is recorded and direct update when a spike is detected (equation 5.6). The
method used to detect membrane voltage firing is radically different this time (see
next section).
Detecting spikes
This time, however, no NetCon object can be used, since only a point-process can be
used as a target, whereas our A-Type Potassium current is a distributed mechanism.
The solution used here is to build a function detect, called at each time-step, which
checks if the membrane voltage yi has crossed a threshold. Using a set of flags, we
make sure that a value firing turns to 1 on the occurence of a spike, and goes
back to zero on the very next time-step. We also add a counter so that update of
the primary trace delay between spike detection and primary trace update, can be
controled.
This solution with a detect function is somewhat more time-consuming, but
since we want to modelize an intrinsic phenomenon, we avoid to reference to variables computed elsewhere : all computation has to be done locally. The main reason
for this choice is that we want the different learning mechanisms to be independent,
because one is a point-process (BCPNNSynapse) and the other is a distributed mechanism (A-TypePotassium). A constant update from values computed in the first to
values used in the second might deteriorate simulations. Also the use of the delay
counter enables to match exactly the values pi (t) computed intrinsically in unit i to
the values obtained by computing it at the synapse level using NetCon objects.
In fact, there is only a short delay (about half of a millisecond) that can be
corrected by changing the initial value of the counter. Our final cell model is shown
in figure 5.3.2. It includes two more ion channels than the standard Hodgkin-Huxley
model (figure 5.1.1). It can account for spike-frequency adaptation and intrinsic
excitability.
44
5.4. RETRIEVAL
5.4
Retrieval
Because NEURON is build to model real neurons, the notion of connection
between to units exists constantly, as long as we have connected them via a NetCon
object. This is somewhat different from what we have done with abstract units,
where all was done during the learning phase as if neurons were not connected to
each other. In this respect, inference consists into feeding some input stimulation to
one or several units and recording what we get during the simulation, at the output
units. In the single-synapse context, a current-clamp is attached to the presynaptic
unit i with an input current proportional to the driving input xi . Inference is purely
frequency-based, which means that input xi is characterized by its firing frequency
fi , and the output activity frequency gives the corresponding output value xj .
However, in the spiking context, inference is part of the simulation : we decide
not to stop running a stimulation to infer after a learning phase, but to include
in a continously operating fashion, which is closer to what can be observed in real
neurons. Thus, we need to update a parameter that allows us to switch from the
learning mode to the inference mode : this parameter is the print-now signal κ. It
can be either read from a file or updated at a predefined time by implementing a specific but, when it goes beyond a threshold value, weights and biases are “frozen” and
the retrieval phase takes place. In our implementation we bypass this by updating
two parameters glearn for the synaptic conductance and gklearn for the potassium
channel conductance, which are set to the values of gcomp and gkcomp respectively,
and only when the print-now signal is small enough. The update of the conductance
g and gk includes a test in the BREAKPOINT according to the value of the print-now
signal. Updating the conductances according to the traces is done in the learning
mode, when κ is large enough, otherwise the conductances are computed directly
from the updated values of glearn and gklearn , in the inference mode.
One thing we must pay attention to is when we want to exhibit the inputoutput mapping. In this very case, it is important that presynaptic stimulation
is not current-clamp-driven because for low current values (when xi is close to
zero), no spiking occurs because the action potential threshold is not reached by the
membrane voltage. To overcome this problem, we use the built-in process NetStim,
which enables us to control the spiking frequency of an input stimulation, even for
really low-valued inputs. As a result, the driving input xi is directly mapped to a
frequency value fi , according to fi = xi .fmax .
45
Chapter 6
Results
6.1
6.1.1
Abstract units
Learning
In this section we present the outcome of a learning phase, for the three different
pattern presentations schemes. We are only dealing with one presynaptic i unit and
one postsynaptic unit j connected by one synapse : this is called single-synapse
learning In order to be able to compare the different methods, the same temporal
series of patterns for unit i, on the one hands, and unit j, on the other hand, is
presented in all three pattern presentation schemes. This means that the driving
inputs xi (t) and xj (t) are the same for the three pattern presentation schemes, but of
course the input activities yi () and yj (t) differ according to the pattern presentation
used (non-spiking, frequency-based spiking or Poisson-spiking).
We aim to show the Hebbian features of our learning rule, so we divide the
learning phase (10 seconds) divided into five sequences (2 seconds), during which
the presynaptic and postsynaptic units exhibit a statistical relation (correlation,
anti-correlation, independance). Each 2-second sequence is divided into 10 pattern
presentation intervals (200 ms), during which the driving inputs xi (t) and xj (t) take
a constant value ki and kj , corresponding to the pattern x = [ki , kj ]. The learning is
composed by the following sequences and the results are presented in figures 6.1.1,
6.1.1 and 6.1.1 :
1. Strong Correlation (between 0 ms and 2,000 ms)
Unit i and unit j are fed the same pattern : xi = xj = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1].
2. Independent Activation (between 2,000 ms and 4,000 ms)
Unit i and unit j are fed independent patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1]
and xj = [0, 1, 1, 0, 0, 1, 0, 0, 1, 1].
3. Strong Anti-Correlation (between 4,000 ms and 6,000 ms)
Unit i and unit j are fed anti-correlated patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1]
and xj = [0, 1, 0, 1, 0, 0, 1, 0, 1, 0].
47
CHAPTER 6. RESULTS
4. Presynaptic and Postsynaptic Muting (between 6,000 ms and 8,000 ms)
no input to both units xi = xj = 0.
5. Presynaptic activation and Postsynaptic Muting (between 8,000 ms
and 10,000 ms)
Unit i is fed a pattern and unit j is mute xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and
xj = 0.
Figure 6.1: Abstract Units Learning - Non-Spiking Presentation
In our simulation, τi = τj = 20 ms, τe = 200 ms and τp = 1,000 ms. The
print-now signal κ is always equal to 1 and the floor value for the tertiary traces ε is
set to 10−4 . Each pattern is presented for 180 miliseconds and there is a relaxation
lag of 20 miliseconds between each pattern. The maximum step-size for the ode45
solver is set to 1, the maximum frequency fmax is 55 Hz and the proportional
trace update value k is set to 0.8. Finally, a delay is introduced between detection
of a spike and primary trace update. This delay can be used to account for a
conduction delay in synaptic transmission. But in practice, we use it to tune the
recording of action potentials in the primary traces between our Hodgking-Huxley
spiking implementation in NEURON, and our spiking abstract units in MATLAB
(this delay exists is present because input is fed to Hodgkin-Huxley units by current
injection into cell, which does trigger action potential but not instantaneously). In
practical, setting this delay to 25 ms gives a good tuning to our models. To discuss
the results, we will refer to the three pattern presentation schemes with the following
abreviations : NS for non-spiking presentation, FS for frequency-based spiking and
PS for Poisson-spiking.
48
6.1. ABSTRACT UNITS
Figure 6.2: Abstract Units Learning - Spiking Frequency based Presentation
Figure 6.3: Abstract Units Learning - Spiking Poisson-generated Presentation
49
CHAPTER 6. RESULTS
During the correlation phase (from 0 ms to 2,000 ms), ωij increased rapidly to a
high positive value (3.25 at 115 ms for NS, 2,80 at 140 ms for FS, 2,30 at 170 ms for
PS) and decays to a lower positive value 1 after 10 patterns (0.73 for NS, 0,82 for
FS, 0,72 ms for PS). This rapid increase is due to the first mutually active pattern
during which the mutual trace pij equals both pi and pj . Later mutual inactivation
is the cause of the decay after this high positive peak. The weight stabilizes to a
positive value at the end of the sequence, accounting for a strong correlation between
the two units The bias terms βi and βj give a measure of the units’ intrinsic activity,
they are equal during the whole sequence because xi (t) = xj (t). It is to be noted
that the value of -0.77 for βi and βj (at 2,000 ms for NS), is very close to log(0.5),
and gives a very good approximation of the logarithm of the global units’ activation
(1,080 ms over 2,000ms).
During the independence phase (from 2,000 ms to 4,000 ms), ωij decays gradually to zero (0.14 for NS, 0.22 for FS, 0,08 for PS, at 4,000 ms), accounting for
statistical independence between the two units. The bias terms values βi and βj
stay in the same range as the previous sequence. Only for NS however, can we
exploit the quantitative value of the bias terms. For FS and PS, their time-courses
exhibit the same dynamics but the value range is 2 to 3 times bigger.
During the anti-correlation phase (from 4,000 ms to 6,000 ms), ωij decays linearily to a strong negative value (-1.04 for NS, -1.05 for FS, -1,21 for PS, at 6,000
ms), accounting for the strong anti-correlation between the two units. The linear
decay of the weight is somewhat characteristic of the logaritmic dependence on the
mutual trace pij , which decays exponentially to zero, because the secondary trace is
zero when no unit is active together. The bias terms, however, are not affected by
this, because they only account for units individual activation. βi is always higher
than βj , because unit i is active more often than unit j.
When both units are mute (from 6,000 ms to 8,000 ms), ωij increases linearily
and both βi and βj decrease linearily. The slope of the three curve is only dependent
on the time-constant τp . This is due to the fact that, when both units are silent,
the three tertiary traces pi , pj and pij decay exponentially to zero, with the same
time-constant. As a result, the linear decrease of the bias term compensates exactly
the linear increase of the weight, which gives consistent results for inference when
no unit has been active during learning. This goes on until the one of the traces
reaches its floor value ε. Finally, when muting one unit and activating the other
(from 8,000 ms to 10,000 ms), ωij reproduces anti-correlation results, by reaching
a sstrong negative value, meanwhile the silent unit’s bias term keeps on increasing
linearily whereas the active unit’s bias term goes up again to account for activation.
The learning outputs in this three cases will be compared in the Chapter Discussion, where we also have the output of the spiking implementation. However,
we can note two things. First NS is more accurate quantitatively than FS and PS,
simply because the primary traces are strongly influenced by the driving inputs
xi (t) and xj (t). When we take a closer look, we can see that they stabilize to fixed
step values. Thus activity reverberates more in the secondary and tertiary traces,
giving more accurate values to the weight and biases. Seondly, the time-courses of
50
6.2. HODGKIN-HUXLEY SPIKING UNITS
the weight and biases are extremly similar, even if we use different pattern presentation schemes, meaning that the qualitative output of the learning phase is strongly
similar for the three implementations.
6.1.2
Retrieval
The retrieval phase occurs after the learning phase. Whereas there is no interruption between these two phases in NEURON, it is implemented by two separate
procedures in MATLAB. We want to create an input-output mapping in order to
understand what would be the outcome of the learning rule in a network context.
So we assume that the learning phase described in the previous section has already
occured and that we have stored the weights ωij and biases βi and βj , in a separate
file.
Now, we import the weights ωij and biases βi and βj values, at the end of each
2-second sequence of the learning phase (i.e. at times 2,000 ms, 4,000 ms, 6,000 ms,
8,000 ms and 10,000 ms). And we display the output values xj corresponding to a
set of input values xi regularly spaced between 0 and 1. Figure 6.4 shows the prepost mapping xj = f (xi ) and the inverse post-pre mapping xi = f −1 (xi ), for the
three different pattern presentation schemes and figure 4.3 displays the time-courses
of the filtered support units activities yj (t) for FS and PS, for different input values
of xi , and after each of the three first sequences of the learning phase.
The characteristics of the mapping are common to NS, FS and PS. The left-most
point of the mapping (corresponding to xi = 0) corresponds to the exponential of the
bias term, which is the tertiary trace pi . The more unit i have been active during,
the stronger the tertiary trace pi : we can easily see on the left mappings xj = f (xi ),
that the longer unit j stays mute, the smaller the value of left-most point. The slope
of the curve is only dependent on the weight ωij : the steeper the amplitude of ωij ,
the stronger the slope. Moreover, the sign of the weight determines the direction
of the curve, hence the fixed points of the mapping. After correlation, the slope is
positive and the only fixed point is 1. Conversely, the slope is negative after anticorrelation and the only fixed point is 0. After independence, the mapping takes a
constant value equal to the exponential of the bias term, which becomes the only
fixed point. Fixed point dynamics is crucial for network implementation, because
retrieval is a process that iterates an inference process until stability is reached.
6.2
6.2.1
Hodgkin-Huxley Spiking Units
Steady-State Current Discharge
Our pattern presentation in NEURON is somewhat more realistic than generating spiking input as we did before. As mentioned earlier, we use current-clamps
to feed input to one unit. But because, we want to work with frequency-based
learning and inference simulations, it is important to have a direct mapping between the amplitude of the current injected into one cell and its firing frequency.
51
CHAPTER 6. RESULTS
(b) Non-Spiking Mapping xi = f −1 (xj )
(a) Non-Spiking Mapping xj = f (xi )
(c) Spiking Frequency-based Mapping xj = f (xi ) (d) Spiking Frequency-based Mapping xi
f −1 (xj )
=
(e) Spiking Poisson-generated Mapping xj = f (xi )(f) Spiking Poisson-generated Mapping xi
f −1 (xj )
=
Figure 6.4: Abstract Units Pre-post and Post-pre mappings for NS, FS
and PS pattern presentations - Dark blue, green and red correspond respectively
to inference after the correlation phase (T = 2,000 ms), the independence phase (T
= 4,000 ms) and the anti-correlation phase (T = 6,000 ms). Inference after the
muting of both units is displayed in light blue (T = 8,000 ms) and finally, inference
after sole activation of the presynaptic unit is shown in pink (T = 10,000 ms). For
the spiking mappings, we chose fmax = 55 Hz and Tinf = 1,000 ms (duration of the
inference simulation). Also, for the Poisson-spiking inference procedure, we take an
average over 50 simulations.
52
6.2. HODGKIN-HUXLEY SPIKING UNITS
(a) FS inference - Correlation phase (T = 2000 ms)(b) PS inference - Correlation phase (T = 2000 ms)
(c) FS inference - Independence phase (T = 4000(d) PS inference - Independence phase (T = 4000
ms)
ms)
(e) FS inference - Anti-correlation phase (T = 6000(f) PS inference - Anti-correlation phase (T = 6000
ms)
ms)
Figure 6.5: Abstract Units Spiking Inference Filtered Support units activities
yj (t) for FS and PS after different input values xi and different sequences of the
learning phase : Blue, green and red curves correspond respectively to the input
values xi = 1, 0.5 and 0.25
53
CHAPTER 6. RESULTS
This relationship is called steady-state current discharge. In figure 6.6, we display
the steady-state current discharge of our cell model. The input current is injected
during 1,000 ms and its amplitude varies beween 0 nA and 0.5 nA. We gathered the
results for 100 cells, thus giving a precision of 0.005 nA.
Figure 6.6: Steady-state Current Discharge
However, we have to take into account the short-term adaptation features of
our cell model. Indeed, this phenomenon makes the inter-spike interval increase
when the cell undergoes a prolonged stimulation. We display three curves : the
mean frequency (bold black curve) which is a count of the action potentials over
1 second, the unadapted frequency (red curve), which is the spiking frequency at
the beginning of the spike train (typically, we take the frequency beween the first
and second spikes), and the adapted frequency, which is the spiking frequency at
the end of the spike train (calculated from the interspike interval between the last
and second last spike). As expected, the mean frequency is always smaller than the
unadapted frequency and bigger that the adapted frequency. When dealing with
high currents, some “edge effect” occurs for the count of action potential which
makes the adapted and unadapted frequencies present irregularities.
The first thing we need to underline are the existence of a threshold for the input
current, below which no action potential is recorded. Thus, it is somewhat difficult
to create low-spiking activities with current clamp and we will use NetStim object to
overcome this. Secondly, the curve is almost linear, or can at least be approximated
linearily, for low currents. For strong currents however, the curve becomes concave,
meaning that strong currents do not produce equally strong frequencies. Finally,
we will work with a reference frequency of 30 Hz corresponding to an input current
of 0.1 nA.
54
6.2. HODGKIN-HUXLEY SPIKING UNITS
6.2.2
Learning
In order to compare qualitatively and quantitatively the spiking implementation
to the abstact units implementation, we will present the same learning procedure
as the one exposed above : a 10 seconds simulation, divided into five 2-second sequences including : a strong correlation phase (between 0 ms and 2,000 ms), during
which units are fed the same pattern : xi = xj = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1], an independent activation phase (between 2,000 ms and 4,000 ms), during which units are fed
independent patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = [0, 1, 1, 0, 0, 1, 0, 0, 1, 1],
a strong anti-correlation phase (between 4,000 ms and 6,000 ms), during which units
are fed anti-correlated patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = [0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
a mute phase (between 6,000 ms and 8,000 ms), characterized by the absence of input
to both units xi = xj = 0, and a phase of presynaptic activation and postsynaptic
muting (between 8,000 ms and 10,000 ms), during which unit i is fed a pattern and
unit j is mute xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = 0.
Pattern presentations consist here of current-clamps attached to the cells with
different delays according to the pattern they represent, with a duration of 180 ms
and a relaxation period of 20 ms and an current amplitude of 0.1 nA. The synaptic
traces show the same behaviour as what has been seen in the previous results, but
it is to be noted that the range of values of the tertiary traces is smaller : they
do not get greater than 0.2. Also, a delay is observed for the tertiary traces, when
switching from one sequence to the other. This is due to the slow-dynamics of the
tertiary traces, whic time-constant τp is set to 1,000 ms. Membrane voltages and
synaptic traces are shown in figure 6.7.
The synaptic conductance and A-type Potassium-channel conductance time
course is shown in figure 6.8 : here, we decide to only display the component
gcomp of the synaptic strength. The quantitative value of the effective synaptic
conductance g is weight by gmax (which will be later investigated for tuning) and
the alpha-function αi , restricting synaptic activity to presynaptic stimulation. We
will refer to the gcomp as the synaptic conductance only in this section. Similarily,
the potassium conductance presented here is the computed conductance gkcomp . It
needs to be multiplied by a parameter gkmax (which value is set when tuning the
channel) to obtain the effective conductance gk.
The synaptic conductance (similar with the weight ωij for abstract units exhibits
the same qualitative behaviour : During the learning phase it shows a rapid increase
to a high value (2.78 at 130 ms) and stabilizes to a lower positive value (1.15 at 2,000
ms), accounting for strong correlation between the two units. During the independance activation phase, the synaptic conductance decays very slowly towards zero
ending up to a non-negligible positive value (0.48 at 4,000 ms). Anti-correlation of
the units leads to a linear decrease of the synaptic conductance towards a strong
negative value (-1.00 at 6,000 ms). The negative value for the conductance can be
considered unbiological (we discuss this in the Conclusion), but it has the expected
effect when we switch inference mode (inhibitory synapse) which makes our implementation fit our need if we discard biological resemblance. The silent phase for
55
CHAPTER 6. RESULTS
(a) Presynaptic and Postsynaptic membrane voltages
(b) Presynaptic and Postsynaptic synaptic traces
Figure 6.7: Spiking Units Pattern Presentation
56
6.2. HODGKIN-HUXLEY SPIKING UNITS
(a) Synaptic weight modification
(b) A Type Potassium Channel Conductance
Figure 6.8: Spiking Units learning
both units results in a linear increase for the synaptic strength which is expected,
because all tertiary decay exponentially to zero with the same time-constant. The
last phase is anologous to the observed anti-correlation phase.
The A-type Potassium-channel conductance computed here is the additive inverse of the bias term in abstract units. The reason is that increasing of increased
excitability results in decrease of A-type Potassium current [19] (Jung et al., 2009).
It exhibits simple features : gradual decrease under stimulation of the unit and linear
increase when unit is silent, to balance the linear increase of the synaptic strength.
Because unit i is more activated than unit j, its potassium channel conductance is
always smaller.
We have set the maximum synaptic conductance gmax to a very low value during learning (1 nS), so that our synapse operates almost in off-line learning condi57
CHAPTER 6. RESULTS
tions. This enables us to discard the inference effect during learning, to prevent the
synapse to learn its own interpretation of the stimulus. It also important to play
with the delay of conduction for the presynaptic input, because we assume that
the postsynaptic potential has less distance to arrive to the synapse (backpropagating potential) whereas the presynaptic input needs to travel along the axon. In
practice, we impose a presynaptic delay of 1 ms and a postsynaptic delay of 1 ms,
assuming that these need the same amount of time to rach the postsynaptic soma.
In reality, we might have to account for a conduction delay of about 5-6 ms along
the presynaptic axon. The learning parameters are τi = τj = 20 ms, τe = 200 ms,
and τp = 1,000 ms. The synaptic traces are initialized to 0.01, except for the mutual
traces initialized to 0.0001. The time-step of integration dt = 0.2 ms is significantly
smaller than in MATLAB, allowing for fine computation.
6.2.3
Parameter tuning
Potassium channel tuning
The dynamics of the A-type Potassium channel need to be tuned. The idea is
to get the same quantitative output for retrieval as with abstract units. In fact, in
the spiking context this potassium channel needs to replace the effect of the bias
term. Not only it needs to be updated at each time-step, taking a crucial place
in the learning rule implementation, but it also needs to account for spontaneous
activity, even in the absence of presynaptic activity.
In order to achieve this, we will modulate the resting membrane potential of
the cell by changing the leakage channel resting potential Eleak . The idea is that
when this parameter is set above the action potential threshold, which is around
-67 mV, the leak current (which is the only passive current in our model and thus
control fully the passive properties of the cell) will always drive the cell to fire in the
absence of an input. In fact, if no input is fed to the unit, the membrane potential
cannot reach its resting value, because the leak current will constantly depolarize
the cell, thus opening the Hodgkin-Huxley voltage-gated channels and triggering
action potentials. However, if the A-type Potassium is set to a high conductance
(consequent to low activity of the cell), it will inhibit the tendecy of the cell to fire
and normal behaviour will be shown.
In order to achieve this, we will start with the hypothetical case when pi = 1
(maximum activity) and the cell should fire to its maximum frequency. So we first
remove our potassium channel (we only set gkmax to zero and increase progressively
Eleak so that we get the same count of action potentials as with a current clamp
of 0.1 nA for a 1-second stimulation. We obtain Eleak = -29.0 mV (for a standard
value of -70.3 mV).
Now we want our potassium channel to reach its maximum conductance when
i)
the cell has not been active at all. A first important thing is to set gk = gkmax log(p
log(ε) ,
so that gk = gkmax when pi reaches its floor value ε. Then the value assigned to
gkmax is chosen by fitting the input-output mapping for spiking units to the NS
58
6.2. HODGKIN-HUXLEY SPIKING UNITS
mapping for abstract units. A gkmax of 54.8 µS.cm−2 gives fairly good results.
Synaptic conductance tuning
The second parameter to tune in order to get a quantitative fit from non-spiking
units to our implementation in NEURON is to fit the synaptic conductance. The
parameter gmax controls the amount of current flowing from one cell to the other
during synaptic transmission. Of course, all other parameters involved in synaptic
transmission need to be set and no further changed can be applied to them. We set
τi = 20 ms, ε = 10−6 , imax = 0.1 nA and Tpres = 180 ms. Then the value assigned
to gmax is chosen by fitting the input-output mapping for spiking units to the NS
mapping for abract units. gmax 500 pS gives fairly good reasults.
6.2.4
Retrieval
Now that our the model parameters have been tuned to mimic the results
obtained with abstract units, we can use the spiking implementation in the retrieval
mode to create an input-output mapping from one cell to the other. As explained
above, this is done by attaching a current-clamp electrode to the presynaptic unit i,
to achieve a firing frequency fi proportional to the driving input xi . We extrapolate
the input-current needed by looking at the steady-state current discharge presented
above. The duration of the stimulation Tstim is set to 1,000 ms.
As opposed to the abstract units implementation, we can only get a spiking
output (no anti-spikes or stepped value), and this simplifies our purpose : the output
xj is simply set to the firing frequency fj (divided by fmax ) of the postsynaptic cell.
It is to be noted that retrieval can only performed in one direction : because of the
αi function postsynaptic input only will not trigger synaptic transmission. Thus, by
adding an alpha-function, we have given an orientation to the link between neurons
in our biological network. Of course, there can be backpropagating connections but
at one synapse, activity can only be triggered by the presynaptic cell.
Also, we have made the choice to perform learning and inference during the same
stimulation, because this seems closer to what can be observed in real neurons. The
print-now signal κ is used as a flag to switch from the learning mode to the inference
mode. Typically 10−6 is chosen as a limit for these two cases. Figure 6.9 presents
the input-output mapping for our NEURON implementation.
The curve exhibits the same qualitative behaviour with what has been observed
with abstract units : Correlated units exhibit a positive slope, accounting for excitatory input from the presynaptic cell to its target. Anti-correlated units show
the opposite behaviour and independent units show no significant synaptic transmission. When the postsynaptic cell, gradually turns silent, its intrinsic excitability
decreases, resulting in a downward shift of the input-output relationship. As opposed to the abstract units, it is nonsense to show the inverse mapping because our
synapse is oriented for pre-post transmission.
59
CHAPTER 6. RESULTS
Figure 6.9: Spiking Units implementation Input-Output mapping - Dark
blue, green and red correspond respectively to inference after the correlation phase
(T = 2,000 ms), the independence phase (T = 4,000 ms) and the anti-correlation
phase (T = 6,000 ms). Inference after the muting of both units is displayed in light
blue and finally, inference after sole activation of the presynaptic unit is shown in
pink.
Some fine tuning needs to be investigated, especially the expression of these conductances are a product of terms depending on the value of the tertiary traces. Also,
using current-clamps is somewhat incomplete, because it misses some data points
for low firing frequencies. Indeed the current-discharge for these values shows a
threshold and an exponential relationship between injected current and resulting
firing frequency. An alternative to this problem can be found by using NetStim objects and control the exact spike train of the presynaptic cell during the simulation.
We have good confidence that further investigation could give a precise fit of the
curves for both abstract units and spiking units.
6.2.5
Spike Timing Dependence
In this section, we want to investigate the spike-timing dependence of the new
spiking version of the BCPNN learning rule. This is somewhat interesting to challenge the BCPNN model, which has not been designed at a time when spike-timing
dependent plasticity rules were investigated yet, with some classical STDP procedures. Our simulation reproduces the work from Bi and Poo [6], about the investigation of the influence of pre-post and post-pre timing on the change in synaptic
60
6.2. HODGKIN-HUXLEY SPIKING UNITS
strength (measured by the percentage of change of EPSC amplitude) in cultured
hippocampal rat cells. In this work, they have shown the existence of a narrow
spike-timing window with a width of 40 ms, where post-pre timing (within 20 ms)
triggered LTD and pre-post timing (within 20 ms) triggered LTP.
We apply the same procedure to our spiking units : a repeated low-frequency
(1Hz) stimulation of transient couples of pre-post or post-pre spikes during 1 minute.
Because of limited computational resources, we modified this procedure as follows
: We feed short transient stimulation to presynaptic and postsynaptic units at a
frequency of 1 Hz, during 30 seconds. This short-lived stimulation produces a single
spike in each of the cells, and the timing between these spikes is obviously controlled
We measure the asolute weight change gcomp after the repeated stimulation.
As mentioned before we want to be able to set τi and τj to different values, so
we use the updated version of the learning allowing us to do so (equations 5.7 and
5.8). Typically we will use τi = 10 ms and τj = 2 ms, to promote pre-post timing
over post-pre timing. It is important to note that the stimulation frequency must
be low enough to allow the secondary traces to decay to zero. We typically have
τe = 200 ms, which is adapted to a stimulation of one spike per second. Finally,
τp can be increased from 1,000 ms up to 10,000 ms to fully integrate the 30-second
procedure. Stimulation is given by strong short-lived (15 ms) current pulses (0.1
nA). We mention that for this only task, we have muted adaptation current for
spike-frequency adaptation and A-type potassium channel current, because they
affect the interspike interval, whereas we want to investigate exact timing between
spikes.
Figure 6.10: Spike-timing dependence window, τi = 10 ms and τj = 2 ms
61
CHAPTER 6. RESULTS
Figure 6.10 shows results for a spike-timing window width of 200 ms. We will
later compare our results to experimental data of Bi and Poo. As expected our
simulations show strong increase in synaptic strength (LTP) for correlated spiketiming (the curve takes a strong positive value of 3.09 when ∆t = 0 ms). and strong
decrease in synaptic strength (LTD) for uncorrelated spike timing (the curve takes
a strong negative value when ∆t = 100 ms). The curve is not symmetric however,
because we have decided to promote pre-post timing over post-pre timing. So the
time-window for LTP is smaller (−6 < ∆t < 0 ms) whereas the time-window for
LTP in the pre-post timing is 5 times bigger (0 < ∆t < 30 ms)
It exhibits also a linear decay of the synaptic strength, which can be explained by
looking at the time-courses of the primary traces : The mutual trace eij is dependent
on the product zi zj , thus on the area commonly under the pi and pj curves. It can
be shown that this area decreases exponentially with the spike timing between
presynaptic and postsynaptic units. Because the individual traces pi and pj are
independent of the spike timing, the weight change depends only on the logarithm
of the mutual trace pij , which decays exponentially with the spike-timing, thus
resulting in a linear decrease on both sides of the curve.
No synaptic change occurs when the timing between spikes is so long that the
product zi zj is constantly zero (which is what happens on the edges of the curve).
Thus, LTD is triggered in the absence of correlated timing. This is a specificity of
our learning rule that decrease in synaptic weight occurs under repeated uncorrelated stimulation (whereas when no input is presented, spontaneous weight increase
occurs). The LTD-level on the edges of the curves can be controlled by adding
some additive noise to the mutual traces (either at the level of eij or pij ). This enables us to control the spontaneous level of weight decrease when the spike-timing
is uncorrelated.
62
Chapter 7
Discussion
7.1
Model Dependencies
In this section, we investigate the parameters which influence our model. We
do not present a quantitative investigation, but we stress the aspects which deserve
special care of the investigator.
7.1.1
Learning Rule Parameters
Time-constants
The time-contants play a crucial role in our learning rule. Not only they control
individually the synaptic trace dynamics, allowing us to promote one specific phenomenon over another one, but they need also to be considered as a set of factors
as a whole, because they exhibit a strong interdependence.
Depending on the maximum stimulation frequency (fmax = 50 Hz here) used in
our model, the time-constants τi and τj constants must be updated. In fact, they are
designed not to exceed 20 ms, because they account for fast spiking dynamics. As
we have shown, we can promote pre-post spiking over post-pre timing by increasing
one time-constant over the other. We can also control the spike-timing window,
thus controling the triggering of LTP or LTD. Recent investigations on STDP [27]
include all these kind of local synaptic variables. According to their number and
complexity, we can try to reproduce results in triplet or quadruplet models.
The secondary time-constant τe and τp control the long-term dynamics of our
model. As explained before, the secondary traces account for delayed-reward mechanisms and the tertiary traces as a long-term memory. By adjusting these two
time-constants we can switch from a fast-operating working memory, to a slowdynamics long-term memory. The range of τp is adaptable to the specificity of our
learning task (from 1 second in fast learning sequences to 10 seconds or more for
long-lasting spike-timing dependence investigation).
63
CHAPTER 7. DISCUSSION
Floor value and Initialization
The ε value is a lower bound for all synaptic traces : they cannot get smaller
than this value. In our model we set ε = 10−6 . It is important that this value is
tuned with the initialisation of the synaptic traces, which are 0.01 for the individual
traces and 0.0001 for the mutual traces, because they must not reach the floor value
too fast. Decreasing the floor value allows us to limit the spontaneous linear weight
increase and bias decrease, observed in the absence of input. By setting ε, we
can modelize a standard level of additive noise or baseline, accounting for irregular
activity from other neurons connected to the cell.
Primary synaptic trace type
The primary traces zi (t) and zj (t), controled by the time-constants τi and τj ,
operate as a fast dynamics memory, keeping a trace of the spiking activity of the
cell. According to the type of synaptic trace (additive, saturated or proportional)
we choose, we can decide to give priority to the timing of recent spikes or to the
frequency over a certain time-scale.
The additive trace presented in Chapter 3, is used to model synapses where
synaptic integration exhibit additive bahaviour, meaning that on the occurence of
every new spike, a fixed quantity of neuro-transmitter is released in the synaptic
cleft and adds up to the synaptic resources already present on the postsynaptic site.
In our model, we implement the additive trace with a low-time constant when we
want to have a measure of the amount of spikes, occuring in a given time period
(see Inference section in Chapter 5)
The saturated trace, on the other hand, discard the past history of the cell
when a new spike occurs. It is always updated to the same value, so that we
can always retrieve the occurence of the last spike, from the value of the trace,
and this at any time. The saturated trace accounts for synapses including a small
number of receptors, which get saturated on the occurence of each spike. The neurotransmitter concentration decays gradually at a certain speed, which is equivalent
to the time-constant of the trace in our model.
We have used proportional traces for zi (t) and zj (t), which gives a compromise
to these two methods. We mention however that switching to additive traces would
promote the spike-timing features of the learning rule, whereas additive trace would
stress the amount of spiking over a given period of time.
Time of presentation
Another important parameter is the time of presentation of each pattern, combined with the resting time between patterns. The longer the pattern is seen, the
stronger the memory. But there needs to be some adaptation for our learner before
jumping to another pattern, because the synaptic traces need to relax for some time
between events.
64
7.1. MODEL DEPENDENCIES
Ideally, we would like to impose a lag corresponding to a decay of the secondary
traces (4τe ) between each learning sequence, so that we ensure that the network activity is not polluted by its previous knowledge. A good example of what we want
to avoid appears in the NEURON implementation : Because the independence sequence occurs right after the strong-correlation sequence, the synaptic conductance
does not decay to its baseline. To solve this, we can impose a resting period or the
learning sequence itself can be extended.
The resting time between patterns allows us to present multiple patterns without
introducing undesirable behaviour. But this repeated resting time acts in the global
activity of the cell, therefore on the values of the tertiary traces pi , pj and pij . This
needs to be taken into account when computing the activity of a cell or comparing
it to the value of the tertiary traces.
7.1.2
Pattern Variability
Testing our implementation on noisy data is something that can be investigated.
It seems likely that the learning rule is robust to noisy input for two reasons. The
first one is that we have implemented Poisson-generated process based on a random
process and this gives fairly good results. Thus, the exact spike-timing between
presynaptic and postsynaptic units is not crucial during learning, but rather the
probability of firing of one cell, which is based on the rate of firing in our Poisson
generation. The second reason is that we are dealing with spiking units, which
synaptic traces will not be affected by noise. Indeed, as long as the membrane
does not fire action potentials, additive noise influencing the resting membrane
potential will be discarded in our learning implementation. However, if the noise is
strong enough to trigger action potentials, stability is not garanteed, but if we run
a simulation with reasonable stimulation frequencies and small enough values for
the primary time-constants τi and τj for a long time, we should be able to discard
random noise affecting the traces.
The fact that input activities represent confidence of feature detection, allows
us to feed graded input patterns to the units, representing the relative confidence.
If an attribute value has been observed for sure, the corresponding unit is fed with
input 1. However, if the attribute value of a pattern is less reliable (because its
observation or recogintion is not garanteed), we can feed the corresponding unit
with a weak input, giving more weight to the bias term, which represents the a
prioiri probability of a certain attribute. This allows us to use the spiking learning
rule as a classifier for ambigous patterns. It can also be used in a recurrent network
to perform pattern completion or pattern reconstruction.
7.1.3
Learning-Inference Paradigm
The boundary between when our brain learns and integrates stimuli from its
environment, and when it infers from its aquired knowledge and gives its own interpretation of the data, is a difficult question. There is a fine line between inference
65
CHAPTER 7. DISCUSSION
and learning in a continously operating network, whereas it is completely clear in
an off-line learner.
The main question we have to adress is : Do we learn and infer in a sequential
fashion, like when we learn a set of words or number ? Or do we copy the acquired
knowledge after a specific task onto another region of the brain where we process
retrieval ? In our work, we make the first assumption, and we decide to switch
from learning mode to inference mode in a sequential manner, but in the same
simulation. It would be valuable to investigate the difference between these two
paradigms. The print-now signal κ allows us to bridge the gap beween inference
and learning. Indeed, when we are setting it to a really low value, we can assume
that the network is still learning, but the dynamics of the learning are tremendously
extended, so that inference can operate in the meantime.
7.2
Comparison to other learning rules
In this section, we discuss the comparison of our learning rule to other existing
learning rules. This has been an important motivation for the development of this
spiking BCPNN learning rule, that we could be able to compare not only to the
non-spiking BCPNN version but also to some spike-timing dependent plasticity
learning rules. We expand on three comparisons : evolution of weights and biases
during learning in order to compare the two versions of the BCPNN learning rule,
spike-timing dependence window to compare with real LTP data from Bi and Poo
[6] and finally a discussion of the analogy with the BCM rule [7].
7.2.1
Spiking vs Non-spiking Learning Rule
Figure 7.1 displays the time-courses of the weight ωij and presynaptic bias βi ,
for abstract units learning (Green : NS, Blue : FS, Red : PS) and the synaptic
conductance gij and the presynaptic A-Type Potassium channel conductance gki
time-courses for the NEURON implementation (Light Blue).
First, we want to compare quantitatively the evolution of the weight ωij in the
abstract units implementation. As expected, the three curves exhibit the very same
dynamics each one on top of the other. The curve corresponding to the non-spiking
pattern presentation (NS) is above the Spiking Frequency-based patternpresentation
(FS), which is itself always above the curve corresponding to the Poisson-generated
pattern presentation (PS). This can be explained by the pattern presentation scheme
: the longer the input activities yi (t) and yj (t) are set to a certain value, the more it
reverberates into the traces. So for our non-spiking presentation, the mutual trace
eij has time to be updated to a strong value, which enables a strong update value
for the tertiary traces, hence the weights. For FS, the pattern is not printed as
strongly as for NS, but the exact timing of the spikes betweeen two units with the
same input makes it fit closely to the NS results.
It is striking however how these three curves reproduce the same behaviour,
meaning that the option we choose for pattern presentation was relevant. The PS
66
7.2. COMPARISON TO OTHER LEARNING RULES
Figure 7.1: Spiking/Non-Spiking Learning Comparison
(a) Weight ωij and Synaptic Conductance gij Time-courses
(b) Presynaptic bias βi and Potassium Channel Conductance gki Time-Courses
67
CHAPTER 7. DISCUSSION
implementation displayed here corresponds to a single run. It would be valuable
to exhibit an average value over several runs, but it gives already a very good
quantitative fit to the data, meaning that our spiking learning rule is robust and
does not suffer from noisy input (at least the exact spike timing doesn’t play a key
role in the learning paradigm).
The synaptic conductance time-course resulting from the NEURON implementation is a little different from the abstract units’ curves. During the early part of
the correlation phase corresponding to the presentation of the first pattern (between
0 and 200 ms), it sticks to the FS curve, which is explained by the fact that they are
both frequency-based implemenations firing at the same frequency. However, the
curve seems later to exhibit slower dynamics than the three others. This is due to
the slow-potassium current accounting for spike-frequency adaptation : in the early
part of the learning phase the cell fire to 55 Hz but later, spike frequency adaptation
results into a firing frequency of 30 Hz for the rest of the stimulation. We note the
very same linear increase, when learning mute inputs (between 6,000 ms and 8,000
ms), which slope is equal in all cases to the to value of the tertiary time-constant τp
The bias term and potassium conductance time-courses are easier to interpret,
because they are only a display of the presynaptic tertiary trace pi (t) on a logarithmic scale. We mention however that we display here, the additive inverse of the
potassium channel conductance to make the comparison easier. Once again, the
NEURON curve fits tightly to the FS and gradually distinguishes from it.
The effect of spike-frequency adaptation is even more sensitive here, because the
value of pi (t) is directly dependent on the firing frequency. For mute inputs (between
6,000 ms and 8,000 ms) we note the same slope of the four curves corresponding to
the time constant τp . This negative slope is the exact counterpart of the positive
one for the weight, and this is what garantees stability during inference after such
a phase.
7.2.2
Spike-timing dependence and real data
We present here, the comparison between our spike-timing procedure simulation
results and the real neural data on cultured hippocampal neurons obtained by Bi
and Poo [6]. Figure 7.2 shows the comparison between the results obtained in both
cases.
Our curve shows qualitative similarities with real data : the existence of a
spike-timing window, triggering Long-Term Potentiation, centered zero-value for
the spike-timing, the promotion of pre-post timing over post-pre timing to trigger
increase in synaptic strength, and Long-Term Depression on the edges of the curve,
corresponding to uncorrelated timing between spikes. Even if the slopes of the two
curves cannot be compared quantitatively, because we have a percentage of change
in current on the one hand, and absolute synaptic conductance change on the other
hand, we stress that the qualitative aspect of the curve shows the same behaviour,
at least for spike-timing close to zero.
The BCPNN curve exhibits LTP for strongly correlated spike timings and LTD,
68
7.2. COMPARISON TO OTHER LEARNING RULES
Figure 7.2: Spike-timing Dependance Comparison
(a) Original data from Bi and Poo - 1998
(b) BCPNN Spike-timing dependence window
as soon as the spike-timing amplitude becomes too strong. This is somewhat different from the STDP rules, which exhibit both LTP, LTD and no significant synpatic
change, when no exact pre-post or post-pre timing is recorded. In the data from Bi
and Poo, a decrease of about 40% in EPSC amplitude, accounting for LTD, occurs
for negative spike-timings (for −20 < ∆t < 0 ms).
We have tried, by first introducing noise into the traces, and second updating the
floor value of the traces, to exhibit LTD only for small negative spike timings, and no
significant change in the far edges of the curves, but our attempt was unsuccessful.
We conclude that the BCPNN learning rule needs to include an active process to
prevent LTD for strong negative spike timings, because the sole interplay of the
time-constants τi and τj is structurally unsufficient.
7.2.3
Sliding threshold and BCM Rule
The BCM rule [7] refers to the theory of synaptic modification first proposed by
Elie Bienenstock, Leon Cooper, and Paul Munro in 1982 to account for experiments
measuring the selectivity of neurons in primary sensory cortex and its dependency
on neuronal input. It is characterized by a rule expressing synaptic change as a
Hebb-like product of the presynaptic activity and a nonlinear function φ(yj , θM ), of
postsynatic activity yj (t). For low values of the postsynaptic activity (yj < θM ), φ
is negative, and for y > θM , φ is positive.
The rule is stabilized by allowing the modification threshold θM to vary as a
super-linear function of the previous activity of the cell. Unlike traditional methods
of stabilizing Hebbian learning, this "sliding threshold" provides a mechanism for
incoming patterns, as opposed to converging afferents, to compete. The BCM rule
is characterized by its biological relevance and was proposed to account for the
development of neuron selectivity in the visual cortex. Several improvements of
this learning have been proposed by Intrator and Cooper in 1992 [17], and Law and
69
CHAPTER 7. DISCUSSION
Cooper in 1994 [3]. A detailed exploration can be found in the book Theory of
Cortical Plasticity [10].
There is a strong analogy between the bias term in the BCPNN learning rule and
the “sliding threshold” in the BCM rule : Both are dependent on the past activity
of the cell and both define a threshold between Long-Term Potentiation and Longterm Depression. It would be extremly valuable to implement the BCM rule for
abstract units and compare the evolution of the bias term in the BCPNN model,
to the evolution of the threshold θM in the BCM context. Since, this rule account
for biologically observed phenomena, we could improve substantially our model, by
modifying the bias term, in order to mimic a BCM-like threshold adaptation.
70
7.3. FURTHER DEVELOPMENTS AND LIMITATIONS
7.3
Further Developments and limitations
In this section, we propose a series of possible developments for the spiking
BCPNN learning rule. Most of the time, the seed is from the need to overcome the
intrinsic limitations in our model.
7.3.1
Network implementation
All the work presented in this project deals only with two units and singlesynapse learning. This gives some lack of consistence to our results. Especially this
occurs when implementating one single synapse inference mode, which is somewhat
artificial with only two units, because a retrieval mapping is meant to receive input
from a set of units and not only from one. Only in this case, can we see the
strengthening of a connection between correlated attributes or group of attributes.
The fact that it is reduced to a pair of units gives poor extent to our inference
results. Therefore, a crucial development would be to introduce the learning rule in
a network context.
The architecture of the network is also a matter of concern. Since biological neural networks are very sparsely connected there is a trade-off to be found between
increasing size of the network (which makes the computational complexity increase
exponentially) and the percentage of connection between the neurons. When implemented in a reccurent architecture the learning rule exhibits lateral inhibition
between units that are not active together, and when a given unit is silent it gets
inhibitory input from almost every unit in the network. This has to be taken care
of during further development but introducing the new learning rule in a fully connected network is a thorny task.
7.3.2
RSNP cells and inhibitory input
For the sake of biological plausibility, we need to investigate a bit further about
what happens when the synaptic conductance g(pi , pj , pij ) in our model become
negative. This arises when the function gcomp (pi , pj , pij ) takes negative values, due
to a strong anti-correlation between two units. In our model, that does not create any major implementation issue : if the synaptic conductance gets negative,
everything happens, current was flowing through the synaptic cleft in the opposite
direction. The resulting consequence on the membrane voltage is similar to the
effect of an inhibitory input.
However, this is biologically unrealistic, because we have stated that our synapse
have an orientation : information passes from the presynaptic cell to the postsynaptic cell. To overcome this problem and stick to what is observed in real neurons,
we proposed an implementation of BCPNNSynapses, with RSNP cells.
Each unit will now be composed of a pair of cells : a RSNP cell, and a pyramidal
cell receiving an inhibitory connection from the first. The behaviour of the RSNP
cell and its corresponding pyramidal cell is complementary : if the first is active,
71
CHAPTER 7. DISCUSSION
(a) Single-synapse - RSNP cells
(b) Mutli-synapse - RSNP cells
Figure 7.3: BCPNN Synapses implementing RSNP cells and inhibitory connections
the other is silent, and vice versa. Both the postsynaptic cell and its corresponding
RSNP cell take input from other presynaptic pyramidal cells, but when the computed conductance becomes negative, the pyramidal-pre/pyramidal-post synaptic
conductance is set to zero, and the pyramidal-pre/RSNP-post connection becomes
active, triggering IPSP (Inhibitory Post-Synaptic Potential) in the postsynaptic
pyramidal cell. Figure 7.3 show a proposal for this implementation in single and
multi-synapse contexts.
7.3.3
Hypercolumns, basket cell and lateral inhibition
In order to account for the hypercolumnar structure, it would be valuable to
introduce the learning rule in a network composed of several minicolumns grouped
in a hypercolumn. This modular structure has been observed in the cat visual
cortex and its attractor dynamics have been implemented by Lundqvist et al in
2006 [24]. The main idea is that only one neuron is active within a minicolumn
(which imposes lateral inhibition between units in a minicolumn). Connections
between pyramidal cell are thus very long and they are the only ones that enter or
leave the hypercolumn.
A new type of cell called the basket cell has to take into account lateral inhibition
within a hypercolumn : Indeed, in each minicolumn one unit is sensitive to a specific
value of one attribute or feature (shape orientation, colour), thus, a smentioned
earlier, it is important that this attribute of feature takes only one value. In order to
achive this, we include basket cells in each hypercolumn (the number of basket cells
equaling the number of pyramidal cells per minicolumn). Each backet cell receives
excitatory connections from the pyramidal cell in each minicolumn, corresponding to
the specific feature, and gives an inhibitory input to the RSNP cells corresponding
72
7.3. FURTHER DEVELOPMENTS AND LIMITATIONS
to latter pyramidal cells. Such an implementation garantees stability for graded
input, but also dramatically increases the computing time.
7.3.4
Parallel computing
Dealing with complex units such as minicolumns and implementing a large
number of auxialiary cells for each unit might increase the computational time dramatically. To be able to run long stimulations with complex units and networks,
parallel computing is a precious tool. It is easily possible with the NEURON Simulation environment to distribute network models and complex models of single
neurons over multiple processors to achieve nearly linear speedup [26]. So, the
speedup governed by a parallel implementation of the presented learning rule will
make it possible to include it in a large-scale network of biologically detailed neurons and investigate the effects emerging on the network level. Some modification
of the code given might be needed however.
73
Chapter 8
Conclusion
In this Master Thesis Project, we have presented and implemented an adaptation of the BCPNN learning rule for spiking units. The BCPNN model has been
developed thoroughly in the last thirty years and has been found relevant in many
domains, such as classification tasks (Holst 1997) [15], a Hebbian working memory
model (Sandberg 2003) [32] and pharmacovigilance and data-mining (Lindquist et
al. 2000). There has been a strong motivation in all of these works, to have a version
of this learning rule operating with spiking units. We propose here an implementation in the NEURON language, based on a mapping from Bayesian weights to
a synaptic conductance, and from the bias term to an A-Type activity-dependent
potassium channel conductance.
Our work present results in single-synapse learning and inference. Instead of
testing our learning rule in a network context, we have focused on fine tuning of
the cell parameters in order to lay the foundations for the further development and
testing of this learning rule in specific tasks. This work can be extended in many
directions, but we think the results presented here constitute already a matter of
interest. Not only no spiking version of the BCPNN learning rule had ever been
implemented, but comparison to STDP-like real data enlarges the scope of our
work. This opens the gate to hybrid learning rules trying to reconciliate spiketiming dependent and probabilistic features at once, in learning rule operating with
spiking units.
We are positive that continuing this work will be a source of reward for the next
years to come. Though it is based on a very old mathematical theory, the use of
Bayesian-Hebbian Networks still prove to be valuable for many tasks. Trying to
bridge the gap between a phenomenological approach and the theoretical study of
the brain, one must seriously consider in the future to adapt existing algorithms
to the new operating stimulation languages. This kind of approach to the problem
might allow us to unify our knowledge, in order to adress what we do not know, in
a more altruistic and efficient manner.
75
Bibliography
[1]
http://en.wikipedia.org/wiki/Hebbian_theory
[2]
http://en.wikipedia.org/wiki/Hodgkin_Huxley_model
[3]
http://www.scholarpedia.org/article/BCM_rule
[4]
http://www.klab.caltech.edu/ stemmler/ Martin Stemmler Notes on Information Maximization in Single Neurons
[5]
Azouz R, Gray CM (2000) Dynamic spike threshold reveals a mechanism for
synaptic coincidence detection in cortical neurons in vivo, Proc Natl Acad Sci
USA 97:8110â8115
[6]
Bi G-Q, Poo M-M (1998) Synaptic Modifications in Cultured Hippocampal
Neurons : Dependence on Spike Timing, Synaptic Strength and Postsynaptic
Cell Type, J. Neurosci. 18(24): 10464-72.
[7]
Bienenstock E, Cooper L, Munro P (1982) Theory for the Development of
Neuron Selectivity : Orientation Specificity and Binocular Interaction in Visual
Cortex, The Journal of Neuroscience Vol. 2, No.1 (January, 1982) 32-48
[8]
Carnevale N, Hines M (2006) The NEURON Book New York, Cambridge University Press.
[9]
Clopath C, Ziegler L, Vasilaki E, Busing L, Gerstner W (2008)
Tag-Trigger-Consolidation:
A Model of Early and Late Long-TermPotentiation and Depression, PLoS Comput Biol 4(12):
e1000248.
doi:10.1371/journal.pcbi.1000248
[10] Cooper L, Intrator N, Blais N, Shouval H (2004) Theory of cortical plasticity
World Scientific, New Jersey.
[11] Fuster J. M. (1995) Memory in the Cerebreal Cortex, Cambridge, Massachusetts, The MIT Press.
[12] Gerstner W, Kistler W (2002) Spiking Neuron Models : Single Neurons, Population, Plasticity New York, Cambridge University Press.
77
BIBLIOGRAPHY
[13] Hebb D.O. (1949) The Organization of Behavior, New York, Wiley
[14] Hodgkin A, Huxley A (1952) A quantitative description of membrane current and its application to conduction and excitation in nerve, J. Physiol.
1952;117;500-544
[15] Holst A (1997) The Use of a Bayesian Neural Network Model for Classification
Tasks, Disseration, Department of Numerical Analysis and Computer Science,
Royal Institute of Technology, Stockholm, Sweden
[16] Hopfield J. J. (1982) Neural networks and physical systems with emergent
collective computational properties, Proc. Nat. Acad. Sci. (USA) 79, 25542558.
[17] Intrator N, Cooper L (1992) Objective Function Formulation of the BCM Theory, Neural Networks (5) 3-17
[18] Kononenko I. (1989) Bayesian Neural Networks, Biological Cybernetics Journal
Vol. 61, pp. 361-370.
[19] Jung S-C, Hoffman D (2009) Biphasic Somatic A-Type K + Channel Downregulation Mediates Intrinsic Pasticity in Hippocampal CA1 Pyramidal Neurons,
Plos ONE 4(8): e6549. doi:10.1371/journal.pone.0006549
[20] Kandel E, Schwartz J, Jessel T (1995) Essentials of NeuroScience and Behaviour, Appleton and Lange, Norwalk, Connecticut
[21] Antonov I, Antonova I, Kandel E, Hawkins R (2003) Activity-Dependent Presynaptic Facilitation and Hebbian LTP Are Both Required and Interact during
Classical Conditioning in Aplysia, Neuron 37 (1): 135â147, doi:10.1016/S08966273(02)01129-7
[22] Lansner A, Ekeberg O (1989) A One-Layer Feedback Artificial Neural Network
with a Bayesian Learning Rule, International Journal of Neural Systems Vol.
1, No. 1 (1989) 77-87
[23] Lansner A, Holst A (1996) A Higher Order Bayesian Neural Network with
Spiking Units, International Journal of Neural Systems Vol. 7, No. 2 (May,1996)
115-128
[24] Lundqvist M, Rehn M, Djurfeldt M, Lansner A (2006) Attractor dynamics
in a modular network model of neocortex, Network: Computation in Neural
Systems, Volume 17, Issue 3 September 2006, 253-276.
[25] Mayr C, Partzsch J, Schuffny R (2009) Rate and PulseâBased Plasticity Governed by Local Synaptic States Variables
78
[26] Migliore M, Cannia C, Lytton W, Markram H and Hines M (2006) Parallel
network simulations with NEURON, Journal of Computational Neuroscience
21:110-119.
[27] Morrison A, Diesmann M, Gerstner W (2008) Phenomenological Models of
Synaptic Plasticity based on Spike Timing, Biol Cybern (2008) 98:459478 DOI
10.1007/s00422-008-0233-1
[28] Potjans W, Morrison A, Diesmann M (2009) A Spiking Neural Network Model
of an Actor-Critic Learning Agent Neural Computation 21, 301-339 (2009)
[29] Ramón y Cajal S (1894) The Croonian Lecture : La Fine Structure des
Centres Nerveux Proceedings of the Royal Society of London 55: 444â468.
doi:10.1098/rspl.1894.0063
[30] Rubin J, Gerkin C, Bi G-Q, Chow C (2005) Calcium Time Course as a Signal
for Spike-TimingâDependent Plasticity, J. Neurophysiol. 93:2600-2613.
[31] Sandberg A, Lansner A, Petersson K-M, Ekeberg O (2002) Bayesian attractor
networks with incremental learning, Network: Computation in Neural Systems
13(2): 179-194.
[32] Sandberg A, Lansner A, Tegner J (2003) A working memory model based on
fast Hebbian learning, Network: Computation in Neural Systems 14: 789-802.
[33] Wahlgren N, Lansner A (2001) Biological evaluation of a Hebbian-Bayesian
learning rule, Neurocomputing 38-40: 433-438.
[34] Xu J, Kang N, Jiang L, Nedergaard M, Kang J (2003) Activity-Dependent
Long-Term Potentiation of Intrinsic Excitability in Hippocampal CA1 Pyramidal Neurons, The Journal of Neuroscience Vol. 25, No.7 (February 16, 2005)
1750-1760, doi:10.1523/JNEUROSCI.4217-04.2005
[35] Benda J, Herz A. V-M A Universal Model for Spike-Frequency Adaptation,
Neural Computation Vol. 15, No. 11 (November 2003), 2523â2564
79
Appendix A
NMODL files
A.1
Synapse modelisation
File BCPNNSynapse.mod
Synapse Implementation for the BCPNN Learning Rule
NEURON {
POINT_PROCESS BCPNNSyn
RANGE e, i, g, gmax, gcomp, glearn
RANGE Tau_i, Tau_j, Te, Tp, eps, float, r, K
RANGE zi, zj, ei, ej, eij, pi, pj, pij
NONSPECIFIC_CURRENT i
}
UNITS {
(S) = (siemens)
(pS) = (picosiemens)
(mV)= (millivolt)
(mA)= (milliamp)
}
PARAMETER {
Tau_i = 20.0 (ms) <1e-9,1e9>
Tau_j = 20.0 (ms) <1e-9,1e9>
Te = 200.0 (ms) <1e-9,1e9>
Tp = 1000.0 (ms) <1e-9,1e9>
eps = 1e-6
r = 0.8
e = 0 (mV)
gmax = 500 (pS)
glearn = 0
K = 1 <1e-9,1e9>
}
81
APPENDIX A. NMODL FILES
ASSIGNED {
g (pS)
gcomp
v (mV)
i (nA)
}
STATE {
zi
zj
zij
ei
ej
eij
pi
pj
pij
}
INITIAL {
zi = 0.01
zj = 0.01
ei = 0.01
ej = 0.01
eij = 0.0001
pi = 0.01
pj = 0.01
pij = 0.0001
}
BREAKPOINT {
SOLVE state METHOD cnexp
gcomp = g_comp(pi,pj,pij)
if (K<eps) { g = gmax * glearn * zi }
else { g = gmax * gcomp * zi }
i = 1e-6*g*(v - e)
}
DERIVATIVE state {
zi’ = -zi/Tau_i
zj’ = -zj/Tau_j
ei’ = (zi-ei)/Te
ej’ = (zj-ej)/Te
eij’ = ((Tau_j/Tau_i)*zi*zj-eij)/Te
pi’ = K*((ei-pi)/Tp)
pj’ = K*((ej-pj)/Tp)
pij’ = K*((eij-pij)/Tp)
}
82
A.2. A-TYPE POTASSIUM CHANNEL
NET_RECEIVE (weight) {
if (weight >= 0) { zi = zi + r*(1-zi) }
else { zj = zj + r*((Tau_i/Tau_j)-zj) }
}
FUNCTION g_comp(pi, pj, pij) {
if (pi < eps) { pi = eps }
if (pj < eps) { pj = eps }
if (pij < eps*eps) { pij = eps*eps }
if ((pij/(pi*pj)) < eps) { gcomp = log(eps) }
else { g_comp = log(pij/(pi*pj)) }
}
A.2
A-Type Potassium Channel
File ATypePotassium.mod
A-Type Potassium current for intrinsic excitability
NEURON {
SUFFIX ka
USEION k READ ek WRITE ik
RANGE gk, gkbar, gcomp, glearn, i
RANGE Tau, Te, Tp, eps, float, r, K
RANGE z, e, p, thresh, delay
}
UNITS {
(S) = (siemens)
(uS) = (microsiemens)
(mV)= (millivolt)
(mA)= (milliamp)
}
PARAMETER {
Tau = 20.0 (ms) <1e-9,1e9>
Te = 200.0 (ms) <1e-9,1e9>
Tp = 1000.0 (ms) <1e-9,1e9>
eps = 1e-6
r = 0.8
gkbar = 54.8 (uS/cm2)
thresh = -20 (mV)
delay = 7
glearn = 0
K = 1 <1e-9,1e9>
}
ASSIGNED {
83
APPENDIX A. NMODL FILES
v (mV)
ek (mV)
ik (mA)
i (mA)
gk (S/cm2)
gcomp
firing
up
time
counter
ready
}
STATE {
z
e
p
}
BREAKPOINT {
SOLVE states METHOD cnexp
gcomp = g_comp(p)
if (K<eps) { gk = gkbar * glearn }
else { gk = gkbar * gcomp }
i = 1e-6*gk*(v-ek)
ik = i
}
DERIVATIVE states {
detect(v)
if (firing == 1) { z = z + r*(1-z) }
z’ = -z/Tau
e’ = (z-e)/Te
p’ = (e-p)/Tp
}
INITIAL {
z = 0.01
e = 0.01
p = 0.01
up = 0
firing = 0
time = 0
counter = 0
ready = 0
}
FUNCTION g_comp(p) {
if (p < eps) { gcomp = 1 }
84
A.2. A-TYPE POTASSIUM CHANNEL
else { g_comp = log(p)/log(eps) }
}
PROCEDURE detect(v (mV)) {
if ( v>thresh && up==0 ) {
counter = delay
up = 1
ready = 1
}
if (ready==1 && counter>0) { counter = counter-1 }
if( ready==1 && counter<=0) {
firing = 1
ready = 0
time = t
}
if ( t>time ) { firing = 0 }
if ( v<thresh ) { up = 0 }
}
85
Appendix B
Hodgkin-Huxley Delayed Rectifier
Model
B.1
Voltage Equations
Each ionic current is given by Vm =
that
1
gi Ii + Ei
so Ii = gi (V m − Ei ) and it follows
Iion = (Ileak + IN a + IK )
= gleak (Vm − Eleak ) + gN a m3 h(Vm − EN a ) + gK n4 (Vm − EK )
with Eleak = −70.3 mV, EN a = +55 mV, EK = −75 mV, gleak = 20.5 µS.cm−2 ,
gN a = 60.0 mS.cm−2 and gN a = 5.1 mS.cm−2 . The final voltage equation is given
by
Iapp = Cm
B.2
dVm
+ gleak (Vm − Eleak ) + gN a m3 h(Vm − EN a ) + gK n4 (Vm − EK ) (B.1)
dt
Equations for Gating Variables
This presentation of Hodgkin-Huxley formalism is fully presented in [4]. The gating variables m, h, n, a and b that control the flow of current through the voltagedependent conductances obey the equations
dm
= φ [αm (Vm ).(1 − m) − βm (V ).m]
dt
dh
= φ [αh (Vm ).(1 − h) − βh (V ).h]
dt
dn
φ
= [αn (Vm ).(1 − n) − βn (V ).n]
dt
2
τa (Vm ).
da
= a∞ (Vm ) − a
dt
and
87
τb (Vm ).
db
= b∞ (Vm ) − b
dt
APPENDIX B. HODGKIN-HUXLEY DELAYED RECTIFIER MODEL
where φ = 3.8 is a temperature factor reflecting the difference between 6.3 deg C
of the original Hodgkin-Huxley experiments and the 18.5 deg C of the Connor and
Stevens crustacean experiments.
αm (Vm ) =
0.1(Vm + 29.7)
i
(Vm + 48)
20
1 − exp − (Vm +29.7)
10
αh (Vm ) = 0.07 exp −
αn (Vm ) =
h
βm (Vm ) = 4 exp −
βh (Vm ) =
0.1(Vm + 45.7)
h
1 − exp − (Vm +45.7)
10
(Vm + 54.7)
10
1
h
+18)
1 + exp − (Vm10
βn (Vm ) = 0.125 exp −
i
i
(Vm + 55.7)
80
h
i1

 0.0761. exp (Vm +94.22)  3
31.84
i
h
a∞ (Vm ) =
.
(V
+1.17)
m
 1 + exp

38.93
b∞ (Vm ) = 1 + exp
τa (Vm ) = 0.3632 +
τb (Vm ) = 1.24 +
(Vm + 53.3)
14.54
1.158
1 + exp
h
−4
(Vm +55.96)
20.12
2.678
1 + exp
h
.
(Vm +50)
16.027
i.
i.
This choice of somatic spiking conductances allows spiking to occur at arbitrarily
low firing rates, as is typically observed in cortical cells.
88
Appendix C
NEURON stimulations parameters
89
APPENDIX C. NEURON STIMULATIONS PARAMETERS
Pyramidal cell layer 2/3
Slow dynamics K channel
nseg
diameter
L
Ra
ḡKim
EKim
τmax,im
1
61.4
61.4
150.0
HH sodium channel
ḡN a
EN a
0.06
50
S/cm2
mV
µm
µm
Ωcm
mS/cm2
mV
ms
HH potassium channel
ḡK
EK
0.0051
-90
BCPNN synapse
τi
τj
τe
τp
r
gmax
0.07
-90
2269
20
20
200
1000
10−6
0.8
500
S/cm2
mV
Leak channel
gleak
Eleak
0.0205
-70.3
mS/cm2
mV
A-type K channel
ms
ms
ms
ms
τ
τe
τp
r
gk
delay
pS
20
200
2000
10−6
0.8
54.8
7
ms
ms
ms
µS
ms
Table C.1: Parameters describing the one-compartmental cell modeling a layer 2/3
pyramidal neuron, the included ion channels and BCPNN synapse. Ra stands for
the axial resistivity and L for the length of the section. The cell has a leakage conductance, a voltage-gated sodium channel, a voltage-gated potassium channel, an
activity dependent potassium channel with slow dynamics and an A-type potassium
channel.
90
TRITA-CSC-E 2010:059
ISRN-KTH/CSC/E--10/059--SE
ISSN-1653-5715
www.kth.se

Download Report

Transforming the BCPNN Learning Rule for Spiking - CSC

Paperzz.com

Your Paperzz