What to make of:
• distributed representations
• summation of inputs
• Hebbian plasticity
?
Competitive nets
{r1,…, ri,…, rN}
ri f (jwijrj)
wij ri (rj - <rj>)
Pattern associators
Autoassociators
Competitive Networks
• “discover” structures in input
space
• may remove redundancy
• may orthogonalize
• may categorize
• may sparsify representations
• can separate out even linear
combinations
• can easily be generalized to selforganizing topographic maps
..use multiple charts
to code environments...
..use the sheet
to code position...
context
position
identity
Object 1 in position 1
Object 2 in position 1
Object 1 in position 2
Discrete attractors, with units
arranged in a cortical network
Leah Krubitzer, Neuron, 2007
The dorsal cortex takes over
hedgehog
cat
monkey
Isocortex is laminated
Arealization
and
Memory in the
Cortex
monkey
Main perspectives:
a) Hierarchical
b) Modular
c) Content-based
The hierarchical perspective
ET Rolls, Proc Roy Soc 1992
The hierarchical perspective
The Elizabeth Gardner approach
..instead of neural activity
(as in the Hopfield model)..
..do thermodynamics over
connection weights, i.e.
consider whether among
all their possible values,
there are which satisfy
ri = g[∑jwijHEBBrj-Θ]+
wij
HEBB
≈
∑μ riμrjμ
riμ = g[∑jwijrjμ-Θ]+
The hierarchical perspective
The Elizabeth Gardner approach
Backpropagation and E-M algorithms
Network activation
Forward Step: Δr
Error propagation
Backward Step: Δw
Expectation – sampling the world
Maximization – of the match between the world
and our internal model of the world
The modular perspective
The Braitenberg model
N pyramidal cells
√N compartments
√N cells each
A pical synapses
B asal synapses
granulating the dorsal wall, leads to the mammalian
isocortex
the brand new
`neocortex’ has
laminated, i.e.
inserted a
granular layer IV
in between two
pyramidal cells
layers.
Layer IV granules are now (excitatory) interneurons
what does
this other
granulation
buy us?
Isocortical lamination
• emerges together with fine topographic mapping
• does not apply to the non topographic olfactory
system
• is underdeveloped in caetaceans
It might be a computational solution to the
need to relay precise information about
both ‘where’ and ‘what’ sensory stimuli are.
the model
src
recurrent
collaterals
patch of
cortex
sff
input
station
feedforward
connections
input activity
R
spatial focus
detailed pattern
The activation of units in the previous station is the product of a spatial
‘focus’, say, a Gaussian of radius R (which presumably would be picked up by
optical imaging, or by multi-unit recording) and a detailed unit-by-unit pattern
of activity (which would require single unit recording to be revealed). p patterns of
activity (e.g. 2-12) are established at the beginning, drawn at random
from a given distribution, and used repeatedly in one simulation.
The activation of units in the cortical patch is compared with the
activations resulting from the application of each input pattern at each
spatial focus, to decode the patternI and focus x of the current
activation. This allows measuring
I pos p( x
real
, xdecoded ) log 2
p( xreal , xdecoded )
p( xreal ) p( xdecoded )
as well as
p( ,
)
I
p( ,
) log
p( ) p(
)
ident
both population measures, reflecting activity in the whole patch
real
real
decoded
decoded
2
real
decoded
Both recurrent and feedforward weights are modified according to a
simple ‘Hebbian’ associative rule, over the course of several training
epochs. Each training epoch involves presenting, in random order,
each input pattern at each activation focus. The map is thus pre-wired
at a coarse, statistical level, and self-organized at a finer scale.
After a training epoch, noisy versions, again of each pattern at each
activation focus, are presented for testing, with no weight change. The
full information about position and identity cannot be decoded from
the activation in the patch, because the activation in the input is noisy
(in practice, e.g. 40% of the input units follow the prescribed pattern, and 60% are
randomly activated with the same distribution)
If R << Src, it is rather intuitive to predict how much information can be
relayed by feedforward projections of spread Sff:
I pos log(1/ S
I
S
ident ff
ff
)
• Iident is small initially
• grows with learning
• no difference between
layers
Results for p=4
• Ipos is less affected
by learning
• decreases with more
diffuse feedforward
connections
• again, no difference
between layers
These data, plotted
as Ipos vs. Iident,
demonstrate the
what/where conflict
as a boundary
• using more
patterns merely
shifts the same
boundary upwards
Differentiating a granular layer (IV)
in which units receive focused FF connections, also more restricted RC
connections, and follow a specific dynamics
• may nail down the focus of activation within the cortical
map (preserving detailed positional information)
• without interfering with the retrieval of the identity of the
specific activation pattern (achieved mainly by the
collaterals of the pyramidal layers)
the model
src
recurrent
collaterals
patch of
cortex
sff
input
station
feedforward
connections
input activity
R
spatial focus
detailed pattern
Indeed it happens!
Laminated cortex can
relay more combined
what and where
information than if it
were not laminated
• The advantage is
somewhat more
evident for larger p
• it is small, but should
scale up in a network
of realistic size
The granular layer
may nail down the focus of activation within
the cortical map (preserving detailed
positional information)
without interfering with attractor-mediated
retrieval of the identity of the specific
activation pattern (achieved mainly by the
collaterals of the pyramidal layers)
A differentiation between supra- and infragranular layers may be usefully coupled to
their different extrinsic connectivity, if:
• the supragranular layers preserve both
positional and identity information, and
trasmit it onward for further analysis
• the infragranular layers relay backwards and
downwards identity information freshly
squeezed from the attractors, without
bothering to replicate positional information
Lamination+directional
connectivity make
each layer convey a
better mix of
information, beyond
the capability of any
unlaminated patch,
whatever its Sff
• they also slow down
learning, though, so the
advantage would be
greater if more learning
epochs had been allowed
(here they are set to 3)
A functional hypothesis
A common mode of operation of the primordial sensory
neocortex of mammals may have been
autoassociative attractor dynamics.
Attractors may be formed by self-organizing weight
changes on FF and RC connections, and may
dominate the dynamics of both SG and IG layers,
although the former can be kept in tighter positional
register by layer IV.
Thanks to Hamish Meffin, with whom I discussed such ideas, with
divergent conclusions (see his Ph.D. Thesis, U. of Sidney)
2 suggestions
• Understanding specific mammalian mechanisms of
information representation and retrieval may require
quantitative (information theoretical) analyses at the
level of populations of individual neurones
• Only notions of sufficient abstraction and generality as
to apply to each sensory cortex can help explain the
appearance, in evolution, of this universal neocortical
microchip.
© Copyright 2026 Paperzz