Neuromorphic winner-take-all circuits for selective attention systems

Tesi di Dottorato di Ricerca in Ingegneria Elettronica ed Informatica (XVI Ciclo)
University of Genova
Biophysical and Electronic Engineering Department (DIBE)
Department of Communication, Computer and System Sciences (DIST)
Neuromorphic winner-take-all circuits for
selective attention systems
Giacomo Indiveri
Relatore: Prof. Sergio Martinoia
2
Contents
1 Introduction
17
1.1 Selective attention systems . . . . . . . . . . . . . . . . . . . . . 18
1.2 Neuromorphic Engineering . . . . . . . . . . . . . . . . . . . . . 19
2 Basic Neuromorphic Circuits
2.1 The subthreshold domain . . .
2.2 The MOS field-effect transistor
2.3 The differential pair . . . . . .
2.4 The current normalizer . . . .
2.5 Resistive Networks . . . . . .
2.6 Design principles . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
23
25
25
27
31
3 Winner-take-all network models
33
3.1 Neural network models . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Non-linear Programming Formulation . . . . . . . . . . . . . . . 38
4 Current mode Winner-Take-All circuits
4.1 The original current-mode WTA circuit
4.2 The hysteretic WTA circuit . . . . . . .
4.3 Lateral coupling . . . . . . . . . . . . .
4.4 Applications . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Neuromorphic vision sensors as single chip selective attention systems
5.1 A one-dimensional tracking chip . . . . . . . . . . . . . . . . . .
5.2 Stand Alone Visual Tracking Device . . . . . . . . . . . . . . . .
5.3 Active Tracking System . . . . . . . . . . . . . . . . . . . . . . .
5.4 Roving Robots . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Extensions of 1-D tracking sensors . . . . . . . . . . . . . . . . .
5.6 A 2-D tracking sensor . . . . . . . . . . . . . . . . . . . . . . . .
3
41
42
46
50
54
59
60
68
69
72
76
78
CONTENTS
6 Multi-chip models of selective attention systems
6.1 The Address-Event Representation . . . . . .
6.2 A 1-D AER selective attention chip . . . . .
6.3 A 2-D AER selective attention chip . . . . .
6.4 Selective attention applications . . . . . . . .
6.5 An active AER selective attention system . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
87
87
90
109
116
117
7 Silicon neural models of winner-take-all networks
129
7.1 The Integrate-and-Fire Neuron Circuit . . . . . . . . . . . . . . . 130
7.2 Networks of Integrate and Fire neurons . . . . . . . . . . . . . . 134
7.3 A competitive ring-of-neurons network . . . . . . . . . . . . . . . 134
8 Conclusions
143
8.1 Emulating Neural Circuits . . . . . . . . . . . . . . . . . . . . . 143
8.2 Commercial Application Scenarios . . . . . . . . . . . . . . . . . 144
8.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4
List of Figures
1.1
Schematic diagram of a saliency based model of selective attention (adapted from Itti, Koch and Niebur (1998)). . . . . . . . . . 19
2.1
Subthreshold and above threshold current response of a MOS transistor, as a function of Gate-to-Source voltage difference. . . . . .
(a) Circuit diagram of the differential pair. The differential output
current I1 −I2 is controlled by the differential input voltage V1 −V2
and scaled by a constant factor set by the bias voltage Vb . (b)
Experimental data obtained from a differential transconductance
amplifier with a bias voltage set to Vb = 0.6V. . . . . . . . . . . .
(a) Circuit diagram of the transconductance amplifier. The output
current Iout = I1 − I2 is a function proportional to a hyperbolic
tangent of the differential input V1 − V2 . (b) Schematic symbol
used to represent the transconductance amplifier circuit. . . . . . .
Two-input current normalizer circuit. . . . . . . . . . . . . . . . .
Current diffusor circuit. The current I3 , proportional to (I2 − I1 ),
diffuses from the source to the drain of M3 . . . . . . . . . . . . .
Similarities between (a) current-mode diffusor network, and (b)
resistive network. . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
2.3
2.4
2.5
2.6
3.1
24
26
27
28
29
30
Network of N excitatory neurons (empty circles) projecting to one
common inhibitory neuron (filled circle), which provides feedback inhibition. Small filled circles indicate inhibitory synapses
and small empty circles indicate excitatory synapses. x1 . . . xN are
external inputs; ye1 . . . yeN are the outputs of the excitatory neurons; yi is the output of the inhibitory neuron; we1 . . . weN are
the excitatory synaptic weights of the external inputs; wl1 . . . wlN
are the excitatory weights onto the global inhibitory neuron; and
wi1 . . . wiN are the inhibitory weights from the inhibitory neuron
onto the excitatory neurons. . . . . . . . . . . . . . . . . . . . . . 34
5
LIST OF FIGURES
3.2
3.3
4.1
4.2
4.3
4.4
4.5
4.6
4.7
Simulations of a WTA network comprising 100 linear-threshold
units ordered along one spatial dimension. The input (solid line)
is composed of 3 Gaussians. The outputs are shown for two cases:
we j = 1, wi j = 1 and wl j = 0.0250 ∀ j (dashed line); we j = 1, wi j =
1 and wl j = 0.0325 ∀ j (dotted line). . . . . . . . . . . . . . . . . 37
Numerical simulation the same WTA network shown in Fig. 3.2
now with weight values we j = 1, wi j = 1 and wl j = 0.0275. (a)
is the input distribution of increasing amplitude. (b) Network responses to the three inputs shown in (a). . . . . . . . . . . . . . . 38
Two cells of a current mode WTA circuit. . . . . . . . . . . . . . 43
Responses of the two-cell WTA circuit shown in Fig. 4.1. (a)
Voltage output (Vd1 and Vd2 ) versus the differential input voltage.
(b) Current output (Iout1 and Iout2 ). The bias voltage Vb = 0.7V. The
small difference in the maximum output currents is due to device
mismatch effects in the read-out transistors of the two cells. . . . . 45
Hysteretic WTA cell, with local excitatory feedback, lateral excitatory coupling, lateral inhibitory coupling and diode-source degeneration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Response of the hWTA circuit (outer hysteresis plot) superimposed to the response of the classical WTA circuit (inner central
plot). The output of the classical WTA circuit was shifted vertically by a few nano-amperes for sake of clarity. . . . . . . . . . . 48
Diode-source degenerated WTA network output and classical WTA
network output. . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Simplified WTA circuit, used to analyze the excitatory diffusor
network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Effect of lateral excitatory coupling on the hWTA network. (a)
Output currents Iall (see Fig. 4.3) measured at each cell of the
network for four increasing values of Vex . The inset shows a fit
of the data from cells 2 to 20 with an exponential function. (b)
Output currents Iall measured for three increasing values of Iin .
Each data set is normalized to the maximum measured current. . . 53
6
LIST OF FIGURES
4.8
4.9
5.1
5.2
Scanned output currents of hWTA network state (top solid-line),
of hWTA output (bottom solid-line) and of classical WTA output (bottom dotted line). (a) Input currents are applied to cell 1
(Vgs,1 = 1.1V ), cell 12 (Vgs,12 = 1.0V ) and cell 13 (Vgs,13 = 1.0V ),
lateral excitation is turned off (Vex = 0V ) and inhibition is global
(Vinh = 5V ). Both the basic WTA network and the hWTA network select cell 1 as the winner. (b) Input signals and network
bias settings are the same as in (a), but lateral excitation is turned
on (Vex = 1.825V ). The basic WTA network keeps on selecting
the strongest absolute input as the winner (cell 1), but the hWTA
network selects the region with two neighboring cells on, because
it has a stronger mean activation. (c) Input currents are applied
to cells 5, 12 and 16 (Vgs,5 = 1.2V , Vgs,12 = 1.1V , Vgs,16 = 1.0V ),
lateral excitation is turned off and inhibition is global (Vex = 0V ,
Vinh = 5V ). Both the basic WTA network and the hWTA network
select cell 5 as the winner. (d) Input signals and network bias settings are the same as in (c), but inhibition is local (Vinh = 3.35V ).
If inhibition is not global, the hWTA network allows multiple winners to be selected, as long as they are spatially distant (cell 16 is
selected as local winner, despite cell 12 receives a stronger input
current). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Response of the hWTA network to a single cell input (cell 13,
with Vgs,13 = 1.1V ) for a fixed value of Vex = 1.825V . (a) Current output for 4 different values of Vinh . (b) Relative difference
between output of the network with global inhibition (Vinh = 5V )
and output of the network with 3 different values of Vinh . . . . . . 57
Block diagram of single-chip tracking system. Spatial edges are
detected at the first computational stages by adaptive photoreceptors connected to transconductance amplifiers. The edge with
strongest contrast is selected by a winner-take-all network and its
position is encoded with a single continuous analog voltage by a
position-to-voltage circuit (see Section 5.1.6). . . . . . . . . . . . 61
Portion of layout of the 1.2µ m chip containing 7 processing columns.
The size of each computational stage is evidenced on the right. . . 62
7
LIST OF FIGURES
5.3
5.4
5.5
5.6
5.7
5.8
5.9
(a) Response of the array of adaptive photoreceptors to a black bar
on a white background (upper trace) and output traces of the edgepolarity detector circuit (lower traces); (b) Output characteristic of
the position-to-voltage circuit. The figure’s inset contains snapshots of many output traces of the WTA network superimposed,
as a stimulus was moving from left to right. The data points in
the main figure represent the output of the circuit corresponding
to the pixel position of the winner in the inset data. . . . . . . . . 63
(a) Response of the array of photoreceptors, with a very slow
adaptation rate, to a dark bar on a white background moving from
right to left with an on-chip speed of 31mm/s. The DC value of
the response has been subtracted. (b) Response of array of photoreceptors with a fast adaptation rate to the same bar moving at
the same speed (left pointing triangles) and at a slightly slower
speed (upward pointing triangles). . . . . . . . . . . . . . . . . . 65
Circuit diagram of the current polarity detector. Positive Idi f f currents are conveyed to the n-type current mirror M4,M5. Negative
Idi f f currents are conveyed to M6 through the the p-type current
mirror M1,M6. Depending on the values of the control voltage
signals VCT RL and VREF , the output current Iedg represents a copy
of only one of the two polarities of Idi f f , or of both polarities of
Idi f f (see text for details). . . . . . . . . . . . . . . . . . . . . . . 66
Response of the WTA network to the ON-edge of a bar moving
from left to right at an on-chip speed of 31mm/s. The top trace
represents the currents Isum of the WTA array while the bottom
trace represents the voltage outputs of the array of adaptive photoreceptors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Schematic diagram of position-to-voltage circuit. Example of three
neighboring cells connected together. . . . . . . . . . . . . . . . . 67
Picture of the stand-alone tracker board. The neuromorphic sensor
is on the chip beneath the lens. On the left part of the board there is
an array of potentiometers used to bias the chip’s control voltages.
On the top there is an LED display, comprising three display bar
lines with their corresponding drivers. The scale in the left part of
the figures is in millimeters. . . . . . . . . . . . . . . . . . . . . . 68
(a) Output of the system in response to a finger moving back and
forth in front of the chip; (b) Output of the system in response to
a pen moving at approximately 8000 pixels/s on a stationary light
background. Note the different time scales on the abscissae. . . . 70
8
LIST OF FIGURES
5.10 Picture of tracker chip mounted on a DC motor. The output of the
chip is sent to a dual-rail power amplifier which drives directly the
motor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.11 (a) Setup of the active tracking system as seen from above. The
angle θ represents the angular displacement produced by the DC
motor, x represents the target’s position in the visual space, y represents the distance of the target’s projection on the retina from its
center. The angular velocity θ̇ is proportional to y. (b) Chip data
measured as the system was engaged in tracking a swinging bar.
The bar’s position (circles) was measured using a separate (fixed)
tracking board, while its velocity (solid line) was computed offline from the discretized position data. The crosses represent the
output of the active sensor used to drive the system’s DC motor. .
5.12 Tracker chip mounted on a LEGO robot performing a “target exploration task”. Using very little CPU power, this robot is able
to simultaneously explore (make random body/head movements),
attend (orient the sensor toward high-contrast moving edges) and
pursuit (drive towards the target). . . . . . . . . . . . . . . . . . .
5.13 (a) Koala robot with neuromorphic sensor mounted on its front.
(b) Positions of Koala following a line, sampled at intervals of
0.25 seconds for a period of 37.5 seconds, in which the robot
completed 4 loops. The features (white squares) were obtained
by tracking a dark cross drawn on the white top of Koala. . . . . .
5.14 (a) Koala robot with neuromorphic sensor mounted on its front
and a white sheet of paper with crosses attached on its top, seen
from above. (b) Positions of Koala following a white line on a
light-blue carpet floor, sampled at intervals of one second over a
period of approximately 3 minutes. The features (white squares)
were obtained by tracking the bars appearing on the top part of
Koala (see text for explanation). . . . . . . . . . . . . . . . . . .
5.15 Two-dimensional tracker chip architecture. . . . . . . . . . . . . .
5.16 Differentiating adaptive photoreceptor circuit. . . . . . . . . . . .
5.17 Hysteretic WTA circuit with spatial coupling. . . . . . . . . . . .
5.18 Two-input pass-transistor demultiplexer. The voltage on Vc is
routed either to VP2V (if Vsel is high) or to VENC (if Vsel is low). . .
5.19 Output of the analog P2V circuits in response to a target moving
from the right top corner to the bottom central part of the sensor’s
field of view. The bottom trace (Vx ) reports the x position of the
target. The top trace (Vy ), offset in the plot by 5V for sake of
clarity, reports the y position of the target. The inset shows Vy
versus Vx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
71
73
74
76
77
78
79
80
81
82
LIST OF FIGURES
5.20 Output of the analog P2V circuits in response to a target moving
from the bottom left corner to the top right one, on to the top left,
to the bottom right, and back to the bottom left corner. . . . . . .
5.21 Output of least significant bit (bottom trace) and second-least significant bit (top trace, displaced by 6V) of the ’X’ address in response to a target moving from right to left. . . . . . . . . . . . .
5.22 Histogram of the addresses measured from the sensor’s address
encoders in response to a target moving on a circular trajectory. . .
6.1
6.2
6.3
6.4
6.5
6.6
Schematic diagram of an AER chip to chip communication example. As soon as a sending node on the source chip generates an
event its address is written on the Address-Event Bus. The destination chip decodes the address-events as they arrive and routes
them to the corresponding receiving nodes. . . . . . . . . . . . .
Image captured from a 168×132 silicon designed by Jörg Kramer,
(at the Institute of Neuroinformatics, Zurich), while the subject
was moving. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Biologically equivalent architecture of selective attention model.
Input spike trains arrive from the bottom onto excitatory synapses.
The populations of cells in the middle part of the figure are modeled by a hysteretic WTA network with local lateral connectivity.
Inhibitory neurons, in the top part of the figure, locally inhibit the
populations of excitatory cells by projecting their activity to the
inhibitory synapses in the bottom part of the figure. . . . . . . . .
(a) Excitatory synapse circuit. Input spikes are applied to M1,
and transistor M4 outputs the integrated excitatory current Iex . (b)
Inhibitory synapse circuit. Spikes from the local output neurons
are integrated into an inhibitory current Iinh . . . . . . . . . . . . .
(a) Response of an excitatory synapse to single spikes, for different values of the synaptic strength Vw (with Ve = 4.60V). (b) Normalized response to single spikes for different time constant settings Ve (with Vw = 1.150V). (c) Response of an excitatory synapse
to a 50Hz spike train for increasing values of Vw (0.6V, 0.625V,
0.65V and 0.7V from bottom to top trace respectively). (d) Response of excitatory synapse to spike trains of increasing rate for
Vw = 0.65V and Ve = 4.6V (12Hz, 25Hz, 50Hz and 100Hz from
bottom to top trace respectively). . . . . . . . . . . . . . . . . . .
Schematic diagram of the WTA network. Examples of three neighboring cells connected together. . . . . . . . . . . . . . . . . . .
10
83
84
85
88
90
92
93
94
95
LIST OF FIGURES
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
Net WTA input current Inet values at each pixel location for a static
control input. Pixels 5 through 13 have input currents slightly
lower than pixel 21. All other pixels receive weaker input stimuli.
(a) In the absence of lateral coupling (Vex = 0V ) the network selects pixel 21 as the winner. (b) In the presence of lateral coupling
(Vex = 1.5V ) the network smooths spatially the input distribution
and selects pixel 9 as the winner. . . . . . . . . . . . . . . . . . .
Circuit diagram of the local inhibitory integrate-and-fire neuron. .
Integrate-and-fire neuron characteristics. (a) Membrane voltage
for two different DC injection current values (set by the control
voltage Vin j ). (b) Membrane voltage for two different refractory
period settings. (c) Firing rates of the neuron as a function of
current-injection control voltage Vin j plotted on a linear scale. (d)
Firing rates of the neuron as a function of Vin j plotted on a log
scale (the injection current increases exponentially with Vin j ). . . .
Scanned net input currents to the WTA network Inet (top traces)
and inhibitory currents Iinh (bottom traces) measured, by means
of an off-chip current sense-amplifier, at every pixel location. (a)
Response of the system to the onset of the stimulation, with a
display persistence setting of 3s (b) Response of the system after
a few seconds of stimulation, with a display persistence setting of
250ms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
(a) Raster plots of neuron 10 in response to the control stimulus
(see text for explanation). (b) Raster plots of neuron 22. (c) Peristimulus time histogram of neurons 10 (solid line) and of neuron
22 (dashed line). (d) Inter-spike interval distribution of neurons
10 (front bars) and 22 (rear bars). . . . . . . . . . . . . . . . . . .
Test image with salient features. (a) Original color figure. (b)
Corresponding saliency map. (c) Input spike frequencies obtained
from the injective mapping describe in the text (upper trace) and
distribution of the output neuron’s spike counts recorded over a
period of 3 seconds (lower histogram). (d) Position of the attended
pixel recorded over time. . . . . . . . . . . . . . . . . . . . . . .
Mapping of the 1D data of Fig. 6.12(d) onto the re-sampled 2D
saliency map data of Fig. 6.12(b). Shifts along the horizontal axis
are due to the selective attention chip’s response. Shifts along the
vertical axis are introduced artificially via the injective mapping
described in the text. . . . . . . . . . . . . . . . . . . . . . . . .
Block diagram of a basic cell of the 8 × 8 selective attention architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
97
99
100
105
106
107
108
109
LIST OF FIGURES
6.15 Synaptic circuits. (a) Input excitatory synapse. Address events
are converted into pulses by the circuit in the dashed box. Pulses
are integrated into the excitatory current Iex by the p-type currentmirror integrator. The integrator’s gain and time constant are modulated by the control voltages Vw and Vτ e ; (b) Inhibitory synapse.
On-chip pulses (Vior ) are integrated into the inhibitory current Iior
by the n-type current-mirror integrator. The time constant and
gain of this integrator are modulated by the voltages Vq and Vτ i . . .
6.16 Hysteretic WTA cell. Input currents are sourced into node Vin and
3 copies of the output current are sent to the two P2V circuits and
to the I&F neuron. . . . . . . . . . . . . . . . . . . . . . . . . . .
6.17 Local output integrate and fire neuron. When the membrane voltage Vmem increases above Vthr the output voltage Vout is driven
to Vdd and an address event is generated. The transistors in the
dashed box are part of the output AER circuitry. . . . . . . . . . .
6.18 (a) Output of the P2V circuits of the selective attention architecture measured over a period of 300ms, in response to a test stimulus exciting four corners of the input array at a rate of 30Hz and
a central cell at a rate of 50Hz; (b) Histogram of the chip’s output
address-events, captured over a period of 13.42s in response to the
same input stimulus. . . . . . . . . . . . . . . . . . . . . . . . .
6.19 Event histograms of addresses generated by the workstation sent
to the chip (a) and output addresses generated by the selective
attention chip (b), (c), and (d). All chip parameters are kept constant throughout the plots except for the bias parameter Vτ i . The
histogram in (b) was obtained with Vτ i = 227mV , the one in (c)
with Vτ i = 207mV , and the one in (d) with Vτ i = 193mV . . . . . .
6.20 Output address events of the selective attention chip biased with
Vτ i = 207mV . The 2D address space of the chip’s architecture is
mapped into the plot’s 1D ordinate vector by labeling each address
successively, row by row. . . . . . . . . . . . . . . . . . . . . . .
6.21 Image representations of saliency maps. (a) Saliency map corresponding to the input stimulus used for the experiment of Fig. 6.18;
(b) Saliency map used for the experiment of Fig. 6.19; (c) Fictitious example resembling a realistic saliency map. . . . . . . . . .
12
110
111
112
113
115
116
117
LIST OF FIGURES
6.22 (a) Block diagram of the sensory-motor selective attention model.
The figure shows the basic computational blocks used, as well
as the corresponding biological analogues and their function. (b)
Schematic diagram of the active vision setup: The neuromorphic
imager, mounted on a pan-tilt unit, transmits its output to the selective attention chip. The latter sends the results of its computations to a host computer which uses this data to drive the pan-tilt
unit’s motors. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.23 Selective attention active vision system. The selective attention
chip processes sensory data coming from an AER imaging sensor
and transmits its output to a workstation that drives the pan-tilt
unit on which the sensor is mounted. A standard CCD camera is
mounted next to the AER sensor to visualize the sensor’s filed of
view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.24 Block diagram of irradiance transient detector with event-based
communication interface. . . . . . . . . . . . . . . . . . . . . . .
6.25 Image captured from the CCD camera mounted next to the transient imager. The outer frame shown in the image corresponds to
the field of view of the transient imager, whereas the inner frame
is drawn to evidence the transient imager’s central region. The
cross to the bottom right of the image center represents the location of the focus of attention currently computed by the selective
attention chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.26 (a) Histogram of events generated by the transient imager pixels
in response to two diffused flashing LEDs. The LED stimulating
the region around pixel (5,9) has higher contrast than the other
LED. (b) Histogram of events generated by the selective attention
chip in response to the events generated by the transient imager
chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.27 Raster plot of the activity of the neurons of both transient imager
chip (dots) and selective attention chip (circles) in response to the
flashing LEDs. To plot the data from both chips using an address
space with the same resolution, we sub-sampled the addresses of
the transient imager chip. The LEDs flashed approximately at
0.25s, 1.25s and 2.25s. . . . . . . . . . . . . . . . . . . . . . . .
6.28 Sequence of images showing the selection of a salient stimulus
prior to and after a saccadic eye movement. (a) The system is
attending the top LED, already centered on the central part of the
imaging array. (b) The system selects the bottom LED, outside the
central region of the imager. (c) The system performed a saccade
toward the bottom LED, and is currently attending it. . . . . . . .
13
118
119
120
121
123
123
125
LIST OF FIGURES
6.29 Raster plot of the activity of the neurons of the transient imager
chip (dots) and of the selective attention chip (circles) in response
to two flashing LEDs. The focus of attention shifts from a central region of the imaging array to a peripheral one (see circles at
2s ≤ t < 6s). Consequently, the system makes a camera movement, at the time indicated by the vertical arrow, and re-centers
the attended location. . . . . . . . . . . . . . . . . . . . . . . . . 125
6.30 Output of the P2V circuits of the selective attention chip (see
Fig. 6.14), representing the scanpath of the focus of attention,
switching back and forth between the fluttering fingers of both
of the experimenter’s hands. The scanpath data is superimposed
onto a snapshot taken from the CCD camera during the experiment. 126
6.31 Saccadic eye movements in response to moving fingers. (a) CCD
camera snapshot taken before the saccadic eye movement (the focus of attention has just switched from one hand to the other). (b)
CCD camera snapshot taken just after the the saccadic eye movement (the focus of attention and the salient stimulus are now in
the center of the imaging array). . . . . . . . . . . . . . . . . . . 127
7.1
7.2
7.3
7.4
Circuit diagram of the I&F neuron. . . . . . . . . . . . . . . . . .
(a) Measured data (circles) representing an action potential generated for a constant input current Iin j with spike-frequency adaptation and refractory period mechanisms activated. The data is fitted
with the analytical model of eq. (7.5) (solid line). (b) Circuit’s f -I
curves (firing rate versus input current Iin j ) for different refractory
period settings. . . . . . . . . . . . . . . . . . . . . . . . . . . .
(a) Raster plots showing the activity of an AER array of 32 I&F
neurons in response to a constant input current, for four decreasing values of the refractory period (clockwise from the top left
quadrant). (b) Mean response of all neurons in the array to increasing values of a global input current, for the same refractory
period settings. The error bars represent the responses standard
deviation throughout the array. . . . . . . . . . . . . . . . . . . .
Architecture of the integrate-and-fire ring of neurons chip. Empty
circle represent excitatory neurons. The filled circle represents
the global inhibitory neuron. The gray line symbolizes inhibitory
connections, from the inhibitory neuron to all excitatory neurons.
Black arrows denote excitatory connections. . . . . . . . . . . . .
14
130
132
135
137
LIST OF FIGURES
7.5
7.6
7.7
(a) Raster plot of input spike trains (small dots) superimposed
onto the output spike trains (empty circles), with global inhibitory
feedback turned off (the inhibitory-to-excitatory synaptic weights
are set to zero). (b) Histograms of input spike distribution (top
trace), output spike distribution of competitive network with global
inhibition but no lateral excitation (middle trace) and output spike
distribution of competitive network with global inhibition and with
lateral excitation (bottom trace) . . . . . . . . . . . . . . . . . . . 139
(a) Arrangement of input signals used to stimulate a set of neurons
of the network. Each box represents a Poisson distributed spike
train source. (b) Raster plots representing input spikes (small
dots), output spikes (empty circles), and coincident (within 1ms
time window) output spikes (filled circles) for the three network
configurations: Without global inhibition (top raster plot), with
global inhibition (middle raster plot) and with global inhibition
and local excitation (bottom raster plot). . . . . . . . . . . . . . . 140
Pairwise cross correlations averaged over neuron pairs 9-10, 911 and 10-11. The data of the top trace were computed from the
response of the network in the absence of global inhibition. The
middle trace corresponds to the case with global inhibition and the
bottom trace corresponds to the case with both global inhibition
and local excitation turned on. . . . . . . . . . . . . . . . . . . . 141
15
LIST OF FIGURES
16
Chapter 1
Introduction
Biological organisms perform complex selection operations continuously and effortlessly. These operations allow them to quickly determine, for example, the
motor actions to take, in response to combinations of external stimuli and internal
states; or to pay attention to subsets of sensory inputs, suppressing non salient
ones; or to plan complex action sequences, serially choosing elementary behaviors among different alternatives. In essence these selection operations allow biological organisms to survive. One of the main computational expedients used
by nature to perform these selection operations is implemented by winner-takeall (WTA) networks. These are networks of competing elements (cells, neurons,
populations of neurons or neural circuits) that sequentially select the elements
receiving the strongest input signals and suppress the remaining ones.
In this thesis we will argue that neuromorphic circuits are an optimal medium
for constructing WTA networks and for implementing efficient hardware models
of selective attention systems. To validate our argument, we will describe properties of neuromorphic circuits, and analyze in detail the characteristics of currentmode WTA circuits; we will then show examples of single-chip vision systems
that use WTA networks to select and track the position of salient features, and
of multi-chip systems that implement more elaborate models of selective attention mechanisms, and that are not restricted to just the visual sensory modality.
Some of these examples will evidence how the biological inspiration and the neuromorphic technology used can lead to the design of devices with high potential
for commercial exploitation. Other examples will evidence how the synthetic approach followed, and the constraints imposed by the analog VLSI circuits, can aid
basic research, e.g. by limiting the space of possible models and providing possibles explanations on why biological organisms implement selective attention
mechanisms with specific architectures.
17
CHAPTER 1. INTRODUCTION
1.1 Selective attention systems
Processing detailed sensory information is a computationally demanding task for
both biological and artificial systems. If the amount of information provided by
the sensors exceeds the parallel processing capabilities of the system, as is usually
the case with both biological and artificial vision systems, an effective strategy is
to select subregions of the input and process them, shifting from one subregion to
another, in a serial fashion [24, 89]. In biology this strategy, commonly referred to
as selective attention, is used by a wide variety of “systems”, from insects [5, 98]
to humans [18, 61]. In primates selective attention plays a major role in determining where to center the high-resolution central foveal region of the retina [84],
by biasing the planning and production of saccadic eye movements [2, 42]. In
general though, visual regions being attended by the focus of attention do not
always correspond to the regions being analyzed by the fovea. Recent findings
even suggest that attention can be used to keep track of multiple targets of interest
simultaneously, if the visual task requires a low attentional cost [14, 18].
Psychophysical evidence indicates that visual attention mechanisms have two
main types of dynamics: a transient, rapid, bottom-up, task independent one, and
a slower, sustained one, which acts under voluntary control [92]. In this thesis
we will focus on implementations of bottom-up models of selective attention. We
will show how it is possible to implement these models using VLSI technology,
and analog neuromorphic circuits, such as winner-take-all networks and silicon
integrate-and-fire neurons.
1.1.1 Saliency-based Model of Selective Attention
Several computational models of selective attention have been proposed [2, 89,
95, 97, 113]. Some of these models are based on the concept of “dynamic routing” [97], by which salient regions are selected by dynamic modification of network parameters (such as neural connection patterns) under both top-down and
bottom-up influences. Some other models, based on similar ideas, promote the
concept of “selective tuning” [113]. In these models, attention optimizes the selection procedure by selectively tuning the properties of a top-down hierarchy of
winner-take-all processes embedded within the visual processing pyramid.
The types of models we seek to implement in hardware are the one based on
the concept of the “saliency map”, originally put forth by Koch and Ullman [64].
These biologically plausible types of models account for many of the observed
behaviors in neurophysiological and psychophysical experiments and have led to
several software implementations applied to machine vision and robotic tasks [1,
12, 58, 112]. They are especially appealing to us because they lend themselves
nicely to hardware implementations.
18
1.2. NEUROMORPHIC ENGINEERING
Attended location
Inhibition
of return
WTA network
Saliency map
Feature combination
Feature
maps
Center-surround differences and normalization
orientations
intensity
colors
Linear filtering
Input image
Figure 1.1: Schematic diagram of a saliency based model of selective attention (adapted
from Itti, Koch and Niebur (1998)).
A diagram describing the main processing stages of such type of model is
shown in Fig. 1.1. A set of topographic feature maps is extracted from the visual input. All feature maps are normalized and combined into a master saliency
map, which topographically codes for local saliency over the entire visual scene.
Different spatial locations then compete for largest saliency, based on how much
they stand out from their surroundings. A winner-take-all (WTA) circuit selects
this most salient location as the focus of attention. The WTA circuit is endowed
with internal dynamics, which generate the shifts in attention based on a mechanism named inhibition of return (IOR) (a key feature of many selective attention
systems) [35].
As saliency-based selective attention models are highly modular, multi-chip
neuromorphic systems that implement them can scale up to arbitrarily complex
selective attention systems.
1.2 Neuromorphic Engineering
Neural network theories, used as an additional methodology for solving pattern
recognition and constraint minimization problems, have emerged in recent years
19
CHAPTER 1. INTRODUCTION
as a practical technology and represent a well established research field. Neural
network algorithms, the type of non-linearities present in the transfer functions of
their computational elements and the architectures that implement them are often
loosely inspired by biological systems.
An emerging new technology which tries to establish even closer links to biology, capitalizing on the advantages of interdisciplinary research, is the one of
neuromorphic engineering. Specifically, neuromorphic engineering applies the
computational principles discovered in biological organisms to those tasks that biological systems perform easily, but which have proved difficult to do using traditional engineering techniques. For example, biological neural systems for sensory
perception and motor control are compact, energy efficient and robust to noise
both in the input data and in the internal state variables. They typically have a
relatively simple organization, consisting of arrays of similar processing elements
that interact in nonlinear ways mainly with nearest neighbors. Neuromorphic systems, rather than implementing abstract neural networks remotely related to these
types of systems, are hardware devices, containing analog circuits, that attempt to
model in detail, (up to the device-physics level) their properties and the physical
processes in them embedded that underlie neural computation [30]. The closest
medium, widely accessible to the research community, that allows researchers to
implement detailed hardware models of neural systems is silicon. Using analog,
continuous time circuits implemented with a standard CMOS VLSI technology
it is possible to build low-cost, compact implementations of such models. The
greatest successes of neuromorphic analog VLSI (aVLSI) to date have been in the
emulation of peripheral sensory transduction: Silicon retinas and silicon cochleas
have been successfully implemented and used in a wide variety of applications
[10, 21, 33, 66, 75, 78]. In these analog devices, as in their biological counterparts, it is the structure of the architecture, the morphology of the system, that
determines their functionality. This constraint is added to the ones that come from
the fact that neuromorphic systems have to cope with issues such as minimizing
power consumption, maximizing robustness to noise and optimizing reliability in
their performance, while interacting in real-time with the environment. It is by
trying to satisfy these very constraints that researchers are hoping to obtain more
insight into the workings of biological neural systems. One could suggest to use
software simulations to validate models of biological neural systems. Besides not
being able to implement real-time, compact, cheap and low power systems using traditional digital technology (as compared to neuromorphic systems), there
are also considerations on the computational load of digital simulators to take
into account. Detailed simulations of neural processes are among the most computationally intensive and (realistic) simulations of large populations of neurons
still result prohibitive, despite the continuous improvements of digital technology.
Furthermore, to obtain realistic simulations, one should attempt to model in soft20
1.2. NEUROMORPHIC ENGINEERING
ware also the dynamics of the system with which the neural model interacts, the
noise present in the environment, and the constraint that might arise from power
consumption minimization. The systems of equations to solve arising from these
additional constraints would increase the computational load of the digital system
even more.
Neuromorphic engineering is thus mainly concerned with hardware correlates
of biological systems. Yet, the nature of the research carried out by neuromorphic
engineers is twofold: on one side there is the desire to learn more about the computational properties of the brain by tackling the same problems that nature and
evolution solved in the course of 600 million years, on the other there is the desire
to design and develop efficient neuromorphic engineered systems that can be used
to solve real world problems and that can eventually lead to successful industrial
applications.
21
CHAPTER 1. INTRODUCTION
22
Chapter 2
Basic Neuromorphic Circuits
In this Section we will introduce some basic concepts of analog circuit design
necessary for understanding the circuits and systems presented in the subsequent
parts of the thesis. A more thorough description of analog VLSI circuits and
principles can be found in the textbook that we recently published [74].
2.1 The subthreshold domain
Perhaps the most elementary computational element of a biological neural structure is the neural cell’s membrane. The nerve membrane electrically separates
the neuron’s interior from the extracellular fluid. It is a very stable structure that
behaves as a perfect insulator. Current flow through the membrane is mediated
by special ion channels (conductances) which can behave as passive or active devices. In the passive case, ion channels selectively allow ions to flow through the
membrane by the process of diffusion. In electronics, it is possible to implement
the same physical process by using MOS field-effect transistor devices, operated
in the subthreshold region (also referred to as weak inversion) [74, 80, 117].
2.2 The MOS field-effect transistor
One of the most common devices used in today’s integrated circuit technology is
the Metal-Oxide-Silicon Field Effect Transistor (MOSFET)1 . The currents in this
device comprise either positively-charged holes or negatively-charged electrons.
1 The
field-effect transistor structure was first described in a series of patents by J. Lilienfeld
that were granted in the early 1930s. The MOSFET is the field-effect transistor type that is almost
exclusively used today. Historically, other field-effect transistor types were invented including the
junction field-effect transistor (JFET), and the metal-semiconductor field-effect transistor (MESFET).
23
CHAPTER 2. BASIC NEUROMORPHIC CIRCUITS
-2
10
-3
subthreshold
10
-4
10
-5
10
-6
Ids (A)
10
above threshold
-7
10
-8
10
-9
10
-10
10
-11
10
-12
10
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Vgs (V)
Figure 2.1: Subthreshold and above threshold current response of a MOS transistor, as a
function of Gate-to-Source voltage difference.
MOSFETs are used typically as digital elements (either fully open or closed).
Only a small percentage of VLSI devices uses them in the analog domain, and
there are even fewer cases in which MOSFETs are used in the subthreshold domain. As neuromorphic circuits are among those few examples, here we concentrate on the current-voltage characteristics of MOSFETs in the subthreshold
domain.
MOS transistors operate in the subthreshold region of operation when their
gate-to-source voltage is below the transistor threshold voltage. This mode of operation of a transistor has been largely ignored by the analog/digital circuit design
community, mainly because the currents that flow through the source-drain terminals of the device under these conditions are extremely low (typically of the
order of nanoamperes). In subthreshold, the drain current of the transistors is related to the gate-to-source voltage by an exponential relationship (see Fig. 2.1).
Specifically, for an n-type MOS transistor, the subthreshold current is given by:
V
V
V
VDS
W
(1−κ ) UBS
− UDS
κ UGS
T
T
T
e
· 1−e
+
(2.1)
Iout = I0 e
L
V0
where W and L are the width and length of the transistor, I0 is the zero bias
current, κ is the subthreshold slope coefficient, UT is the thermal voltage, V0 is
the Early voltage and VGS , VDS and VBS are the gate-to-source, drain-to-source and
bulk-to-source voltages respectively. Typical values for devices with W = L =
4µ m fabricated with standard 2µ m technology are: I0 = 0.72 · 10−18 A, κ = 0.65,
V0 = 15.0 V.
24
2.3. THE DIFFERENTIAL PAIR
If the transistor operates in saturation region (i.e. if VDS ≥ 4UT ) and if | V0 ||
VDS | the above equation can be simplified to yield:
κ VG −VS
W
Iout = I0 e UT
(2.2)
L
The diffusion of electrons through the transistor channel is mediated by the
gate-to-source voltage difference. As the input/output characteristic of a subthreshold transistor is an exponential function, circuits containing these devices
can implement the “base functions” required to model biological processes: logarithms and exponentials.
2.3 The differential pair
On of the most common tricks used both by biological and engineered devices
for computing measurements insensitive to absolute reference values and robust
to noise, is the one of using difference signals. The differential pair is a compact
circuit comprising only three transistors that is widely used in many neuromorphic
systems (see Fig. 2.2). It has the desirable property of accepting a differential voltage as input and providing in output a differential current with extremely useful
characteristics: if the bias transistor is operated in the subthreshold domain and if
we assume that all the transistors are in saturation (so that equation 2.2 holds), the
transfer function of the circuit is:
I1 − I2 = Ib tanh
κ (V1 −V2 )
2UT
(2.3)
The beauty of this transfer function lies in the properties of the hyperbolic
tangent present in it: it passes through the origin with unity slope, it behaves in
a linear fashion for small differential inputs and it saturates smoothly for large
differential inputs.
To provide in output the differential term I1 − I2 using a single terminal, one
needs simply to connect a current mirror of complementary type to the differential
pair output terminals (e.g. a current mirror of p-type MOS transistors in the case
of Fig. 2.2). The circuit thus obtained would then be the glorious differential
transconductance amplifier [74, 80] (see Fig.2.3).
2.4 The current normalizer
During the last 40 years, the vast majority of analog circuits have used voltages
to represent and process relevant signals. However, recently, current-mode signal
25
CHAPTER 2. BASIC NEUROMORPHIC CIRCUITS
2
1.5
I1
(I1 − I2 ) (nA)
1
I2
V1
V2
0.5
0
−0.5
−1
Ib
−1.5
Vb
−2
−300
−200
−100
(a)
0
100
(V1 − V2 ) (mV)
200
300
(b)
Figure 2.2: (a) Circuit diagram of the differential pair. The differential output current
I1 −I2 is controlled by the differential input voltage V1 −V2 and scaled by a constant factor
set by the bias voltage Vb . (b) Experimental data obtained from a differential transconductance amplifier with a bias voltage set to Vb = 0.6V.
processing circuits, in which signals and state variables are represented by currents rather than voltages [110], have shown advantages over their voltage-mode
counterparts. Their advantages include higher bandwidth, higher dynamic range,
and they are more amenable to lower power supplies.
A current-mode circuit that will form the basis of the more complex circuits
described throughout this thesis is the current normalizer (see Fig. 2.4). This
circuit, based on the Gilbert normalizer, receives analog continuous time input
currents and provides normalized output currents. It is a modular circuit that can
be extended to an arbitrary number of cells by simply connecting additional current mirrors to the common node Vc . If the input currents are subthreshold, the
circuit is characterized by the equations
Iini = I0 e
Iouti = I0 e
Vd
T
κU i
Vd
T
κ U i − UVc
T
(2.4)
where i is the index of the ith cell of the circuit, UT is the thermal voltage and κ
is the subthreshold slope coefficient. By applying Kirchhoff’s current law to the
common node Vc we obtain
N
∑ Iouti = Ib
i=1
26
(2.5)
2.5. RESISTIVE NETWORKS
Vdd
Vdd
M4
M5
Iout
I1
V1
M1
Vout
V2
I2
M2
Vs
V1
V2
−
Iout V
out
+
Vb
Ib
Vb
(b)
M3
(a)
Figure 2.3: (a) Circuit diagram of the transconductance amplifier. The output current
Iout = I1 − I2 is a function proportional to a hyperbolic tangent of the differential input
V1 −V2 . (b) Schematic symbol used to represent the transconductance amplifier circuit.
where Ib is a constant current set by the control voltage Vb . We use this constraint
to solve Eq. 2.4 for Vc , and to derive the dependence of the output current on the
input currents:
Iini
.
(2.6)
Iouti = Ib
∑ j Iin j
The output current of each cell Iouti is directly proportional to its input current
(with a proportionality constant Ib ), but scaled by the sum of all the input currents
∑ j Iin j .
2.5 Resistive Networks
Conventional methods of implementing resistors in VLSI technology include using complex circuits such as the transconductance amplifier of Section 2.3. These
methods have the disadvantage of emulating linear resistors for only a very limited
range of voltages, and of resistance values. If we consider currents, and not voltages, to represent input and output signals of MOSFETs, then we can implement
resistive networks using single transistors instead of resistors. In this configuration, the transistor is linear for a wide range of current values. Furthermore, if the
27
CHAPTER 2. BASIC NEUROMORPHIC CIRCUITS
Iin1
Iin2
Iout2
Iout1
Vd1
Vd2
M1
M3
M2
M4
Vc
Vc
Ib
Vb
Figure 2.4: Two-input current normalizer circuit.
transistor is operated in the subthreshold regime, then the resistance (or conductance) can be varied by changing its gate voltage.
A conventional conductance, G, is defined by the relationship
Iab = G (Va − Vb )
where Iab is the current flowing from terminal a to terminal b, and Va ,Vb the voltages at the corresponding terminals. If the two terminals a and b are the source
and the drain of a subthreshold nFET, the current Iab can be expressed by the usual
transistor relationship:
Iab = I0 e
V
T
κ Ug − UVa
T
− I0 e
V
T
κ U g − Ub
V
T
(2.7)
− UV
(where V0 is an arbitrary
where Vg is the transistor’s gate voltage.
If we define the pseudo-voltage [118] V ∗ = V0 e
scaling voltage), and the pseudo-conductance G∗ =
Iab = G∗ (Va∗ − Vb∗ )
T
Vg
I0 κ UT
V0 e
, then we can write
(2.8)
where the value of pseudo-conductance G∗ depends exponentially on the transistor’s gate voltage Vg . Using Eq. (2.8) we can map any resistive network into an
28
2.5. RESISTIVE NETWORKS
I2
I1
V1
V3
M1
M3
Iin1
M2
I3
V2
Iin2
Figure 2.5: Current diffusor circuit. The current I3 , proportional to (I2 − I1 ), diffuses
from the source to the drain of M3 .
equivalent transistor network: Each resistor Ri of the resistive network can be replaced by a single transistor Mi , provided that all the transistors share the same
substrate (that is, they are all either nFETs or pFETs). If the gate voltages Vgi of
all the transistors are equal, then the transistor network is linear with respect to
current [117]. This linear behavior holds for the entire range of weak inversion,
which may be as much as 6 orders of magnitude in transistor current. Because
all Vgi must be the same, the values of the individual conductances can only be
adjusted by changing the W /L ratio (which modulates I0 ) of each transistor.
An alternative interpretation of the mapping between resistive and transistor
networks uses the concept of a current diffusor [8] illustrated in Figure 2.5.
The currents Iin1 and Iin2 are inputs to the circuit. Assuming that the three
nFETs are identical (that is, their I0 and κ parameters are equal), and solving the
circuit equations, we obtain:
!
V3
I1
I2
κ
.
(2.9)
I3 = I0 e UT
V2 −
V
κ
κ 1
I0 e UT
I0 e UT
If V1 = V2 = Vre f , this relationship can be simplified to yield
κ
I3 = e UT
(V3 −Vre f )
(I2 − I1 ) .
(2.10)
The diffusion current I3 through M3 is proportional to (I2 − I1 ). The proportionality factor can be modulated by either Vre f or V3 . The current-mode diffusor
network (Fig. 2.6(a)) is composed of multiple instances of the circuit of Fig. 2.5.
In this network, current injected at a node j diffuses laterally and decays with distance [80]. Consequently the network acts as spatial low-pass filter; and because
the network is linear, the effects of currents injected at different nodes superimpose.
The diffusor network (Fig. 2.6(a)) has the same network response as the resistive network in Fig. 2.6(b). This equivalence can be demonstrated by comparing
29
CHAPTER 2. BASIC NEUROMORPHIC CIRCUITS
Ioutj-1
Ioutj
VR
Ioutj+1
VR
VG
VR
VG
Vj-1
Ij-1
VG
Vj
Ij
Iinj-1
VG
Vj+1
Ij+1
Iinj
Iinj+1
Iinj
Iinj+1
(a)
Iinj-1
G
Vj-1
Ij-1
G
Vj
Ij
G
Vj+1
Ij+1
Ioutj-1
Ioutj
Ioutj+1
R
R
R
G
(b)
Figure 2.6: Similarities between (a) current-mode diffusor network, and (b) resistive
network.
the transfer functions of the two circuits. Applying Kirchhoff’s current law at
node V j of Fig. 2.6(a):
Iout j − (I j − I j−1 ) = Iin j .
(2.11)
Using Eq. (2.10), we can express I j and I j−1 in terms of the output currents:
κ
I j−1 = e UT
(VG −VR )
κ
UT
(VG −VR )
Ij = e
(Iout j − Iout j−1 )
(2.12)
(Iout j+1 − Iout j ).
(2.13)
Substituting these two relationships in Eq. 2.11 yields
κ
Iout j − Iin j = e UT
(VG −VR )
(Iout j+1 − 2Iout j + Iout j−1 )
30
(2.14)
2.6. DESIGN PRINCIPLES
Similarly, we can apply Kirchhoff’s current law at node V j of Fig. 2.6(b):
Iout j − (I j−1 − I j ) = Iin j .
(2.15)
Because I j = G(V j − V j+1 ) and Iout j = V j /R, I j can be expressed as a function of
Iout j and of Iout j+1 :
1
(Iout j − Iout j+1 ).
(2.16)
Ij =
RG
Combining this equation with Eq. 2.15 yields
1
Iout j − Iin j =
(Iout j+1 − 2Iout j + Iout j−1 ).
(2.17)
RG
The term (Iout j+1 − 2Iout j + Iout j−1 ) in Eq. 2.17 is the discrete approximation of
2
d
the dx
2 operator. Both circuits of Fig. 2.6 approximate the diffusion equation that
characterizes the properties of a continuous resistive sheet [80]:
d2
Vout (x) = Vout (x) −Vin (x)
(2.18)
dx2
where λ is the diffusion length.
In the discrete resistive network of Fig. 2.6(b)
√
the diffusion length λ = 1/ RG, while in the diffusor network of Fig. 2.6(a) the
κ (V −V )
diffusion length is λ = e 2UT G R .
λ2
2.6 Design principles
Many additional subthreshold circuital building blocks can be designed using single transistors, differential pairs, current mirrors and exploiting the physics of
silicon. Some examples are described in the now classical textbook Analog VLSI
and Neural Systems [80] and in the more recent book Analog VLSI: Circuits and
Principles [74]. But what should be stressed is the importance of the design principles used by neuromorphic engineers: complex systems can be built by locally
interconnecting elementary computational elements and exploiting the non linear, recursive characteristics of the bio-inspired architectures thus designed. The
physical constraints imposed by the hardware medium help designers in keeping these non linear systems from diverging even in the (many) cases in which
positive-feedback loops are present. Furthermore, the advantages offered by VLSI
technology allow them to faithfully reproduce the properties of high parallelism,
redundancy and collective computation present in biological systems.
In the next chapters we will apply these design principles to models of selective attention systems implemented both on single-chip systems (at a high level of
abstraction) and on multi-chip systems (at a lower level of abstraction).
But first we analyze from a theoretical perspective WTA networks and show
how to map the equations arising from our analysis into subthreshold circuits.
31
CHAPTER 2. BASIC NEUROMORPHIC CIRCUITS
32
Chapter 3
Winner-take-all network models
A winner-take-all (WTA) circuit is a network of competing cells (neural, software,
or hardware) that reports only the response of the cell that has the strongest activation while suppressing the responses of all other cells. These circuits are typically
used to implement and model competitive mechanisms among populations of neurons. For example, they are used to select specific regions of an input space [123].
Many WTA networks have been implemented both in software [37, 60, 93, 119]
and in hardware [17, 23, 39, 50, 69, 72, 104, 106].
In this Section we analyze a class of WTA networks that emulate biological networks, consisting of a cluster of excitatory neurons that innervate a global
feedback inhibitory neuron. These networks have been implemented in aVLSI
and applied to a wide variety of tasks, including selective attention [13, 49, 120],
auditory localization [71], visual stereopsis [77], smooth pursuit/tracking [32, 44],
and detection of heading direction [54, 90].
3.1 Neural network models
We shall focus on a particularly simple yet powerful model that describes a population of N homogeneous excitatory units that excites a single global inhibitory
unit which feedbacks to inhibit all the excitatory units (Fig. 3.1). For sake of
simplicity, we neglect the dynamics of the system and examine only the steadystate solutions. Dynamic properties of these networks and of other physiological
models of competitive mechanisms are described in detail in Ben-Yishai et al.
[3], Grossberg [37], Kaski and Kohonen [60], Yuille and Geiger [123].
Consider a network (Fig. 3.1), in which the external input to the j th excitatory
neuron is x j , the response of the jth excitatory neuron is ye j , the response of the
inhibitory neuron is yi ; and in which the weights of the synapses from the external
inputs to the excitatory neurons, from the inhibitory neuron to excitatory ones and
33
CHAPTER 3. WINNER-TAKE-ALL NETWORK MODELS
x2
x1
we2
we1
wi1
x3
ye1
wi2
we3
ye2
wi3
weN-2
yeN-2
wl3
wl2
wl1
weN
weN-1
wiN-1
wiN-2
ye3
xN
xN-1
xN-2
yeN-1
wiN
yeN
wlN-2
yi
wlN-1
wlN
Figure 3.1: Network of N excitatory neurons (empty circles) projecting to one common
inhibitory neuron (filled circle), which provides feedback inhibition. Small filled circles
indicate inhibitory synapses and small empty circles indicate excitatory synapses. x 1 . . . xN
are external inputs; ye1 . . . yeN are the outputs of the excitatory neurons; y i is the output of
the inhibitory neuron; we1 . . . weN are the excitatory synaptic weights of the external inputs;
wl1 . . . wlN are the excitatory weights onto the global inhibitory neuron; and w i1 . . . wiN are
the inhibitory weights from the inhibitory neuron onto the excitatory neurons.
from the excitatory neuron to the inhibitory one are we j , wi j and wl j respectively.
We can write this network as
ye j = f (we j x j − wi j yi )
!
N
yi = f
∑ wl j y e j
(3.1)
j=1
where f (·) denotes the transfer function of both excitatory and inhibitory neurons.
This system of coupled equations describes the recurrent interactions between
excitatory neurons and the inhibitory neuron. We explore the behavior of the
system by considering three special cases:
1. The case in which all neurons have a linear transfer function ( f (x) = x).
2. The case in which the neurons are linear-threshold ( f (x) = max(0, x)), and
all external inputs are identical.
3. The case in which the neurons are linear-threshold, and one external input
is much larger than all others.
More general cases using non-linear transfer functions are difficult to solve
analytically; however, they can be studied using numerical simulations.
34
3.1. NEURAL NETWORK MODELS
Linear Units If the neurons are fully linear ( f (x) = x) we can solve the system
analytically:
ye j = w e j x j − w i j yi
yi = ∑ wl j (we j x j − wi j yi )
(3.2)
j
which implies that
w i j ∑ k w lk w e k x k
1 + ∑ k w lk w ik
∑ k w lk w e k x k
yi =
.
1 + ∑ k w lk w ik
ye j = w e j x j −
(3.3)
In the simplified case, we assume that all the weights of each kind are the same:
we j = w e
∀j
wi j = w i
∀j
wl j = w 0
∀j
and so
ye j = w e x j −
we ∑ k x k
1
wi w0 + N
(3.4)
The output of each neuron is proportional to its input, but has a normalizing term
subtracted. Equation (3.4) shows that the response ye j of a linear excitatory neuron
can have both positive and negative values, depending on the inputs x k , on its
connection weights we , wi , w0 and on the total number of excitatory neurons N.
Linear Threshold Units with Uniform External Inputs The half-wave rectification function ( f (x) = max(0, x)) is a more biologically realistic function than
the linear one of the previous case. Neurons with this transfer function have a
response of only positive values.
In this case, the system of equations (Eqs. 3.1) becomes a system of nonlinear coupled equations, and it is not longer possible to obtain a general closed
form solution. However if all external inputs are identical (x j = x0 ∀ j), we can
reduce the system to
ye j = max(0, we j x0 − wi j yi )
!
yi = max 0, ∑ wl j ye j
j
35
(3.5)
CHAPTER 3. WINNER-TAKE-ALL NETWORK MODELS
and if we make the working hypothesis that (we j x0 − wi j yi ) > 0 ∀ j, then we
obtain the linear system:
ye j = w e j x0 − w i j yi
yi = ∑ wl j (we j x0 − wi j yi )
(3.6)
j
which yields
w e j 1 + ∑ k w lk w ik − w i j ∑ k w e k w lk
ye j = x 0
1 + ∑ k w lk w ik
∑ j we j wl j
yi = x 0
1 + ∑ j .wl j wi j
(3.7)
If the synapses from external inputs and those from the inhibitory neuron have
equal strength (we j = wi j = w0 ∀ j), then
ye j =
yi =
x0
1
w 0 + ∑ k w lk
x 0 ∑ j wl j
1
w0
+ ∑ k w lk
.
(3.8)
The hypothesis used to obtain Eq. (3.6) is satisfied for all values of x 0 > 0, w0 > 0,
and wl j > 0 ∀ j. In summary, if all inputs are equal, then all excitatory linear
threshold units have identical outputs which are equal to the input normalized by
a term that is directly proportional to the weights wl j and inversely proportional to
w0 .
Linear Threshold Units with One Input Much Greater than All Others Now
consider the case in which one input (say the external input to unit j 0 , x j0 ) is
much greater than all other external inputs (x j0 x j ∀ j 6= j0 ) and the synaptic
weights are as described above. Again, we assume a priori that the weighted
external excitatory input to unit j0 exceeds the inhibitory input to the same unit
(we j0 x j0 − wi j0 yi > 0) and that the weighted external inputs to all other excitatory
inputs don’t (we j x j − wi j yi < 0 ∀ j 6= j0). Under these assumptions, Eq. 3.5 can
be rewritten
ye j0 = (we j0 x j0 − wi j0 yi )
ye j = 0 ∀ j 6= j0
yi = wl j0 (we j0 x0 − wi j0 yi )
36
(3.9)
3.1. NEURAL NETWORK MODELS
0.8
Unit activity
0.6
0.4
0.2
0
0
20
40
60
Unit position
80
100
Figure 3.2: Simulations of a WTA network comprising 100 linear-threshold units ordered
along one spatial dimension. The input (solid line) is composed of 3 Gaussians. The
outputs are shown for two cases: we j = 1, wi j = 1 and wl j = 0.0250 ∀ j (dashed line);
we j = 1, wi j = 1 and wl j = 0.0325 ∀ j (dotted line).
which can be simplified to yield
ye j0 =
we j0 x j0
1 + wl j0 wi j0
ye j = 0 ∀ j 6= j0
we j0 wl j0 x j0
yi =
.
1 + wl j0 wi j0
(3.10)
This solution satisfies the assumption that we j0 x j0 > wi j0 yi for all values of we j0 ,
x j0 , and wi j0 . It also satisfies the a priori assumption that we j x j < wi j yi as long as
the external input x j0 is sufficiently large with respect to all other x j inputs.
Summarizing: if one external input is much greater than the other inputs, then
all excitatory linear threshold units, except the one receiving the strongest input,
are suppressed. The output of the winning unit is a normalized version of the
input, and the normalizing factor is directly proportional to the connection weights
wi j0 , wl j0 , and inversely proportional to we j0 .
Numerical Simulations It is not possible to obtain a closed form solution for
networks with linear threshold units and any arbitrary input distribution, or networks with arbitrary transfer functions, however numerical simulations are useful
for providing insight into the general computational properties of the network.
37
CHAPTER 3. WINNER-TAKE-ALL NETWORK MODELS
4
2
Input activity
3
Output activity
1.5
2
1
0
0
1
0.5
50
Unit position
0
0
100
(a)
50
Unit position
100
(b)
Figure 3.3: Numerical simulation the same WTA network shown in Fig. 3.2 now with
weight values we j = 1, wi j = 1 and wl j = 0.0275. (a) is the input distribution of increasing
amplitude. (b) Network responses to the three inputs shown in (a).
For example, the simulations shown in Figs. 3.2 and 3.3 explore the response
of a network with f (x) = max(0, x) and N=100 to a more complicated input distribution, consisting of three Gaussians centered at unit positions 20, 50, and
80, and having maximum values of 0.75, 0.5, and 0.35 respectively (see solid
line of Fig. 3.2). The simulations of Fig. 3.2 show the effect of modifying the
excitatory to inhibitory weights wl j (with all other weights set to one). When
wl j = 0.0250 ∀ j, the output is a thresholded version of the input, consisting of 3
peaks of activity. However, when wl j is increased to 0.0325 ∀ j, only the strongest
input peak is reflected in the output.
In the simulations of Fig. 3.3(a), the excitatory to inhibitory weights w l j are
set to an intermediate value of to 0.0275 ∀ j, and the network responds to the two
strongest peaks in the input. The form of the response is invariant to the input
strength (or alternatively, the strength of the we j weights) as shown in Fig. 3.3.
3.2 Non-linear Programming Formulation
The competitive mechanism that emerges from the neural architecture of Fig. 3.1
can also be described mathematically. The following set of non-linear equations
select the largest number among N real numbers by multiplying the neuron output
38
3.2. NON-LINEAR PROGRAMMING FORMULATION
signals ye j by binary-valued constants (α j , either 0 or 1):
!
N
min − ∑ α j ye j
with constraint:
j=1
N
∑ αj = 1
j=1
(α j ∈ {0, 1}).
(3.11)
Systems of non-linear equations with discrete constraints are difficult to solve. We
can simplify the system if we extend the domain of α j to the continuous interval
[0, 1] and include an additional constraint that forces the continuous values of α j
to tend either toward zero or one. :
!
min − ∑ α j ye j with constraints:
j
∑αj = 1
j
∑ α j ln α j = 0.
(3.12)
j
These types of systems can be solved using the Lagrange multipliers method [6].
Solving Eq. 3.12 is equivalent to finding min(L) and max(L), where λ1 and λ2 are
αj
λ 1 ,λ 2
parameters called Lagrange multipliers, and L is the cost function
!
N
L = − ∑ α j ye j + λ 1
N
N
j=1
j=1
∑ α j − 1 + λ2 ∑ α j ln α j .
j=1
(3.13)
If we set λ2 to a constant, then we can find min(L) and max(L) by solving:
αj
λ1
∂
L = −ye j + λ1 + λ2 (ln α j − 1) = 0
∂αj
∂
L = ∑αj −1 = 0
∂ λ1
j
(3.14)
which implies
αj = e
e
λ 1 +λ 2
λ2
y e j −λ 1 −λ 2
λ2
= ∑e
j
39
ye j /λ2
(3.15)
CHAPTER 3. WINNER-TAKE-ALL NETWORK MODELS
and
αj =
e
ye j /λ2
∑k e
yek /λ2
.
(3.16)
This equation approaches the solution of Eq. 3.12 when λ2 is sufficiently small.
This system of constrained non-linear equations can be implemented using
MOSFETs in the subthreshold domain. The circuit that solves this system of equations is the current-mode WTA circuit described in the next Section (see Fig. 4.1).
If we assume that the MOSFETs of the circuit of Fig. 4.1 are in saturation, so
that eq. (2.2) holds, we can write:
Vd
Iout j = I0 e
κ U j − UVc
T
T
If we then apply Kirchhoff’s current law at the common node Vc (Ib = ∑ j Iout j ),
and we observe that the circuit’s output currents Iout j can be expressed as a fraction
α j of the total bias current:
Iout j = α j Ib = α j
N
∑ Ioutk .
k=1
then we can prove the equivalence between the circuit’s response and the system
of equations (Eqs. 3.15):
ye j
Iout j = e λ2
Ib = e
λ 1 +λ 2
λ2
with
Vd j Vc
ye j = λ 2 κ
−
+ ln(I0 )
UT UT
λ1 = λ2 (ln(Ib ) − 1) .
40
Chapter 4
Current mode Winner-Take-All
circuits
CMOS implementations of WTA networks are an important class of circuits widely
used in neural networks and pattern-recognition systems. They implement architectures that select one node, out of many, through a competition mechanism that
depends on the amplitude of the architecture’s input signals. Several types of
WTA circuits have been proposed in the literature [17, 23, 31, 41, 69, 72, 83, 91,
104, 122]. Each WTA circuit was designed with specific optimization constraints
in mind. For example, the circuits proposed in [91] and in [104] are optimal for
high-speed, high-precision applications, whereas the circuits of [83] and [31] are
optimal for pulse-coded neural networks. The WTA circuit proposed by Lazzaro
et al. [72] optimizes power consumption and silicon area usage. It is ideal for
applications that do not require high precision or high speed computation, such as
sensory perception tasks [25, 43, 71]. This circuit, proposed more than ten years
ago, still remains one of the most compact and elegant designs of analog currentmode WTA circuits. It is asynchronous; it responds in real-time; and it processes
all its input currents in parallel, using only two transistors per node, if the output signal is a voltage, or four transistors if the output signal is a current (see
Fig. 4.3(b)). Recently, some extensions to the basic design described in [72] have
been proposed [26, 47, 106]. They endow the WTA circuit with local excitatory
feedback [106] and with distributed hysteresis [26, 47]. Local excitatory feedback
enhances resolution and speed performance of the circuit, providing a hysteretic
mechanism that withstands the selection of other potential winners unless they are
stronger than the selected one by a set hysteretic current. Distributed hysteresis
allows the winning input to shift between adjacent locations maintaining its winning status, without having to reset the network. These enhanced types of WTA
networks are able to select and lock onto the input with strongest amplitude, and
to track it as it shifts smoothly from one pixel to its neighbor [46, 50, 87].
41
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
In this Section we will first analyze the original WTA circuit, first proposed
in [72] and then describe a new version of the current-mode WTA circuit that
contains local excitatory feedback and lateral excitatory coupling (to implement
distributed hysteresis) but that also implements lateral inhibitory coupling and
diode-source degeneration. The interactions between the non-linearities of the
WTA network and the lateral coupling networks produce center-surround spatial
response properties that differ from the ones obtained using conventional spatial
diffusion networks [10, 118]. To make an accurate comparison between the performance of the new WTA network and the performance of the classical WTA
network described in [72], we implemented both circuits on the same chip, using transistors of the same size, common bias pads and the same input sources.
In the next two sections we describe the circuits, present experimental data from
both networks, derive analytically the hysteretic WTA network’s lateral coupling
properties as a function of the circuit parameters, point out the differences to conventional diffusor networks and show the response properties of the circuit when
both lateral excitatory and lateral inhibitory couplings are enabled.
4.1 The original current-mode WTA circuit
The circuit of Fig. 4.1 is a continuous time, analog circuit that implements a WTA
network. It was originally designed by Lazzaro et al. [72] and is extensively used
in a wide variety of applications. The circuit is extremely compact and elegant:
It processes all the (continuous-time) input signals in parallel, using only two
transistors per input cell, and one global transistor that is common to all cells.
Collective computation and global connectivity is obtained using one single node
common to all cells.
An example of a WTA circuit containing only 2 cells is shown in Fig. 4.1. Each
cell comprises a current-controlled conveyor and is connected to a global node Vc .
The WTA network is modular and can be extended to N cells, by connecting
additional cells to the node Vc . Input currents are applied to the network through
current sources which are implemented for example using subthreshold pFETs.
The output signals are encoded both by the Iout1 and Iout2 currents, and the Vd1
and Vd2 voltages. The voltage Vb sets the bias current Ib . Transistors M1 and M2
discharge nodes Vd and so implement inhibitory feedback. Transistors M3 and
M4 implement an excitatory feedforward path by charging node Vc . The overall
circuit selects the largest input current Iin j because cell j provides Iout j ≈ Ib , and
so suppresses all other output voltages and currents (Vdi 6= j ≈ 0, Iouti 6= j ≈ 0). Cell
j wins the competition because its voltage Vd j determines Vc by the exponential
characteristics of the transistor that sinks the output current Iout j (for example, M3
or M4 ).
42
4.1. THE ORIGINAL CURRENT-MODE WTA CIRCUIT
Iin,2
Iin,1
Iout,2
Iout,1
Vd,2
Vd,1
M4
M3
M1
Vc
M2
Vc
Ib
Vb
Figure 4.1: Two cells of a current mode WTA circuit.
We will analyze the behavior of the circuit in the steady-state case using the
methods that we applied for the network model: By providing constant input signals and measuring the outputs after the circuit has settled. We consider three
cases: Both inputs are equal; one input much larger than the other; and two inputs
that differ by a very small amount (small-signal regime).
Both Inputs Equal If the two input currents are equal (Iin1 = Iin2 = Im ) then the
currents flowing into transistors M1 and M2 of Fig. 4.1 are also equal. In this case,
because the gates of M1 and M2 are tied to the same common node Vc , the drain
voltages of M1 and M2 must take the same value (Vd1 = Vd2 = Vm ). As a result, the
output transistors M3 and M4 will have the same gate-to-source voltage difference
(Vgs3 = Vgs4 = Vm −Vc ). If both output transistors are in saturation then the output
currents must be identical. Moreover, Kirchhoff’s current law requires that, at the
common node Vc , Iout1 = Iout2 = Ib /2 (Eq. 2.5).
One Input Much Greater than The Other From eq. (2.1) we can observe that
the subthreshold current flowing through a transistor can be divided into a forward
component, I f , and a reverse component, Ir :
43
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
Iout = I f − Ir = I0 e
κ VG −VS
UT
−e
κ VG −VD
UT
(4.1)
When the transistor’s source voltage Vs is approximately equal to its drain
voltage Vd , Ir becomes comparable to I f .
With this property in mind, we can consider the case in which Iin1 Iin2 . In
this case, the drain voltage of M1 (Vd1 ) will be greater than the drain voltage of M2
(Vd2 ). If the transistor M1 is in saturation (Vd1 > 4UT ), the dominant component
of its drain current will be in the forward direction and its gate voltage Vc will
κ
Vc
increase such that Id1 = I f1 = I0 e UT = Iin1 . Although the two input currents Iin1
and Iin2 are different, the forward component of the drain currents of M1 and M2
are equal (I f1 = I f2 ) because the two transistors have a common gate voltage Vc ,
and both their sources are tied to ground. The drain current Id2 of transistor M2
can only be equal to the input current Iin2 under the following conditions:
I f2 − Ir2 = Iin2
which implies that
Ir2 = I f2 − Iin2
so
Ir2 = Iin1 − Iin2 0.
The reverse component of Id2 becomes significant only if Vd2 decreases enough for
M2 to operate in its ohmic region (Vd2 ≤ 4UT ). In this case, the output transistor
M4 is effectively switched off, and Iout2 = 0. Consequently, M3 sources all the
bias current (Iout1 = Ib ), with Vd1 satisfying the equation I0 eκVd1 −Vc = Ib .
The experimental data of Fig. 4.2 shows the output voltages (Vd,1 and Vd,2 ) and
output currents (Iout,1 and Iout,2 ) of the circuit, in response to the differential input
voltage ∆V which encodes the ratio of the input currents. In this experiment, the
input currents were provided by pFETs operating in the subthreshold regime: The
gate voltage Vin1 of the pFET sourcing current into the first cell was set to 4.3V,
while the gate voltage Vin2 of the pFET sourcing current into the second cell was
set to Vin2 = Vin1 + ∆V . The two traces in each plot show the responses of the two
cells as ∆V was swept from -8mV to +8mV. When ∆V is zero (the input currents
are identical), the output signals of both cells are also identical. When ∆V is large
(one input current dominates), a single cell is selected.
If ∆V is small, the above description is not adequate. Instead, we can compute
the outputs signals of the cells using small-signal analysis [70].
44
4.1. THE ORIGINAL CURRENT-MODE WTA CIRCUIT
1.8
1.6
Output Voltage (V)
1.4
1.2
1
0.8
0.6
0.4
0.2
0
−8
−6
−4
−2
0
2
4
Differential input voltage (mV)
6
8
6
8
(a)
250
Output Current (nA)
200
150
100
50
0
−8
−6
−4
−2
0
2
4
Differential input voltage (mV)
(b)
Figure 4.2: Responses of the two-cell WTA circuit shown in Fig. 4.1. (a) Voltage output
(Vd1 and Vd2 ) versus the differential input voltage. (b) Current output (I out1 and Iout2 ). The
bias voltage Vb = 0.7V. The small difference in the maximum output currents is due to
device mismatch effects in the read-out transistors of the two cells.
Two Inputs Differ by a Small Amount To analyze the circuit in this regime, we
must consider the Early effect of the transistor operating in the saturation region
(Eq 2.1):
Ids = Isat (1 +
45
Vds
)
Ve
(4.2)
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
where Ve is the Early voltage.
Assume that the two input currents Iin1 and Iin2 are initially equal. In this case,
the transistors M1 and M2 operate in the saturation region: The output voltages
Vd1 and Vd2 will settle to a common value, and the output currents Iout1 and Iout2
will both be equal to Ib /2. If we now increase the input current Iin1 by a small
amount δI and apply Eq. (2.1) to transistor M1 of Fig. 4.1, then its drain voltage
Vd1 will increase by
δI
Ve .
(4.3)
Isat
As Vd1 is also the gate voltage of transistor M3 , the Iout,1 will be amplified by
an amount proportional to eδV . The constraint of Eq. (2.5) requires that Iout2 decrease by the same amount in steady state. This reduction means the gate voltage
Vd2 of M4 must decrease by δV .
The gain of the competition mechanism ( δδVI ) in the small signal regime is
directly proportional to the Early voltage Ve and inversely proportional to Isat .
The Early voltage depends on the geometry of the transistors and is fixed at design
time. On the other hand Isat depends on Vc , which changes with the amplitude of
the input currents.
δV =
4.2 The hysteretic WTA circuit
The circuit diagram of one cell of the hysteretic WTA (hWTA) network is shown
in Fig. 4.3. The data shown in this Section was taken from a hWTA network
implemented using a 2 µ m CMOS technology, as linear arrays of 25 cells. The
cell size of the hWTA network is 60 µ m × 100 µ m.
The current source of in Fig. 4.3 that generates the bias current Ib can be implemented using a single n-type MOS transistor. If the transistor operates in weakinversion the transistor is in saturation as long as Vc ≥ 4UT (i.e. Vc ≥ 100mV), its
κ Vb
output current being: Ib = I0 e UT . The term UT represents the thermal voltage,
I0 the zero bias current, and κ the subthreshold slope coefficient [74]. In practical applications Ib can be set by providing an external bias current into a single
diode-connected transistor that has its gate connected to all the network’s bias
transistors (thus implementing a series of current-mirrors). Similarly, the input
current source that generates Iin can be implemented using a p-type transistor operating in the subthreshold regime. Although the WTA circuit can operate both
in the weak and strong inversion regimes, it is typically operated in the weak inversion/subthreshold regime. In this regime the circuit is particularly sensitive to
device mismatch and noise. In the existent implementation, when operated in subthreshold, the circuit selects one single winner if its input currents differ by at least
46
4.2. THE HYSTERETIC WTA CIRCUIT
Vdd
Vgain
Iin
M5
Vdd
Vdd
M3
M4
Iout
Vout
M2
M9
Vinh
Vc
M1
Vex
M8
Iall
M7
M6
Ib
Figure 4.3: Hysteretic WTA cell, with local excitatory feedback, lateral excitatory coupling, lateral inhibitory coupling and diode-source degeneration.
10% among each other, and one input is greater than the others. The low currents
provided by the subthreshold input transistors and by the bias transistor (typically
ranging from fractions of pico-Amperes to hundreds of nano-Amperes) also limit
the circuit’s dynamic response properties. As with the original WTA circuit, the
network’s time constant is dominated by the maximum input current and ranges
from fractions of milliseconds up to fractions of seconds. The detailed, quantitative analysis of the WTA’s dynamic response properties discussed in [72] is valid
also for the circuit proposed here. As the original WTA circuit, this circuit is ideal
for tasks that do not rely on high precision and do not require time constants lower
than a few milliseconds. Fortunatelly, most applications involving perception and
processing of sensory signals fall into this category.
The main differences between the original WTA design and the one described
here are implemented by transistors M5 through M9, as shown in Fig. 4.3. Specifically, transistor M5, together with M3 of Fig. 4.3 implement local excitatory feedback. Transistor M6 implements diode-source degeneration, and transistors M8
and M9 implement inhibitory and excitatory lateral coupling respectively.
4.2.1 Local Excitatory Feedback
The main effect of local excitatory feedback is to introduce a hysteretic behavior
into the WTA network. Once a cell is selected as the winner, a current proportional
47
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
140
120
Output Current (nA)
100
80
60
40
20
0
−200
−150
−100
−50
0
50
Differential input voltage (mV)
100
150
200
Figure 4.4: Response of the hWTA circuit (outer hysteresis plot) superimposed to the
response of the classical WTA circuit (inner central plot). The output of the classical
WTA circuit was shifted vertically by a few nano-amperes for sake of clarity.
to the network’s bias current Ib is sourced back into the cell’s input node through
the current-mirror formed by M3 and M5 (see Fig. 4.3). If the bias current Ib is
a subthreshold current, the proportionality factor of the local excitatory feedback
current is modulated exponentially by the voltage difference (Vdd − Vgain ). Hysteresis is evident because, after a cell has been selected as the winner, to lose its
winning status the cell’s input current has to decrease by an additional amount
equal to the local excitatory feedback current. Figure 4.4 shows the output of a
cell of the hWTA network, superimposed on the output of the corresponding cell
belonging to the classical WTA network, in response to the same input signals.
For both types of WTA networks, input currents were applied only to two neighboring cells, while all other cells received no input. The common mode input
current of the stimulated cells was set by biasing the input p-type transistors with
a constant voltage Vin = 4.2V . The bias current of both WTA networks was generated using a bias voltage Vb = 0.67V . The local excitatory feedback loop of the
hWTA circuit was fully activated (Vgain = Vdd ). The width of the hysteresis curve
can be modulated by changing either the WTA network’s bias current Ib , or the
control voltage Vgain .
The stability properties of the hWTA network are the same as those of conventional winner-take-all circuits with positive feedback, and have been analyzed
in detail in [106]. Similarly, the dynamic response properties of the hWTA network are the same as those of the classical current-mode WTA network described
in [72] and depend mainly on the values of Ib and of the total current entering the
input nodes of the WTA cells.
48
4.2. THE HYSTERETIC WTA CIRCUIT
Classical WTA
Source−diode degenerated WTA
250
Output Current (nA)
200
150
100
50
0
−8
−6
−4
−2
0
2
Differential input voltage (mV)
4
6
8
Figure 4.5: Diode-source degenerated WTA network output and classical WTA network
output.
4.2.2 Diode-source degeneration
Source degeneration, also referred to as emitter degeneration for bipolar transistors, is a classical technique in analog design [36]. It consists of converting the
current flowing through a transistor into a voltage, by dropping it across a resistor
or a diode, and feeding this voltage back to the source of the transistor, to increase
its gate voltage accordingly. At the WTA network level, source degeneration of
the input transistor has the effect of increasing the circuit’s winner selectivity gain.
This is evident in Fig. 4.5, where the output of the diode-source degenerated network is superimposed on the output of the classical WTA network, in response to
the same input signals. This figure shows the output of four cells (two neighboring
cells per type of WTA network) as they change their state from winning to losing
and vice-versa. Small differences in the amplitude of the winning signals are due
to mismatches of the readout transistors (M4 of Fig. 4.3). The data was taken
using the same input stimulus arrangement described for Fig. 4.4. The bias current of both WTA networks was generated using a bias voltage Vb = 0.7V . Local
excitatory feedback (and the hysteretic behavior associated with it) was disabled
by setting the control voltage Vgain to 3V .
By adding just one transistor and connecting its gate to the diode-source degeneration transistor of each WTA cell it is possible to read out a copy of the cell’s
net input current Iall (see M7 of Fig. 4.3). As Iall represents the sum of all of the
currents converging into the WTA cell (namely, the input current Iin , the current
being spread to or from the left and right nearest neighbors and the local excitatory feedback current coming from the top p-type current mirror), it is a useful
49
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
Vdd
Vdd
Vdd
Vdd
Iin
Ir0
V0
Vc
If0
Vex
Ia,0
Ir,i
Vi
Vc
If,i
Vex
Ia,i
Ir,i+1 If,i+1
Vi+1
Vc
Vex
Ia,i+1
IB
Figure 4.6: Simplified WTA circuit, used to analyze the excitatory diffusor network.
measure for visualizing the state of the WTA network.
4.3 Lateral coupling
Lateral coupling is implemented in the hWTA network proposed here by means of
“diffusor” (or “pseudo-conductance”) networks [10, 118]. Diffusor networks are
extensively used in silicon retinas and other types of neuromorphic circuits. In the
circuit proposed in this article the current diffusors are implemented by transistors
M8 and M9 of Fig. 4.3, operated in the subthreshold regime. Specifically, transistor M9 implements lateral excitatory coupling and transistor M8 lateral inhibitory
coupling. Functionally, the inhibitory diffusor network can be used to spatially
decouple the WTA cells, while the excitatory diffusor network can be used to
smooth the input signals, combined with the local excitatory feedback current of
the winning cell (see Section 4.2.1).
4.3.1 Lateral excitation
To study analytically the principle of operation of the excitatory diffusor network
let us neglect, for the time being, the inhibitory diffusor network (i.e. let us set
the inhibition to be global with Vinh = 5V ). Furthermore let us neglect, for the
sake of simplicity, transistors M5, M6, and M7 of Fig. 4.3 and apply a constant
subthreshold input current Iin only to the first node of the network. In this case the
hWTA network reduces to the circuit shown in Fig. 4.6.
As pointed out by the figure, the (subthreshold) currents flowing through the
diffusors can be separated into forward and reverse components: Id,i = I f ,i − Ir,i ,
50
4.3. LATERAL COUPLING
where
Ir,i = I0 e
I f ,i = I0 e
V
T
κ VUex − U i
(4.4)
V
κ VUex − Ui+1
T
T
(4.5)
T
From these equations the following relationship holds:
I f ,i = Ir,i+1
(4.6)
By writing Kirchhoff’s current law at each node i we have:
Ia,i = (I f ,i−1 − Ir,i−1 ) − (I f ,i − Ir,i )
(4.7)
which, using eq. (4.6), turns into:
Ia,i = 2Ir,i − Ir,i−1 − Ir,i+1
(4.8)
but, if Ia,i is a subthreshold current, we can also write:
Ia,i = I0 e
κ UVc
T
(1 − e
V
T
−Ui
)
(4.9)
and, by expressing Vi in terms of Ir,i (using eq. (4.4))
Ia,i = I0 e
κ UVc
T
−e
κ
Vc
UT
− VUex
T
Ir,i
(4.10)
which yields
Ir,i = λ I0 e
−κ
Vc
− Vex
κ UVc
T
− λ Ia,i
(4.11)
UT
UT
where λ = e
.
Substituting eq. (4.11) into eq. (4.8) we obtain the discrete approximation of a
Laplacian:
(4.12)
Ia,i = λ (Ia,i−1 − 2Ia,i + Ia,i+1 )
It follows that
λ
λ
Ia,i−1 +
Ia,i+1
1 + 2λ
1 + 2λ
By using this equation recursively we can write
Ia,i =
Ia,i =
λ
λ2
Ia,i−1 +
(Ia,i + Ia,i+2 )
1 + 2λ
(1 + 2λ )2
(4.13)
(4.14)
If λ 1, eq. (4.14) reduces to
Ia,i ≈ λ Ia,i−1
51
(4.15)
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
If we want to estimate the current flowing to ground through the nth transistor
of the network Ia,n , we can use eq. (4.15) recursively until we reach the first cell
of the network (node 0):
(4.16)
Ia,n = Ia,0 λ n
but, as Ia,0 ≈ Iin (if λ 1), we can write:
Ia,n = Iin e
−nκ
Vc
UT
− VUex
T
.
(4.17)
The term λ is defined as the network’s space constant. The space constant
(and with it, the network’s spatial coupling) is modulated exponentially by the
term −(Vc −Vex ). While Vex is a directly accessible circuit parameter, independent
of other circuit parameters, the voltage Vc depends logarithmically on the input
current. Specifically, for the circuit of Fig. 4.6:
Ia,0 = I0 e
k UVc
T
≈ Iin
(4.18)
With this relationship in mind,we can rewrite λ as a function of Vex and Iin , and
eq. (4.16) reduces to:

n
κ VUex
T
I0 e

Ia,n = Iin 
(4.19)
Iin
According to this finding, an increase in Vex will increase (exponentially) the
amount of spreading and allow more current to flow through the diffusors. Conversely, increases in the amplitude of Iin will narrow the spreading width of the
network and diminish the amount of current flowing through the diffusors. In this
respect this excitatory diffusor network differs from the diffusor networks previously proposed [10] which have the undesirable property of increasing lateral
spreading with increasing amplitude of input signals. In typical applications of
diffusor networks, increasing the range over which spatial averaging takes place
can be an effective strategy if the signal-to-noise ratio of the input signals is not
too high. On the other hand, if input signals are strong, smoothing over large
regions not only might not be useful, but could even be counterproductive.
The experimental data of Fig. 4.7 confirms the theoretical predictions of eq. (4.19).
In Fig. 4.7(a) we stimulated the first cell of the hWTA network with a constant current and measured its response for different values of Vex . As for the theoretical
analysis, lateral inhibition is set to be global (Vinh = 5V ); the effects of the diodesource degeneration transistors can be neglected, as the currents flowing through
the diode-connected transistors (M6 of Fig. 4.3) are of the order of a few nanoamperes. The discontinuity present in the response profile between the first cell of
the network and the second is due to the non-linear nature of the WTA competitive
52
4.3. LATERAL COUPLING
18
16
4
3.5
14
3
2.5
12
Current (nA)
2
10
1.5
1
8
0.5
0
6
5
10
15
20
4
V
ex
2
0
0
2
4
6
8
10
12
Pixel position
14
16
18
20
(a)
1
0.9
0.8
Normalized Units
0.7
0.6
0.5
0.4
0.3
0.2
Iin
0.1
0
0
2
4
6
8
10
12
Pixel position
14
16
18
20
(b)
Figure 4.7: Effect of lateral excitatory coupling on the hWTA network. (a) Output currents Iall (see Fig. 4.3) measured at each cell of the network for four increasing values
of Vex . The inset shows a fit of the data from cells 2 to 20 with an exponential function. (b) Output currents Iall measured for three increasing values of Iin . Each data set is
normalized to the maximum measured current.
mechanism. From the second cell on, the measured current decays exponentially
with distance, as predicted by eq. (4.19), (see inset of Fig. 4.7(a)). In Fig. 4.7(b)
we stimulated the first cell of the network with currents of increasing amplitude
53
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
(for a fixed value of Vex ), measured the network’s response and plotted the data on
a normalized scale. As predicted by eq. (4.19), and as shown in Fig. 4.7(b), lateral
spreading decreases with increasing amplitude of the input current.
4.3.2 Local inhibition
The local inhibitory diffusor network is equivalent in all respects to the local excitatory network. It can be shown, using the same methodology used to analyze the
excitatory diffusor network, that the inhibitory diffusor network’s space constant
depends exponentially on Iin and on Vinh . We can see intuitively how Vinh allows
us to modulate the spatial extent over which the WTA cells compete. In one extreme case inhibition is global (i.e. Vinh = 5V ), and the WTA network allows only
one winner to be active at a time. In the other extreme case, the cells of the WTA
network are completely decoupled from each other (Vinh = 0), and all cells are
allowed to be simultaneously selected as winners. For intermediate values of Vinh
the network can be biased to allow multiple winners to be active simultaneously,
as long as they are sufficiently distant from each other.
Combining the effects of both excitatory and inhibitory networks, we can bias
the hWTA network to exhibit different functional behaviors. Figure 4.8 shows a
comparison between the behavior of the classical WTA network and the behavior
of the hWTA network for different input distributions and different settings of Vex
and Vinh .
Fig. 4.9 shows perhaps one of the most interesting response profiles that can be
obtained combining lateral excitation and local inhibition in this WTA network:
the center-surround response profile was obtained by stimulating the central cell
of the network with a constant input current, for a fixed value of Vex and different
values of Vinh . Lateral excitation and local inhibition can be combined together
to yield center-surround type of response profiles. This is evident in Fig. 4.9,
where we measured the response of the network with lateral excitation enabled
(Vex = 1.825V ), after stimulating its central cell with a constant input current, for
different values of Vinh .
4.4 Applications
Besides being a practical, compact, low-power circuit for generic applications
that require a winner-take-all type of computation, the hWTA circuit is particularly useful in all those applications that involve the processing of sensory signals
and the selection of one or more inputs (e.g. for determining motor actions in
a sensory-motor system). The center-surround response profile of the network
shown in Fig. 4.9 is a rough approximation of a difference of two Gaussians
54
4.4. APPLICATIONS
(DOG), which in turn approximates closely a Laplacian of a Gaussian ∇ 2 G. It
has been argued that this type of operator is ideal for detecting intensity changes
in sensory stimuli [79] and resembles closely the response profile of many types
of neurons, ranging from simple cells in the visual cortex of mammals [59] to
cells in the somatosensory cortex of rats [85], to neurons in the midbrain of barn
owls [62]. Examples of applications that exploit the local excitatory feedback and
distributed hysteresis circuits are presented in the next Chapter.
55
400
400
350
350
300
300
250
250
200
200
Current (nA)
Current (nA)
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
150
100
50
150
100
50
0
0
−50
−50
−100
−100
−150
1
3
5
7
9
11
13
15
Pixel position
17
19
21
23
−150
25
1
3
5
7
9
400
400
350
350
300
300
250
250
200
200
150
100
50
17
19
21
23
25
17
19
21
23
25
150
100
50
0
0
−50
−50
−100
−150
13
15
Pixel position
(b)
Current (nA)
Current (nA)
(a)
11
−100
1
3
5
7
9
11
13
15
Pixel position
17
19
21
23
25
(c)
−150
1
3
5
7
9
11
13
15
Pixel position
(d)
Figure 4.8: Scanned output currents of hWTA network state (top solid-line), of hWTA
output (bottom solid-line) and of classical WTA output (bottom dotted line). (a) Input
currents are applied to cell 1 (Vgs,1 = 1.1V ), cell 12 (Vgs,12 = 1.0V ) and cell 13 (Vgs,13 =
1.0V ), lateral excitation is turned off (Vex = 0V ) and inhibition is global (Vinh = 5V ). Both
the basic WTA network and the hWTA network select cell 1 as the winner. (b) Input
signals and network bias settings are the same as in (a), but lateral excitation is turned on
(Vex = 1.825V ). The basic WTA network keeps on selecting the strongest absolute input
as the winner (cell 1), but the hWTA network selects the region with two neighboring cells
on, because it has a stronger mean activation. (c) Input currents are applied to cells 5, 12
and 16 (Vgs,5 = 1.2V , Vgs,12 = 1.1V , Vgs,16 = 1.0V ), lateral excitation is turned off and
inhibition is global (Vex = 0V , Vinh = 5V ). Both the basic WTA network and the hWTA
network select cell 5 as the winner. (d) Input signals and network bias settings are the
same as in (c), but inhibition is local (Vinh = 3.35V ). If inhibition is not global, the hWTA
network allows multiple winners to be selected, as long as they are spatially distant (cell
16 is selected as local winner, despite cell 12 receives a stronger input current).
56
4.4. APPLICATIONS
20
9
V =5.00V
inh
V =3.00V
inh
V =2.95V
inh
V =2.90V
18
V =3.00V
inh
V =2.95V
inh
V =2.90V
8
inh
inh
7
16
6
14
5
Current (nA)
Current (nA)
12
10
4
3
8
2
6
1
4
0
2
0
−1
1
3
5
7
9
11
13
15
Pixel Position
17
19
21
23
−2
25
(a)
1
3
5
7
9
11
13
15
Pixel position
17
19
21
23
25
(b)
Figure 4.9: Response of the hWTA network to a single cell input (cell 13, with Vgs,13 =
1.1V ) for a fixed value of Vex = 1.825V . (a) Current output for 4 different values of Vinh .
(b) Relative difference between output of the network with global inhibition (Vinh = 5V )
and output of the network with 3 different values of Vinh .
57
CHAPTER 4. CURRENT MODE WINNER-TAKE-ALL CIRCUITS
58
Chapter 5
Neuromorphic vision sensors as
single chip selective attention
systems
Neuromorphic vision sensors are typically analog VLSI devices that implement
hardware models of biological visual systems and that can be used for machine
vision tasks [10, 78]. It is only recently that these hardware models have become
elaborate enough for use in a variety of engineering applications [63]. These types
of devices and systems offer an attractive, low cost alternative to special purpose
DSPs for machine vision tasks. They can be used either for reducing the computational load on the digital system in which they are embedded or, ideally, for
carrying out all of the necessary computation without the need of any additional
hardware. They process images directly at the focal plane level. Typically each
pixel contains local circuitry that performs in real time different types of spatiotemporal computations on the continuous analog brightness signal. In contrast
CCD cameras or conventional CMOS imagers merely measure the brightness at
the pixel level, eventually adjusting their gain to the average brightness level of
the whole scene. In neuromorphic vision chips, photoreceptors, memory elements
and computational nodes share the same physical space on the silicon surface.
The specific computational function of a neuromorphic sensor is determined by
the structure of its architecture and by the way its pixels are interconnected. Since
each pixel processes information based on locally sensed signals and on data arriving from its neighbors, the type of computation being performed is fully parallel
and distributed. Another important feature is the asynchronous operation of neuromorphic sensors, which is preferable to clocked operation for sensory processing, given the continuous nature of sensory signals. Clocked systems introduce
temporal aliasing artifacts that can significantly compromise the time-dependent
computations performed in real-time sensory processing systems.
59
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Several neuromorphic sensors based on models of visual attention have been
presented [13, 44, 88, 120]. These systems typically contain photo-sensing elements and processing elements on the same focal plane, apply the competitive
selection process to visual stimuli sensed and processed by the focal plane processor itself and perform visual tracking operations.
Tracking features of interest as they move in the environment is a computationally demanding task for machine vision systems. The control loop of active
vision systems, comprising motors that steer the visual sensor, relies on the speed
of the specific computation carried out. The stability of system depends on the
latency of the sensory-motor control loop itself. Neuromorphic tracking sensors
can reduce this latency and improve the performance of the active vision system.
Here we describe a tracking architecture that reduces the computational cost of
the processing stages interfaced to it by carrying out an extensive amount of computation at the focal plane itself, and transmitting only the result of this computation, rather than extensive amounts of data representing the raw input image. Despite the approach here followed is very similar to the one followed in previously
published work, the tracking architecture we implemented differs from previously
proposed ones in two key features: it selects high-contrast edges independent of
the absolute brightness of the scene (as opposed to simply selecting the scene’s
brightest region [13, 32, 87]); and it uses a hysteretic WTA network, with positive
feedback and lateral coupling, to lock-onto and smoothly track the selected targets
(different from WTA networks used in other tracking devices [13, 45, 87]).
These features allow systems that use the architecture here proposed to reliably
track natural stimuli in a wide variety of illumination conditions.
5.1 A one-dimensional tracking chip
The tracking architecture here proposed is structured in a hierarchical way and can
be implemented on a single chip device. As the architecture is one-dimensional,
we can design thin, long processing columns in a way to optimize the area used
and increase the number of pixels on the device. Two chips of approximately
2mm × 2mm were fabricated using a standard 2µ m and 1.2µ m CMOS technology
respectively. The processing columns of each chip, are 60λ wide, where λ is the
scalable CMOS design rule parameter, corresponding to 1µ m for the 2µ m process
and to 0.6µ m for the 1.2µ m process. As the circuits are analog and some circuit
elements (such as capacitors) don’t scale with λ , the layouts of the two chips are
slightly different (despite the schematic diagrams are identical). The 2µ m chip
has a pixel pitch of 60µ m and contains 25 processing columns, while the 1.2µ m
has a pixel pitch of 36µ m and contains 40 processing columns.
60
5.1. A ONE-DIMENSIONAL TRACKING CHIP
CENTROID
CENTROID
CENTROID
WINNER TAKE ALL
WINNER TAKE ALL
ADAPTIVE
PHOTORECEPTOR
ADAPTIVE
PHOTORECEPTOR
CENTROID
ADAPTIVE
PHOTORECEPTOR
ADAPTIVE
PHOTORECEPTOR
ADAPTIVE
PHOTORECEPTOR
ADAPTIVE
PHOTORECEPTOR
Figure 5.1: Block diagram of single-chip tracking system. Spatial edges are detected at
the first computational stages by adaptive photoreceptors connected to transconductance
amplifiers. The edge with strongest contrast is selected by a winner-take-all network and
its position is encoded with a single continuous analog voltage by a position-to-voltage
circuit (see Section 5.1.6).
5.1.1 System Architecture
Image brightness data is processed in parallel through five main computational
stages. A block diagram of the device’s architecture is depicted in Fig. 5.1. The
first stage is an array of adaptive photoreceptors [22] that map logarithmically
image intensity into their output voltages. The second stage is composed of an
array of simple transconductance amplifiers, operated in the subthreshold regime,
which receive input voltages from neighboring photoreceptors [80]. The amplitude of their output currents encode the contrast intensity of edges and the sign
their polarity. At the third computational stage the polarity of each edge is gated
so that the sensor selectively responds either to ON edges (dark to bright transitions), or to OFF edges (bright to dark transitions) or to both. The fourth stage
uses a hWTA network (see Section 4.2) which selects and locks onto the feature
with strongest spatial contrast moving at the speed that best matches the photoreceptor’s velocity tuning. Finally in the last stage there is a position-to-voltage
circuit [27], that allows the system to encode the spatial position of the WTA network’s output with a single analog value. The 1.2µ m chip layout of these circuits
is shown in Fig. 5.2.
Fig. 5.3 summarizes the general response properties of the 2µ m chip by showing the outputs of the different computational stages above described. The top
trace of Fig. 5.3(a) shows the responses of the array of adaptive photoreceptors to
61
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Figure 5.2: Portion of layout of the 1.2µ m chip containing 7 processing columns. The
size of each computational stage is evidenced on the right.
a black bar on a white background, imaged onto the chip’s surface using a standard
CS mount 4mm lens with an f -number of 1.2. The two lower traces of the figure are the response of the edge polarity detector circuits, representing the spatial
derivative of the input stimulus. Fig. 5.3(b) shows the response of the position-tovoltage circuit to 11 different winning pixel positions. The figure’s inset displays
11 snapshots of the WTA response to the 11 corresponding spatial positions of the
input stimulus.
5.1.2 Adaptive Photoreceptor Circuit
This photoreceptor circuit, originally designed by Tobi Delbrück [22], has been
used extensively in many neuromorphic sensors. The response of the circuit is
invariant to absolute light intensity, (changing logarithmically with image bright-
62
5.1. A ONE-DIMENSIONAL TRACKING CHIP
Derivative Circuit Output
2.8
2.6
2.4
2.2
Voltage (V)
2
1.8
1.6
1.4
1.2
1
0.8
0
0.001
0.002
0.003
0.004
0.005
0.006
Time − Pixel Position (sec)
0.007
0.008
0.009
0.01
(a)
Centroid Circuit Output
4.5
4.5
4.4
4
4.3
4.2
3.5
Output voltage (V)
4.1
4
0
3
5
10
15
20
25
2.5
2
1.5
1
1
2
3
4
5
6
Data points
7
8
9
10
11
(b)
Figure 5.3: (a) Response of the array of adaptive photoreceptors to a black bar on a
white background (upper trace) and output traces of the edge-polarity detector circuit
(lower traces); (b) Output characteristic of the position-to-voltage circuit. The figure’s
inset contains snapshots of many output traces of the WTA network superimposed, as a
stimulus was moving from left to right. The data points in the main figure represent the
output of the circuit corresponding to the pixel position of the winner in the inset data.
ness). The adaptive photoreceptor exhibits the characteristics of a temporal bandpass filter, with adjustable high and low frequency cut off values. Fig. 5.4 shows
the response of the array of photoreceptors to a moving bar, for two different
63
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
adaptation settings. In Fig. 5.4(a) the adaptation rate was low with adaptation
time constants in the order of hundreds of milliseconds. In Fig. 5.4(b) the adaptation rate was very high such that the photoreceptors adapt quickly to brightness
transients. Because of its adaptation property, the photoreceptor biased in this way
has a response which results in both contrast and speed dependence.
5.1.3 Spatial Derivative Circuit
Spatial derivative is implemented using simple transconductance amplifiers operated in the subthreshold regime. The amplifiers receive input voltages from
neighboring photoreceptors and provide a bidirectional output current that is proportional to the hyperbolic tangent of their differential input [80]. The output
current saturates smoothly as the differential voltage increases (in absolute value)
beyond 200 − 300mV. The possibility of electronically smoothing the input image
(at the adaptive-photoreceptors stage) allows the user to operate the spatial derivative circuit always in its linear range, for a stimulus with fixed spatial frequencies.
Furthermore, the presence of multiple stimuli with contrast high enough to saturate the transconductance amplifiers currents is not going to compromise the sensor’s tracking performance, as the WTA network is able to lock onto the feature
selected (see Section 5.1.5).
5.1.4 Edge-Polarity Detector Circuit
The polarity of edges in the visual scene is encoded by the sign of the transconductance amplifiers’ currents. Each of these currents is fed into a circuit of the
type shown in Fig. 5.5. The amplifier in the left part of Fig. 5.5 together with transistors M1 through M6 implement a current conveyor [111]. This circuit is used
to separate the positive component of the input current Idi f f from the negative
one, and to decouple the spatial derivative stage from the current-polarity selection stage. Negative input currents are conveyed to transistor M6, while positive
ones are flipped through the current mirror M4,M5 and conveyed to M8. Transistors M6 and M8 source their currents to the polarity selection circuit (transistors
M9-M12) [45]. The output current of the polarity selection circuit Iedg represents
OFF edges (the positive component of Idi f f ), ON edges (the negative component
of Idi f f ) or either type of edge (the absolute value of Idi f f ), depending on the control voltage VCT RL and VREF settings. The voltage VBIAS on the positive node of
the amplifier is a constant used to bring the circuit into its correct operating point
and (in typical operating conditions) assumes values ranging from 1V to 2.5V.
The output currents Iedg of all edge-polarity detector circuits are sourced, in parallel, to the elements of the next processing stage: the hysteretic winner-take-all
network.
64
5.1. A ONE-DIMENSIONAL TRACKING CHIP
1
0.9
0.8
Volts
0.7
0.6
0.5
0.4
0.3
0
5
10
15
20
Pixel position
25
30
35
40
(a)
0.25
fast
slow
0.2
0.15
0.1
Volts
0.05
0
−0.05
−0.1
−0.15
−0.2
−0.25
0
5
10
15
20
Pixel position
25
30
35
40
(b)
Figure 5.4: (a) Response of the array of photoreceptors, with a very slow adaptation rate,
to a dark bar on a white background moving from right to left with an on-chip speed of
31mm/s. The DC value of the response has been subtracted. (b) Response of array of
photoreceptors with a fast adaptation rate to the same bar moving at the same speed (left
pointing triangles) and at a slightly slower speed (upward pointing triangles).
5.1.5 Hysteretic WTA Network
This circuit is the one described in Section 4.2. The hysteretic WTA network
implemented on these chips contains an additional cell connected to an external
bias. This additional cell can be used to set a threshold for the spatio-temporal
65
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Vdd
Vdd
M1
−
Idiff
BIAS
Vdd
M6
Vdd
M7
M8
REF
M2
+
M3
CTRL
CTRL
M9
M4
M10
M11
M12
M5
Iedg
Figure 5.5: Circuit diagram of the current polarity detector. Positive I di f f currents are
conveyed to the n-type current mirror M4,M5. Negative Idi f f currents are conveyed to
M6 through the the p-type current mirror M1,M6. Depending on the values of the control
voltage signals VCT RL and VREF , the output current Iedg represents a copy of only one of
the two polarities of Idi f f , or of both polarities of Idi f f (see text for details).
2.5
Volts
2
1.5
1
0
5
10
15
20
Pixel position
25
30
35
40
Figure 5.6: Response of the WTA network to the ON-edge of a bar moving from left
to right at an on-chip speed of 31mm/s. The top trace represents the currents I sum of the
WTA array while the bottom trace represents the voltage outputs of the array of adaptive
photoreceptors.
contrast of edges present in the scene: if the input from external bias is higher
than all other inputs the WTA will signal the absence of high-contrast edges in the
visual scene.
The option of introducing hysteresis in the WTA network might cause prob66
5.1. A ONE-DIMENSIONAL TRACKING CHIP
Vdd
Vdd
I left
Vdd
Iout
Vdd
Iout
i-1
i
Vdd
Iout
i+1
I right
Vout
Figure 5.7: Schematic diagram of position-to-voltage circuit. Example of three neighboring cells connected together.
lems in dynamic environments for which it is necessary to update the winning
pixel position continuously (e.g in the domain of tracking applications). One solution would be to reset the WTA network manually any time it needs to be updated [106]. A more elegant solution is the one of using lateral coupling between
cells as described in Section 4.3. Cells adjacent to the winning pixel will hence
be facilitated in the winner computation process whereas cells in the periphery
will be inhibited. This solution takes into account the assumption that the features
being selected move continuously in space, and ensures that once the WTA network has selected a target and is engaged in visual tracking, it locks onto it and
does not get distracted by possible distracting stimuli in the periphery. Fig. 5.6
shows an example of the response of the WTA network on the 2µ m tracking chip
to a moving high-contrast bar. The top trace of the figure represents the net input
current to the WTA network, and shows the effect of spatial smoothing of the sum
of input currents with the hysteretic current from the winner’s positive feedback
loop. It is clear from this figure that the active winning cell is the one corresponding to pixel 26. The bottom trace shows the istantaneous response of the adaptive
photoreceptor array. The input stimulus was the same one used for the previous
figures: a 1cm-wide black bar on a white background positioned at approximately
17cm away from the focal plane and imaged onto the chip through a 4mm lens
moving from left to right with an on chip speed of 31mm/s.
5.1.6 Spatial Position Encoding Circuit
This circuit consists of a series of voltage followers, using a common global current mirror which receive inputs from a linear resistive network [27] (see Fig. 5.7).
The currents Iout being generated by the WTA network at the previous stage, are
used as bias currents for the followers. As only one Iouti is non-null at any given
67
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Figure 5.8: Picture of the stand-alone tracker board. The neuromorphic sensor is on the
chip beneath the lens. On the left part of the board there is an array of potentiometers used
to bias the chip’s control voltages. On the top there is an LED display, comprising three
display bar lines with their corresponding drivers. The scale in the left part of the figures
is in millimeters.
time, all followers are switched off except for the one connected to the winning
WTA cell. The output of the spatial position encoding circuit Vout thus represents
the position of the winning cell in the array.
5.2 Stand Alone Visual Tracking Device
We attached a 4mm lens to the 2µ m chip and mounted it on a board with external
potentiometers, used to set its bias voltages. The board also has a one-dimensional
LED display with its driver (See Fig. 5.8). The LED display is used to have visual
feedback on the position of the feature selected by the chip. The power supply to
the whole board is provided by a 9V battery (attached to the back of the board)
and a voltage regulator IC.
The system is able to detect and report in real time the position of realistic
types of stimuli moving within its field of view. It performs reliably in a wide
variety of illumination conditions, ranging from dim artificial room illumination
to bright sun light, thanks to the adaptive properties of the photoreceptors at the
input stage. For this applications the bias settings of the photoreceptor stage are
those of fast adaptation rates, as described in Section 5.1.2. Lateral coupling
68
5.3. ACTIVE TRACKING SYSTEM
between neighboring cells was turned off at the photoreceptor stage but turned on
at the WTA level (Vex of Fig. 6.6 was set to 1.2V). Smoothing at the WTA level
was useful to reduce the offsets introduced by the spatial derivative and edgepolarity detector circuits. The hysteretic current of the WTA network (summed
back into the input nodes through the positive-feedback path) was set to be a small
fraction of the maximum possible feed-forward input current (controlled by the
bias voltage of the spatial-derivative transconductance amplifier). All other bias
parameters on the chip were not critical and were set to reasonable subthreshold
voltages (i.e. [0.5V − 0.8V ] for n-type transistors and [4.4V − 4.1V ] for p-type
transistors). The system biased in such a way adapts out the background of a
stationary scene and selects high contrast moving targets present in its field of
view, tracking them as they move smoothly in space. Fig. 5.9(a) shows the output
of the chip in response to a finger moving back and forth in front of the lens in a
laboratory environment with cluttered background. Fig. 5.9(b) shows the output
of the chip in response to a black pen moving at a speed of almost 8000 pixels/s on
a uniform background. As mentioned in Section 5.1, each pixel of the 2µ m chip
is 60µ m wide, and thus the velocity of the target on the focal plane corresponds to
approximately 0.5m/s. The output of the chip is continuous in time, but discrete
in space: the discrete jumps present Fig. 5.9 represent the shifting of the winning
position from one pixel to the next.
5.3 Active Tracking System
We implemented a fully analog active tracking system, by mounting a board with
the 1.2µ m tracker chip and a 4mm lens onto a DC motor (see Fig. 5.10). The
bias settings of the chip were the same used in Section 5.2, except for the value
of the hysteretic current in the positive-feedback path of the WTA network, which
was set to be greater than the feed-forward current Iedg . Specifically, the WTA
bias voltage Vb was set to a value slightly higher than the bias voltage of the
spatial-derivative transconductance amplifier, and the source voltage of the p-type
transistor of the positive-feedback current mirror (Vgain in Fig. 6.6) was set to 5V .
In this way the WTA network locks onto the selected target and allows only the
nearest-neighbor units to win, if the selected stimulus moves (see also Fig. 5.6 is
Section 5.1.5). The position-to-voltage circuits were biased to encode the position
of the winner with voltages ranging from 1 to 4 Volts. The analog output of the
chip was rescaled and amplified (via an ST L272 power amplifier), such that the
selection of features in the right part of the visual field produces positive voltages
and the selection of features in the left part of the visual field produces negative
voltages. The output voltage, with an amplitude directly proportional to the distance of the target’s position from the center of the retina, is used to drive the
69
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
3
2.8
2.6
Edge position (V)
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0
0.5
1
1.5
2
Time (sec)
2.5
3
3.5
4
4
4.5
(a)
4
Stimulus velocity: 7955 pixels/sec
3.5
Edge position (V)
3
2.5
2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
Time (msec)
3
3.5
(b)
Figure 5.9: (a) Output of the system in response to a finger moving back and forth in
front of the chip; (b) Output of the system in response to a pen moving at approximately
8000 pixels/s on a stationary light background. Note the different time scales on the
abscissae.
DC motor. The sensory-motor loop so designed implements a negative feedback
system which attempts to zero the motion of the target on the retina: if a target
appears in the periphery of the visual scene, the sensor will drive the DC motor so
as to orient the sensor’s gaze toward the target. As the projection of the target on
70
5.3. ACTIVE TRACKING SYSTEM
Figure 5.10: Picture of tracker chip mounted on a DC motor. The output of the chip is
sent to a dual-rail power amplifier which drives directly the motor
the retina approaches the center of the pixel array, the output of the system (i.e.
the motor’s power supply) decreases towards zero, bringing the motor to a stop.
In terms of equations we can write, to a first order approximation:
y(t) = Fx(t) − θ (t)
(5.1)
θ̇ (t) = Ay(t)
where x(t) represents the position of the target in the visual space, y(t) represents
its corresponding projection on the retina, θ the rotation angle produced by the
DC motor around its axis and F the optical magnifying factor (see Fig. 5.11(a)).
The term θ̇ (t) corresponds to the motor’s angular velocity, and A to the open-loop
gain of the feedback system. Solving for ẏ(t) we obtain:
ẏ(t) = F ẋ(t) − Ay(t)
(5.2)
If the system is successful in zeroing the motion of the target on the retina
(ẏ(t) = 0) we should measure a retinal slip y(t) directly proportional to the velocity
of the target in the visual space. Fig. 5.11(b) shows traces obtained from the
71
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
system, while it was engaged in tracking a swinging target. The target stimulus
was a black bar on a white background, similar to the one used to characterize
the adaptive photoreceptor circuit, in Section 5.1.2. The position of the target in
visual space was measured optically by the stand-alone tracker board described
in Section 5.2. The target’s velocity was computed off-line by differentiating the
discretized position signal (hence the jitters in the figure). As shown, the measured
response matches, to a first order approximation, the theoretical prediction.
The task performed by the system here described is that of smooth pursuit [101].
This model does not take into account the velocity of the target, but only its position. More elaborate models of smooth pursuit tracking have been proposed [32,
45], but none using fewer components (namely a neuromorphic CMOS sensor, a
DC motor, a power amplifier and a dual power supply). The system here presented
can be considered as the minimal, lowest cost and most compact solution to 1D
visual tracking of natural stimuli.
5.4 Roving Robots
An application domain that is well suited for the visual tracking chip is that of
vehicle-guidance and autonomous navigation. These types of tasks in fact require
compact and power-efficient computing devices which should be robust to noise,
tolerant to adverse conditions induced by the motion of the system (e.g. to jitter
and camera calibration problems) and possibly able to adapt to the highly variable
properties of the world. To test our tracking sensor within this framework, we
successfully interfaced it several types of robotic platforms, ranging from Koala
(K-Team, Switzerland) rovers to LEGO toys (see Fig. 5.12).
In these applications the computationally expensive part of the processing (involving visual preprocessing and target selection) is done in real-time by the neuromorphic sensor. Using simple control algorithms, in conjunction with these
types of sensors, roving robot are able to reliably track lines randomly layed out
on the floor, for a wide variety of conditions (e.g. floors with different texture,
cables of different colors and sizes, extreme illumination conditions, etc.)
Quantitative measurements were carried out using the Koala (K-Team, Lausanne) mobile robot and measuring the performance of the overall system in a
line-following task. The Koala robot measures 32cm in length, 31cm in width and
is 11cm high. It has an on-board Motorola 68331 processor, 12 digital I/O ports
and 6 analog inputs (with 10bit A/D converters), 1 MByte of RAM, and two to
three hours of autonomous operation from its battery. The tracking sensor was
mounted onto a wire-wrap board together with a 4 mm lens with an f -number of
1.2, and it was attached to the front of Koala with the lens tilted towards ground at
an angle of approximately 60o , in a way to image onto the retinal plane the features
72
5.4. ROVING ROBOTS
x
Θ
y
Θ
(a)
4.5
position
retinal slip
velocity
4
3.5
Volts (V)
3
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
1
Time (s)
1.2
1.4
1.6
1.8
2
(b)
Figure 5.11: (a) Setup of the active tracking system as seen from above. The angle θ
represents the angular displacement produced by the DC motor, x represents the target’s
position in the visual space, y represents the distance of the target’s projection on the retina
from its center. The angular velocity θ˙is proportional to y. (b) Chip data measured as the
system was engaged in tracking a swinging bar. The bar’s position (circles) was measured
using a separate (fixed) tracking board, while its velocity (solid line) was computed offline from the discretized position data. The crosses represent the output of the active
sensor used to drive the system’s DC motor.
present on the floor approximately 10cm ahead (see Fig. 5.13(a) and Fig. 5.14(a)).
The bias settings of the chip were the same ones used in the analog active tracking
system, described in Section 5.3. For this specific application example we made
use of the additional node of the WTA network with its input current set by an
external potentiometer. This allowed us to set a threshold value against which we
could compare the contrast of edges present in the visual scene. In the case of
73
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Figure 5.12: Tracker chip mounted on a LEGO robot performing a “target exploration
task”. Using very little CPU power, this robot is able to simultaneously explore (make
random body/head movements), attend (orient the sensor toward high-contrast moving
edges) and pursuit (drive towards the target).
absence of lines to follow, the WTA network selects the external input and the
sensor outputs a unique voltage different from the set of voltages generated by
visual stimuli. The output voltage of the tracking chip is directly applied to one of
the analog input ports of the robot and digitized. To implement the line-following
task Koala uses a very simple control algorithm which reads the tracking chip’s
output Vout and backs up in a random direction if no edge if found. If on the other
hand the tracker chip detects an edge and outputs a valid voltage, the algorithm
shifts and re-scales Vout so that the variable encoding the edge position pos is
zero when the target is in the center of the chip’s visual field; it sets the forward
component of the velocity fwd to a value weighted by a Gaussian function of
pos (fwd is maximum when pos=0 and it decays as |pos| increases); it sets
the rotational component of the velocity rot to a value proportional to pos; and
finally it executes motor commands sending fwd and rot directly to the robot’s
motors. Scaling the forward component of the velocity fwd by a Gaussian function of the line’s eccentricity allows the robot to slow down in curves. If the line
goes out of the field of view of the sensor (e.g. in presence of steep curves),
the algorithm forces the robot to stop and back up until it finds again a line to
follow. The line-tracking algorithm makes very little use of the on-board CPU’s
processing power (leaving it free for other CPU-time demanding processes). The
computationally expensive part of the processing (involving visual preprocessing
and target selection) is done in real-time by the neuromorphic sensor. Using this
simple control algorithm, in conjunction with these types of sensors, the robot is
able to reliably track lines randomly layed out on the floor, for a wide variety of
74
5.4. ROVING ROBOTS
conditions (e.g. floors with different texture, cables of different colors and sizes,
extreme illumination conditions, etc.) [57]. Depending on the bias settings of the
edge-polarity detector circuit, the line-following robot will always make left turns
at road-forks (e.g. if the circuit is selective to OFF edges and the line is darker
than the background) or right-turns. The bias settings can be changed at run-time
by the robot using one of its digital I/O ports.
Fig. 5.13 shows the robot in the process of tracking a line. The line (a high
contrast bar layed onto the floor) is long approximately 323cm and forms a closed
loop of elliptic shape with major axis long roughly 110cm and a minor axis long
90cm. The robot followed the line with an average speed of 5 loops/min (corresponding roughly to 27cm/s). To measure quantitatively the robot’s performance,
we stored a sequence of images (sampled at a rate of 4 frames/s) and applied them
in input to the Kanade-Lucas-Tomasi Feature Tracker [105]. The data was taken
in dim natural light conditions (typical of a cloudy rainy day in Zurich, Switzerland). Fig. 5.13(b) shows the features tracked by the algorithm for a sequence of
150 frames (in which the robot completed 4 loops). The features selected by the
algorithm correspond to a (moving) black cross drawn on the robot’s white top.
Closely grouped features indicate the re-visitation of nearby positions over time.
Features are more dense in the steep parts of the curve because of the slower speed
values that the robot uses, as determined by its control algorithm.
Fig. 5.14 shows an experiment similar to the one described in Fig. 5.13, but
run in a different, less controlled environment. The robot was following a line
of white paper adhesive tape layed on a light blue carpet forming an 8 figure in
an area of approximately 1.3 × 2.5 meters. The illumination conditions were of
bright natural sunlight (typical of sunny summer days in Telluride, Colorado).
The robot was partially covered with a sheet of paper containing bars and crosses
(see Fig. 5.14(a)). The Kanade-Lucas-Tomasi tracking algorithm selects different
corners of the crosses as the robot changes its orientation. Fig. 5.14(b) shows
the output of the tracking algorithm for a sequence of 200 images, sampled at
intervals of approximately 1s, in which the robot makes two full loops around the
8 figure. As in Fig. 5.13(b), white squares are more dense in the steeper parts of
the curve because the robot slows down at those points. The robot is able to follow
the line reliably in both directions, always passing the intersection of the 8 figure,
for a wide selection of (maximum) speeds. At high speeds the robot occasionally
looses the line (in the steep parts of the curve), comes to a stop, backs up and
starts following the line again until it reaches the shallow parts of the curve where
it speeds up again to the maximum speed.
75
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
(a)
(b)
Figure 5.13: (a) Koala robot with neuromorphic sensor mounted on its front. (b) Positions of Koala following a line, sampled at intervals of 0.25 seconds for a period of
37.5 seconds, in which the robot completed 4 loops. The features (white squares) were
obtained by tracking a dark cross drawn on the white top of Koala.
5.5 Extensions of 1-D tracking sensors
As the visual processing circuits operate in a fully parallel way, and the hysteretic
WTA circuit relies on a global competition mechanism that requires one single
76
5.5. EXTENSIONS OF 1-D TRACKING SENSORS
(a)
(b)
Figure 5.14: (a) Koala robot with neuromorphic sensor mounted on its front and a white
sheet of paper with crosses attached on its top, seen from above. (b) Positions of Koala
following a white line on a light-blue carpet floor, sampled at intervals of one second
over a period of approximately 3 minutes. The features (white squares) were obtained by
tracking the bars appearing on the top part of Koala (see text for explanation).
node for the whole array, tracking architectures of the type described above can
easily be extended to two dimensions [13, 32, 56].
77
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
X ENCODER
P OUT
X OUT
X P2V
X SCANNER
Y ENCODER
Y OUT
Y P2V
Y SCANNER
Y DECODER
THR
S OUT
X DECODER
Figure 5.15: Two-dimensional tracker chip architecture.
5.6 A 2-D tracking sensor
The 2-D tracking sensor we present here is an extension of the 1-D devices described in the previous sections. It comprises a core array of 26 × 26 pixels arranged on a hexagonal grid, and peripheral analog and digital input/output (I/O)
circuits (see Fig. 5.15). Each pixel contains a photosensing stage, a hysteretic
WTA circuit, and interfacing I/O circuits. The photosensing stage used in this
sensor differs slightly from the one used in the 1D sensors, in that the adaptive
photoreceptor circuits respond to contrast transients (rather then to absolute contrast). At the output stage, the chip comprises digital output circuits, next to the
analog P2V circuits, to encode the position of the winner. The chip also has
on-chip scanners and address decoders to report the DC response of the adaptive photoreceptor array serially (e.g. for displaying images on monitors) or in a
random-access mode (e.g. for reading out sub-regions of the image). The input
address decoders can be directly connected to the chip’s digital outputs (encoding
the position of the winning pixel) for selectively reading the photoreceptor output
of just that pixel and displaying only the part of the image that is of interest. Regions of interest can be selectively accessed by addressing small windows around
the winning pixel’s address.
78
5.6. A 2-D TRACKING SENSOR
Vdd
Vdt
Mdt
Vdd
Iout
Vdd C
Mon
Mfb
Vdd
Mp
Vprb
Vprd
Vout
Mdp
Moff
Vdx
Vdy
Mn
D
Vprs
Msp
Mdx
Mdy
Vscan
Msnp
Figure 5.16: Differentiating adaptive photoreceptor circuit.
5.6.1 The differentiating adaptive photoreceptor
The photoreceptor circuit with its readout circuitry is shown in Fig. 5.16. The
photoreceptor consists of a photodiode D in series with a transistor M fb in sourcefollower configuration and a negative feedback loop from the source to the gate
of Mfb [22]. The feedback loop consists of a high-gain inverting amplifier in
common-source configuration (Mn , M p ) [22] and a thresholding and rectifying
temporal differentiator stage (Mon , Moff , C) [68]. A sufficiently large positive irradiance change activates a transient current Ion onto capacitor C, that is converted
into a voltage Vdt by the diode-connected transistor Mdt . The photoreceptor voltage Vout can be read out by the address decoder as V prd , if the address decoder
select lines Vdx and Vdy are high. The voltage Vout can also be read out by the
on-chip scanner circuit, via Vprs , to display the sensor output on monitors. The
photosensing sub-circuit, developed by Kramer, has been analyzed in detail presented in [66]. The voltage Vdt is used to provide input to the locally connected
WTA cell.
5.6.2 The 2-D hysteretic winner-take-all circuit
The basic cell of the 2D hysteretic WTA network shown in Fig. 5.17. It is the
2D extension of the circuit described in Section 4.2. The output current Ion of the
photoreceptor stage of Fig. 5.16 is mirrored by Min into node Vex . If the input
79
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
Vdd
Vgain
Vdd
Vdt
Min
Vdd
Mwm
Mwfb
Vh
Vex
Vnet
Mcx
Mcy
Vcx
Vcy
Vex_t
Mht
Mwo
Vh
Mhr
Cs
Mwi
Vdd
Vex_r
Mhb
Vh
Vex_b
Mnetsf
Vl
Mlt
Vinh
Vscan
Msnw
Mnet
Mwb
Vinh_t
Vl
Mlr
Vwtab
Vl
Vinh_r
Mlb
Vinh_b
Figure 5.17: Hysteretic WTA circuit with spatial coupling.
current to the considered pixel is the strongest, the cell “wins” and transistors M cx
and Mcy source an output current proportional to the circuit’s bias current, set by
Vwtab , bringing the output voltages Vcx and Vcy high. Voltages Vcx of all pixels
belonging to common columns are tied together, and voltages Vcy of all pixels
belonging to a common row are tied together. A copy of the WTA bias current,
attenuated exponentially by the bias voltage Vgain is fed back into the input node,
via Mwfb . Transistors Mht , Mhb , and Mhr diffuse the currents coming from Min
and Mwfb to the Vex nodes of the three (top, bottom, and right) neighboring cells.
The bias voltage Vh is used to tune the diffusion space constant and to control
the amount of lateral excitatory coupling. Conversely, transistors M lt , Mlb , and
Mlr implement the inhibitory coupling among neighboring cells. The bias voltage
Vl is used to control the spatial extent of lateral inhibition. If Vl is set to Vdd ,
inhibition is global, and only one pixel in the whole array can win.
The current flowing through Mnet represents the net current that the WTA cell
is receiving, corresponding to sum of the input current from the photoreceptor
circuit, the positive-feedback current and the diffused excitatory currents. The
voltage Vnet , logarithmically proportional to this net current, can be scanned out
to image the overall network activity and view the relative effects of positive feedback current modulation (Vgain ), and excitatory and inhibitory coupling modulations (Vh and Vl respectively).
80
5.6. A 2-D TRACKING SENSOR
Vc
/Vsel
Vsel
Vsel
VP2V
VENC
Figure 5.18: Two-input pass-transistor demultiplexer. The voltage on Vc is routed either
to VP2V (if Vsel is high) or to VENC (if Vsel is low).
5.6.3 Peripheral I/O circuits
This device has analog position-to-voltage (P2V) circuits, and digital position encoding circuits for reading out the output of the WTA network; Furthermore there
is an on-chip scanner circuit [82], for displaying on monitors the outputs of all
photoreceptors, and/or the state of the WTA network activity (see Vnet described
above); and there are input address decoders for accessing the analog output voltage of individual photoreceptors.
WTA output: The voltages Vcx and Vcy of Fig. 5.17 are routed to the periphery
of the architecture core, and fed into a two-input pass-transistor demultiplexer
(see Fig. 5.18). Depending on the value of Vsel (see figure), Vcx and Vcy are routed
either to the analog P2V circuits, or to the position (address) encoders. In this way
only one of the two (analog or digital) modes can be used at one time, but wiring
and possible sources of cross-talk noise are minimized.
Scanner circuits: The scanner reads the output voltages V prs and Vnet of the
array in the sequence used for standard electronic cameras. Each output voltage
of each pixel is buffered via a source follower consisting of an input transistor
(Msp for Vprs and Mnetsf for Vnet ) and a current source that is common to each
column and signal. A vertical shift register sequentially addresses the rows with
the binary voltage signal Vscan via switching transistors (Msnp for Vprs and Msnw
for Vnet ), such that each source follower is driven by the signal of a single pixel
at a time. The output voltages of the column source followers are transferred to
a common output line for each signal via complementary pass transistors that are
sequentially opened, column by column, by a horizontal shift register. The clocks
of the two shift registers are synchronized, such that the output voltages of the
entire array are sequentially read out, row by row. The voltages on the common
lines are buffered to be sensed off chip.
81
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
10
V
y
5
8
0
0
5
6
x
y
V , V (V)
Vx
4
2
0
0
10
20
30
Time (ms)
40
50
60
Figure 5.19: Output of the analog P2V circuits in response to a target moving from the
right top corner to the bottom central part of the sensor’s field of view. The bottom trace
(Vx ) reports the x position of the target. The top trace (Vy ), offset in the plot by 5V for sake
of clarity, reports the y position of the target. The inset shows Vy versus Vx .
Address decoders: When properly driven, the chip’s input address decoders
activate the select lines Vdx and Vdy of the addressed pixel (see Fig. 5.16) and
route the voltage Vprd of that pixel to a unity gain follower of an analog output
pad.
5.6.4 Experimental results
In Fig. 5.19 we show experimental results obtained by enabling the analog P2V
circuits (by setting Vsel of Fig. 5.18 high) and measuring their output voltages Vx
and Vy encoding the x and y position of the winning pixel. The WTA network
was biased in a way to have local excitation (Vh of Fig. 5.17 was set to 0.8V)
and global inhibition (Vl was set to Vdd ). The measurement shows the sensor’s
response to a target appearing in the upper right corner of the sensor’s field of
view and quickly moving downward and to the right. Before the target appeared,
the sensor’s output was sitting around Vx ≈ 0V and Vy ≈ 0V. This is because the
bottom-left pixel (0, 0) receives an additional input current, set by an external bias
voltage Vthr , that sets a global threshold: if no visual stimulus is strong enough
to overcome this threshold, the output is always “zero”. As soon as the target
appeared in the sensor’s field of view, the WTA network switched winner, and
the P2V circuits modified Vx and Vy accordingly. The response time of the WTA
and P2V circuits combined, at the onset of the stimulation, is about 200µ s. The
switching time, required to report a change of winner from one pixel to its nearest
82
5.6. A 2-D TRACKING SENSOR
10
Vy
5
8
0
0
5
V
Vx, Vy (V)
x
6
4
2
0
0
1
2
Time (s)
3
4
Figure 5.20: Output of the analog P2V circuits in response to a target moving from the
bottom left corner to the top right one, on to the top left, to the bottom right, and back to
the bottom left corner.
neighbor, is around 15µ s.
In Fig. 5.20 we show the response of the sensor to a target appearing in
the bottom left corner of the field of view, slowly moving to the top right corner and then completing a figure-eight pattern. Note the different time scales in
Figs. 5.19 and 5.20.
In both experiments the target was the light spot of a laser-pointer shone on a
flat surface 30cm from the chip’s focal plane. Images were focused onto the focal
plane using an 8mm lens with an f −number of 1.2. The sensor’s response does
not depend on the background onto which the target is overlaid, nor does it change
with absolute background illumination.
By switching the state of the demultiplexer connected to the WTA outputs we
disabled the analog P2V circuits and enabled the asynchronous address encoders.
Figure 5.21 shows the the response of two address lines (the least significant and
second-least significant bits of the X address) in response to the same stimulus
of Fig. 5.20 moving from right to left. The non-uniform pulse widths are due to
the asynchronous response of the circuit to the variable speed of the stimulus. In
a second experiment, we placed the sensor in front of a CRT monitor, showed a
white box performing a circular motion on a black background, and sampled the
chip’s address encoder outputs every 25ms over a period of 40s. In this period the
target made 16 full revolutions. The histogram of the sampled addresses is shown
in Fig. 5.22. As the global threshold was set relatively high, address (1, 1) was
selected most often (193 samples, off-scale in the figure).
The response time of the sensor to the sudden appearance of a target is 1.2µ s
83
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
12
10
Voltage (V)
8
6
4
2
0
0
0.05
0.1
0.15
Time (s)
Figure 5.21: Output of least significant bit (bottom trace) and second-least significant bit
(top trace, displaced by 6V) of the ’X’ address in response to a target moving from right
to left.
Fabrication technology
Resolution
Fill factor
Pixel size
Die size
Power supply voltage
Power consumption
scanned output
digital output (scanners off)
analog output (scanners off)
0.8 µ m CMOS 2P 2M
26×26
1.2%
84.8µ m×62.6µ m
3.22 mm×2.56 mm
single 5 V
18.6mW
1.1mW
600µ W
Table 5.1: Characteristics of the visual tracking sensor.
when the digital outputs are enabled, and can be as long as 6µ s when the analog
outputs are enabled. Power consumption is also dependent on the output mode
selected (see Table 5.1).
In this device images are sensed and processed fully in parallel. The pixel
reporting the strongest positive illuminance transient (e.g. induced by a highcontrast moving target) is selected by the WTA network. Its position can be read
out using either analog P2V circuits or digital address encoders. The sustained
response of each photoreceptor and net input current to each WTA can be read out
serially, using on-chip scanners, and displayed on monitors. Additionally, photoreceptor voltages can be individually sensed, using input address decoders. The
84
5.6. A 2-D TRACKING SENSOR
50
40
30
20
10
0
1
6
11
16
21
Y address
26
1
6
11
16
21
26
X address
Figure 5.22: Histogram of the addresses measured from the sensor’s address encoders in
response to a target moving on a circular trajectory.
WTA analog outputs can be used to drive motors and actuators, for example on
small autonomous robots. The WTA digital outputs can be used to drive the input address decoders and read the photoreceptor output of only the winning pixel.
This mechanism could be exploited (e.g. using a microcontroller) to selectively
read out just the regions of the image around the position of the target, rather than
reading out all the raw image data.
85
CHAPTER 5. NEUROMORPHIC VISION SENSORS AS SINGLE CHIP
SELECTIVE ATTENTION SYSTEMS
86
Chapter 6
Multi-chip models of selective
attention systems
The single chip neuromorphic systems of the type describe in the previous Sections have great advantages, such as size, fabrication cost and low power consumption, and extraordinary computational capabilities. However, to design systems with greater computational power and higher flexibility one needs to resort to multi-chip systems. Neuromorphic multi-chip systems generally consist of
systems containing one or more sensory devices, such as silicon retinas, silicon
cochleas or vision sensors, interfaced to one or more chips containing networks of
spiking neuron circuits. These chips can process the sensory signals (e.g. detecting salient regions of the sensory space [51], learning correlations [15], etc.) and
eventually transmit the processed signals to actuators, thus implementing complete neuromorphic sensory-motor systems. Specifically, using multi-chip systems it is possible to implement more elaborate models of selective attention, of
the type described is Section 1.1.1 (see also Fig. 1.1).
6.1 The Address-Event Representation
Consistent with the neuromorphic engineering approach, the strategy used by
neuromorphic devices to communicate analog signals across chip boundaries is
inspired from the nervous system. Analog signals are converted into streams of
stereotyped non-clocked digital pulses (spikes) and encoded using pulse-frequency
modulation (spike rates). These digital pulses are transmitted using an asynchronous communication protocol based on the Address-Event Representation (AER) [9,
20, 73].
87
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Encode
Decode
3
2
3
Address Event Bus
1
2
1
3 21 2
1 23
Inputs
Outputs
Source
Chip
Destination
Chip
Address-Event
representation of
action potential
Action Potential
Figure 6.1: Schematic diagram of an AER chip to chip communication example. As
soon as a sending node on the source chip generates an event its address is written on the
Address-Event Bus. The destination chip decodes the address-events as they arrive and
routes them to the corresponding receiving nodes.
6.1.1 The Address-Event I/O Interface
In AER, each analog element on a sending device is assigned an address. When a
spiking element generates a pulse its address is encoded and instantaneously put
on a digital bus, using asynchronous logic (see Fig. 6.1). In this asynchronous
representation time represents itself, and analog signals are encoded by the interspike intervals between the addresses of their sending nodes. Address-events are
the digital pulses written on the bus. By converting analog signals into a digital
representation, we can take advantage of the considerable understanding and development of high-speed digital communications, emulating the parallel, but slow,
connectivity of neurons using axons with fast, but serial, connectivity through
digital busses. We basically trade-off “space” (the number of pins and wires that
would be required to transmit spikes from each individual neuron on a chip) with
“time”, exploiting the fact that our neuromorphic circuits have typical time constants of the order of milliseconds and digital busses have bandwidths of the order
of MHz.
To manage collisions (cases in which two or more neurons attempt to access
the AER bus simultaneously) we use on-chip digital, asynchronous arbitration
88
6.1. THE ADDRESS-EVENT REPRESENTATION
circuits. As the channel only sends the addresses of active units, the system’s
bandwidth is devoted to those units that are spiking. Redundancy reduction in the
signal (e.g. spatial and temporal adaptation) before the channel can dramatically
reduce the bandwidth needed for a given population of cells.
An important consequence of using a digital chip-interconnect scheme is the
relative ease with which these chips are able to interface to existing digital hardware. From the simulation of input spike trains to quickly re-configuring a network’s connectivity via address routers, the flexibility of software can be used to
produce a more powerful modeling tool. From the engineering perspective, the
translation of our analog signals into a stream of asynchronous spikes not only
facilitates communication, it opens up new possibilities for the efficient implementation of both computation and memory in the spike domain.
In the case of single-sender/single-receiver communication, a simple handshaking mechanism ensures that all events generated at the sender side arrive at
the receiver. The address of the sending element is conveyed as a parallel word
of sufficient length, while the handshaking control signals require only two lines.
Systems containing more than two AER chips (e.g. with AER sensors at the input
stages, AER networks on neurons for doing the computation and AER read-out
modules to drive possible actuators) are constructed by implementing special purpose off-chip arbitration schemes [19, 20].
6.1.2 Address-Event Neuromorphic Sensors
The two most successful types of neuromorphic sensors developed in previous
years are silicon cochleas [33, 103] and silicon retinas [10, 66, 78]. The former
implement detailed models of the human cochlea, producing outputs that could
be useful for artificial speech recognizers, or for hearing aids. The silicon retinas
on the other hand implement models of the retina’s early processing stages and
typically produce images that represent local changes in contrast (see Fig. 6.2 for
an example of a silicon retina image).
Until recently these sensory devices transmitted their information off-chip using conventional techniques, such as multiplexers or scanners. With the advent
of the Address-Event Representation we now have also AER silicon retinas and
cochleas that produce streams of address-events representing the activity of each
individual pixel. With these AER sensors the bandwidth used for signal transmission is allocated optimally only for those pixels that are active (as opposed
for example to scanning techniques, that allocate the same bandwidth for all the
pixels, independent of their activity). The address-events (spikes) generated by
these sensors can then be processed by synapses and networks of spiking neurons
implemented on one or more receiving AER chips.
89
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Figure 6.2: Image captured from a 168×132 silicon designed by Jörg Kramer, (at the
Institute of Neuroinformatics, Zurich), while the subject was moving.
6.2 A 1-D AER selective attention chip
Here we present a 1-D AER chip that contains circuits useful for emulating in
real time saliency-based selective attention systems of the type described in Section 1.1.1.
Several VLSI systems for implementing visual selective attention mechanisms
have been presented in the past [13, 44, 88, 120]. These systems (as the ones described in Chapter 5) contain photo-sensing elements and processing elements on
the same focal plane, and typically apply the competitive selection process to visual stimuli sensed and processed by the focal plane processor itself. Unlike these
systems, the device proposed here is able to receive input signals from any type of
AER device. Therefore input signals need not arrive only from visual sensors, but
could represent a wide variety of sensory stimuli obtained from different sources.
The selective attention chip proposed is also one of the first of its kind able not
only to receive AER signals, but also to transmit the result of its computation using
the Address-Event Representation. With both input and output AER interfacing
circuits the chip can be thought of as a VLSI “cortical” module able to receive and
transmit spike trains.
In general, decoupling the sensing stage from the processing stage and using the Address-Event Representation to transmit and receive signals has several advantages: a multi-chip AER attention system could use multiple sensors to
construct a saliency map; visual input sensors could be relatively high-resolution
silicon retinas and would not have the small fill factors that single-chip 2D attention systems are troubled with; top-down modulating signals could be fused with
90
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
the bottom-up generated saliency map to bias the selection process; multiple instances of the same selective attention chip could be used to construct hierarchical
selective attention architectures; and sensors could be distributed across different
peripheral regions of the neuromorphic system, as is the case for real biological
systems.
6.2.1 System Overview
The 1-D selective attention chip contains a one-dimensional architecture of 32
locally coupled elements that compete globally for saliency. Global competition
is achieved using a hysteretic WTA network. Each element comprises, next to
the hysteretic WTA cell, synaptic circuits and integrate and fire neurons. The
synapses receive off-chip address-events and integrate them into analog current
signals that are sourced into the WTA network. The integrate and fire neurons are
used to transmit address-events off chip, and to implement the dynamics of the
selective attention model.
It has been argued that neural circuits with these types of connectivity patterns
are valuable models of cortical processing and can account for many response
properties of cortical neurons [38, 102]. The analog circuits of the WTA network
implement a simplified abstract model of these types of neural networks in which
each element of the WTA network can be regarded as a local population of excitatory neurons interconnected among each other with lateral nearest-neighbor
connections. From this point of view the architecture of the selective attention
chip is equivalent to the neural network diagram depicted in Fig. 6.3.
The input excitatory synapses shown in the bottom part of the figure receive
spike trains from external devices and provide an excitatory current to the local
populations of neurons. These populations compete among each other by means
of recurrent interactions with a global inhibitory cell (not shown in the figure) and
reach a steady state in which typically all populations except the one receiving
the strongest net excitation are silent. Each local population projects to one of
the output inhibitory neurons shown in the top row of Fig. 6.3. For typical operating conditions, only the inhibitory neuron connected to the winning population
of cells will be active at any given time. The output neuron projects its spikes
both to AER interfacing circuits, for transmitting the result of the computation to
further processing stages, and to local on-chip inhibitory synapses (equivalent to
those shown in the lower part of Fig. 6.3). The resulting inhibitory current is subtracted from its corresponding input excitatory current. This negative feedback
loop implements the so called inhibition of return (IOR) mechanism [35, 109]:
after selecting a salient stimulus, the WTA network stimulates the output neuron
connected to the winning population of cells. The spikes that the output neuron generates are integrated by the corresponding inhibitory synapse. As the in91
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Output spike train
Input spike trains
Figure 6.3: Biologically equivalent architecture of selective attention model. Input spike
trains arrive from the bottom onto excitatory synapses. The populations of cells in the
middle part of the figure are modeled by a hysteretic WTA network with local lateral connectivity. Inhibitory neurons, in the top part of the figure, locally inhibit the populations
of excitatory cells by projecting their activity to the inhibitory synapses in the bottom part
of the figure.
hibitory current increases in amplitude, the effect of the input excitatory current
is diminished and eventually the WTA network switches stable state, selecting a
different cell as the winner. Note how the integrate and fire neurons, necessary for
the Address-Event I/O interface, allowed us to implement the IOR mechanism by
simply including an additional inhibitory synaptic circuit. This solution is quite
elegant and compact in comparison with previously proposed alternatives [86].
Depending on the dynamics of the IOR mechanism, the WTA network will
continuously switch the selection of the winner between the strongest input and
the second-strongest, or between the strongest and more inputs of successively decreasing strength, thus generating focus of attention scan-paths, analogous to eye
movement scan-paths [121]. The dynamics of the IOR mechanism depend on the
time constants of the excitatory and inhibitory synapses, on their relative synaptic strengths, on the input stimuli and on the frequency of the output inhibitory
neuron.
6.2.2 The Excitatory and Inhibitory Synapses
The input pulses being received by the chip, reach, at each cell location, a currentmirror integrator [9] which models an excitatory synapse. This circuit, shown in
Fig. 6.4(a), uses only 4 transistors and one capacitor. The input pulse is applied to
transistor M1, which acts as a digital switch. Transistor M2 is biased by the analog voltage Vw to set the weight of the synaptic strength. Similarly, the voltage Ve
92
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
Vdd
Ve
M3
Vout
Iinh
M4
Vq
M1
Vw
Vs
Vd
Iex
M2
M4
Vca
M3
M2
M1
Vi
(a)
(b)
Figure 6.4: (a) Excitatory synapse circuit. Input spikes are applied to M1, and transistor
M4 outputs the integrated excitatory current Iex . (b) Inhibitory synapse circuit. Spikes
from the local output neurons are integrated into an inhibitory current I inh .
on the source of transistor M3 can be used to set the time constant of the synapse.
With each input pulse, a fixed amount of charge is stored on the capacitor and
the amplitude of the output current Iex is increased. If no input is applied (i.e. no
current is allowed to flow through M3), the output current Iex decays with a 1t profile. Similarly, the inhibitory synapse integrates the spikes generated by the output
neurons. As with the excitatory synapse, we implemented the inhibitory synapse
using a current-mirror integrator circuit (see Fig. 6.4(b)). The principle of operation of this circuit is very similar to the one described for the excitatory synapse,
with the difference that the output current of the circuit is of opposite polarity.
Every time the local output neuron projecting to this synapse generates a spike,
its output voltage Vout rises to the positive power supply rail (see Section 6.2.4)
allowing the transistor M1 to charge the capacitor connected to it. The amount of
charge passed can be controlled by Vq (which thus determines the strength of the
synaptic weight). As for the circuit of Fig. 6.4(a), the voltage at the source node of
the diode connected transistor (Vi on the source of M2 in Fig. 6.4(b)) controls the
time constant and the gain of the synapse. Transistor M4 is a decoupling (cascode)
element, used to decrease second order effects. Specifically, it is used to minimize
the effect of the Miller capacitance of transistor M3. The voltage Vca on the capacitor is a measure of the spiking history of the neuron projecting to the inhibitory
synapse. It determines (with an exponential relationship) the amplitude of the output current Iinh . Inhibitory currents Iinh are subtracted from excitatory currents Iex
coming from the input synapses to provide the net input current Iin to the WTA
93
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
(a)
(b)
160
Excitatory Synaptic Current (nA)
Excitatory Synaptic Current (nA)
160
140
120
100
80
60
40
120
100
80
60
40
20
20
0
140
0
0.5
1
1.5
2
2.5
3
Time (s)
3.5
4
4.5
(c)
0.5
1
1.5
2
2.5
3
Time (s)
3.5
4
4.5
(d)
Figure 6.5: (a) Response of an excitatory synapse to single spikes, for different values
of the synaptic strength Vw (with Ve = 4.60V). (b) Normalized response to single spikes
for different time constant settings Ve (with Vw = 1.150V). (c) Response of an excitatory
synapse to a 50Hz spike train for increasing values of Vw (0.6V, 0.625V, 0.65V and 0.7V
from bottom to top trace respectively). (d) Response of excitatory synapse to spike trains
of increasing rate for Vw = 0.65V and Ve = 4.6V (12Hz, 25Hz, 50Hz and 100Hz from
bottom to top trace respectively).
network (see Fig. 6.6). We characterized the excitatory synapse of Fig. 6.4(a)
by applying single pulses (see Fig. 6.5(a,b)) and by applying sequences of pulses
(spikes) at constant rates (see Fig. 6.5(c,d)). Figure 6.5(a) shows the response
of the excitatory synapse to a single spike for different values of Vw . Similarly,
Fig. 6.5(b) shows the response of the excitatory synapse for different values of Ve .
Changes in Ve modify both the gain and the time constant of the synapse. To better
visualize the effects of Ve on the time evolution of the circuit’s response, we nor94
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
Vdd
Vdd
Vinj
Vdd
Iinji-1
Vex
Iini-1
Vdd
Iinji
Vex
Iini
Vinh
Vb
Vdd
Vdd
Vinj
Iinji+1
Vex
Iini+1
Ineti
Ineti-1
Vinh
Vinj
Ineti+1
Vinh
Vb
Vb
Figure 6.6: Schematic diagram of the WTA network. Examples of three neighboring
cells connected together.
malized the different traces, neglecting the circuit’s gain variations. Figure 6.5(c)
shows the response of the excitatory synapse to a constant 50Hz spike train for
different synaptic strength values. As shown, the circuit integrates the spikes up
to a point in which the output current reaches a mean steady-state analog value,
the amplitude of which depends on the frequency of the input spike train, on the
synaptic strength value Vw and on Ve . Figure 6.5(d) shows the response of the
circuit to spike train sequences of four different rates for a fixed synaptic strength
value.
6.2.3 The Hysteretic Winner-Take-All Network
The basic cell of the hysteretic WTA network is based on the circuits described in
Section 4.2. Figure 6.6 shows the circuit schematic of three WTA cells connected
together.
Input is applied to each node of the network through the currents Iin , corresponding to the sum of excitatory synaptic current Iex (see Fig. 6.4(a)), with te
inhibitory synaptic current Iinh (see Fig. 6.4(b)). If (and only if) the cell considered
is the winning one, the p-type current-mirror in the top part of the circuit produces,
at the same time, the output current Iin j and a hysteretic current, summed through
a positive feedback loop (indicated by a dashed arrow) back into the input node.
The hysteretic current is a copy of the WTA bias current (set by Vb on the bias
transistors in the lower part of the circuit). The amplitude of the output current
Iin j is independent of the input current Iin and can be modulated by the control
voltage Vin j at the source of the output transistor.
The n-type current mirrors in the lower half of the figure are used both to produce an output current Inet and to enhance the response of the WTA network (by
95
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
producing source degeneration of the input transistor). The source degeneration
technique consists of converting the current flowing through the input transistor
into a voltage, by dropping it across a diode, and feeding this voltage back to the
source of the input transistor, to increase its gate voltage. At the network level,
source degeneration of the input transistor has the effect of increasing the circuit’s
winner selectivity. As the current Inet represents the sum of all of the currents converging into the same cell (namely, the input current Iin , the current being spread
to or from the left and right nearest neighbors and the hysteretic current coming
from the top p-type current mirror), it is a useful measure for visualizing the state
of the WTA network.
Each WTA cell is connected to its immediate neighbors through pass transistors controlled by Vex (in the upper half of the figure). and by Vinh (in the lower
part of the figure). In the two extreme cases they either completely decouple the
network allowing each individual cell to be a winner (Vinh = 0V ), or they globally
connect all the cells, forcing the network to chose only one winner (Vinh = Vdd ).
In intermediate cases, modulation of Vinh determines the spatial extent of the local
regions over which competition takes place, thus allowing the network to select
multiple winners.
In Fig. 6.7 we show examples of scanned Inet measurements, having applied
constant input currents to the WTA nodes. We generated a spatial input stimulus such that pixel 21 received an input current of approximately 350nA, pixels
9 through 13 received input currents of approximately 300nA and the remaining
pixels received currents ranging from 150nA to 250nA. Small variations across
neighboring pixels are mainly due to device mismatch effects. Figure 6.7(a) shows
the state of the network in the case at which no lateral excitatory coupling is
present (Vex = 0V ). The network selects pixel 21 as the winner and supplies the
hysteretic current to it (which has an amplitude of approximately 300nA, with the
current bias settings). Figure 6.7(b) shows the status of the network in the case in
which lateral excitatory coupling is applied (Vex = 1.5V ). Lateral coupling effectively smooths spatially the input currents, thus decreasing the net input current
to pixel 21. In this condition the WTA network selects pixel 9 as the winner (and
sums the hysteretic current to it). This example points out the two main characteristic features of the spatially coupled hysteretic WTA network: spatial smoothing
and positive feedback. Spatial smoothing is implemented by modulating the gate
voltage Vex of the excitatory pass-transistors. It can be used to reduce the effect of
noise and offsets in the input transistors, and to favor the selection of spatial regions with high average activity (as opposed to strongly activated isolated pixels).
By adding a copy of the WTA bias current to the winning cell’s input node, the
positive feedback loop effectively produces a hysteretic behavior. Hysteresis is
used to enforce the selection of the winner and is induced by summing a constant
hysteretic current to the input of the winning pixel, through the positive feedback
96
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
WTA net input current (nA)
700
600
500
400
300
200
100
0
1
6
11
16
21
Pixel Position
26
31
26
31
(a)
WTA net input current (nA)
700
600
500
400
300
200
100
0
1
6
11
16
21
Pixel Position
(b)
Figure 6.7: Net WTA input current Inet values at each pixel location for a static control
input. Pixels 5 through 13 have input currents slightly lower than pixel 21. All other
pixels receive weaker input stimuli. (a) In the absence of lateral coupling (Vex = 0V ) the
network selects pixel 21 as the winner. (b) In the presence of lateral coupling (Vex = 1.5V )
the network smooths spatially the input distribution and selects pixel 9 as the winner.
loop implemented with the p-type current mirrors, in the top part of Fig. 6.6.
97
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
6.2.4 The Output Inhibitory Integrate-and-Fire Neuron
The output inhibitory neurons implemented on this chip are circuits of the type
shown in Fig. 6.8. They are non-leaky integrate-and-fire neurons based on circuits
proposed by Mead [80] and by van Schaik [115].
Input is applied to this circuit by injecting a constant DC current Iin j , sourced
from the WTA network (see Fig. 6.6), into the membrane capacitance Cm . A comparator circuit compares the membrane voltage Vmem (which increases linearly
with time if the injection current is applied) with a fixed threshold voltage Vthr .
As long as Vmem is below Vthr , the output of the comparator is low and the neuron’s output voltage Vout sits at 0V. As Vmem increases above threshold though,
the comparator output voltage rises to the positive power supply rail and, via the
two inverters, also brings Vout to the rail. A positive feedback loop, implemented
with the capacitive divider C f b Cm , ensures that as soon as the membrane voltage
Cfb
[80].
Vmem reaches Vthr , it is increased by an amount proportional to Vdd Cm +C
fb
In this way we avoid the problems that could arise with small fluctuations of Vmem
around Vthr . When Vout is high, the reset transistor at the bottom-left of Fig. 6.8 is
switched on and the capacitor Cm is discharged at a rate controlled by Vpw , which
effectively sets the output pulse width (the width of the spike). The membrane
voltage thus decreases linearly with time and as soon as it falls below Vthr the
comparator brings its output voltage to zero. As a consequence the first inverter
sets its output high and switches on the n-type transistor of the second inverter,
allowing the capacitor Cr to be discharged at a rate controlled by Vr f r . This bias
voltage controls the length of the neuron’s refractory period: the current flowing
into the node Vmem is discharged to ground and the membrane voltage does not
increase, for as long as the voltage on Cr (Vout ) is high enough.
Figs. 6.9(a) and (b) shows traces of Vmem for different amplitudes of the input
injection current Iin j and for different settings of the refractory period control voltage Vr f r . The threshold voltage Vthr was set at 2V and the bias voltage Vpw was
set at 0.5V, such that the width of a spike was approximately 1ms. Figs. 6.9(c)
and (d) show how the firing rate of the neuron depends on the injection current
amplitude. These plots are typically referred to as FI-curves. We can control the
saturation properties of the FI-curves by changing the length of the neuron’s refractory period. The error bars show how reliable the neuron is, when stimulated
with the same injection current. We changed the injection current amplitude by
modulating the control voltage Vin j (see Fig. 6.6). As the injection current changes
exponentially with the control voltage Vin j , the firing rate of the neuron follows the
same relationship. To verify that the firing rate is linear with the injection current
we can view the same data using a log-scale on the ordinate axis (Fig. 6.9(d)).
The 1-D AER selective attention chip can receive signals analogous to spike
trains at its input interface, that can represent sensory information, in their tempo98
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
Cfb
Vdd
Vdd
Vpb
Vmem
+
Cm
Iinj
Vthr
Vout
−
Vpw
Vrfr
Vdd
Vpu
Vior
Ax
Rx
Ry
Ay
Cr
Figure 6.8: Circuit diagram of the local inhibitory integrate-and-fire neuron.
ral structure. In the first instance, to stimulate and test the selective attention model
with well controlled input signals, we interfaced the chip to a workstation. In a
more general scenario, the chip can be interfaced to analog VLSI neuromorphic
sensors that use the same AER interfacing circuitry to construct more elaborate
multi-chip systems [55].
6.2.5 Testing the 1-D selective attention chip
To provide inputs to all the synapses of the chip we developed a program on a
workstation that continuously addressed all the pixels in a serial fashion, exciting them at the times specified by a look-up table. The rate used to sequentially
address the pixels was fast compared to the typical firing rates of input signals
(chosen in a range between 10Hz and 80Hz). The input synapses, which have
time constants on the order of milliseconds, thus appeared to be receiving spikes
in parallel. The I/O card, in conjunction with the software we used, was able to
cycle through all 32 pixels of the network at a rate of 500Hz.
Control inputs In our control experiment we stimulated input synapses 1 through
9, 11 through 19 and 21 through 32, with spike trains at a constant rate of 10Hz.
Synapses at pixels 10 and 22 were stimulated with spike trains at rates of 50Hz
and 80Hz respectively. In Fig. 6.10 we show oscilloscope traces representing
99
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
(a)
(b)
V inj
Vinj
(c)
(d)
Figure 6.9: Integrate-and-fire neuron characteristics. (a) Membrane voltage for two different DC injection current values (set by the control voltage Vin j ). (b) Membrane voltage
for two different refractory period settings. (c) Firing rates of the neuron as a function
of current-injection control voltage Vin j plotted on a linear scale. (d) Firing rates of the
neuron as a function of Vin j plotted on a log scale (the injection current increases exponentially with Vin j ).
the scanned net input currents to the WTA network Inet of all 32 pixels, in the
top traces, and the inhibitory currents Iinh (see Section 6.2.2) of all 32 inhibitory
synapses, in the bottom traces. To illustrate the circuit’s dynamics, we increased
the persistence of the oscilloscope’s display. Figure 6.10(a) shows the response
of the system to the onset of the stimulation. As the input spike trains start to
arrive, and the excitatory synapses integrate them, the net input current of each
pixel increases from zero (corresponding approximately to the level at the central
axis of the display) to a mean maximum steady state value (see also Fig. 6.5).
The net input currents at pixels 10 and 22 increase more rapidly, compared to all
100
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
other pixels, in accordance with the rate of the spike trains arriving at their input
synapses (top trace). At the onset of the stimulation, all output inhibitory neurons
are silent and the inhibitory synapses receiving spikes from the output neurons do
not generate any increase in the amplitude of the inhibitory synaptic currents Iinh
(bottom trace). Ideally all synapses receiving the same inputs should integrate the
spike trains to the same mean steady-state value. As shown in the figure, this is
not the case; the traces in Fig. 6.10 show the amount of variability present in the
synaptic currents due to device mismatches created in the chip’s fabrication process. The offsets introduced by device mismatches are different from chip to chip,
but do not change over time. Figure 6.10(b) shows the response of the system
after a few seconds of stimulation. As expected, all excitatory synapses reached a
mean steady-state value, but the network keeps on switching from selecting pixel
10 as the winner, to selecting pixel 22 as the winner and back again. Specifically,
Fig. 6.10(b) shows the situation in which pixel 22 has just been de-selected and
pixel 10 selected. At the pixel position 22 the inhibitory synaptic current is in the
process of decreasing back to zero (bottom trace), while at the tenth pixel position the WTA hysteretic current has just been added to the net input (top trace),
the output neuron has been activated and the current of the inhibitory synapse is
increasing with every output neuron’s spike (bottom trace).
In a second experiment we measured the membrane potential of single neurons, rather than using time-multiplexing to scan all of the pixels’ outputs. The
input stimulus was the same one used in the first experiment: all pixels were excited with 10Hz spike trains except for pixels 10 and 22 which were receiving
spikes at rates of 50Hz and 80Hz respectively. Given that our system behaves in
real-time, we were able to make multiple recordings without having to wait for
the long simulation/computation times, which are typical of software algorithms
modeling networks of spiking neurons. To measure statistical properties of the
system and to observe the variability of the system’s response to the same stimulus, we repeated the experiment 100 times. In the first 50 trials we measured the
membrane voltage of output neuron 10 and in the remaining 50 trials we measured
the membrane voltage of output neuron 22. The input stimulus was applied for 3
seconds per trial and there was a delay of 30 seconds between each trial to allow
the system to return to its initial “resting” state. Figure 6.11 shows raster plots
describing the responses of these two neurons to the two consecutive sessions of
50 trials each (Fig. 6.11(a) represents the activity of neuron 10 and Fig. 6.11(b)
represents the activity of neuron 22). In the first 500ms to 800ms of stimulation, the input synapses have not reached their mean steady-state value (see also
Fig. 6.5). As the excitatory synaptic currents reach their steady-state value, the
WTA network selects either pixel 10 or pixel 22 as the winner, and excites the
corresponding output neuron, continuously switching between the two.
As mentioned previously, the offsets introduced in the circuits are constant
101
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
over time and the single circuit elements of the system, such as the synapses and
the neurons are highly reliable (see error bars of Fig. 6.9(c)). Yet the traces of
Fig. 6.11 show a significant amount of inter-trial variability, for the same input
stimulus and the same output neuron. Small variations in the input stimulus software, executed in a multiplexing environment, and thermal noise in the circuit
elements are not sufficient to explain such a large amount of variability. This
variability is therefore due to network effects. A possible (and probable) explanation for this phenomenon could lie in the recurrent nature of the competition
mechanism that takes place in the WTA network. Computer simulations of completely deterministic models have already demonstrated that recurrent networks
of reliable (software) spiking neurons can produce highly irregular firing patters [40, 100, 114, 116]. Carrying out a detailed analysis of the dynamics of
the VLSI system would prove to be extremely difficult and would go beyond the
scope of this paper. But this experiment is a real-time demonstration that also reliable VLSI spiking neurons, such as the integrate-and-fire neurons used here, can
produce highly irregular spike trains if embedded in a WTA network containing
recurrent excitatory and inhibitory pathways.
Despite the highly irregular firing patterns produced by the chip’s output neurons, the overall response of the system is consistent with its input: on average, as
expected from the input stimulus distribution, the network selects pixel 22 more
often than pixel 10. This can also be seen by the peri-stimulus time-histogram in
Fig. 6.11(c). Figure 6.11(d) shows the inter-spike interval histograms for the two
neurons. Both histograms show the same type of bimodal distribution. This distribution can be approximated by a superposition of two Gaussian, one centered
around 7ms and the other around 12ms. The bimodal distribution arises from the
fact that the output neuron is either being driven by a constant injection current
Iin j (in the case in which that pixel is the winner) or it is not receiving any input
(in the case in which the WTA network selects a different pixel as the winner).
If the pixel considered is the winner, the inter-spike interval (which is inversely
proportional to the amplitude of the injection current Iin j ) is constant and approximately equal to 7ms. This explains why the Gaussian centered around 7ms has
a small standard deviation. On the other hand, if the neuron is not receiving any
input, the inter-spike interval is not constant and depends on the time the network
takes to switch from the other winning pixel back to the one considered. This
time is not just a function of one parameter (as is the case for Iin j ), but depends
on many factors, ranging from the values of the bias voltages in the circuits, to
the frequencies of the input spike trains, to the frequency of the output neuron.
Therefore the Gaussian centered around 12ms has a larger standard deviation.
The control experiments were useful to verify the expected behavior of the
circuits, at the system level. To test the behavior of the system in the more general context of selective visual attention, we used data obtained from real-world
102
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
images, processed on the workstation and sent to the chip using the same method
described in Section 6.3.1.
Saliency-Map Inputs We processed static color images with an algorithm which
generates a saliency map, i.e. a feature map which topographically codes for local
conspicuousness over the entire scene [58]. Specifically, given the digitized input image, the algorithm computes a set of multi-scale feature maps, responding
to orientation, color and intensity contrast and, after appropriately normalizing
them, combines them in a bottom-up fashion. Figure 6.12(b) shows an example
of a saliency map generated from the image shown in Fig. 6.12(a). The algorithm
has no a priori knowledge of what is salient and what not, so the fact that the four
neurons illustrated in the image appear to be salient (and the “Neural Systems” title appears to be absolutely not salient) is simply due to the color, spatial scale and
contrast properties of those local regions in the image. As the VLSI architecture
used in this work is one dimensional and the saliency map is a 2D data array, we
had to choose an appropriate operator to map the 2D saliency map space onto a 1D
input vector. We applied the max() operator to the pixels of each column of the
saliency map, and mapped the location of the selected pixel to the 1D input vector.
In case of multiple maxima with the same value, we selected the pixel associated
with the first occurrence of the maximum, while scanning the column from the top
part of the image to its bottom part. This type of mapping is injective: any value
of the 1D input vector is associated to only one pixel of the saliency map. The
top trace of Fig 6.12(c) shows the values of the 1D vector obtained by applying
the mapping described above to the saliency map of Fig 6.12(b). Each component
of the vector corresponds to the brightest pixel of the column in the saliency map
with the matching index, and determines the frequency of the input spike train
(e.g. the excitatory synapse at pixel 1 receives approximately 19 spikes per second, the one at pixel 2 receives approximately 12 spikes per second, the one at
pixel 9 approximately 85 spikes per second, etc.). We applied this stimulus to the
chip, using the method described in Section 6.3.1, for a period of 3 seconds and
recorded spike trains from the chip’s output neurons. The histogram in the lower
part of Fig. 6.12(c) represents the activity of these output neurons. As shown, on
average the system attends pixels 9 and 10 most of the time, shifting its focus of
attention to regions centered around pixels 19 and 28 quite frequently and to other
regions of the image less frequently. To show the dynamical aspects of the focus
of attention we plotted, in Fig. 6.12(d), the address of the pixel attended, over
time.
As the mapping performed from the 2D data of Fig. 6.12(b) to the 1D vector
of Fig. 6.12(c) is injective, we can re-plot the 1D data of Fig. 6.12(b) onto the
2D saliency map. Figure 6.13 shows such a plot: the white dots superimposed on
103
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
the saliency map represent the locations attended by the chip, and the white solid
lines join successive attended locations. Figure 6.13 resembles a visual scan-path
similar to those recorded from human subjects [96, 121]; yet, the figure represents the movements of the focus of attention, which do not necessarily match
on a one-to-one basis the saccadic eye movements measured in scan-paths. The
focus of attention in our VLSI system tends to shift more frequently between locations which are spatially close to each other. This property, which appears to be
characteristic also of human subjects [95], has been explicitly engineered into our
system by implementing excitatory lateral connections between winner-take-all
cells (see Section 6.2.3). Furthermore, the average time spent by the system at
each location is approximately 50ms. This measurement, also in accordance with
data measured from psychophysical experiments performed on human subjects, is
an emergent property of the system and has not been explicitly engineered.
Mapping the 2D saliency map onto a 1D vector and the 1D output of the chip
back onto the 2D saliency map introduces some artifacts which do not allow us
to make a fair comparison between the scan-paths obtained from the chip with
scan-paths recorded from human observers. For example, the region around pixel
values (260; 475) of Fig. 6.13 is never selected by the system precisely due to the
injective mapping described above. The example of focus of attention scan-paths
shown in Fig. 6.13 is only illustrative. Although the 1D selective attention chip
can be used in several application domains [50], 2D saliency maps containing
both horizontal and vertical salient features are best processed using 2D selective
attention chips [48, 55].
104
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
(a)
(b)
Figure 6.10: Scanned net input currents to the WTA network Inet (top traces) and inhibitory currents Iinh (bottom traces) measured, by means of an off-chip current senseamplifier, at every pixel location. (a) Response of the system to the onset of the stimulation, with a display persistence setting of 3s (b) Response of the system after a few
seconds of stimulation, with a display persistence setting of 250ms
105
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Neuron #10
0
10
20
30
40
50
0
500
1000
Time (ms)
1500
2000
1500
2000
1500
2000
(a)
Neuron #22
0
10
20
30
40
50
0
500
1000
Time (ms)
(b)
PSTH
6
4
2
0
0
500
1000
Time (ms)
(c)
ISI occurrence
4000
3000
2000
1000
0
0
5
10
Time (ms)
15
20
(d)
Figure 6.11: (a) Raster plots of neuron 10 in response to the control stimulus (see text for
explanation). (b) Raster plots of neuron 22. (c) Peri-stimulus time histogram of neurons
10 (solid line) and of neuron 22 (dashed line). (d) Inter-spike interval distribution of
neurons 10 (front bars) and 22 (rear bars).
106
6.2. A 1-D AER SELECTIVE ATTENTION CHIP
50
5
100
10
150
15
200
250
20
300
25
350
30
400
100
200
300
5
400
10
15
(a)
60
Winning neuron
Spike input/output distribution
70
50
40
30
20
10
3
5
7
25
30
(b)
80
0
1
20
9 11 13 15 17 19 21 23 25 27 29 31
Pixel position
(c)
31
29
27
25
23
21
19
17
15
13
11
9
7
5
3
1
0
0.2
0.4
0.6
Time (s)
0.8
1
1.2
(d)
Figure 6.12: Test image with salient features. (a) Original color figure. (b) Corresponding saliency map. (c) Input spike frequencies obtained from the injective mapping
describe in the text (upper trace) and distribution of the output neuron’s spike counts
recorded over a period of 3 seconds (lower histogram). (d) Position of the attended pixel
recorded over time.
107
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
Figure 6.13: Mapping of the 1D data of Fig. 6.12(d) onto the re-sampled 2D saliency
map data of Fig. 6.12(b). Shifts along the horizontal axis are due to the selective attention
chip’s response. Shifts along the vertical axis are introduced artificially via the injective
mapping described in the text.
108
6.3. A 2-D AER SELECTIVE ATTENTION CHIP
To nearest
neighbors
AER
Input
A
E
R
P2V (X)
Excitatory
Synapse
Iex
+
+
Inhibitory
Synapse
Analog
Output
P2V (Y)
Hysteretic
WTA
A
E
R
Output
Neuron
Iior
AER
Output
IOR
To nearest
neighbors
Figure 6.14: Block diagram of a basic cell of the 8 × 8 selective attention architecture.
6.3 A 2-D AER selective attention chip
We extended the model presented in the previous section to two dimensions and
implemented a 2-D selective attention chip. The 2-D selective attention chip was
fabricated using a standard 2µ m CMOS technology. Its size is approximately
2mm×2mm and it contains of an array of 8×8 cells. The chip’s architecture,
easily expandable to arrays of arbitrary size, is laid out on a square grid, with
input and output AER interfacing circuits.
In a system containing AER sensors interfaced to the selective attention chip,
address events would reach at the input stage of each cell of the 8 × 8 array excitatory synaptic circuits that convert the digital voltage pulse streams into analog
input currents. Figure 6.14 shows the block diagram of one of the architeture’s
cells. The input current integrated by the excitatory synapse (see Iex in Fig. 6.14)
is sourced into the hysteretic WTA network. The output current of each WTA cell
is used to activate both an integrate and fire (I&F) neuron and two position to voltage (P2V) circuits [27]. The two P2V circuits encode both x and y coordinates of
the winning WTA cell with two analog voltages, while the I&F neurons generate
pulses that are used by the AER interfacing circuits to encode the address of the
winning WTA cell. The neuron’s spikes are also integrated by the local inhibitory
synapse connected to it, to generate a current Iior that is subtracted from the current Iex (see Fig. 6.14). Figure 6.15 shows the circuit diagram of both excitatory
and inhibitory synapses. As for the 1-D selective attention chip, the synaptic circuits use compact, non-linear current-mirror integrators to integrate their input
spikes. The transistors in the dashed box of Fig. 6.15(a) implement the AER input
109
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Vdd
Vdd
Vτe
Iex
Vw
Vdd
Iior
Vior
Vq
Vdd Vdd Vdd
Vpu
Qx
Vτi
Vack
Qy
(b)
(a)
Figure 6.15: Synaptic circuits. (a) Input excitatory synapse. Address events are converted into pulses by the circuit in the dashed box. Pulses are integrated into the excitatory
current Iex by the p-type current-mirror integrator. The integrator’s gain and time constant
are modulated by the control voltages Vw and Vτ e ; (b) Inhibitory synapse. On-chip pulses
(Vior ) are integrated into the inhibitory current Iior by the n-type current-mirror integrator.
The time constant and gain of this integrator are modulated by the voltages Vq and Vτ i .
interfacing circuits, and can operate correctly over a wide range of input pulse
widths, ranging from a few hundred nanoseconds to milliseconds. The gain and
time constants of the two current-mirror integrators are set by two pairs of control
voltages (Vw and Vτ e for the excitatory synapse and Vq and Vτ i for the inhibitory
synapse).
The sum of the currents (Iex − Iior ) is sourced into the input node of the hysteretic WTA cell (node Vin in Fig. 6.16). Each cell is connected to its four nearest
neighbors, both with lateral excitatory connections and lateral inhibitory connections (see Fig. 6.16). The inhibitory connections are modulated by the bias voltage
Vinh , and control the spatial extent over which competition takes place. If lateral
inhibition is maximally turned on (Vinh = Vdd ), all WTA cells of the architecture
are connected together and only one winner can be selected at a time (global inhibition). If Vinh is low, the WTA network allows multiple winners to be selected,
as long as they are sufficiently distant from each other (local inhibition). Similarly, lateral excitatory connections, modulated by the bias voltage Vex , control
the amount of lateral facilitatory coupling between cells. If lateral coupling is en110
6.3. A 2-D AER SELECTIVE ATTENTION CHIP
Vdd
Vinj
Vdd
Vdd
Vdd
Vin
Iinj
(Iex-Iior)
VP2VX
VP2VY
Vex
Vex
Vex
Vex
Vinh
Vinh
Vwta
Vinh
Vinh
Figure 6.16: Hysteretic WTA cell. Input currents are sourced into node Vin and 3 copies
of the output current are sent to the two P2V circuits and to the I&F neuron.
abled, the system tends to select new winners in the immediate neighborhood of
the currently selected cell.
When a WTA cell is selected as a winner, its output transistors source DC
currents into the two P2V row and column circuits. The winning WTA cell also
sources a DC current Iin j into the input node Vmem of the local inhibitory neuron
connected to it (see Fig. 6.17). The amplitude of the injection current Iin j is independent of the input current (Iex − Iior ), but depends on the bias voltage Vwta and
on the control voltage Vin j . This current, integrated onto the neuron’s capacitor
Cm of Fig. 6.17, allows the neuron’s membrane voltage Vmem to increase linearly
with time. As soon as Vmem reaches the threshold voltage Vthr , the neuron generates an action potential: the comparator and the inverters of Fig. 6.17 drives Vout
to the positive power supply rail. This activates the AER row and column request
signals (Rx and Ry ), which produce an address event. The output AER circuit’s acknowledge signals (Ax and Ay ) reset the pulse by allowing the neuron’s membrane
capacitance to discharge at a rate controlled by V pw .
Also in this case, next to transmitting their address events off chip, the output
neurons, together with the local inhibitory synapse connected to them, implement
the inhibition of return (IOR) mechanism. The spikes generated by the winning
cell’s output neuron are integrated by its corresponding inhibitory synapse, and
gradually increase the cell’s inhibitory post-synaptic current Iior . As the neuron
keeps on firing, the net input current to that cell (Iex − Iior ) decreases until a different cell is eventually selected as the winner. When the previous winning cell
is de-selected its corresponding local output neuron stops firing and its inhibitory
111
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Cfb
Vdd
Vdd
Vpb
Vmem
+
Cm
Iinj
Vthr
Vout
−
Vpw
Vrfr
Vdd
Vpu
Vior
Ax
Rx
Ry
Ay
Cr
Figure 6.17: Local output integrate and fire neuron. When the membrane voltage Vmem
increases above Vthr the output voltage Vout is driven to Vdd and an address event is generated. The transistors in the dashed box are part of the output AER circuitry.
synapse recovers, decreasing the inhibitory current Iior back to zero.
6.3.1 Experimental Results
To characterize the behavior of selective attention chip with well controlled input
signals we interfaced it to a workstation, via a National Lab-PC+ I/O card, and
stimulated it using the AER communication protocol. With this setup we were
able to stimulate all the 64 pixels of the network with voltage pulses (i.e. addressevents) at a maximal rate of 500Hz. As the input synapses were set to have time
constants of the order of milliseconds, each cell appeared to receive input spikes
virtually in parallel. The handshaking between the chip and the PC was carried
out at run time by the hardware present in the National I/O card. The chip’s input
stimuli consisted of patterns of address-events being generated by the workstation
at uniform rates of different frequencies. We performed two sets of experiments,
to demonstrate the chip’s response properties using both the analog P2V outputs
and the digital AER output.
Analog P2V outputs In the first set of experiments, we used a test stimulus that
excited cells (2,2) (2,7) (7,2) and (7,7) of the selective attention chip with 30Hz
pulses, and cell (5,5) with 50Hz pulses. Figure 6.18(a) shows the analog output of
the P2V circuits in response to 300ms of stimulation with the input “saliency map”
112
6.3. A 2-D AER SELECTIVE ATTENTION CHIP
4
2
6
2000
3.5
Event count
9
Y Position (V)
3
7
1
5
2.5
1500
1000
2
1.5
500
0
3
1
8
4
10
2
3
4
1
1
5
1.5
2
2.5
3
X Position (V)
3.5
4
6
4.5
Y Address
7
8
1
2
3
4
5
6
7
8
X Address
(a)
(b)
Figure 6.18: (a) Output of the P2V circuits of the selective attention architecture measured over a period of 300ms, in response to a test stimulus exciting four corners of the
input array at a rate of 30Hz and a central cell at a rate of 50Hz; (b) Histogram of the
chip’s output address-events, captured over a period of 13.42s in response to the same
input stimulus.
described above. The system initially selects the central cell (5,5). But, as the
IOR mechanism forces the WTA network to switch the selection of the winner, the
system cycles through all other excited cells as well. The P2V circuits are actively
driven when the WTA network is selecting a winner (i.e. when the output p-type
transistors of Fig. 6.16(a) are sourcing current into the nodes P2V X and P2VY ).
At the times in which no cell is winning (i.e. when all cells are inhibited), there
is no active device driving the P2V circuits, and their outputs tend to drift toward
zero. This is evident in Fig. 6.18(a), for example, at the position corresponding
to cell (7,2) in the lower right corner of the figure. When the network selects it
as its eighth target, the horizontal P2V circuit outputs approximately 4.4V and
the vertical one outputs approximately 1.3V. When the IOR mechanism forces the
network to de-select the winner the outputs of the P2V circuits slowly drift toward
zero. As soon as inhibition decreases, the network selects the cell (7,7) as the new
(ninth) winner, the position to voltage circuits are actively driven again, and their
output quickly changes from approximately 3.6V and 1.2V to 4.2V and 3.5V (for
the horizontal and vertical circuits respectively).
113
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Digital AER outputs To verify that the AER outputs are consistent with the
analog P2V outputs, we stimulated the chip with the same pattern used for collecting the data of Fig. 6.18(a). We measured the address-events generated by the
selective attention chip in response to this input stimulus using a logic analyzer,
and plotted in Fig. 6.18(b) the histogram of such events. As shown, the chip’s
output address-events reflect, on average, the input stimulus, and are consistent
with the analog outputs of the P2V circuits.
The data of both Fig. 6.18(a) and (b) demonstrate how the IOR mechanism
forces the network to switch the selection of the winner from one input to a different one, cycling through all sufficiently strong inputs. To demonstrate also
how different IOR dynamics settings (modified for example by changing the bias
voltage Vτ i of Fig. 6.15(b)) affect the system’s behavior, we performed a second
experiment with a different input stimulus. The stimulation pattern used in this
experiment excited cells (2,2), (5,5) and (7,2) with pulses at uniform frequency
of 50Hz, cell (7,7) with 100Hz pulses and cell (2,7) with a 150Hz pulses (see
Fig. 6.19(a) for a histogram of the input address-events). Figures 6.19(b), (c) and
(d) show histograms of the chip’s response for three different values of the bias
voltage Vτ i . The data of Fig. 6.19(b) was obtained by setting the time constant of
the inhibitory synapse to a relatively high value (Vτ i = 227mV ). In this case once
a cell is inhibited (after being selected as the winner), its input is suppressed for an
extensive period of time and the WTA network is forced to select all other (nonsuppressed) inputs. Conversely, the data of Fig. 6.19(d) was obtained by setting
synapse time constant to a relatively low value (Vτ i = 193mV ). In this case the
WTA network switches from selecting the cell receiving the strongest input to the
cell receiving the second-strongest input, and back. As the selected cells are not
suppressed for sufficiently long periods of time, the remaining inputs never win
the WTA competition. The histogram in Fig. 6.19(c) shows the data obtained for
the intermediate case of Vτ i = 207mV .
The same data used to compute the address-event histograms of Fig. 6.19 can
be displayed using a different representation, to show the dynamics of the WTA
competition stage. In Fig. 6.20 we plotted the address-events measured for the
intermediate case of Fig. 6.19(c) over time. The addresses of the 8 × 8 cells are
labeled successively row by row, such that labels 0 through 7 correspond to the
addresses of the cells in the first row, labels 8 through 15 correspond to the addresses of cells in the second row, and so on. Consistent with the histogram of
Fig. 6.19(c), this plot shows how the system selects the cell (2,7) (labeled as 15
in Fig. 6.20) most frequently, switching occasionally to cell (5,5) (labeled as 37),
and more often to cells (7,2), and (7,7) (labeled as 50 and 55). As mentioned in
Section 6.2.1, the details of the switching dynamics can be controlled by setting
appropriately the bias voltages of the excitatory and inhibitory synaptic circuits
(see (Vw ,Vτ e ) and (Vq ,Vτ i ) in Fig. 6.15) and the neuron’s firing rate (controlled by
114
6.3. A 2-D AER SELECTIVE ATTENTION CHIP
2000
Event Count
Event Count
600
400
200
0
1500
1000
500
0
1
2
1
2
3
4
5
6
7
8
Y Address
1
2
3
4
5
6
7
8
3
4
5
6
7
8
Y Address
X Address
1
2
3
4
5
6
7
8
X Address
(a)
(b)
2500
3000
Event Count
Event Count
2000
1500
1000
2000
1000
500
0
0
1
1
2
2
3
4
5
6
Y Address
7
8
1
2
3
4
5
6
7
8
3
4
5
6
Y Address
X Address
(c)
7
8
1
2
3
4
5
6
7
8
X Address
(d)
Figure 6.19: Event histograms of addresses generated by the workstation sent to the chip
(a) and output addresses generated by the selective attention chip (b), (c), and (d). All
chip parameters are kept constant throughout the plots except for the bias parameter Vτ i .
The histogram in (b) was obtained with Vτ i = 227mV , the one in (c) with Vτ i = 207mV ,
and the one in (d) with Vτ i = 193mV .
Vin j of Fig. 6.16(b)). These bias voltages, together with the other ones controlling
the hysteretic WTA network’s behavior (namely Vwta , Vex , and Vinh of Fig. 6.16(a)),
endow the system with a sufficient amount of flexibility to be able to use the same
chip in different types of selective attention tasks.
115
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
64
56
Neuron Address
48
40
32
24
16
8
0
0
2
4
6
8
Time (ms)
10
12
14
Figure 6.20: Output address events of the selective attention chip biased with Vτ i =
207mV . The 2D address space of the chip’s architecture is mapped into the plot’s 1D
ordinate vector by labeling each address successively, row by row.
6.4 Selective attention applications
The test stimuli used in the experiments of Section 6.3.1 were simple examples designed to demonstrate the expected behavior of the selective attention chip. They
don’t resemble realistic saliency maps (see Fig. 6.12and Fig. 6.21(a,b)). In practical applications saliency maps would more likely resemble the one shown in
Fig. 6.21(c), or the one shown in Fig. 6.13. More elaborate saliency maps could
be processed by 2D selective attention networks of greater size. The 8×8 architecture proposed in this paper can scale up to networks of arbitrary size: The
performance of the hysteretic WTA circuits, which operate collectively in a massively parallel way, is not affected by the network’s size. Similarly, given that in
the selective attention system there is always one or a few winners at a time, the
performance of the AER circuitry does not degrade with size (performance is affected only in architectures in which too many cells are trying to access the AER
bus simultaneously).
As demonstrated in Section 6.2, these types of selective attention chips can
operate reliably also on elaborate saliency maps, generated from high-resolution
digitized images. In practical applications, the images could come for example from a camera connected to the workstation, and the selective attention chip
could be used to allocate in real-time CPU (image processing) resources only to
the first n most salient regions of the image, or to scan the whole image in an
intelligent way, sorting the scanning process by region saliency. Depending on
the chip’s bias settings, the system could also be tuned to visit each region only
once, switching from region to the other slowly, or to revisit each region over
and over again, switching from one region to the other quickly. Systems of this
116
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
(a)
(b)
(c)
Figure 6.21: Image representations of saliency maps. (a) Saliency map corresponding
to the input stimulus used for the experiment of Fig. 6.18; (b) Saliency map used for the
experiment of Fig. 6.19; (c) Fictitious example resembling a realistic saliency map.
type would already benefit from the real-time response properties of the selective
attention chip. But the most effective way of exploiting the computational properties of this chip would be to use it in conjunction with neuromorphic sensors
that employ the AER communication protocol, such as silicon retinas or silicon
cochleas [7, 33, 67]. These types of systems could be used as research tool for
testing, in real-time, with real stimuli, different hypotheses on biological selective
attention mechanisms [24, 89, 95]. Or they could be used as low-cost alternatives
to implement visual/auditory tracking or monitoring systems. For example, rather
than using several fixed high-resolution (high-cost) cameras to monitor an environment, one could use a single, motorized, high-resolution camera driven by a
selective attention system, comprising an AER silicon retina with a wide-field of
view lens interfaced to the selective attention chip. In the next section we describe
a first attempt at making a system of this type.
6.5 An active AER selective attention system
We constructed an active vision system using an AER image sensor mounted on
a motorized pan-tilt unit, and the 2-D AER selective attention chip interfaced to a
workstation.
The selective attention chip receives input from an AER imaging sensor [66],
and transmits the address of the winning pixel to the workstation, that is used to
drive the pan-tilt unit on which the sensor is mounted. A standard CCD camera
is mounted next to the sensor, to visualize the sensor’s field of view. The AER
sensor responds to contrast transients and its address events report the position
of moving objects. The selective attention chip selects the locations with highest
contrast moving objects and cycles through them, while the workstation drives the
117
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Real
world
scene
Transient Imager
CCD Camera
Lens
HOST
Biology:
Retina, LGN, V1
Eye muscles
Model:
Transient imager
Pan-tilt unit
Function:
Image input and feature map
calculation
Eye movements
Pulvinar, primary visual
cortex, superior colliculus
Superior colliculus
Selective attention chip
Software algorithm
Saliency map processing and
focus of attention computat.
Motor control for
eye movements
Pan-Tilt Unit
Selective Attention Chip
(b)
(a)
Figure 6.22: (a) Block diagram of the sensory-motor selective attention model. The
figure shows the basic computational blocks used, as well as the corresponding biological
analogues and their function. (b) Schematic diagram of the active vision setup: The
neuromorphic imager, mounted on a pan-tilt unit, transmits its output to the selective
attention chip. The latter sends the results of its computations to a host computer which
uses this data to drive the pan-tilt unit’s motors.
pan-tilt unit centering the selected locations with the sensor’s imaging array.
A block diagram of the selective attention sensory-motor system and the correspondence between the system’s computational blocks and their biological counterparts is shown in Fig. 6.22(a). A schematic diagram of the system’s setup
illustrating how the individual components are connected together is shown in
Fig. 6.22(b). At the input stage we use a neuromorphic imager that is sensitive
to temporal changes in illumination (transients) and extract motion or flicker as
features. Since our system in its current state extracts only one feature map, the
saliency map is identical to the extracted feature map. In this case no feature combination stage is necessary. The transient imager chip transmits its output data
directly to the selective attention chip.
Based on its inputs, the selective attention chip computes the location of the
focus of attention and sends address events encoding this location to the host computer. In addition to managing the communication with the selective attention
chip, using the AER communication protocol, the host computer is used for data
logging and, more importantly, for driving the motors of a commercial pan-tilt
unit1 on which the transient imager is mounted (see Fig. 6.23). The pan-tilt unit is
1 fabricated
by Directed Perception, Inc.
118
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
Figure 6.23: Selective attention active vision system. The selective attention chip processes sensory data coming from an AER imaging sensor and transmits its output to a
workstation that drives the pan-tilt unit on which the sensor is mounted. A standard CCD
camera is mounted next to the AER sensor to visualize the sensor’s filed of view.
used to orient the imager chip such that the location of the focus of attention lies
in its central region.
The system proposed here uses a single-sender/single-receiver point-to-point
AER protocol. The sender chip is a transient imager that contains a two-dimensional
array of 16×16 adaptive photoreceptors with an AER arbiter circuit that serially
processes the requests from the different pixels in the order of their activation,
latches their addresses onto the AER bus in the same order, and sends acknowledge pulses to the corresponding pixels [9]. As soon as a new address is ready on
the bus, the handshaking cycle with the receiver chip is initiated, in the course of
which the address of the sending pixel is transmitted. The transient imager transmits its address events to the selective attention chip using a topographic mapping.
As the sender has 16×16 pixels and the receiver only 8×8 we map the addresses
of 2×2 neighboring pixels on the sender to the same pixel on the receiver. This
mapping was accomplished by simply discarding the least significant bit of the
sender address, for each dimension.
6.5.1 The Transient imager chip
The transient imager is a 16×16 pixel array of irradiance transient detectors that
is used to generate the events that drive the system. Each pixel responds with
binary pulses in real time to a local change of a brightness distribution projected
119
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
ON
Interface
ON request
ON acknowledge
Transient
Detector
OFF
Interface
OFF request
OFF acknowledge
Figure 6.24: Block diagram of irradiance transient detector with event-based communication interface.
through a lens onto its surface. These pulses are used as the request signals to the
AER communication interface. Figure 6.24 shows a block diagram of the pixel
circuitry. The transient detector comprises an adaptive photo-receptor [22] with
a rectifying temporal differentiator [66] in the feedback loop. Positive irradiance
transients, corresponding to dark-to-bright or ON transitions, and negative irradiance transients, corresponding to bright-to-dark or OFF transitions, appear at
different output terminals. The ON and OFF responses are separately amplified
with tunable gains, each generating a request pulse to the on-chip arbiter if it exceeds a chosen threshold. By appropriately setting the threshold and the respective
gain factors, the circuit can be made to respond only to ON transients or only to
OFF transients or to both types of transients. Each acknowledge pulse from the
arbiter triggers a reset pulse at the requesting terminal, whose duration determines
a refractory period for the succeeding request from the same terminal. Depending
on the chosen refractory period and the magnitude and duration of the irradiance
transient, the pixel responds with a single spike or a burst of spikes. In the present
application, a short refractory period of 140µ s was chosen to obtain bursts, and
only the OFF response was used to stimulate the selective attention chip.
The pixels are arranged on a square grid. The position of a pixel along a row
is encoded with a 4-bit column address and its position along a column with a
4-bit row address. An additional address bit is used to distinguish between ON
and OFF transients.
6.5.2 The Motor Control Algorithm
The control algorithm that the host computer executes is responsible for driving
the motors of the pan-tilt unit in such a way as to center the location picked by the
selective attention chip within the central region of the transient imager chip. This
algorithm represents a first attempt at modeling the bottom-up, stimulus driven
neural mechanism that generates saccadic eye movements which center the fovea
with respect to the location of the focus of attention.
120
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
Figure 6.25: Image captured from the CCD camera mounted next to the transient imager.
The outer frame shown in the image corresponds to the field of view of the transient
imager, whereas the inner frame is drawn to evidence the transient imager’s central region.
The cross to the bottom right of the image center represents the location of the focus of
attention currently computed by the selective attention chip.
To evaluate quantitatively the response properties of the system and test the
motor control algorithm, we mounted a standard CCD camera next to the transient
imager chip and captured images on the host computer (see also Fig. 6.22(b)).
This allowed us to see in real-time the images projected onto the focal plane of
the transient imager chip, as shown in Fig. 6.25. We calibrated the system so that
the image projected onto the transient imager array corresponds to the central part
of the image captured by the CCD camera, shown as the outer square in the center
of Fig. 6.25. The inner square drawn in the center of Fig. 6.25 represents the part
of the scene being projected on the central 4 by 4 region of the transient imager
array. The location selected by the selective attention chip is represented by a
small cross, superimposed onto the CCD image.
The control algorithm produces motor commands that depend on the current
position of the selected location and its recent history: if the cross lies within
the inner square of the image, no camera movements are triggered (the camera is
already “foveating” the salient feature). If the cross shifts to a location outside
the inner frame, the algorithm records the address of the location and increases
a counter associated with that address. As soon as the counter for a particular
address reaches a threshold n (i.e. when the cross revisits the same location n
times), the algorithm generates a camera movement that centers the selected location within the central region of the transient imager array (the camera “saccades”
to the persistent salient stimulus). In this way camera movements are generated
only if a salient location is visited more than once. The revisiting constraint ensures that the system does not saccade to all locations picked by the selective
attention chip, but orients its gaze only toward persistent salient stimuli. In the
examples shown in Section 5.6.4, n was set to 5. The value of n was chosen to re121
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
produce the characteristics of biological selective attention systems, as reported in
the neuroscience literature [95]: while the focus of attention shifts 15 to 20 times
per second, saccadic eye movements are made only 3 to 5 times per second [95].
Another important function implemented by the motor control algorithm is
that of saccadic suppression. During a camera movement the images projected on
the focal plane of the transient imager array generate a large amount of address
events. These events are not relevant for the analysis of the scene once the camera
stops moving. In biology this problem is solved by suppressing all inputs arriving
from the retinas during saccadic eye movements (indeed, we are effectively blind
during a saccade). In the current version of our system, the addresses generated
by the transient imager chip are hardwired into the selective attention chip (see
Fig. 6.22(b)). There is no way of suppressing these events at source. During a
camera movement the selective attention chip receives and processes all spurious
events from the imager and the addresses generated by the selective attention chip
are transmitted to the host computer. The control algorithm ignores the effect of
these events, by resetting all address counters to zero after each camera movement.
In this way, the recent history of all selected positions is canceled and normal
operation of the control algorithm can be resumed.
6.5.3 System response in absence of camera movements
Initially, we tested the system with the motors of the pan-tilt unit turned off. The
input images consisted of a laboratory scene with two flashing LEDs in the foreground. The two LEDs were blinking in phase, with a frequency of 1Hz and a
duty-cycle of 50%. As the transient imager responds only to local changes in illumination, the blinking LEDs proved to be a reliable and well controlled stimulus.
The static background did not contribute to the generation of address events. We
placed a diffusion glass in front of the transient imager’s lens, to diffuse the projection of the two LEDs on the imager’s focal plane. In this way we were able to
stimulate several pixels of the imaging array with each LED. Fig. 6.26(a) shows
the histogram of the address events generated by the transient imager array in response to the flashing LEDs, captured over a period of 2s. The two regions with
the highest occurrence of events (around pixels (5,9) and (11,11)) correspond to
the locations of the LEDs. Fig. 6.26(b) shows the histogram of address events
generated by the selective attention chip. As shown, on average, the selective
attention chip visited pixels (3,5), (3,4) and (6,6), (6,5) most often.
While the event histogram shows that the selective attention chip acts on average like a threshold filter, picking only inputs with a high mean frequency, it
does not show the more interesting aspect of the computation carried out by the
chip: its dynamics. To show the dynamical aspect of the selective attention chip’s
response, we plotted in Fig. 6.27 a raster plot. This plot shows the activity of
122
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
(a)
(b)
Figure 6.26: (a) Histogram of events generated by the transient imager pixels in response
to two diffused flashing LEDs. The LED stimulating the region around pixel (5,9) has
higher contrast than the other LED. (b) Histogram of events generated by the selective
attention chip in response to the events generated by the transient imager chip.
64
Neuron address
56
48
40
32
24
16
8
0
0.5
1
1.5
Time (s)
2
2.5
3
Figure 6.27: Raster plot of the activity of the neurons of both transient imager chip (dots)
and selective attention chip (circles) in response to the flashing LEDs. To plot the data
from both chips using an address space with the same resolution, we sub-sampled the
addresses of the transient imager chip. The LEDs flashed approximately at 0.25s, 1.25s
and 2.25s.
the transient imager and of the selective attention chip neurons over time, in response to the flashing LEDs. The 8 by 8 neurons of the selective attention chip
are labeled successively, row by row (1 through 64), and the events that they generated are plotted with circles. To show the events of the transient imager pixels
on the same scale, we sub-sampled their addresses taking into consideration only
123
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
their three most significant bits (in the same way we implemented the mapping
of addresses from the transient imager pixels to the selective attention ones, as
described in Section 6.1.1). The high density of events around time instants 0.5s,
1.5s and 2.5s is due to the flashing of the LEDs. Within a single flash, the focus of attention shifts approximately four times, moving from one region of high
saliency to another. The proportion between events generated by the two chips is
consistent with the data of Fig. 6.26. By looking at the selective attention chip
data of Fig. 6.27 one can extrapolate the focus of attention’s scanpaths. Note how
these scanpaths tend to repeat themselves over time. This characteristic will be
even more evident in Section 6.5.5, when we analyze the response of the system
to natural stimuli.
6.5.4 System response in presence of camera movements
To allow the system to make camera movements we activated the motors of the
pan-tilt unit on which the imager was mounted. The input stimulus consisted again
of two flashing LEDs, but this time not in phase. Furthermore we removed the diffusion filter from the transient imager’s lens, so that the two LEDs stimulated only
a few pixels of the imaging array. As described in Section 6.5.2, the selective attention chip was driving the pan-tilt unit to orient the imager towards the attended
location. Figure 6.28 shows a sequence of images captured by the CCD camera
mounted on the pan-tilt unit, while the system was engaged in selecting and tracking the LEDs. Initially only the top LED was flashing, and the system selected it
and oriented the central region of the imager to that location (see Fig. 6.28(a)). As
we turned on the bottom LED, the system changed the focus of attention location
(see Fig. 6.28(b)) and made a camera movement centering the attended stimulus
on the central region of the imager (see Fig. 6.28(c)).
The raster plot of Fig. 6.29 shows in detail the sequence of events that lead
to the camera movement. The arrangement of the neuron addresses on the figure
axis is the same as in Fig. 6.27. Initially the selective attention chip was attending
the region of transient imager pixels that project to its 35th pixel. As the second
LED flashed, the imager pixels excited also the 20th selective attention chip pixel.
After approximately 1s, the WTA network of the selective attention chip switched
and selected the second LED as the winner. After having attended to that location
for approximately 2.5s, the system made an abrupt camera movement (saccade),
and centered the attended stimulus on the imaging array.
6.5.5 System response to natural stimuli
In this section we show how the system is able to select and attend natural stimuli,
that were not explicitly engineered to optimally drive the imaging array. As we
124
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
(a)
(b)
(c)
Figure 6.28: Sequence of images showing the selection of a salient stimulus prior to and
after a saccadic eye movement. (a) The system is attending the top LED, already centered
on the central part of the imaging array. (b) The system selects the bottom LED, outside
the central region of the imager. (c) The system performed a saccade toward the bottom
LED, and is currently attending it.
64
Neuron address
56
48
40
32
24
16
8
0
1
2
3
4
5
6
7
8
9
Time (s)
Figure 6.29: Raster plot of the activity of the neurons of the transient imager chip (dots)
and of the selective attention chip (circles) in response to two flashing LEDs. The focus of
attention shifts from a central region of the imaging array to a peripheral one (see circles
at 2s ≤ t < 6s). Consequently, the system makes a camera movement, at the time indicated
by the vertical arrow, and re-centers the attended location.
did in Sections 6.5.3 and 6.5.4, we initially tested the system in the absence of
camera movements and subsequently tested it with the motor output activated.
Figure 6.30 shows the location of the focus of attention, as measured by the
P2V circuits of the selective attention chip (see Fig. 6.14), in response to the fluttering fingers of the experimenter, over a period of 500 ms. The x-component and
y-component of the focus of attention are plotted against each other, and superim125
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
Figure 6.30: Output of the P2V circuits of the selective attention chip (see Fig. 6.14),
representing the scanpath of the focus of attention, switching back and forth between the
fluttering fingers of both of the experimenter’s hands. The scanpath data is superimposed
onto a snapshot taken from the CCD camera during the experiment.
posed onto an image taken by the CCD camera during the experiment. Although
the resolution of the selective attention chip is 8 × 8 pixels, the data of Fig. 6.30
seems to belong to a much higher resolution architecture. This is due to the fact
that the output of the P2V circuits is analog and is affected by noise [48]. These
analog output signals might not be appropriate for precise quantitative measurements, but could be used to drive, via buffers or power-amplifiers, motors and
actuators to implement (negative feedback) sensory-motor loops [48].
Figure 6.31 shows the response of the system to the same stimulus as Fig. 6.30,
with the motors engaged. Fig. 6.31(a) shows the beginning of the experiment: the
motors had just been activated, the imager was still in its initial position and the
selective attention chip chose a pixel in the top left region of the transient imager
array as the focus of attention. After the selective attention chip transmitted the
same pixel address to the host computer for a set number of times, specified by
the motor control algorithm (see Section 6.5.2), the control algorithm generated
a camera movement and centered the focus of attention with respect to the transient imager array (see Fig. 6.31(b)). If the salient stimuli were persistent (e.g.
if the fingers kept on moving) and remained in the field of view of the imager,
the system continuously shifted its gaze from one salient stimulus to the other.
This behavior has proven to be extremely reliable and robust. The system’s response is largely invariant to illumination conditions, stimulus speed and (static)
background conditions.
126
6.5. AN ACTIVE AER SELECTIVE ATTENTION SYSTEM
(a)
(b)
Figure 6.31: Saccadic eye movements in response to moving fingers. (a) CCD camera
snapshot taken before the saccadic eye movement (the focus of attention has just switched
from one hand to the other). (b) CCD camera snapshot taken just after the the saccadic
eye movement (the focus of attention and the salient stimulus are now in the center of the
imaging array).
127
CHAPTER 6. MULTI-CHIP MODELS OF SELECTIVE ATTENTION SYSTEMS
128
Chapter 7
Silicon neural models of
winner-take-all networks
The single-chip and multi-chip selective attention systems that we described in
previous Chapters all share a common feature: they use the current-mode WTA
circuit described in Chapter 4 to perform competition. While this compact and
elegant circuit captures many of the features found in biological competitive networks that use mean firing rates to represent signals, it is not able to reproduce
those results that arise in natural neural systems due to spike-timing effects. To
investigate the effect of spike-timing on the coputational properties of competitive networks one should implement networks of spiking neurons. These types
of networks have been extensively studied using both analytical and numerical
simulation tools [76]. An increasing number of researchers use software implementations of these networks as tools for investigating the effect of spike timing
and synchrony on the network’s computational properties [4, 28, 65, 94]. The
importance of these approaches is underscored by recent experimental results that
support the idea that the temporal fine structure of neural spike trains, in particular
the correlation between firing times of different neurons, is used for the representation of sensory input [108] or internal states, in particular, selective attention
[34, 107].
While software simulations are effective for analyzing networks of neurons
that use rate codes as the principal representation, detailed simulations of networks of spiking neurons is a CPU-intensive process and can still require a significant amount of simulation time. Here we present a set of hybrid analog/digital
VLSI neural circuits, that allow us to overcome these problems, by exploiting
the advantages of both highly parallel analog computation and high-speed asynchronous digital VLSI techniques. Specifically, we present a VLSI device that
contains a competitive network of spiking neurons that uses the AER communication protocol to trasmit and receive spikes to and from other AER chips.
129
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
Vdd
Vdd
-Vgs
Vdd
M21
Vadap
M16
Iinj
Vdd
Vdd
M3
M8
M4
M9
M5
M10
Vdd
Vmem
Iadap
M1
Ireset
Cmem
Vsf
M2
Vin
Vrfr
Vlk
M20 M19
Vca
M17
Vspk
M14
M7
Ifb
Ileak
M13
Vo1
M6
M15
Vo2
M11
M12
M18
Figure 7.1: Circuit diagram of the I&F neuron.
7.1 The Integrate-and-Fire Neuron Circuit
A spiking neuron model that allows us to implement large, massively parallel
networks of neurons is the Integrate-and-Fire (I&F) model. Networks of I&F
neurons have been shown to exhibit a wide range of useful computational properties, including feature binding, segmentation, pattern recognition, onset detection,
input prediction, etc. [76]. The recent and growing interest in pulse–based neural networks has lead to the design and fabrication of an increasing number of
VLSI networks of Integrate–and–Fire (I&F) neurons. These types of devices have
great potential, allowing researchers to implement simulations of large networks
of spiking neurons with complex dynamics in real time, possibly solving computationally demanding tasks. This is especially true as continuous improvements in
VLSI technology allow for the fabrication of AER devices containing thousands
of elements, operating in parallel. For these devices to be be practically realizable,
it is crucial to have pulse generating elements with minimal power consumption
(locally) and with pulse-frequency saturation and adaptation mechanisms to limit
and reduce the power consumption globally and to optimize communication bandwidth for the transmission of address-events. The I&F neuron circuit depicted in
Fig. 7.1 implements these saturation and adaptation mechanisms, and has been
shown to be one of the lowest-power circuits of its kind [53].
The I&F neuron circuit is shown in Fig. 7.1. The circuit comprises a source
follower M1-M2, used to control the spiking threshold voltage; an inverter with
positive feedback M3-M7, for reducing the circuit’s power consumption; an in130
7.1. THE INTEGRATE-AND-FIRE NEURON CIRCUIT
verter with controllable slew-rate M8-M11, for setting arbitrary refractory periods; a digital inverter M13-M14, for generating digital pulses; a current-mirror
integrator M15-M19, for spike-frequency adaptation, and a minimum size transistor M20 for setting a leak current.
7.1.1 Circuit operation
The input current Iin j is integrated linearly by Cmem onto Vmem . The sourcefollower M1-M2, produces Vin = κ (Vmem − Vs f ), where Vs f is a constant subthreshold bias voltage and κ is the sub-threshold slope coefficient [74]. As Vmem
increases and Vin approaches the threshold voltage of the first inverter, the feedback current I f b starts to flow, increasing Vmem and Vin more rapidly. The positive
feedback has the effect of making the inverter M3-M5 switch very rapidly, reducing dramatically its power dissipation.
A spike is emitted when Vmem is sufficiently high to make the first inverter
switch, driving Vspk and Vo2 to Vdd . During the spike emission period (for as
long as Vspk is high), a current with amplitude set by Vadap is sourced into the
gate-to-source parasitic capacitance of M19 on node Vca . Thus, the voltage Vca
increases with every spike, and slowly leaks to zero through leakage currents when
there is no spiking activity. As Vca increases, a negative adaptation current Iadap
exponentially proportional to Vca is subtracted from the input, and the spiking
frequency of the neuron is reduced over time.
Simultaneously, during the spike emission period, Vo2 is high, the reset transistor M12 is fully open, and Cmem is discharged, bringing Vmem rapidly to Gnd.
As Vmem (and Vin ) go to ground, Vo1 goes back to Vdd turning M10 fully on. The
voltage Vo2 is then discharged through the path M10-M11, at a rate set by Vr f r
(and by the parasitic capacitance on node Vo2 ). As long as Vo2 is sufficiently high,
Vmem is clamped to ground. During this “refractory” period, the neuron cannot
spike, as all the input current Iin j is absorbed by M12.
Figure 7.2(a) shows an action potential generated by injecting a constant current Iin j into the circuit and activating both spike-frequency adaptation and refractory preiod mechanisms. Figure 7.2(b) shows how different refractory period
settings (Vr f r ) saturate the maximum firing rate of the circuit at different levels.
The circuit of Fig. 7.1 does not implement a simple linear model of an I&F.
Rather its positive feedback and spike-frequency adaptation mechanisms represent additional features that increase the model’s complexity (and hopefully its
computational capabilities). The overall current that the circuit receives is Iin +
I f b − Iadap , where Iin is the circuit’s input current Iin j subtracted by the leak current
Ileak , I f b is the positive feedback current and Iadap is the adaptation current generated by the spike-frequency adaptation mechanism. We can use the transitor’s
131
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
2.5
Membrane potential (V)
2
1.5
1
0.5
0
0
0.01
0.02
0.03
Time (s)
0.04
0.05
0.06
(a)
1200
1000
Vrfr=300mV
Vrfr=350mV
Vrfr=450mV
Firing rate (Hz)
800
600
400
200
0
0
0.1
0.2
0.3
Input current (µ A)
0.4
0.5
(b)
Figure 7.2: (a) Measured data (circles) representing an action potential generated for a
constant input current Iin j with spike-frequency adaptation and refractory period mechanisms activated. The data is fitted with the analytical model of eq. (7.5) (solid line). (b)
Circuit’s f -I curves (firing rate versus input current Iin j ) for different refractory period
settings.
132
7.1. THE INTEGRATE-AND-FIRE NEURON CIRCUIT
Cm = 0.66pF
Ca = 0.12pF
C p = 500fF
Iin = 177pA Vs f = 0.5V
I1 = 2.29pA Va0 = 50mV
I0 = 100fA κ = 0.6
Table 7.1: Parameters used to fit the data of Fig. 7.2(a)
weak-inversion equations [74] to compute the adaptation current:
Iadap = I0 e
κ VUca
T
(7.1)
where I0 is the transistor’s dark current [74] and UT is the thermal voltage.
If we denote with Ca the parasitic gate-to-source capacitance on node Vca of
M19, and with C p the parasitic gate-to-drain capacitance on M19, then:
Vca = Vca0 + γ Vmem
(7.2)
C
p
and Vca0 is the steady-state voltage stored on Ca , updated with
where γ = Cp +C
a
each spike.
To model the effect of the positive feedback we can assume, to first order
approximation, that the current mirrored by M3,M7 is:
I f b = I1 eκVin
(7.3)
where I1 is a constant current flowing in the first inverter when both M4,M5
conduct, and Vin = κ (Vmem −Vs f ) is the output of the source-follower M1,M2.
The equation modeling the subthreshold behavior of the neuron is:
C0
d
Vmem = Iin + I f b − Iadap
dt
(7.4)
where C0 = Cm + γ Ca . Substituting Iadap and I f b with the equations derived above
we obtain:
V
V
mem
mem
mem
d
− VU
κ Ua0 κγ VU
−κ 2 Us f κ 2 VU
T e
T
T
− I0 e T e
1−e T
(7.5)
C0 Vmem = Iin + I1 e
dt
We fitted the experimental data by integrating eq. (7.5) numerically and using
the parameters shown in Table 7.1 (see solid line of Fig. 7.2(a)). The initial part
of the fit (for low values of Vmem ) is not ideal because the equations used to model
the source follower M1,M2 are correct only for values of Vmem sufficiently high.
133
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
7.2 Networks of Integrate and Fire neurons
Networks of I&F neurons consist of arrays of these types of neurons connected
to synaptic circuits that generate currents with biologically plausible dynamics.
The synaptic circuits used in neuromorphic devices can exhibit simple non-linear
integration properties [9], short-term depression properties [16, 99], or plasticity/learning properties [15, 52].
Researchers in the neuromorphic engineering community are starting just now
to put all these components together to form AER neural network devices. In
Fig. 7.3 we show the activity of one of these types of devices, containing a network of 32 I&F neurons and of 32×8 plastic synapses in response to constant
currents being injected into each neuron. Each dot in Fig. 7.3(a) represents an
address-event. The address of the spiking neuron is on the ordinates, while time
is on the abscissae. In Fig. 7.3(b) we plotted the neurons’ mean firing rates as a
function of their input current (set by changing the Vgs of Fig. 7.1), for different
refractory period settings (Vr f r of Fig. 7.1), on a semi-logarithmic scale. Given
the exponential relationship between Vgs and the current of a MOSFET working
in weak-inversion [74], Fig. 7.3(b) shows how the firing rate is linear with the
input current, saturating at higher asymptotic values, for increasing values of Vr f r
(decreasing refractory period duration).
These AER networks of I&F neurons act as transceivers: they receive addressevents in input and generate events in output. The topology of the network together with the weights of the synapses interconnecting the neurons determine
the network’s functionality. Address-event systems allow us to arbitrarily configure network topologies by re-mapping the (digital) address events (e.g. using
lookup tables, micro controllers, or dedicated PCI boards [19]). There are currently different approaches for controlling synaptic weights. These include the
use of Floating-Gate devices [29], binary synapses [16], or spike-timing based
weight update rules [11, 52].
7.3 A competitive ring-of-neurons network
A network topology that can exhibit WTA behavior is shown in Fig. 7.4. We
implemented this network using a ring of 32 excitatory I&F neurons that project
to a global inhibitory neuron which in turn projects its output spikes back to all
the excitatory neurons. In our implementation the excitatory neurons are mutually
connected to their nearest neighbors. Each neuron can also be excited or inhibited
via additional synaptic circuits that can be accessed by AER spikes arriving from
outside the chip.
Architectures of this type have been shown to exhibit a broad range of interest134
7.3. A COMPETITIVE RING-OF-NEURONS NETWORK
(a)
Vrfr
(b)
Figure 7.3: (a) Raster plots showing the activity of an AER array of 32 I&F neurons in
response to a constant input current, for four decreasing values of the refractory period
(clockwise from the top left quadrant). (b) Mean response of all neurons in the array to
increasing values of a global input current, for the same refractory period settings. The
error bars represent the responses standard deviation throughout the array.
ing computational properties [39, 102]. Hahnloser et al. (2000) present a similar
network architecture, but in which the neurons are modeled with simple linear
135
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
threshold units. In the mean-rate interpretation of neural firing, the network we
propose can exhibit a winner-take-all behavior and reproduce other results presented by Hahnloser et al. 2000. In the more realistic spiking interpretation of the
network, the system can also produce behaviors such as adaptive thresholding,
single-neuron winner-take-all behavior, and synchrony filtering. The temporal
dependence of the inhibitory interactions produces a variable selectivity to synchronous events as well as a locking behavior that shifts output spikes closely in
time to the input spikes.
The VLSI device that implements this network is a 2 × 2mm2 chip, fabricated using standard 1.2µ m CMOS technology. It contains analog circuits implementing the neurons and synapses present in the competitive network, and hybrid
analog/digital circuits that implement the AER input/output communication infrastructure.
A schematic diagram of the network architecture is depicted in Fig. 7.4. The
excitatory neurons (empty circles), as well as the inhibitory neuron (filled circle)
are implemented using leaky integrate-and-fire neurons, while the synapses are
implemented via non-linear integrators.
The network consists of an array of 32 excitatory neurons that project to a
global inhibitory neuron, which in turn projects back onto all 32 excitatory neurons. Each excitatory neuron receives its input current from two nearest neighbor
excitatory neurons, from the global inhibitory neuron and from both excitatory
and inhibitory synapses that can be stimulated from external inputs, through the
AER protocol.
The coupling strength of each type of synapse (external excitatory, external
inhibitory, internal excitatory-to-excitatory, internal excitatory-to-inhibitory and
inhibitory-to-excitatory) and the duration of the synaptic currents can be set by
external bias voltages.
As can be seen from Figure 7.4, we use closed (periodic) boundary conditions for the network: the leftmost neuron receives spikes from the output of the
chip’s rightmost neuron. Similarly, the rightmost neuron receives spikes from the
leftmost neuron on the chip.
We characterized the behavior of the network by stimulating groups of neurons with Poisson-distributed spike trains generated from a workstation and sent
to the chip using the Address-Event Representation. We performed two sets of experiments. The first one was aimed at characterizing the network by analyzing the
neuron’s mean firing rate, in response to different types of input patterns, and as a
function of different connectivity conditions. The second sets of experiments was
aimed at characterizing the detailed timing properties of the network, for example
arising from synchronization of output spike patterns or correlations among them.
136
7.3. A COMPETITIVE RING-OF-NEURONS NETWORK
Figure 7.4: Architecture of the integrate-and-fire ring of neurons chip. Empty circle
represent excitatory neurons. The filled circle represents the global inhibitory neuron. The
gray line symbolizes inhibitory connections, from the inhibitory neuron to all excitatory
neurons. Black arrows denote excitatory connections.
Mean-Rate Winner-Take-All
To characterize the behavior of the network using the mean-rate representation
we stimulate different neurons on the chip with Poisson-distributed spike trains
using long EPSPs and no inherent leak, and read out the output spikes via the
AER output bus. In order to evaluate the effect of lateral coupling among neurons
we stimulated both isolated neurons and groups of neighboring neurons. Furthermore, we used different spike-rates for different groups of neurons, to bias the
competition. Specifically, we stimulated neuron #5 with spikes-trains with a mean
rate of 5Hz; neurons #9, #10, #11 and #25 with synchronous spikes at a rate of
14Hz; neurons #19, #20 and #21 with synchronous spikes at a rate of 25Hz and
neuron #29 with spikes at a mean rate of 37Hz.
In Fig. 7.5(a) we show a raster plot displaying both the occurrences of input
spikes (small dots) and output spikes (red empty circles) in the case in which
global inhibition is effectively switched off (by setting the strength of the local
inhibitory synapses of each neuron to zero). If the strength of the local inhibitory
synapses is increased to a relatively high value, the network exhibits a winnertake-all type of behavior: the neuron receiving the strongest excitation has the
highest output spike rate and inhibits all other neurons. If we also increase the
weight of the local (lateral) excitatory synapses, the network’s winner-take-all behavior is even more pronounced due to the recruitment of neighboring neurons.
This is seen in Fig. 7.5(b) where the top panel shows the firing frequency of input spikes for all 32 neurons. The middle trace shows the output frequencies of
each of those neurons in the absence of local excitation, and the lower trace the
same in the presence of local excitation. Note that the neuron with the highest
input, #29 in the simulation generates the most output spikes and that the output
of neurons with low firing rates may be completely suppressed. Note, further, that
137
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
this effect is more pronounced in the case with local excitatory connections and
that it spreads to nearest neighbors (#28 and 30) even though they do not receive
any input spikes. In this parameter range, the details of spike timing do not play a
significant role in determining the winning output.
Producing Synchrony
While the amplifying effects of synchronous inputs on leaky integrate-and-fire
neurons are well known, good techniques for controlling (inserting) synchrony
are less well understood. In the following experiment we show the ability of this
network to produce correlated firing combined with the winner-take-all function.
To investigate this effect, we simultaneously1 stimulated three neighboring neurons (neurons #9, #10 and #11, in the example of Fig. 7.6) with Poisson distributed
spike trains at a mean rate of 30Hz. We also stimulate neurons #8, #11, #12 and
#13 with independent Poisson input spikes at a mean rate of 10Hz, and neuron
#9 with an additional independent Poisson spike train with a mean rate of 40Hz
(see Fig. 7.6a). We applied this input stimulus pattern to the network for three
different sets of synaptic weights. In the first case we applied the stimuli without
global inhibition (i.e. with the inhibitory synapses set to zero). In a second case
we applied the input stimuli with global inhibition activated and in the final case
we stimulated the network with both global inhibition and local excitation. Figure 7.6(b) shows raster plots of both input and output spikes for the three different
cases. In the absence of global inhibition (top trace of Fig. 7.6b), the outputs of
neurons #9, #10, and #11, which are partially stimulated by the same input, are
not well synchronized. As shown by the middle plot of Fig. 7.6(b), global inhibition partly increases the amount of synchronization between these neurons, and
the subsequent addition of local excitation (bottom trace, Fig. 7.6(b)) increases
the synchronization even further.
To quantify these increases in synchronization, we computed the pairwise
cross-correlation functions between neuron pairs 9-10, 9-11 and 10-11. Figure 7.7
shows the average of these three cross-correlation functions for the three cases described above. As shown in the top plot of Fig. 7.7, the correlation peak around
zero lag is weak. Global inhibition, which has a temporal modulatory effect that
shortens the time-window of integration, has the effect of narrowing the peak and
increasing its amplitude. The combination of global inhibition and local excitation has the effect of further increasing the correlation peak, as predicted by
theory [76].
1 “Simultaneous stimulation” in our experimental setup means transmitting address-
events with delays of approximately 30µ s between each other.
138
7.3. A COMPETITIVE RING-OF-NEURONS NETWORK
30
Neuron address
25
20
15
10
5
0
0.5
1
Time (s)
1.5
2
(a)
Mean rate
40
20
Mean rate
0
0
60
10
15
20
25
30
5
10
15
20
25
30
5
10
15
20
Neuron address
25
30
40
20
0
0
Mean rate
5
150
100
50
0
0
(b)
Figure 7.5: (a) Raster plot of input spike trains (small dots) superimposed onto the output
spike trains (empty circles), with global inhibitory feedback turned off (the inhibitory-toexcitatory synaptic weights are set to zero). (b) Histograms of input spike distribution
(top trace), output spike distribution of competitive network with global inhibition but no
lateral excitation (middle trace) and output spike distribution of competitive network with
global inhibition and with lateral excitation (bottom trace)
139
12
10Hz
Neuron address
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
12
10
8
11
30Hz
10
40Hz
9
10Hz
8
(a)
0.5
1
1.5
2
2.5
3
0.5
1
1.5
2
2.5
3
0.5
1
1.5
Time (s)
2
2.5
3
12
10
8
0
Neuron address
10Hz
Neuron address
0
12
10
8
0
(b)
Figure 7.6: (a) Arrangement of input signals used to stimulate a set of neurons of the network. Each box represents a Poisson distributed spike train source. (b) Raster plots representing input spikes (small dots), output spikes (empty circles), and coincident (within
1ms time window) output spikes (filled circles) for the three network configurations:
Without global inhibition (top raster plot), with global inhibition (middle raster plot) and
with global inhibition and local excitation (bottom raster plot).
140
7.3. A COMPETITIVE RING-OF-NEURONS NETWORK
0.4
0.2
0
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
−0.4
−0.3
−0.2
−0.1
0
Lag (s)
0.1
0.2
0.3
0.4
0.5
0.4
0.2
0
−0.5
0.4
0.2
0
−0.5
Figure 7.7: Pairwise cross correlations averaged over neuron pairs 9-10, 9-11 and 10-11.
The data of the top trace were computed from the response of the network in the absence
of global inhibition. The middle trace corresponds to the case with global inhibition and
the bottom trace corresponds to the case with both global inhibition and local excitation
turned on.
141
CHAPTER 7. SILICON NEURAL MODELS OF WINNER-TAKE-ALL NETWORKS
142
Chapter 8
Conclusions
In the preceding chapters we described examples of simple systems that make
use of the technology developed over the last few years within the field of neuromorphic engineering. These examples are representative of what can be achieved
using the present state-of-the-art. Up to now neuromorphic engineers have mastered, generally speaking, the art of building single-chip systems. We are starting to consolidate the framework for designing and successfully implementing
multi-chip systems, and systems containing conventional analog/digital electronics interfaced to neuromorphic devices. We have reached the point in which the
technology is standardized and mature enough for building complex systems, containing sensory devices interfaced to chips carrying out different types of computation, interfaced to actuators interacting in real time with the environment. All
the small (neuromorphic) bits and pieces are starting to come together. The possibility to build complex neuromorphic systems which sense and interact with
the environment will hopefully contribute to advancements both in basic research
and in commercial applications. This technology is likely to become instrumental
both for research on computational neuroscience, and for practical applications
that involve sensory signal processing, adaptation to changes in the input signals,
recognition, etc.
8.1 Emulating Neural Circuits
During the last decade CMOS aVLSI has been used to construct a wide range of
neural analogs, from single synapses to sensory arrays, and simple systems. These
circuits are not general processors. They simply exploit the inherent physics of
analog transistors to produce an efficient computation of a particular task.
The analog circuits have the advantage of emulating biological systems in real
time. To the extent that the physics of the transistors matches well the compu143
CHAPTER 8. CONCLUSIONS
tation to be performed, and digital communication between chips is small, the
aVLSI circuits use less power and silicon area than would an equivalent digital
system. This is an important advantage, because any serious attempt to replicate
the computational power of brains must use resources as effectively as possible.
The brain performs about 1016 operations per second. Using the best digital
technology that we can envisage, this performance would dissipate over 10MW
[81] by comparison with the brains consumption of only a few Watts. Subthreshold aVLSI circuits are also no match for neuronal circuits, but they are a factor of
104 more power-efficient than their digital counterparts.
The specialized but efficient nature of neuromorphic systems encourages a
different role in the investigation of biological systems for analog emulation (as
opposed to digital simulation). Analog emulation is particularly useful for relating the physical properties of the system to its computational function because
both levels of abstraction are combined in the same system. In many cases, these
neuromorphic analogs make direct use of device physics to emulate the computational processes of neurons, so that the base level of the analysis is inherent in
the machine itself. Because the computation is cast as a physical process, it is
relatively easy to move from emulation to physiological prediction.
8.2 Commercial Application Scenarios
Complete systems containing solely custom analog devices and neuromorphic
sensors would offer the greatest benefits to industrial applications, in terms of
compactness power consumption and cost, but the design, development and test
of these systems still proves to be a process too lengthy and demanding for typical industrial applications with stringent constraints such as fault tolerance, reliability and time-to-market. Nonetheless, neuromorphic sensors are starting to be
used commercially as intelligent preprocessors for implementing the first stages of
computation in artificial systems. In these systems neuromorphic sensors are interfaced to conventional electronics either through analog circuits such as buffers
and power amplifiers or through small/cheap digital signal processors such as PIC
micro-controllers. Typical application domains include:
1. automotive,
2. electronic toys and gadgets,
3. autonomous mobile systems,
4. space exploration.
144
8.2. COMMERCIAL APPLICATION SCENARIOS
8.2.1 Automotive Applications
For what concerns the automotive field, there is an increasing demand for artificial vision systems used to monitor both the inside environment of the vehicle, and
the outside. For example, to improve the reliability of current air-bag deployment
systems, information on the presence, height, position and size of the passenger is
necessary. Similarly, driver assistance systems require “intelligent” visual sensors
to alert the driver if he or she is slowly leaving the center of the road as if the
vehicle was side-slipping, or if an object is approaching the vehicle at a dangerous speed (either from the front or from the back of the vehicle). The luminous
conditions in these environments vary drastically, so even if the cost, power and
size constraints would allow it, artificial vision systems using conventional CCD
or CMOS imagers would not be able to function properly, due to their limited
dynamic range in contrast sensitivity. As mentioned in Chapter 5, the local adaptation properties of neuromorphic vision chips, allow these devices to operate in
a wide range of illumination conditions, ranging from bright moonlight to dim
sunlight. Thanks to these features and to the possibility that this technology offers
to process images directly at the focal-plane level, even large automotive companies, such as Daimler-Benz, are starting to investigate the use of neuromorphic
sensors [124].
8.2.2 Toys and “Sensory” Gadgets
For what concerns the second application domain, of electronic toys or gadgets,
the advantages offered by neuromorphic systems lie mainly in the low-power consumption (which results in long battery life) and in their compactness. In most
neuromorphic systems used in this application domain all the computation would
be carried out on a single chip. Thus the packaging components would simply include a (cheap plastic) lens, the chip package, a battery holder and some additional
small electronic components, such as resistors and/or capacitors. These systems
do not need to be absolutely precise or reliable, given the public and tasks they
are targeted to. The tracking chip described in Section 5.1 is a typical example of
a device that could be used with toy cars or other small motorized toys which use
visual tracking as a means of entertainment The designer is not required to guarantee that the system will track edges over a wide range of stimulus contrast or
illumination conditions with high reliability. It is sufficient that the system fulfills
its requirements 90% or even only 80% of the times. Similarly gesture, voice or
pitch recognition systems (using either silicon retinas or silicon cochleas or both)
do not need to have an extremely high recognition rate for these tasks (as opposed
to other application domains, such as character recognition for postal services).
The typical user of the system would be willing and even happy to repeat the
145
CHAPTER 8. CONCLUSIONS
gesture or command while interacting with the system.
8.2.3 Autonomous Mobile Systems
In Section 5.4 we showed examples of a mobile robot which used simple onedimensional visual sensors to implement a line-following task. Similar types of
sensors could be, and have been, successfully used to implement other types of
navigation tasks [51]. The use of neuromorphic sensors as intelligent preprocessors for autonomous vehicles reduces drastically the computational load of the vehicle’s central processing unit. By integrating multiple sensors on a single mobile
system and by connecting directly, or by means of elementary transformations,
the output of the sensors to the motor-command centers, one would be able to
provide the mobile system with elementary reflex-like behaviors that would allow
it to navigate safely, avoiding obstacles, following lines or performing other types
of elementary navigational tasks. Meanwhile the vehicle’s CPU could be used to
learn or plan paths or to carry out other types of high-level processing tasks, thus
improving the overall performance of the system.
We are currently integrating multiple sensors on the Koala robot (see Figure 5.13) and investigating its behavioral responses to its interactions with the
environment, in order to develop proper sensor-fusion and control algorithms for
complex navigation tasks.
8.2.4 Space Exploration
This section is not included here to stand as a provocation or to put forth visionary hypotheses. Space exploration is actually the field which will most likely
use the technology that the field of neuromorphic engineering is developing. New
space exploration missions will use mainly miniature space crafts containing small
rovers, similar to the Sojourner of the Mars Pathfinder mission, launched in December 1996. As NASA’s “Space Science Enterprise Strategic Plan” quotes, under the section Enabling Technologies, some of the key capabilities for new space
science missions are:
Lightweight, low-power, robust electronics systems
Highly integrated, lightweight instruments compatible with microspacecraft
Advanced miniaturization of electronic and mechanical components
New, highly autonomous and survivable spacecraft and computer architectures
146
8.3. SUMMARY
In the previous section we already pointed out the advantages that neuromorphic engineering can bring to autonomous mobile systems. Having a rover able
to navigate autonomously from one point of an unknown region to a nearby one
will reduce dramatically time-scales of exploration. The Mars Pathfinder mission
is a god example of what can be achieved with the current state-of-the art technology: even if the Sojourner rover has some basic hazard avoidance mechanism (as
defined by Brian Wilcox, supervisor of the robotic vehicles group at JPL), it basically runs in “remote-control” mode. It takes snapshots of the environment in its
immediate vicinity, transmits the data to a control center on Earth, waits approximately 22 minutes for motor commands (due to the 11 minute light-time delay in
transmitting signals from Mars to Earth and vice-versa), moves a few centimeters
and repeats the whole sequence of operations again.
It is most likely using also neuromorphic sensors that ”the next rovers to explore the red planet will have at least 1,000 times the processing speed of Sojourner and will be able to travel much farther much faster”, as Brian Wilcox
claims. Similarly, ”whereas Sojourner can only plan 20 centimeters ahead, our
next rover will be able to plan 10 meters or more ahead. And while Sojourner has
no concept of what rocks look like 20 meters away, her successor will be able to
recognize rocks that far away very easily.”
Apart from features such as the inherent characteristic of being able to adapt to
new unknown situations, the greatest advantage that neuromorphic systems offer
in this specific application domain is their low power consumption versus computational efficiency ratio. As mentioned in Section 8.1, the types of circuits used
in neuromorphic systems have the desirable property of carrying out the most
amount of computation for the least amount of power dissipation. This property
becomes especially relevant in scenarios where power consumption and power
dissipation have to be reduced to a minimum.
8.3 Summary
In this report we introduced the reader to the field of neuromorphic engineering
following a learn by examples approach. In the first chapter we presented the
basic concepts underlying this field and briefly described the basic circuit elements
used to design neuromorphic chips. In the second chapter we presented some
of the circuits which make up the building blocks for designing neuromorphic
vision chips. In Chapter 5 we proposed examples of single chip systems. The
first example is a chip which implements a winner-take-all (WTA) network. In its
most simple operation-mode, the chip selects the strongest of its inputs. Inputs
are provided through the chips pads. The second example is a chip which can be
used to measure from moving visual scenes (e.g. obtained by mounting the chip
147
CHAPTER 8. CONCLUSIONS
on some moving platform) its heading direction. The third example is an other
chip with visual input that determines the horizontal position of the stimulus with
highest contrast present in its field of view, using a WTA network similar to the
one present in the first example. This last example is a one dimensional device,
but its extension to two dimensions is straightforward: the WTA network would
be global, receiving inputs from all the pixels of a 2D array and providing outputs
to two independent 1D position-to-voltage circuits. The 2D position of a target
would then be obtained by simply looking at the outputs of the two position-tovoltage circuits. Other 2D tracking sensors have already been proposed [13, 32],
but these are intensity based, i.e. they select the brightest feature in the visual
scene. They are thus not suited for real-world applications in which the feature
that needs to be tracked is not necessarily the brightest one present in the visual
scene.
In Chapter 6 we described how it is possible to interface many neuromorphic
chips among each other in a flexible way, using a communication protocol that is
biologically inspired and optimal for the types of circuits used in neuromorphic
chips. We presented an example in which we connected a 2D silicon retina to an
array of 1D velocity sensors and showed how it is possible to change the functionality of the overall system by simply reprogramming lookup tables that connect
elements of one chip to elements of the other chip. In Section 6.4 we showed
how it is possible to interface neuromorphic sensors to conventional analog/digital
systems. Specifically we demonstrated three simple examples of tracking applications which make use of neuromorphic vision sensors, in real-world scenarios.
They are examples of neuromorphic systems able to perform complex visual tasks
using analog VLSI chips as a front-end preprocessors.
This report is intended to show part of the work (constantly “in progress”) that
is being carried out within the neuromorphic engineering community, but most
importantly, it is intended to introduce the reader to the field of neuromorphic
engineering, to demonstrates its potentialities and hopefully to attract researchers
to this field, helping the small worldwide community of neuromorphic engineers
to grow and improve.
Acknowledgments:
I would like to thank the people I worked most closely with, at the Institute of
Neuroinformatics, Zürich. In particular the late Jörg Kramer, whose circuits and
ideas I used so much, Tobi Delbrück and Shih-Chii Liu for the fruitful discussions
and comments, to Paul F.M.J. Verschure who introduced me to the world of mobile
robots, Koalas and Kehperas, and to Elisabetta Chicca and Adrian Whatley, for
their invaluable help in developing, debugging and setting up the AER multi-chip
148
8.3. SUMMARY
communication infrastructure.
I’d like also to give credit to the series of workshops on Neuromorphic Engineering held yearly in Telluride, Colorado, where many of the ideas, circuits
designs and system examples that define the field of neuromorphic engineering
are brought to life. The work described in this report was supported by the Swiss
National Science Foundation SPP Grant, by the U.S. Office of Naval Research
and by the EU-FET grant ALAVLSI (IST–2001–38099).
149
CHAPTER 8. CONCLUSIONS
150
Bibliography
[1] Baluja, S. and Pomerleau, D. (1997). Expectation-based selective attention for
the visual monitoring and control of a robot vehicle. Robotics and Autonomous
Systems Journal, 22:329–344.
[2] Behrmann, M. and Haimson, C. (1999). The cognitive neuroscience of visual
attention. Current opinion in neurobiology, 9:158–163.
[3] Ben-Yishai, R., Lev Bar-Or, R., and Sompolinsky, H. (1995). Theory of orientation tuning in visual cortex. Proceedings of the National Academy of Sciences
of the USA, 92(9):3844–3848.
[4] Bernander, Ö., Koch, C., and Usher, M. (1994). The effect of synchronized
inputs at the single neuron level. Neural Computation, 6:622–641.
[5] Bernays, E. (1996). Selective attention and host-plant specialization. Entomologia Experimentalis et Applicata, 80(1):125–131.
[6] Bertsekas, D. P. (1982). Constrained optimization and Lagrange multiplier
methods. Academic Press, New York.
[7] Boahen, K. (1996). A retinomorphic vision system. IEEE Micro, 16(5):30–
39.
[8] Boahen, K. (1997). The retinomorphic approach: Pixel-parallel adaptive amplification, filtering, and quantization. Jour. of Analog Integrated Circuits and
Signal Processing, 13(1/2):53–68.
[9] Boahen, K. (1998). Communicating neuronal ensembles between neuromorphic chips. In Lande, T. S., editor, Neuromorphic Systems Engineering, pages
229–259. Kluwer Academic, Norwell, MA.
[10] Boahen, K. and Andreou, A. (1992). A contrast sensitive silicon retina with
reciprocal synapses. In Touretzky, D., Mozer, M., and Hasselmo, M., editors, Advances in neural information processing systems, volume 4. IEEE, MIT
Press.
151
BIBLIOGRAPHY
[11] Bofill, A. and Murray, A. (2001). Circuits for VLSI implementation of temporally asymmetric Hebbian learning. In Dietterich, T. G., Becker, S., and
Ghahramani, Z., editors, Advances in Neural Information processing systems,
volume 14. MIT Press, Cambridge, MA.
[12] Bosch, H., Milanese, R., and Labbi, A. (1998). Object segmentation by
attention-induced oscillations. In Proc. IEEE Int. Joint Conf. Neural Networks,
volume 2, pages 1167–1171.
[13] Brajovic, V. and Kanade, T. (1998). Computational sensor for visual tracking
with attention. IEEE Journal of Solid State Circuits, 33(8):1199–1207.
[14] Braun, J. and Julesz, B. (1998). Dividing attention at little cost: detection
and discrimination tasks. Perception and Psychophysics, 60:1–23.
[15] Chicca, E., Badoni, D., Dante, V., D’Andreagiovanni, M., Salina, G., Fusi,
S., and Del Giudice, P. (2003a). A VLSI recurrent network of integrate–and–
fire neurons connected by plastic synapses with long term memory. IEEE
Trans. Neural Net., 14(5):1297–1307.
[16] Chicca, E., Indiveri, G., and Douglas, R. (2003b). An adaptive silicon
synapse. In Proc. IEEE International Symposium on Circuits and Systems.
IEEE.
[17] Choi, J. and Sheu, B. (1993). A high-precision VLSI winner-take-all circuit
for self- organizing neural networks. IEEE J. Solid-State Circuit, 28(5):576–
584.
[18] Culham, J., Brandt, S., Cavanagh, P., Kanwisher, N., Dale, A., and Tootell,
R. (1999). Cortical fMRI activation produced by attentive tracking of moving
targets. J Neurophysiol., 81:388–393.
[19] Dante, V. and Del Giudice, P. (2001).
The PCI-AER interface board.
In Cohen, A., Douglas, R., Horiuchi, T., Indiveri,
G., Koch, C., Sejnowski, T., and Shamma, S., editors, 2001 Telluride Workshop on Neuromorphic Engineering Report, pages 99–103.
http://www.ini.unizh.ch/telluride/previous/report01.pdf.
[20] Deiss, S. R., Douglas, R. J., and Whatley, A. M. (1998). A pulse-coded
communications infrastructure for neuromorphic systems. In Maass, W. and
Bishop, C. M., editors, Pulsed Neural Networks, chapter 6, pages 157–178.
MIT Press.
[21] Delbrück, T. (1999). 3 silicon retinas for simple consumer applications. In
Intelligent Vision Systems meeting, Santa Clara, CA.
152
BIBLIOGRAPHY
[22] Delbrück, T. and Mead, C. (1995). Analog VLSI phototransduction by
continuous-time, adaptive, logarithmic photoreceptor circuits. In Koch, C. and
Li, H., editors, Vision Chips: Implementing vision algorithms with analog VLSI
circuits, pages 139–161. IEEE Computer Society Press.
[23] Demosthenous, A., Smedley, S., and Taylor, J. (1998). A CMOS analog
winner-take-all network for large-scale applications. IEEE Trans. on Circuits
and Systems I, 45(3):300–304.
[24] Desimone, R. and Duncan, J. (1995). Neural mechanisms of selective visual
attention. Annu. Rev. Neurosci., 18:193–222.
[25] DeWeerth, S. and Morris, T. (1994). Analog VLSI circuits for primitive
sensory attention. In Proc. IEEE Int. Symp. Circuits and Systems, volume 6,
pages 507–510. IEEE.
[26] DeWeerth, S. and Morris, T. (1995). CMOS current mode winner-take-all
circuit with distributed hysteresis. Electronics Letters, 31(13):1051–1053.
[27] DeWeerth, S. P. (1992). Analog VLSI circuits for stimulus localization and
centroid computation. Int. J. of Comp. Vision, 8(3):191–202.
[28] Diesmann, M., Gewaltig, M.-O., and Aertsen, A. (1999). Stable propagation
of synchronous spiking in cortical neural networks. Nature, 402:529–533.
[29] Diorio, C., Hasler, P., Minch, B., and Mead, C. (1996). A single-transistor
silicon synapse. IEEE Trans. Electron Devices, 43(11):1972–1980.
[30] Douglas, R., Mahowald, M., and Mead, C. (1995). Neuromorphic analogue
VLSI. Annu. Rev. Neurosci., 18:255–281.
[31] ElMasry, E., Yang, H., and Yakout, M. (1997). Implementations of artificial
neural networks using current- mode pulse width modulation technique. IEEE
Trans. Neural Netw., 8(3):532–548.
[32] Etienne-Cummings, R., Van der Spiegel, J., and Mueller, P. (1996). A visual
smoot pursuit tracking chip. In Touretzky, D. S., Mozer, M. C., and E., H. M.,
editors, Advances in Neural Information Processing Systems, volume 8. MIT
Press.
[33] Fragnière, E., van Schaik, A., and Vittoz, E. (1997). Design of an analogue
VLSI model of an active cochlea. Jour. of Analog Integrated Circuits and
Signal Processing, 13(1/2):19–35.
[34] Fries, P., Reynolds, J., Rorie, A., and Desimone, R. (2001). Modulation
of oscillatory neuronal synchronization by selective visual attention. Science,
291:1560–1563.
153
BIBLIOGRAPHY
[35] Gibson, B. and Egeth, H. (1994). Inhibition of return to object-based and
environment-based locations. Percept. Psychopys., 55:323–339.
[36] Gray, P. M. and Meyer, R. G. (1984). Analysis and Design of Analog Integrated Circuits. Wiley, second edition.
[37] Grossberg, S. (1978). Competition, decision, and consensus. Journal of
Mathematical Analysis and Applications, 66:470–493.
[38] Hahnloser, R., Douglas, R. J., Mahowald, M., and Hepp, K. (1999). Feedback interactions between neuronal pointers and maps for attentional processing. Nature Neuroscience, 2:746–752.
[39] Hahnloser, R., Sarpeshkar, R., Mahowald, M., Douglas, R. J., and Seung,
S. (2000). Digital selection and analog amplification co-exist in an electronic
circuit inspired by neocortex. Nature, 405:947–951.
[40] Hansel, D. and Sompolinsky, H. (1996). Chaos and synchrony in a model of
a hypercolumn in visual cortex. J. Computat. Neurosci., 3(1):7–34.
[41] He, Y. and Sanchezsinencio, E. (1993). Min-net winner-take-all CMOS implementation. Electron. Lett., 29(14):3.
[42] Hoffman, J. and Subramaniam, B. (1995). The role of visual attention in
saccadic eye movements. Perception and Psychophysics, 57(6):787–795.
[43] Horiuchi, T., Bair, W., Bishofberger, B., Lazzaro, J., and Koch, C. (1992).
Computing motion using analog VLSI chips: an experimental comparison
among different approaches. International Journal of Computer Vision, 8:203–
216.
[44] Horiuchi, T. and Koch, C. (1999). Analog VLSI-based modeling of the
primate oculomotor system. Neural Computation, 11:243–265.
[45] Horiuchi, T., Morris, T., Koch, C., and DeWeerth, S. (1997). Analog VLSI
circuits for attention-based, visual tracking. In Mozer, M. C., Jordan, M. I.,
and Petsche, T., editors, Advances in Neural Information Processing Systems,
volume 9, pages 706–712. MIT Press.
[46] Horiuchi, T. and Niebur, E. (1999). Conjunction search using a 1-D, analog
VLSI-based attentional search/tracking chip. In Wills, D. S. and DeWeerth,
S. P., editors, Proc. Conf. Advanced Research in VLSI, pages 276–290. IEEE
Computer Society.
[47] Indiveri, G. (1997). Winner-take-all networks with lateral excitation. Analog
Integrated Circuits and Signal Processing, 13(1/2):185–193.
154
BIBLIOGRAPHY
[48] Indiveri, G. (2000a). A 2D neuromorphic VLSI architecture for modeling
selective attention. In Amari, S.-I., Giles, C. L., Gori, M., and Piuri, V., editors, Proceedings of the IEEE-INNS-ENNS International Joint Conference on
Neural Networks; IJCNN2000, volume IV, pages 208–213. IEEE Computer
Society.
[49] Indiveri, G. (2000b). Modeling selective attention using a neuromorphic
analog VLSI device. Neural Computation, 12(12):2857–2880.
[50] Indiveri, G. (2001a). A current-mode hysteretic winner-take-all network,
with excitatory and inhibitory coupling. Analog Integrated Circuits and Signal
Processing, 28(3):279–291.
[51] Indiveri, G. (2001b). A neuromorphic VLSI device for implementing 2-D
selective attention systems. IEEE Trans. on Neural Networks, 12(6):1455–
1463.
[52] Indiveri, G. (2002). Neuromorphic bistable VLSI synapses with spiketiming-dependent plasticity. In Advances in Neural Information Processing
Systems, volume 15, Cambridge, MA. MIT Press.
[53] Indiveri, G. (2003). Neuromorphic selective attention systems. In Proc.
IEEE International Symposium on Circuits and Systems. IEEE.
[54] Indiveri, G., Kramer, J., and Koch, C. (1996). System implementations of
analog VLSI velocity sensors. IEEE Micro, 16(5):40–49.
[55] Indiveri, G., Mürer, R., and Kramer, J. (2001). Active vision using an analog
VLSI model of selective attention. IEEE Trans. on Circuits and Systems II,
48(5):492–500.
[56] Indiveri, G., Oswald, P., and Kramer, J. (2002). An adaptive visual tracking
sensor with a hysteretic winner-take-all network. In Proc. IEEE International
Symposium on Circuits and Systems, pages 324–327. IEEE.
[57] Indiveri, G. and Verschure, P. F. M. J. (1997). Autonomous vehicle guidance
using analog VLSI neuromorphic sensors. In W. Gerstner, A. Germond, M. H.
and Nicoud, J.-D., editors, Proceedings Artificial Neural Networks-ICANN97:
Lausanne, Switzerland, volume 1327 of Lecture Notes in Computer Science.
Berlin: Springer, pages 811–816. Springer Verlag.
[58] Itti, L., Niebur, E., and Koch, C. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. on Pattern Analysis and Machine
Intelligence, 20(11):1254–1259.
[59] Jones, J. and Palmer, L. (1987). The two-dimensional spatial structure of
simple receptive fields in cat striate cortex. J. Neurosci., 58:1187–1211.
155
BIBLIOGRAPHY
[60] Kaski, S. and Kohonen, T. (1994). Winner-take-all networks for physiological models of competitive learning. Neural Networks, 7(6/7):973–984.
[61] Kastner, S., De Weerd, P., Desimone, R., and Ungerleider, L. G. (1998).
Mechanisms of directed attention in the human extrastriate cortex as revealed
by functional MRI. Science, 282(2):108–111.
[62] Knudsen, E. and Konishi, M. (1978). Center-surround organization of auditory receptive fields in the owl. Science, 202(4369):778–780.
[63] Koch, C. and Mathur, B. (1996). Neuromorphic vision chips. IEEE Spectrum, 33(5):38–46.
[64] Koch, C. and Ullman, S. (1985). Shifts in selective visual-attention – towards
the underlying neural circuitry. Human Neurobiology, 4(4):219–227.
[65] König, P., Engel, A., Roelfsema, P., and Singer, W. (1995). How precise is
neuronal synchronization? Neural Comput., 7(3):469–485.
[66] Kramer, J. (2002a). An integrated optical transient sensor. IEEE Trans. on
Circuits and Systems II, 49(9):612–628.
[67] Kramer, J. (2002b). An ON/OFF transient imager with event-driven, asynchronous readout. In Proc. IEEE International Symposium on Circuits and
Systems.
[68] Kramer, J., Sarpeshkar, R., and Koch, C. (1997). Pulse-based analog VLSI
velocity sensors. IEEE Trans. on Circuits and Systems II, 44(2):86–101.
[69] Lau, K. and Lee, S. (1998). A CMOS winner-takes-all circuit for selforganizing neural networks. Int. J. Electron., 84(2):131–136.
[70] Lazzaro, J. (1990). Silicon Models of Early Audition. Ph.D. thesis, California
Institute of Technology, Pasadena, CA.
[71] Lazzaro, J. and Mead, C. (1989). A silicon model of auditory localization.
Neural Computation, 1:41–70.
[72] Lazzaro, J., Ryckebusch, S., Mahowald, M., and Mead, C. (1989). Winnertake-all networks of O(n) complexity. In Touretzky, D., editor, Advances in
neural information processing systems, volume 2, pages 703–711, San Mateo
- CA. Morgan Kaufmann.
[73] Lazzaro, J., Wawrzynek, J., Mahowald, M., Sivilotti, M., and Gillespie, D.
(1993). Silicon auditory processors as computer peripherals. IEEE Trans. on
Neural Networks, 4:523–528.
[74] Liu, S.-C., Kramer, J., Indiveri, G., Delbrück, T., and Douglas, R. (2002).
Analog VLSI:Circuits and Principles. MIT Press.
156
BIBLIOGRAPHY
[75] Lu, T.-T., Baker, M., Salthouse, C., Sit, J.-J., Zhak, S., and Sarpeshkar, R.
(2003). A micropower analog VLSI processing channel for bionic ears and
speech-recognition front ends. In Proc. IEEE Int. Symp. on Circuits and Systems, volume 5, pages 41–44. IEEE.
[76] Maass, W. and Bishop, C. M. (1998). Pulsed Neural Networks. MIT Press.
[77] Mahowald, M. (1994). An Analog VLSI System for Stereoscopic Vision.
Kluwer, Boston.
[78] Mahowald, M. and Mead, C. (1989). Analog VLSI and Neural Systems,
chapter Silicon Retina, pages 257–278. Addison-Wesley, Reading, MA.
[79] Marr, D. (1982). Vision, a Computational Investigation into the Human
Representation & Processing of Visual Information. Freeman, San Francisco.
[80] Mead, C. (1989). Analog VLSI and Neural Systems. Addison-Wesley, Reading, MA.
[81] Mead, C. (1990). Neuromorphic electronic systems. Proceedings of the
IEEE, 78(10):1629–1636.
[82] Mead, C. and Delbrück, T. (1991). Scanners for visualizing activity of analog VLSI circuitry. Analog Integrated Circuits and Signal Processing, 1:93–
106.
[83] Meador, J. L. and Hylander, P. D. (1994). Pulse coded winner-take-all networks. In Zaghloul, M. E., Meador, J. L., and Newcomb, R. W., editors, Silicon Implementation of Pulse Coded Neural Networks, chapter 5, pages 79–99.
Kluwer.
[84] Miller, M. J. and Bockisch, C. (1997). Where are the things we see? Nature,
386(10):550–551.
[85] Moore, C., Nelson, S., and Sur, M. (1999). Dynamics of neuronal processing
in rat somatosensory cortex. Trends Neurosci., 22(11):513–520.
[86] Morris, T. and DeWeerth, S. (1996). Analog VLSI circuits for covert attentional shifts. In Proceedings of the Fifth International Conference on Microelectronics for Neural, Fuzzy and Bio-inspired Systems; Microneuro’96, pages
30–37, Los Alamitos CA. IEEE Computer Society Press.
[87] Morris, T. G. and DeWeerth, S. P. (1997). Analog VLSI excitatory feedback circuits for attentional shifts and tracking. Analog Integrated Circuits and
Signal Processing, 13(1/2):79–92.
[88] Morris, T. G., Horiuchi, T. K., and DeWeerth, S. P. (1998). Object-based selection within an analog VLSI visual attention system. IEEE Trans. on Circuits
and Systems II, 45(12):1564–1572.
157
BIBLIOGRAPHY
[89] Mozer, M. and Sitton, M. (1998). Computational modeling of spatial attention. In Pashler, H., editor, Attention, pages 341–395. Psychology Press, East
Sussex.
[90] Mudra, R. and Indiveri, G. (1999). A modular neuromorphic navigation system applied to line following and obstacle avoidance tasks. In Löffler, A., Mondada, F., and Rückert, U., editors, Experiments with the Mini-Robot Khepera,
volume 64 of Proceedings of the 1st International Khepera Workshop, pages
99–108, Paderborn.
[91] Nakada, A., Konda, M., Morimoto, T., Yonezawa, T., Shibata, T., and Ohmi,
T. (1999). Fully-parallel VLSI implementation of vector quantization processor
using neuron-MOS technology. IEICE Trans. on Electron., E82C(9):1730–
1738.
[92] Nakayama, K. and Mackeben, M. (1989). Sustained and transient components of focal visual attention. Vision Research, 29:1631–1647.
[93] Nass, M. M. and Cooper, L. N. (1975). A theory for the development of
feature detecting cells in visual cortex. Biological Cybernetics, 19:1–18.
[94] Niebur, E. and Koch, C. (1996). Control of selective visual attention: modeling the “where” pathway. In Touretzky, D., Mozer, M., and Hasselmo, M.,
editors, Advances in neural information processing systems, volume 8, Cambridge, MA. MIT Press.
[95] Niebur, E. and Koch, C. (1998). Computational architectures for attention.
In Parasuraman, R., editor, The Attentive Brain, pages 163–186. MIT Press.
[96] Noton, D. and Stark, L. (1971). Scanpaths in eye movements during pattern
perception. Science, 171:308–311.
[97] Olshausen, B. A., Anderson, C. H., and Van Essen, D. C. (1993). A neurobiologycal model of visual attention and invariant pattern recognition based on
routing of information. J. Neurosci., 13(11):4700–4719.
[98] Pollack, G. (1988). Selective attention in an insect auditory neuron. J. Neurosci., 8:2635–2639.
[99] Rasche, C. and Hahnloser, R. (2001). Silicon synaptic depression. Biological
Cybernetics, 84(1):57–62.
[100] Ritz, R., Gerstner, W., Gaudoin, R., and van Hemmen, J. L. (1997).
Poisson-like neuronal firing due to multiple synfire chains in simultaneous action. In Bower, editor, Computational Neuroscience: Trends in Research 1997,
pages 801–806. Plenum Press, New York.
158
BIBLIOGRAPHY
[101] Robinson, D. (1965). The mechanism of human smooth pursuit eye movement. Journal of Physiology, 180:569–591.
[102] Salinas, E. and Abbott, L. (1996). A model of multiplicative neural responses in parietal cortex. Proc. Natl. Acad. Sci., 93:11956–11961.
[103] Sarpeshkar, R., Lyon, R., and Mead, C. (1996). An analog VLSI cochlea
with new transconductance amplifiers and nonlinear gain control. In Proc.
IEEE Int. Symp. on Circuits and Systems, volume 3, pages 292–296. IEEE.
[104] Serrano, T. and Linares-Barranco, B. (1995). A modular current-mode
high-precision winner-take-all circuit. IEEE Trans. on Circuits and Systems
II, 42(2):132–134.
[105] Shi, J. and Tomasi, C. (1994). Good features to track. In Proc. IEEE Conf.
Computer Vision and Pattern Recognition, pages 593–600.
[106] Starzyk, J., A. and Fang, X. (1993). CMOS current mode winner-takeall circuit with both excitatory and inhibitory feedback. Electronic Letters,
29(10):908–910.
[107] Steinmetz, P., Roy, A., Fitzgerald, P., Hsiao, S., Johnson, K., and Niebur,
E. (2000). Attention modulates synchronized neuronal firing in primate somatosensory cortex. Nature, 404:187–190.
[108] Stopfer, M., Bhagavan, S., Smith, B., and Laurent, G. (1997). Impaired
odour discrimination on desynchronization of odour-encoding neural assemblies. Nature, 390:70–74.
[109] Tanaka, Y. and Shimojo, S. (1996). Location vs feature: Reaction
time reveals dissociation between two visual functions. Vision Research,
36(14):2125–2140.
[110] Tomazou, C., Lidgey, F., J., and Haigh, D., G., editors (1990). Analogue IC
design: the current-mode approach. Peter Peregrinus Ltd.
[111] Toumazou, C., Ngarmnil, J., and Lande, T. (1994). Micropower log-domain
filter for electronic cochlea. Electronics Letters, 30(22):1839–1841.
[112] Trahanias, P., Velissaris, S., and Garavelos, T. (1997). Visual landmark
extraction and recognition for autonomous robot navigation. In Proc. IEEE Int.
Conf. Intelligent Robots and Systems IROS ’97, volume 2, pages 1036–1043.
[113] Tsotsos, J. K., Culhane, S. M., Wai, W. Y. K., Lai, Y., Davis, N., and Nuflo,
F. J. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78(1-2):507–545.
159
BIBLIOGRAPHY
[114] Usher, M., Stemmler, M., Koch, C., and Olami, Z. (1994). Network amplification of local fluctuations causes high spike rate variability, fractal firing
patterns and oscillatory local-field potentials. Neural Computation, 6(5):795–
836.
[115] van Schaik, A. (2001). Building blocks for electronic spiking neural networks. Neural Networks, 14(6–7):617–628.
[116] van Vreeswijk, C. and Sompolinsky, H. (1996).
Chaos in neuronal networks with balanced excitatory and inhibitory activity. Science,
274(5293):1724–1726.
[117] Vittoz, E. (1994). Micropower techniques. In Franca, J. and Tsidivis, Y.,
editors, Design of VLSI Circuits for Telecommunications and Signal Processing. Prentice Hall.
[118] Vittoz, E. and Arreguit, X. (1993). Linear networks based on transistors.
Electronics Letters, 29(3):297–298.
[119] von der Malsburg, C. (1973). Self-organization of orientation sensitive cells
in the striate cortex. Kybernetik, 14:85–100.
[120] Wilson, C., Morris, T., and DeWeerth, S. (1999). A two-dimensional,
object-based analog VLSI visual attention system. In DeWeerth, S. P., Wills,
S., and Ishii, A., editors, Proc. Conf. Advanced Research in VLSI, volume 20,
Los Alamitos, CA. IEEE Computer Society Press.
[121] Yarbus, A. L. (1967). Eye movements and vision. Plenum Press.
[122] Yu, H., Miyaoka, R., and Lewellen, T. (1998). A high-speed and highprecision winner-select-output (WSO) ASIC. IEEE Trans. on Nucl. Sci.,
45(3):772–776.
[123] Yuille, A. L. and Geiger, D. (1995). Winner-take-all mechanisms. In Arbib,
M. A., editor, The Handbook of Brain Theory and Neural Networks, pages
1056–1060. MIT Press, Cambridge, MA.
[124] Zinner, H. and Nothaft, P. (1996). Analogue image processing for driver assistant systems. In Proc. Advanced Microsystems for Automotive Applications,
Berlin, D.
160