LINEARITY, GAIN CONTROL AND SPIKE
ENCODING IN THE PRIMARY VISUAL CORTEX
by
Matteo Carandini
A dissertation submitted in partial fulllment
of the requirements for the degree of
Doctor of Philosophy
Center for Neural Science
New York University
September, 1996
Approved
J. Anthony Movshon
Ai miei genitori
ii
Acknowledgments
Thanks to Tony Movshon for being a terric advisor. Whenever I have needed
guidance he has understood my problems in a snap and given great advice. I learnt
a lot from him, half by direct transfer of information and half by osmosis.
I thank David Heeger for teaching me how to create and evaluate models, and
how to explain to people what I have done. David has a clarity of thought that I
will always try to emulate.
Thanks to Chris Leonard for teaching me about the intracellular world and for
letting me play in his lab, and to my dear friend Ferenc Mechler for playing with
me and bearing with my bad moods.
Thanks to Larry O'Keefe for teaching me about physiology and for helping me
out in countless occasions. A lot of the data that I present in this Thesis come
from experiments in which he had the leading role. Allen Poirson and Chao Tang
also collected sizeable portions of data, and I thank them very much for their help.
George Sperling, Charlie Peskin, Mike Hawken, Dan Tranchina, Jonathan Victor, Larry Maloney, Dario Ringach and James Cavanaugh provided me with crucial
bits of advice during this research, and I thank them for their help. Thanks also
to Bob Shapley, with whom I have had many great conversations.
Finally, thanks to everybody in the Center for Neural Science. I have never
seen such a collection of intelligent, friendly and helpful people. They made ve
years of graduate school a great experience.
iii
Preface
This Thesis was defended in December, 1995, and revised in August, 1996. Its
core chapters originate from three individual documents:
Chapter 2, \Linearity", is adapted from the rst half of a book chapter,
\Linearity and Gain Control in V1 Simple Cells" (Carandini et al., 1996a),
whose additional authors are David Heeger and Tony Movshon.
Chapter 3, \Gain Control", comes from a paper called \Linearity and normalization in the responses of simple cells of the macaque visual cortex" which
is in preparation. It deals with work I have done mainly with David Heeger
and Tony Movshon. Some of its contents are already available as conference
abstracts (Carandini and Heeger, 1993; Carandini et al., 1993a,b, 1994a),
and as short publications (Carandini and Heeger, 1994, 1995).
Chapter 4, \Spike Encoding", is essentially identical to a paper, \Spike train
encoding in regular-spiking cells of the visual cortex in vitro" (Carandini
et al., 1996b), which I wrote with Ferenc Mechler, Chris Leonard and Tony
Movshon. Additional references for this work are two conference abstracts
(Carandini et al., 1994b, 1995).
While I have devoted some eort to linking these Chapters in a cohesive manner,
my goal is to keep them as self-sucient as possible. This implies a bit of repetition
between Chapters 2 and 3, which I hope the reader of the whole Thesis will excuse.
iv
Contents
Acknowledgments .
Preface . . . . . . .
List of Figures . . .
List of Tables . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 Introduction
iii
iv
viii
xi
1
2 Linearity
10
2.1 The Linear Model of Simple Cells . . . . . . . . . .
2.1.1 Visual Stimuli In Space-Time . . . . . . . .
2.1.2 Spatiotemporal Weighting Functions . . . .
2.1.3 A Nonlinearity: Light Adaptation . . . . . .
2.1.4 Another Nonlinearity: Rectication . . . . .
2.2 Some Linear Properties of Simple Cells . . . . . . .
2.2.1 Responses to Impulses . . . . . . . . . . . .
2.2.2 Responses to Drifting Gratings . . . . . . .
2.2.3 Responses to Contrast-Modulated Gratings .
2.2.4 Responses to Compound Stimuli . . . . . . .
2.3 Biophysics of the Linear Model . . . . . . . . . . .
2.3.1 Linearity of the LGN . . . . . . . . . . . . .
2.3.2 Building Simple Cell Receptive Fields . . . .
2.3.3 Push-Pull Arrangement of Inputs . . . . . .
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
13
16
17
19
19
24
27
30
32
32
33
35
2.3.4 Linearity of Excitation and Inhibition . . .
2.3.5 Simplied Model of a Cortical Cell . . . .
2.3.6 Linear Integration of the Synaptic Inputs .
2.3.7 Spike Rate Encoding . . . . . . . . . . . .
2.4 Some Nonlinear Properties of Simple Cells . . . .
2.4.1 Contrast Responses . . . . . . . . . . . . .
2.4.2 Nonspecic Suppression . . . . . . . . . .
2.4.3 Temporal Nonlinearities . . . . . . . . . .
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Gain Control
36
37
40
41
43
43
47
49
51
53
3.1 Methods . . . . . . . . . . . . . . . . . . . . .
3.1.1 Preparation and Maintenance . . . . .
3.1.2 Stimuli . . . . . . . . . . . . . . . . . .
3.1.3 Data Analysis . . . . . . . . . . . . . .
3.2 Results . . . . . . . . . . . . . . . . . . . . . .
3.2.1 The Normalization Model . . . . . . .
3.2.2 Responses to Gratings . . . . . . . . .
3.2.3 Responses to Plaids . . . . . . . . . . .
3.2.4 Responses to Gratings and Noise . . .
3.2.5 Comparison with Other Models . . . .
3.2.6 Cell Population . . . . . . . . . . . . .
3.3 Discussion . . . . . . . . . . . . . . . . . . . .
3.3.1 Comparison with Geniculate Cells . . .
3.3.2 Composition of the Normalization Pool
3.3.3 Shunting Inhibition . . . . . . . . . . .
3.3.4 Feedback Models of the Visual Cortex
3.3.5 Conclusions . . . . . . . . . . . . . . .
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
55
56
59
66
66
71
88
96
99
103
109
110
116
117
119
121
3.4 Appendix: Proposed Biophysics of the Model . . . . . . . . . . . . 124
3.5 Appendix: Predicted Responses to Gratings . . . . . . . . . . . . . 129
3.6 Appendix: Predicted Responses to Plaids . . . . . . . . . . . . . . . 132
4 Spike Encoding
133
4.1 Methods . . . . . . . . . . . . . . . . .
4.1.1 Preparation and Maintenance .
4.1.2 Stimuli . . . . . . . . . . . . . .
4.1.3 Data Analysis . . . . . . . . . .
4.2 Results . . . . . . . . . . . . . . . . . .
4.2.1 Spike Train Responses . . . . .
4.2.2 Membrane Potential Responses
4.2.3 The Sandwich Model . . . . . .
4.3 Discussion . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
137
137
138
139
141
143
151
159
173
5 Conclusions
180
Bibliography
204
vii
List of Figures
1.1
......................................
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
15
16
18
20
22
26
28
38
39
44
46
48
50
3.1
3.2
3.3
3.4
3.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
62
68
72
76
79
viii
2
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
3.21
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
82
84
86
89
91
93
95
97
99
103
105
107
112
113
126
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
142
144
146
148
149
153
155
158
160
161
165
ix
4.12
4.13
4.14
4.15
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5.1
5.2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
x
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
167
168
170
172
List of Tables
4.1
4.2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
xi
Chapter 1
Introduction
The primary visual cortex (V1) is arguably the most studied area in the mammalian
cortex, and one of the very few for which we can say something sensible about the
computations that it performs. V1 cells are selective for the position, shape, size,
velocity, color, and eye of presentation of a visual stimulus. The mechanism of this
selectivity, as well as its rationale, have recently begun to be understood, although
some aspects still constitute an area of intense debate.
The receptive elds of V1 cells were rst mapped by Hubel and Wiesel (1962)
using ashing bars. They termed simple cells those cells for which they could nd
regions that responded either to the onset or to the oset of a bright bar, but
not to both. Simple cells constitute around 50% of V1 neurons (De Valois et al.,
1982; Schiller et al., 1976). This Thesis is devoted to simple cells, but it also
includes ideas that can be useful in understanding the other major V1 cell type,
the complex cells.
1
A
The linear model of simple cells
Firing
rate
Retinal image
B
The normalization model of simple cells
Firing
rate
Retinal image
Other cortical cells
C
RC circuit implementation
Firing
rate
Retinal image
Other cortical cells
D
The sandwich model of spike encoding
Gain
Current
Firing
rate
Frequency
Schemata of the four models discussed in this Thesis. A:
According to the linear model, simple cells perform a weighted average over
local space and recent time of the light intensities in the retinal image. Their
weighting function determines their selectivity. A rectication stage converts
the output of the linear stage into ring rate. B: The normalization model
extends the linear model by adding a divisive stage. The output of the linear
stage is divided by the pooled output of a large number of other cells. C: In
the RC implementation of the normalization model the linear stage injects
current into a circuit composed of a resistor and a capacitor in parallel. The
conductance of the resistor grows with the pooled output of a large number
of other cells. D: The sandwich model is a model of spike rate encoding. It
extends the rectication model by adding a high-pass linear lter after the
rectication stage.
Figure 1.1:
2
Linearity
A longstanding view of simple cells is that they perform a weighted sum of the
intensity values in the visual stimuli (Movshon et al., 1978b). This linear model is
essentially the simplest possible model of the relation between visual stimuli and
cell responses. It is attractive because if it were correct it would be possible to
predict the responses of a simple cell to any visual stimulus, based on a limited
number of measurements. For example, any image can be approximated by a
number of small pixels. Measuring the cell response by lighting each pixel one by
one would enable us to predict the response to any visual stimulus.
To account for the fact that ring rates cannot be negative, the linear model
includes a rectication stage that follows the linear stage. The model is depicted
in Figure 1.1A. The linear stage scales the stimulus intensity at each location in
local space and recent time by the cell's sensitivity at that location and time, and
algebraically sums the results. The rectication stage models the transformation
of membrane potentials into ring rates; it embodies a threshold below which
no spikes are generated, and above which the ring rate grows linearly with the
membrane potential. Rectication is a static (or memoryless) nonlinearity, i.e. one
that depends only on the present value of its input and not on its past history.
As such, it does not profoundly alter the linearity of the output of the rst stage
(Heeger, 1992a).
In Chapter 2 we give the full denition of the linear model, we explain its
basic properties, and we provide a brief review of the vast number of studies that
were devoted to testing it. In these studies the linear model was found to be
largely successful in explaining the selectivity of simple cells for stimulus shape,
size, position, orientation, and direction of motion.
3
We then propose a biophysical implementation of the linear model, and we
discuss its plausibility. According to this implementation, simple cells receive both
excitation and inhibition arranged in push-pull, so that when one increases the
other decreases, and vice versa. The importance of this arrangement is that it
makes it possible for the visual stimuli to result in perfect current injection in the
cell, without any conductance increase. The main role of inhibition in this model
is to ensure linearity. Other arrangements, for example the extreme case in which
inhibition is absent, would result in nonlinear behavior.
We conclude Chapter 2 by pointing out some limitations of the linear model.
There are indeed many situations in which the linear model was found to fail. For
example, scaling the contrast of a stimulus would identically scale the responses of
a linear cell. At high contrasts, however, the responses of simple cells show clear
saturation (Maei and Fiorentini, 1973). Moreover, simple cells are subject to
cross-orientation inhibition: the responses to an optimally-oriented stimulus can
be diminished by superimposing an orthogonal stimulus, which would be ineective in driving the cell when presented alone (Morrone et al., 1982). While these
nonlinearities may be partially explained by a gain control mechanism operating
as early as in the retina (Shapley and Victor, 1978; Benardete et al., 1992; Lee
et al., 1994), there is evidence suggesting that they have an important cortical
component.
Gain Control
According to a view emerged in recent years, the nonlinearities of simple cells could
be explained by extending the linear model to include a gain-control stage (Albrecht
and Geisler, 1991; Heeger, 1991; DeAngelis et al., 1992). In particular, Heeger
4
(Heeger, 1991, 1992b) proposed a normalization model (Figure 1.1B), in which the
linear response of every cell is divided by the same number (\normalized"). This
number grows with the activity of a large number of cortical cells, the normalization
pool. If the pool is suciently varied in its composition, the normalization signal
can be shown to grow with the local stimulus energy, which is the variance of the
intensity values of the stimulus, measured over local space and recent time and
over a band of spatial and temporal frequencies.
The normalization model was shown through computer simulations to be very
successful in predicting the nonlinear aspects of simple cell responses while retaining the essential properties of the linear model (Heeger, 1992b, 1993). The model
attributes a cell's selectivity to the initial linear stage and its nonlinear behavior
to the normalization stage. For example, the model predicts response saturation
because increasing the contrast of a stimulus increases its energy and thus increases
the divisive suppression. Similarly, it predicts cross-orientation inhibition because
adding an orthogonal grating increases the stimulus energy.
Chapter 3 is devoted to testing the normalization model. We propose a biophysical implementation of the model, in which the cell membrane is a simple RC
circuit i.e. one composed of a resistor and a capacitor in parallel (Figure 1.1C).
The linear stage injects synaptic current into the circuit, which outputs a membrane potential response. A rectication stage converts the latter into ring rate.
Normalization operates by shunting inhibition: the cells in the normalization pool
inhibit each other by increasing each other's membrane conductance. The conductance controls the gain of the transformation of input currents into output
membrane potentials.
To test the RC implementation of the normalization model we recorded from
5
simple cells in the primary visual cortex of paralyzed, anesthetized macaques, while
presenting very large sets of visual stimuli. We derived closed-form equations for
the model responses to such stimuli, and we found that these equations provide
good ts to the neural responses.
In addition to providing a biophysical substrate for the normalization model,
the RC implementation expands the range of phenomena predicted by the model.
In particular, in the RC implementation the conductance controls both the gain
and the time constant of the membrane. As a consequence the latency of the model
responses and the temporal ltering properties of the model cells depend on the
stimulus energy. This property allows the RC implementation to capture a number
of temporal nonlinearities in the responses of V1 cells that were not explained by
the original formulation of the normalization model Figure 1.1B). These include
the decreased response latency observed with increased stimulus energy (Dean
and Tolhurst, 1986; Reid et al., 1992), and the dependence of temporal frequency
tuning on stimulus contrast (Holub and Morton-Gibson, 1981; Hawken et al., 1992;
Albrecht, 1995).
Is shunting inhibition really the mechanism underlying gain control in the cortex? We tested the model using extracellular data, so we have no direct proof that
the overall conductance grows with stimulus energy. There are actually reasons to
doubt that this is the case: intracellular in vivo measurements have consistently
failed to demonstrate large conductance increases related to visual stimulation
(Berman et al., 1991; Ferster and Jagadeesh, 1992). As a result, the true biophysical substrate of gain control is uncertain. The main advantage of the RC model is
that it constitutes the simplest possible way for the normalization pool to control
both the gain and the dynamics of a cell's response. Until further data is available
the RC model should be considered something between a phenomenological model
6
and a true biophysical description.
Spike encoding
There are other ways in which gain control could operate. For example, the increases in conductance could be localized to the axon hillock, and thus be reected
only in the spike train outputs of the cell, rather than in its membrane potential
responses. This would be consistent with recent evidence that the membrane potential responses do not show the nonlinearities that are found in the spike train
responses (Jagadeesh et al., 1993) and which the normalization model attributes
to gain control (Heeger, 1993). This brings us to the topic of spike encoding.
Both the linear and the normalization models assume that the transformation
of intracellular signals into spike trains by cortical cells is well approximated by
rectication, and that it is independent of the visual stimulus.
The rst assumption is acceptable only if one connes it to the steady-state
responses, but it is overly simplistic if applied to the time-varying responses. There
is indeed a large body of literature pointing to a linear or bilinear steady-state relation between injected current and ring rate, once the current is above a threshold
level (Stafstrom et al., 1984b). Rectication, however, fails to predict the timevarying responses of cortical cells because it is a static nonlinearity. This property
turns out to be an oversimplication. For example, when the stimuli are current
steps, the ring rate of some cortical cells displays prominent adaptation (Connors
et al., 1982). Firing rate thus depends not only on the injected current, but also
on time.
Can the rectication model be extended so as to correctly predict the timevarying responses of cortical cells? Chapter 4 is devoted to answering this question.
7
We performed a series of intracellular in vitro experiments on slices of guinea pig
visual cortex. We injected currents of various waveforms, and analyzed the cells'
membrane potential and spike train responses. We found that the properties of
the spike train responses to single sinusoids are very dierent from those of the
responses to noise. In the rst case the cells act as band-pass lters, and their
responses are very nonlinear. In the latter case, the cells are more responsive, and
encode all the frequencies between 0.1 and 130 Hz equally well. In addition, the
responses to noise are much more linear, a phenomenon known as \linearization
by noise" (Spekreijse and Oosting, 1970; French et al., 1972).
To account quantitatively for our results we propose an extension of the rectication model (Figure 1.1D). We call it a sandwich model, since it essentially consists
of a static nonlinearity | the rectication stage | sandwiched between two linear
lters (Victor et al., 1977; Korenberg et al., 1989). The linear lter that precedes
the rectication stage is the RC model of the membrane, which is low-pass. The
linear lter that follows the rectication stage is high-pass. The model includes a
nal rectication stage (not shown in the Figure) which passes only the positive
responses, ensuring that the predicted ring rates are positive. A similar model
was proposed by French and Korenberg (1989) to describe the transformation of
injected currents into spike trains by cockroach mechanoreceptors.
Based on our results, we suggest that spike encoding is likely to play a role
in numerous nonlinearities present in the responses of V1 cells to visual contrast
modulation, nonlinearities that in Chapter 3 we ascribe entirely to the normalization mechanism. These include the dependence of temporal frequency tuning
on stimulus contrast (Holub and Morton-Gibson, 1981; Hawken et al., 1992) and
on stimulus bandwidth (Reid et al., 1992), the dependence of response latency on
stimulus contrast (Dean and Tolhurst, 1986) and stimulus bandwidth (Reid et al.,
8
1992), and other nonlinearities of temporal summation (Tolhurst et al., 1980).
In the Conclusions (Chapter 5) we speculate on the possibility that rather than
acting on the rst linear lter (by increasing the conductance of the RC circuit),
normalization may operate on the second linear lter. The spike encoding mechanism occupies a strategic position as a bottleneck for the outow of information
from the cells. The results of Chapter 4 indicate that this mechanism has important
dynamical properties. These properties make this mechanism the ideal candidate
for the site of normalization, which we have shown in Chapter 3 to aect both the
gain and the temporal properties of the responses.
9
Chapter 2
Linearity
This Chapter provides a brief review of the linear model of simple cell responses.
We give the full denition of the linear model (Section 2.1). We explain its basic
properties, and we summarize the vast number of studies that were devoted to
testing it (Section 2.2). In these studies the model was found to be largely successful in explaining the selectivity of simple cells for stimulus shape, size, position,
orientation and direction of motion.
We then propose a biophysical implementation of the linear model, and we
discuss its plausibility (Section 2.3). In this implementation, simple cells receive
both excitation and inhibition arranged in push-pull, so that when one increases
the other decreases, and vice versa. The importance of this arrangement is that
it makes it possible for the visual stimuli to result in perfect current injection in
the cell, without any conductance increase. Conductance increases would result in
nonlinear behavior.
In our view, the main role of inhibition is to ensure linearity. By contrast, most
other models of the visual contrast invoke inhibition to enhance the orientation or
10
direction selectivity of the cells (see e.g. reviews by Ferster and Koch, 1987, and
by Bonds, 1992).
At various points in this Chapter we encounter situations in which the linear
model was found to fail, and we suggest that these nonlinearities would be explained by the presence of a gain control mechanism. Section 2.4, in particular,
lists a number of these nonlinearities, and illustrates them (Figures 2.11{2.14) with
data borrowed from Chapter 3, which is an experimental study of the gain control
mechanism.
2.1
The Linear Model of Simple Cells
2.1.1 Visual Stimuli In Space-Time
As a visual stimulus is projected on the retina it can be described by its intensity
distribution I (x; y; t), that varies in the two spatial dimensions x,y and in time
t. This representation ignores the color of the stimulus and assumes monocular
viewing, but is in all other respects complete. Consider for example a stimulus
consisting of a dark vertical bar drifting from left to right, on a white background.
Figure 2.1A shows the bar at a particular instant in time. Panel B shows that as
the bar drifts from left to right, it can be considered as a solid in the x-y-t space.
Panel C shows a snapshot of the volume taken from above, a space-time (x-t) plot
which ignores the y dimension.
Dierent velocities result in dierent orientations in space-time (Fahle and Poggio, 1981; Adelson and Bergen, 1985; van Santen and Sperling, 1985; Watson and
Ahumada, 1985). For example, if the bar in Figure 2.1A were going faster, its
11
x
x
x
y
y
t
t
A
B
C
x
x
x
y
past
y
present
t
t
D
E
F
A stimulus and a receptive eld in space-time. A: A vertical
bar translating to the right. B: The space-time volume of stimulus intensities corresponding to motion of the vertical bar. C: An x-t slice through the
space-time volume. Orientation in the x-t slice is the horizontal component
of velocity. D: an x-y section of a spatiotemporal weighting function. Dark
areas represent locations where the weighting function is negative, bright
areas represent locations where it is positive. E: The same x-y section together with a projection of the weighting function on the x-t plane. F: The
x-t projection, which ignores the y dimension. The receptive eld travels
in time t, from past to future. Panels A-C are based on an illustration by
Adelson and Bergen (1985).
Figure 2.1:
12
space-time (x-t) representation (Panel C) would have been more tilted towards
the horizontal. Had the bar been motionless, its x-t representation would have
been vertical. Had the bar been going from right to left, the orientation of its x-t
representation would have been opposite to the one in Figure 2.1C.
2.1.2 Spatiotemporal Weighting Functions
A dening property of linearity is that of superposition: if L1 is the response
to stimulus I1, and L2 is the response to stimulus I2, then the response of a
linear system to the sum of the stimuli I1 + I2 is just the sum of the responses,
L1 + L2 . While the property of superposition may sound a little abstract, there is
an equivalent statement that will make it concrete: simple cells are linear if and
only if their responses are a weighted sum of the light intensities falling on their
receptive elds.
Figure 2.1D-F shows a schematic of a spatiotemporal weighting function. Panel D
shows a space-space (x-y) section of the weighting function. Panel F shows the
space-time (x-t) projection of the weighting function, i.e. a snapshot of the weighting function taken from above. The relation between the two is shown in Panel E.
As we will see, real simple cell weighting functions don't look too dierent from
this idealization (Figure 2.6).
The response of a linear cell is simply obtained by weighting the stimulus
intensity I at each location and time by the value of the cell's weighting function
W at that location and at that time, and by summing the results:
L(t) =
ZZ Z
W (x; y; T ) I (x; y; t 0 T ) dx dy dT:
(2.1)
The cell travels in time t from past to future (as we all do), while its retinal
13
(x-y) location remains xed. The weighting function is zero for any time that lies
in the future, because the responses of the cell cannot depend on future events.
The spatiotemporal weighting function W of a linear cell determines its selectivity (e.g., for orientation or direction of motion). In particular, several researchers
have pointed out that a linear cell is direction selective if and only if the subregions
of its weighting function are tilted along an oblique axis in space-time (Fahle and
Poggio, 1981; Adelson and Bergen, 1985; van Santen and Sperling, 1985; Watson
and Ahumada, 1985).
Figure 2.2 illustrates how this selectivity arises, by showing the responses of
a linear cell with a space-time oriented weighting function to a drifting grating
stimulus. In Panels A-D the grating drifts from left to right, and the resulting
space-time orientation is very similar to the space-time orientation of the weighting
function. This results in strong responses (Panel A). In Panels E-H the grating
drifts in the opposite direction, and the resulting space-time orientation is almost
orthogonal to that of the weighting function. This results in very small responses
(Panel E) because the weighting function averages out the variations in intensity
present in the stimulus.
If a linear cell's weighting function is not tilted along an oblique axis in spacetime, then the cell will not have a preference for direction of motion. Figure 2.3, for
example, shows the space-time projections of three dierent weighting functions.
The weighting function in Panel A prefers stationary objects, since it is vertical
in the x-t plane. The weighting function in Panel B prefers moving or ickering
objects but has no preference for the direction of motion. These two weighting
functions cannot be direction selective because they are space-time separable, i.e.
their weighting functions W (x; y; t) can be expressed as the product of a function
14
A
Response
Time
E
Time
B
C
x
t
Response
F
x
t
t
t
x
H
x
t
G
x
D
x
t
Figure 2.2: Direction selectivity in a linear cell. A: Responses to gratings drifting in the preferred direction. The stimulus elicits large responses.
B-D: Relative space-time positions of weighting function and stimulus at
three instants in time. When the central excitatory region of the weighting
function is aligned with a dark bar (B), the response is negative. When
it is aligned with a bright bar, the response is positive (C), and so on. E:
Responses to gratings drifting in the opposite direction. The stimulus elicits
small responses. F-H: Relative space-time positions of weighting function
and stimulus at three instants in time. At any given time, each bar of the
grating is covering both excitatory and inhibitory subregions of the weighting function, whose outputs are averaged out by the cell. The lled areas in
Panels A and E show the parts of the responses that would be visible after
rectication.
15
Time
Space
A
B
C
Contour plots of three spatiotemporal weighting functions, averaged across one spatial dimension. Continuous curves: positive contours.
Dashed curves: negative contours. A: A space-time separable weighting
function, the product of a center-surround spatial weighting function and
of a monophasic temporal weighting function. B: Another space-time separable weighting function, spatially displaced with respect to A and with
a biphasic temporal weighting function that is less delayed than the one in
A. C: Non-separable weighting function obtained by summing the ones in
A and B. The result is a space-time oriented weighting function resembling
that of the direction selective cell in Figure 2.6.
Figure 2.3:
of space x; y and a function of time t. The weighting function in Panel C is clearly
tilted along an oblique axis in space-time, and is direction selective. Note that it
is not space-time separable.
2.1.3 A Nonlinearity: Light Adaptation
In characterizing simple cells as spatiotemporal linear neurons, we have neglected
an important (retinal) nonlinearity: light adaptation (Shapley and Enroth-Cugell,
1984). We can, however, safely ignore light adaptation by restricting our choice
of visual stimuli to luminance distributions l(x; y; t) that modulate (transiently)
about a xed mean/background luminance l. Examples are drifting grating patterns and drifting or briey ashed bars that are either brighter or darker than the
mean. In these conditions the retina can be considered to be in a xed state of
adaptation, and its output is proportional to the \local contrast" I (x; y; t) = [l(x; y; t)0
l]=l of the stimulus (Shapley and Enroth-Cugell, 1984).
16
To avoid confusion we (improperly) use the term intensity to refer to the local
contrast, and we reserve the term contrast for the maximum absolute value of the
local contrast of a grating stimulus. The maximum contrast of a grating is 1, which
is attained when the lowest intensity is zero and the highest intensity is twice the
mean. Finally, we use the term local energy to denote the variance of the stimulus
over local space and recent time, and within a band of spatiotemporal frequencies.
2.1.4 Another Nonlinearity: Rectication
Spatiotemporal linear weighting functions are intended to be models the intracellular responses of simple cells. Most of the data discussed in this Chapter, however,
were obtained extracellularly. To model this, one is forced to consider also the
transformation of membrane potentials into ring rates.
This transformation is bound to introduce a nonlinearity. The responses of a
linear cell would assume both positive and negative values. Likewise, the membrane
potential uctuates above and below a cell's resting potential. Firing rates, on the
other hand, are by denition positive. A linear cell with a high maintained ring
rate could encode the positive and negative values by responding either more or
less than the maintained rate. This is for example typical of retinal ganglion
cells (Enroth-Cugell and Robson, 1966). Simple cells, however, have very little
maintained discharge. Since their negative responses cannot be encoded in their
ring rate, simple cells cannot act truly linearly.
As we discuss in Section 2.1.4, the transformation of membrane potential into
spike rate can be approximated by rectication, that is by a function that is zero
for membrane potentials below a threshold, and that grows linearly from there on.
Figure 2.4 shows some examples of rectication. The three solid lines depict cases
17
Firing rate (Spikes/sec)
80
40
0
-5
0
5
10 15 20
Membrane potential (mV)
Three possible transformations of membrane potential into ring rate. For simplicity the resting potential is assigned the value Vrest = 0.
The continuous lines represent rectication with threshold Vthresh . The
thick, intermediate and thin lines represent respectively the cases in which
the threshold is Vthresh = 0, 5 and 10 mV. The dashed lines represent approximations to rectication. These approximations are useful to simplify
the mathematics of the normalization model (Appendix A). They are power
functions of the positive deviation from resting potential, for dierent exponents n. The exponent n is 2 for the thicker dashed curve, 3 for the thinner
dashed curve.
Figure 2.4:
18
in which the ring threshold is respectively 0, 5 and 10 mV away from the resting
potential. Technically the rst example of rectication, with a threshold at Vrest,
is called \half-rectication". The other two are called \over-rectication", since
their threshold is above the resting potential. We use the term \rectication" to
include all these cases.
Rectication is a static nonlinearity, that is one that depends only on the
instantaneous value of its input and not on its past history. Adding rectication
after a spatiotemporal linear weighting function does not substantially alter the
selectivity or other basic properties of the responses (Heeger, 1992a).
In the following, whenever we refer to the linear model of simple cells, we tacitly
assume it to be a spatiotemporal linear weighting function followed by rectication,
as shown in Figure 1.1A (page 2).
2.2
Some Linear Properties of Simple Cells
This Section describes some experimental results that provide strong evidence
in favor of the linear model of simple cells (see Heeger, 1993,1992a, for a more
thorough review). Most of the nonlinearities that are mentioned in this Section
are explained by the rectication stage that transforms intracellular responses into
ring rates.
2.2.1 Responses to Impulses
When Hubel and Wiesel (1962) rst mapped the receptive elds of V1 cells, their
stimuli were bright ashing bars. Depending on whether a region responded posi19
x
x
x
A
y
y
t
t
Response
x
x
x
B
Time
t
Response
t
x
t
x
x
C
Time
t
t
t
Flashed bar stimuli and ON- or OFF- subregions of a linear
cell's receptive eld. A: Spatiotemporal structure of a ashed bar stimulus.
The x-t projection of a ashed bar is a vertical rectangle. B: Response of
a linear cell when the bar is ashed on an OFF subregion of the receptive
eld. The response is negative when the bar is turned on, positive when
it is turned o. The left panel shows the responses as a function of time,
before rectication. The other three panels represent receptive eld and
stimulus at three instants in time. C: as in B, except that the bar is ashed
on an ON subregion of its receptive eld. The lled areas in the leftmost
panel in B and C show the parts of the responses that would be visible after
rectication.
Figure 2.5:
20
tively to the onset or to the oset of a bright bar, they termed that region an ON
or an OFF subregion.
The linear model predicts the existence of these subregions (Emerson, 1988;
Heeger, 1992a). This can be understood by considering the full space-time representation of a ashed bar, which is shown in Figure 2.5A. Since the bar does not
change position in time, its space-time (x-t) projection is vertical. The top and
bottom ends of the rectangle are respectively the times at which the bar is turned
on and o. The responses of a linear weighting function to such a stimulus are
depicted in Figure 2.5B-C. Figure 2.5B shows the case in which the bar is ashed
in an OFF subregion. As the weighting function travels down in time, the rst
subregion of the weighting function that hits the stimulus is inhibitory; this gives a
negative response. Later the stimulus overlaps both the excitatory and inhibitory
subregions, so the response is about zero. Finally, when the stimulus overlaps only
the excitatory subregion (right panel of Figure 2.5B) the weighting function gives
a positive response. In sum, the response is negative just after the bar is turned
on and positive just after it is turned o. The opposite will happen when the bar
is ashed in and ON subregion (Figure 2.5C): the response is positive just after
the bar is turned on and negative just after it is turned o. After the responses in
the left panels of Figure 2.5B and C are passed through a rectication stage that
shows only their positive parts (shaded areas), they closely resemble the spike rate
responses of a real simple cell.
Since Hubel and Wiesel's original work, the method for mapping a receptive
eld has been made more quantitative by having a computer show sequences of bars
in many dierent positions and recording the correlation between ring rate and
light intensity. Such a correlation depends on space and time: for each location x; y
and delay time T , one can measure the correlation between the spike train R(t)
21
T=200 ms
T=150 ms
T=100 ms
T=50 ms
Y (deg)
4
0
300
T(
ms
ec
)
0 0
X (deg)
4
The full space-time receptive eld of a simple cell, as obtained
with the reverse correlation method. The four upper panels represent x-y
snapshots of the receptive eld measured at dierent times T in the past.
Gray levels indicate the correlation between the appearance of a bar and the
ring rate T ms later. Zero correlation is indicated by mid-gray. Lighter
grays indicate points of positive correlation with the appearance of a bright
bar. Darker grays indicate points of positive correlation with the appearance
of a dark bar. A large number of snapshots like these are stacked to build a
full space-time receptive eld, whose space-time projection is shown at the
bottom. This previously unpublished Figure is courtesy of Greg DeAngelis.
Cell is part of sample published in (DeAngelis et al., 1993a).
Figure 2.6:
22
and the sequence of stimulus intensities that occurred T seconds before at that
x; y location, I (x; y; t 0 T ). The value of the correlation, which can be positive
or negative, is taken as the strength of the weighting function at that position
and time, W (x; y; T ) . This method is called reverse correlation, and allows the
measurement of full space-time (x-y-t) weighting functions (deBoer and Kuyper,
1968; McLean and Palmer, 1989; Shapley et al., 1991; DeAngelis et al., 1993a,b;
McLean et al., 1994). Figure 2.6 shows the full space-time weighting function of a
simple cell, measured with the reverse correlation technique. The four upper panels
represent x-y snapshots of the weighting function measured at dierent times T
in the past. A large number of snapshots like these are stacked to build a full
space-time weighting function, whose x-t structure (averaged over the y axis) is
shown at the bottom of the gure.
The reverse correlation method can be applied to any visual cell, linear or
nonlinear, and it will always give a result, i.e. a full space-time weighting function.
For a linear cell, however, such a weighting function could then be used to predict
the cell's responses to any visual stimulus (using Equation 2.1).
This property of linear systems can be used to test whether simple cells are
linear. For example, one can ask whether direction selectivity in simple cells is fully
explained by an underlying linear stage. The linear model predicts that simple cells
are direction selective only if their weighting functions are oriented in space-time,
and thus nonseparable (Figure 2.3).
This prediction was tested by McLean and Palmer (1989, 1994), Shapley, Reid
and Soodak (1991), Emerson and Citron (1992) and DeAngelis et al. (1993b).
Their ndings are mostly consistent with the linear model. They found simple
cells with weighting functions tilted along an oblique axis (inseparable) in space23
time, like the one in Figure 2.6. These cells were all direction selective, and the
preferred direction of motion was always correctly predicted from the orientation
of the weighting function. For example, the cell of Figure 2.6 was highly direction
selective and preferred stimuli moving from right to left in the x-y plane. A number
of simple cells were found to have space-time separable weighting functions, like
the ones depicted in Figure 2.3A and B. Consistent with the linear model, most of
these cells were not direction selective1.
2.2.2 Responses to Drifting Gratings
The stimulus of choice for linear systems analysis of the visual system is a stimulus
whose luminance varies sinusoidally in space and time, the sine grating. There are
many advantages to using sine gratings (reviewed in Enroth-Cugell and Robson,
1984), the most important being that linear systems are guaranteed to respond
to sinusoidal modulation with a sinusoid. For example, had the modulation in
luminance of the grating in Figure 2.2 been sinusoidal, the responses in Panels A
and E would have been perfect sinusoids. The deviation of the responses from
pure sinusoids can provide a quantitative measure of nonlinearity (Hochstein and
Shapley, 1976). Sine gratings were rst used to study the neurophysiology of the
visual system by Enroth-Cugell and Robson (1966), who demonstrated the linearity
of cat retinal X ganglion cells. One of their tests of linearity involved comparing
the responses to gratings with the responses to luminance edges. The logic of this
experiment is straightforward: since an edge is composed of the sum of a number
1 Some
cells, however, are direction selective even though they have space-time separable
weighting functions (McLean and Palmer, 1989; Emerson and Citron, 1992; McLean et al., 1994).
The behavior of these cells cannot be accounted for by models like those advocated in this
Chapter, in which direction selectivity is due to an underlying spatiotemporal linear stage.
24
of gratings, the responses of a linear cell to an edge would be predictable from its
response to gratings.
A similar experimental paradigm was applied to the study of simple cells by
Movshon et al. (1978a). They measured the sensitivity of the cells to drifting
gratings of dierent spatial frequencies, and the sensitivity of dierent receptive
eld regions to ashing bars. Since a grating is composed of the sum of a number of bars, the response of a linear cell to a grating is predictable (via Fourier
transform) from its response to bars. Likewise, since a bar can be thought of as
the sum of a number of gratings, the response to a bar is predictable (via inverse
Fourier transform) from the response to gratings. Movshon et al. found good
agreement between the weighting function predicted by inverse Fourier transform
of the grating sensitivity and the weighting function obtained from the ashing
bar data (Figure 2.7). This supports the linear model of simple cell responses.
Many other studies have compared grating responses to impulse responses (Maffei et al., 1979; Andrews and Pollen, 1979; Glezer et al., 1980; Kulikowski and
Bishop, 1981b,a; Dean and Tolhurst, 1983; Field and Tolhurst, 1986; Jones and
Palmer, 1987b; Jones et al., 1987; Jones and Palmer, 1987a; Tadmor and Tolhurst,
1989; Shapley et al., 1991; DeAngelis et al., 1993b).In many cases, the inverse
transform of the response to gratings gives a weighting function with additional
side bands beyond those measured directly. In addition, the measured response to
gratings is often more narrowly tuned for spatial frequency than predicted from
the Fourier transform of the response to impulses. This discrepancy between the
grating responses and the impulse responses can be explained by over-rectication,
which conceals the impulse responses of the weaker receptive eld regions, so it is
consistent with the linear model (Tadmor and Tolhurst, 1989; Heeger, 1992a; Tolhurst and Heeger, 1996b).
25
Linearity of spatial summation in four cat V1 simple cells.
Spatial weighting functions as measured with ashing bars (histograms) and
as predicted by inverse Fourier transformation of the spatial frequency tuning curves (continuous curves). For a linear cell the two would be identical.
Plots show one spatial dimension (e.g. x), and collapse all information about
the other two dimensions (y and t). Both the observed and predicted weighting functions were independently rescaled. Positive values in each weighting
function represent incremental responses to the introduction of a bright bar;
negative values represent incremental responses to the introduction of a dark
bar. Insets: The spatial frequency tuning curves used to compute each predicted weighting function. The abscissa of these insets is spatial frequency
(in cycles/degree) and the ordinate is contrast sensitivity, the inverse of the
threshold contrast value for each spatial frequency. Reprinted with permission from (Movshon et al., 1978a).
Figure 2.7:
26
Some of the above mentioned studies, on the other hand, unveiled a serious
failure of linearity: a discrepancy between the predicted and actual sizes of the
responses. For example, when Movshon et al. (1978a) compared the observed and
predicted weighting functions (Figure 2.7), they did so only up to an arbitrary
amplitude scaling factor. This scaling factor should not be necessary according to
the linear model. We will see in Chapter 3 that the normalization model predicts
this failure of linearity. In particular, it predicts that the cell's gain is dierent
when it is stimulated with ashed bars from when it is stimulated with drifting
gratings.
2.2.3 Responses to Contrast-Modulated Gratings
A contrast-modulated grating is a standing sine grating whose intensity is modulated sinusoidally over time. Simple cell responses to drifting and contrastmodulated gratings are quite similar to rectied sinusoids (see e.g. Figure 2.11A).
This is obviously consistent with the linear model: the spatiotemporal linear
weighting function responds with a sinusoid and the rectication hides everything
that is below threshold.
A number of researchers (e.g. Maei and Fiorentini, 1973; Movshon et al.,
1978; Kulikowski and Bishop, 1981b; Tolhurst and Dean, 1991; Reid et al. 1987,
1991) measured the responses of simple cells while varying the spatial phase of
contrast-modulated gratings. Since these responses can be reasonably t by a
sinusoid, they can be described by just two numbers, the amplitude and phase of
the sinusoid. A useful way to display both response amplitude and response phase
at the same time is given by a polar plot like the one in Figure 2.8. Every point in
the polar plot corresponds to a sinusoid, whose amplitude is given by the distance
27
Linearity of spatiotemporal summation in a cat V1 simple
cell. The polar plot shows the cell's responses to standing gratings whose
contrasts were modulated sinusoidally in time at dierent spatial phases.
The amplitude of the rst harmonic sinusoid of each response is represented
radially, while the angular coordinate indicates the temporal phase of each
response. The lled symbols represent the unaltered data from the experiment; the open symbols and the ellipse tted to them represent the same
data corrected for a resting \ring rate" of -8 Spikes/sec. Reprinted with
permission from (Movshon et al., 1978a).
Figure 2.8:
28
from the origin, and whose phase is given by the angle with the horizontal axis.
The linear model predicts that as the spatial phase of the contrast-modulated
grating varies between 0 and 180 , the responses should describe a \wasp-waisted"
ellipse in the polar plot (Movshon et al., 1978a). In particular, for a linear cell a
polar plot of the responses would be elliptical in shape. Over-rectication distorts
the ellipse, producing a wasp-waist: if the neuron has to reach a certain level of
excitation before any activity is seen, there will be a disproportionate decrease
in small responses (Albrecht and Geisler, 1991; DeAngelis et al., 1993b; Heeger,
1993; Tolhurst and Heeger, 1996a). The physiological results are in line with
this prediction. Figure 2.8, for example, shows the responses of a simple cell to
a contrast-modulated grating positioned at 8 dierent phases over the receptive
eld, spanning the range from 0 to 180 . The raw data (lled symbols) describe
a wasp-waisted ellipse since the amplitudes near the minor axes are smaller than
they should be to t an ellipse. When the distortion introduced by rectication is
removed (in this case assuming a resting \ring rate" of minus 8 spikes per second),
the data fall on an ellipse (open symbols).
Because of superposition, the responses of a linear cell to contrast-modulated
gratings would be easily predictable from the responses to drifting gratings, and
vice versa. Several researchers tested whether this was the case for simple cells. Ferster and collaborators performed intracellular in vivo recordings and found that the
membrane potential responses were consistent with the output of a spatiotemporal
linear weighting function (Jagadeesh et al., 1993). Other researchers performed extracellular recordings (Reid et al., 1987; Albrecht and Geisler, 1991; Tolhurst and
Dean, 1991; Reid et al., 1991). These studies are generally consistent with the
linear model in that a cell's preferred direction of motion for drifting gratings can
be correctly predicted from its responses to contrast-modulated gratings. They
29
however uncovered two nonlinearities in simple cell responses. First, the linear
prediction from contrast-modulated grating responses underestimates the degree
of directional selectivity observed with drifting gratings. Second, the linear prediction overestimates the responses to gratings drifting in the nonpreferred direction.
Albrecht and Geisler (1991) and Heeger (1991, 1993) showed that the rst phenomenon can be explained by the rectication stage (which acts as an expansive
nonlinearity), so it is consistent with the linear model of simple cells. The second
phenomenon instead is not consistent with the linear model, but it can in most
cases be explained by a gain control mechanism like that postulated by the normalization model. The normalization model predicts that the cells are less responsive
in the presence of drifting gratings than in the presence of contrast-modulated
gratings of equal contrast. This dierence in gain can be shown to yield the observed discrepancy in the predicted and actual responses to gratings drifting in the
nonpreferred direction (Heeger, 1993; Tolhurst and Heeger, 1996a).
2.2.4 Responses to Compound Stimuli
According to the linear model of simple cells, knowing a cell's responses to gratings
would make it possible to predict the responses to any visual stimulus. This is
because any visual stimulus can be expressed as the sum of many dierent gratings.
An important test of this prediction was performed by DeValois et al. (1979).
They measured the orientation tuning of cat simple cells using individual gratings
as well as checkerboard stimuli. The motivation for their use of checkerboards is
very interesting. At the end of the 1970s the issue of whether simple cells were
better modeled as linear weighting functions or as all-or-none edge detectors was
the object of heated debate (see e.g. Marr, 1982, Maei and Fiorentini, 1973 and
30
Schumer and Movshon, 1984). DeValois et al. (1979) reasoned that the two models
made very dierent predictions of a cell's response to checkerboards. In checkerboards the strongest sine grating components are oriented along the diagonals,
whereas the sharp edges are oriented along the rows and columns. According to
the linear model the cells will respond best when one of the diagonals is oriented in
the cell's preferred orientation for gratings. According to the edge-detector model,
on the other hand, a cell will respond best when either the rows or the columns
are oriented in the cell's preferred orientation for gratings. The results of DeValois
et al. (1979) were consistent with the linear model, and falsied the edge-detector
model. The responses of the cells could be predicted by having knowledge of the
location and orientation of the main sine gratings that compose the checkerboard.
The precise location and orientation of the sharp edges was not relevant in predicting the cells' responses.
A similar approach was followed by other researchers (e.g. Gizzi et al., 1990,
Pollen et al., 1988, DeValois and Tootell, 1983; Pollen et al., 1982; Maei et al.
1979), who tested linearity by comparing responses to single sine gratings with
responses to sums of sine gratings of dierent spatial frequencies or orientations.
All of these results are qualitatively explained by the linear model (Heeger, 1992a).
The quantitative predictions of the linear model, however, are not always correct, and once again the discrepancy points to the existence of a gain control
mechanism. For example, Gizzi et al. (1990) found that simple cell responses to
plaids composed of two sine gratings with dierent orientations were on average
only 2/3 of the linear predictions based on the responses to the individual gratings. We will see in Chapter 3 that the normalization model explains this behavior
because it predicts that the gain of the cells is lower in the presence of plaids than
in the presence of single gratings.
31
2.3
Biophysics of the Linear Model
We have been considering the linear model as little more than a mathematical
abstraction. This Section describes how the model might be implemented physiologically.
2.3.1 Linearity of the LGN
Unless some complicated linearization mechanism is invoked, simple cells can only
be as linear as the inputs they get. Since the input to the visual cortex is constituted by the activity of LGN cells, we must begin our task of modeling simple cell
linearity by assuming that the responses of LGN neurons are linear functions of
the stimulus intensity distribution.
This assumption is a better approximation in the monkey than in the cat. In
the cat there are no LGN cells that are perfectly linear: the X cells are spatially
and temporally linear, but they have a (retinal) contrast gain control mechanism,
which violates linearity (Enroth-Cugell and Robson, 1966; Enroth-Cugell et al.,
1983; Victor, 1987). The Y cells are extremely nonlinear (Hochstein and Shapley,
1976; Troy, 1983; Victor, 1988), but may not contribute any input to the primary
visual cortex (Ferster, 1990b,a).
In the monkey, on the other hand, there is a geniculocortical channel, the
P pathway, which is substantially linear. The other channel, the M pathway,
is instead quite nonlinear, and its nonlinearity might be due to a gain-control
mechanism. The substantial linearity of the P pathway and the nonlinearity of the
M pathway have been observed in the responses of retinal ganglion cells (Benardete
et al., 1992; Lee et al., 1994; Benardete and Kaplan, 1995), and are reected in
32
the properties of LGN cells (Derrington and Lennie, 1984; Sherman et al., 1984;
Carandini et al., 1993a; Movshon et al., 1994). Even though P cells constitute
around 90% of the monkey LGN (Dreher et al., 1976), many simple cells also receive
M inputs (Malpeli et al., 1981). Indeed, while the two streams are segregated in
layer 4C (Hubel and Wiesel, 1972; Hendrickson et al., 1978; Blasdel and Lund,
1983), they are not segregated at all in the upper layers (Lahica et al., 1992;
Yoshioka et al., 1994; Nealey and Maunsell, 1994). In particular, for those neurons
that do receive M input, the rst 7-10 ms of activation are due exclusively to the
M signal (Maunsell and Gibson, 1992).
Besides the fact that monkey simple cells may receive some M input, there
is another phenomenon that makes our assumption of linearity of the geniculate
input an imperfect approximation: at high contrasts the responses of LGN cells
show evidence of rectication. Resting ring rates in the LGN have been reported
to be around 18 sp/s in monkey, and respectively around 6 sp/s and 16 sp/s in cat
X and Y LGN cells (Kaplan et al., 1987). An analysis of the 136 monkey LGN cells
recorded by Sherman et al. (1984) reveals that their average resting ring rate was
only around 7 spikes/s. On average the modulation in spike rate due to a 50%
contrast grating was twice as large as the resting ring rate for P cells (median:
2.1), and three times as large for M cells (median: 3.3). Rectication in the input
will be ignored in the following but should be considered in more detailed models
of the visual cortex.
2.3.2 Building Simple Cell Receptive Fields
In their 1962 paper, Hubel and Wiesel hypothesized that the receptive elds of
simple cells are the result of an orderly arrangement of LGN inputs. Geniculate
33
cells have center-surround receptive elds. When stimulated with a spot stimulus
in their center they respond either to the onset (ON-center cells) or to the oset
(OFF-center cells) of the stimulus. According to the scheme proposed by Hubel
and Wiesel an ON subregion of a simple cell receptive eld would result from the
sum of an aligned series of LGN ON-center inputs. Similarly, an OFF subregion
would result from the sum of an aligned series of OFF-center inputs. This arrangement has been recently conrmed by Reid and Alonso (1995), who recorded
simultaneously from simple cells and from LGN cells.
If some of the LGN inputs reach a simple cell before some others, the resulting
receptive eld can also display direction selectivity. In the cat LGN, for instance,
there is evidence for the existence of two classes of cells, \lagged" and \nonlagged",
whose responses have dierent latencies (Mastronarde, 1987; Saul and Humphrey,
1990, 1992). Figure 2.3 illustrates how the outputs of these two classes of cells
could be summed to yield a direction selective simple cell. The weighting functions in Panels A and B were drawn to resemble respectively that of idealized
lagged and nonlagged LGN cells. These functions are space-time separable and
have the same center-surround spatial structure, but slightly dierent spatial positions. They have dierent temporal structures, which result in dierent response
latencies. Panel C depicts the weighting function obtained by summing the two
LGN weighting functions. This weighting function is oriented in space-time, so it
is direction selective. It is similar to that of a direction selective V1 simple cell.
In practice, V1 simple cell weighting functions would be the result of many, not
just two, LGN inputs. In general, a nonseparable (direction selective) weighting
function can be obtained by simple addition of separable (not direction selective)
weighting functions (Fahle and Poggio, 1981; Adelson and Bergen, 1985; Watson
and Ahumada, 1985), as long as the latter dier in their temporal structure.
34
2.3.3 Push-Pull Arrangement of Inputs
The linear combination of LGN inputs involves both sums and subtractions. Indeed, besides excitatory responses, simple cell receptive elds also exhibit inhibitory responses, elicited by the onset of a light on an OFF region, or the oset
of a light on an ON region. Hubel and Wiesel (1962) pointed out that these inhibitory responses could originate either from the withdrawal of excitation or from
actual inhibition.
There is now evidence that both mechanisms are at work. Inhibitory postsynaptic potentials (IPSPs) do appear in intracellular recordings of simple cells
(Creutzfeldt and Ito, 1968), and their interaction with the excitatory postsynaptic
potentials (EPSPs) is subtractive (Berman et al., 1991; Ferster and Jagadeesh,
1992). Ferster (1986) measured the selectivity of the IPSPs and found it to be
identical to that of the EPSPs. Moreover, he found that in ON regions a light increase results in EPSPs and a light decrease results in IPSPs, while in OFF regions
a light decrease results in EPSPs and a light increase results in IPSPs (Ferster,
1988). In other words, EPSPs and IPSPs are spatially overlapping. The inhibitory
responses result from both withdrawal of excitation and actual inhibition, just
as the excitatory responses result from both withdrawal of inhibition and actual
excitation.
It is thus plausible that the inputs to a simple cell are arranged in push-pull,
i.e. they come from pairs of cells with opposite signed receptive elds, one of which
provides excitation and the other inhibition. For example, an ON subregion would
be the result of excitatory ON-center inputs as well as of inhibitory OFF-center
inputs.
This complementary arrangement of excitation and inhibition is also consistent
35
with extracellular recording studies in cat V1 (Glezer et al., 1980; Heggelund,
1981; Palmer and Davis, 1981; Glezer et al., 1982; Heggelund, 1986; Tolhurst and
Dean, 1987, 1990). A similar push-pull arrangement might also be used by ganglion
cells to integrate bipolar signals (Gaudiano, 1992).
For reasons of simplicity we assume that both the excitation and the inhibition
are contributed by feed-forward connections. In this we dier from a number
of recent models that consider intracortical feedback crucial in sharpening the
selectivity conferred by the inputs from the lateral geniculate nucleus (Ben-Yishai
et al., 1995; Somers et al., 1995; Suarez et al., 1995). While the feed-forward view
is supported by recent evidence (Reid and Alonso, 1995; Ferster et al., 1996), the
linear model should not necessarily be identied with a feed-forward arrangement
inputs. A linear receptive eld could, in principle, be constructed with pure feedforward connections, pure feed-back connections, or a combination of feed-forward
and feedback.
2.3.4 Linearity of Excitation and Inhibition
We now make the linking assumption that simple cell synaptic conductances depend
linearly on the responses of LGN neurons. This assumption is quite realistic as
far as excitatory conductances are concerned, because there is evidence for direct
excitatory inputs from the LGN to simple cells (Ferster and Lindstrom, 1983), and
direct synaptic transmission can be well approximated by a linear transformation
of the presynaptic ring rate into the postsynaptic conductance2.
2 Indeed,
it is widely held that in the absence of synaptic plasticity and for reasonable presy-
naptic ring rates, at a given synapse each presynaptic spike results in a stereotyped postsynaptic
conductance increase (Jack et al., 1975; Koch and Poggio, 1987). Synaptic transmission can then
be considered to be a linear transformation whose impulse response is given by the shape of an
36
There is, however, conicting anatomical evidence of direct geniculocortical inhibition (Garey and Powell, 1971; Einstein et al., 1987), and none of the physiological evidence supports its existence (Watanabe et al., 1966; Toyama and
Takeda, 1974; Toyama et al., 1974, 1977b,a; Ferster and Lindstrom, 1983; Tanaka,
1983; Reid and Alonso, 1995). Since, most inhibitory inputs from the LGN to simple cells are disynaptic (Ferster and Lindstrom, 1983), the linearity of inhibition
would seem to require an inhibitory cortical interneuron that performs a linear
integration of LGN inputs and encodes them linearly into ring rate.
2.3.5 Simplied Model of a Cortical Cell
We adopt a very simplied model of a cortical cell (Figure 2.9): a single compartment circuit with only passive conductances. In particular we consider a leak
conductance gleak and two synaptic conductances, one excitatory (ge ) and one inhibitory (gi). The membrane potential of a model cell then obeys
0C dV
= ge (V 0 Ve ) + gi(V 0 Vi) + gleak (V 0 Vleak );
dt
(2.2)
where C is the membrane capacitance, and Vleak , Ve and Vi are respectively the
equilibrium potentials of the leak, excitatory and inhibitory channels.
This view of the cellular physiology deliberately ignores many known aspects of
neuronal biophysics, such as voltage- and calcium-dependent channels, the possible
nonlinear interactions between inputs caused by the dendritic structure, and the
possible eects of electrotonic distance from the soma (Koch and Segev, 1989). Our
model of a cortical cell is however in many respects a reasonable approximation.
For example, Figure 2.10A shows the responses of an intracellularly recorded cortical neuron to sinusoidal current injection at dierent temporal frequencies. The
isolated postsynaptic conductance increase.
37
Retinal image
V
C
Vleak
gleak
Ve
ge
Vi
gi
Firing rate
Simplied model of a cortical cell, and possible biophysical
implementation of the linear model. The cell membrane is modeled as a single compartment with passive properties and two classes of synaptic inputs,
excitatory and inhibitory. In the central excitatory subregion of the receptive eld the excitation is provided by ON-center cells and the inhibition by
OFF-center cells with superimposed receptive elds. The anking inhibitory
subregions are obtained by the opposite arrangement of excitation and inhibition (not shown). This push-pull arrangement of excitation and inhibition
ensures the linearity of the membrane potential V . The membrane potential
is encoded into ring rate by a rectier. See text for explanation of symbols.
Figure 2.9:
38
B
Firing rate (Spikes/sec)
-100
0.0
005s01u1.p9
005s01u1.p9
50
0.5
1.0
30
40
20
20
10
0.25
Time (sec)
1
4
16
64
Frequency (Hz)
Figure 2.10: Encoding of input current in a visual cortical cell in vitro.
Responses of an intracellularly recorded regular spiking neuron in a slice of
guinea pig cortex to sinusoidal current injection (0.8 nA). A: Time course
of the responses for ve dierent frequencies of stimulation (1, 2, 4, 8 and
16 Hz). The traces are dominated by their rst harmonic. This means that
apart from the presence of the spikes the membrane is acting linearly. B:
Temporal frequency tuning of spike rate and membrane potential. Filled
symbols: Amplitude of the rst harmonic of the membrane potential obtained by tting a sinusoid to the raw membrane potential traces in A.
Scale is on right axis. The dashed line is the prediction of a single compartment with only passive conductances. Open symbols: Amplitude of the
rst harmonic of the ring rate, obtained by tting a sinusoid to the spike
times.
39
Membrane Potential (mV)
Membrane potential (mV)
A
membrane potential responses are dominated by their sinusoidal (rst harmonic)
component. This means that if the generation of spikes is ignored, the membrane
of cortical neurons can be reasonably modeled by passive conductances, which endow it with a linear behavior. In particular, when the rst harmonic responses
are plotted against the temporal frequency of the stimulus (lled circles in Figure 2.10B), they are well t by the predictions of a single-compartment model of
the cell (dashed curve in Figure 2.10B).
2.3.6 Linear Integration of the Synaptic Inputs
The push-pull arrangement of the LGN inputs to a simple cell can lead (through a
balance of excitation and inhibition) to a perfectly linear integration of the synaptic
conductances by the cell membrane (Carandini and Heeger, 1994).
For the sake of simplicity, consider the steady-state behavior of the membrane
(dV=dt = 0). At steady state Equation 2.2 can be rewritten as
V =
ge Ve + gi Vi + gleak Vleak
:
ge + gi + gleak
(2.3)
The push-pull arrangement (Section 2.3.3) guarantees that every increase in
excitation will correspond to a decrease in inhibition, and vice-versa. In particular,
we assume that ge and gi are balanced so that the total conductance of the cell is
constant:
ge (t) + gi (t) + gleak = g0 :
(2.4)
Equation 2.3 can then be rewritten as V = [geVe + gi Vi + gleak Vleak ]=g0, which
is a linear function of ge and gi . In words, the membrane potential V is a linear
function of the synaptic conductances.
40
If our model of the cell membrane as a single compartment is a good approximation, the exact balance of excitation and inhibition expressed in Equation 2.4
is an essential condition for the linear integration of the synaptic conductances.
If, on the other hand, there is substantial electrotonic distance between synaptic
sites on the membrane, there are other conditions in which linear integration of
the synaptic inputs is possible. For example, Blomeld (1974), showed that if
inhibition is located on the soma, and excitation is electrotonically remote from
it, there is a range of synaptic activations in which the membrane potential will
be approximately a linear combination of the excitatory and inhibitory synaptic
conductances. This approach would not require the strict balance of excitation
and inhibition (Equation 2.4), but it would require additional assumptions about
the dendritic structure of the cell, the sites of the inputs, and the range of the
synaptic conductances.
2.3.7 Spike Rate Encoding
If simple cells integrate their synaptic inputs linearly, if those inputs depend linearly on LGN activity, and if LGN activity is a linear function of the stimulus
intensity distribution, then simple cells will integrate the stimulus intensity distribution linearly. This Section discusses the nal, nonlinear stage of the model,
which is responsible for the encoding of the input-driven membrane potential responses into spike trains.
Many characteristics of ring rate encoding are consistent with the view that
the ring rate responses are a rectied copy of the membrane potential responses.
An example of this can be seen in Figure 2.10A. The spike responses closely mirror
the membrane potential responses, and there is a clear threshold below which
41
no spikes are generated. Once above threshold, the ring rate grows with the
amplitude of the membrane potential modulation. There is in fact a large literature
pointing to a linear or bilinear relation between injected current and ring rate,
once the current is above a threshold level (see Stafstrom et al., 1984, and references
therein).
There is however an additional experimental result that is not consistent with
the view of ring rate encoding as rectication: the spike rate encoder has notable
dynamic properties (Chapter 4). For example, cortical neurons typically exhibit
spike frequency adaptation, meaning that the ring rate response to steady depolarization decreases with time (Stafstrom et al., 1984c). Dynamic properties of
spike encoding are also evident in Figure 2.10. Figure 2.10B plots the temporal
frequency tuning of the rst harmonic of the spike train taken from records like
the ones in Panel A. It is clear that spike rate encoding is not at all independent of
the temporal frequency of the stimulus, as would be the case for rectication. The
middle temporal frequencies are transmitted much better than the low temporal
frequencies, and the very high temporal frequencies are completely cut o.
Rectication cannot account for these phenomena because it is a static nonlinearity, i.e., it depends only on the instantaneous value of its argument. Strictly
speaking, then, rectication is incorrect, because it would predict that the spike
encoding properties would not depend on the past history of stimulation. Hence,
we are forced to adopt a slightly more complicated model of spike rate encoding.
In particular, we model the spike rate encoder as a band-pass lter followed by
rectication3.
3 An
even better model for the spike rate encoder of visual cortical cells is given by the
opposite arrangement, in which rectication is followed by a band-pass linear lter (Chapter 4).
This arrangement however would make the model very hard to deal with analytically. The two
42
The behavior of this spike encoder model is actually quite simple when incorporated into the linear model of simple cells. In the full model, the spike rate
encoder comes after a (synaptic) linear spatiotemporal weighting function. Since a
chain of linear systems is itself a linear system, we can treat the band-pass (spike
encoder) linear lter and the (synaptic) spatiotemporal linear weighting function
as a single linear system. The weighting function of this nal system is partly due
to the synaptic inputs and partly to the band-pass properties of the ring rate
encoder.
2.4
Some Nonlinear Properties of Simple Cells
Having described the linear model of simple cells, and having discussed its numerous successes and its possible biophysical implementation, it is now the time to
discuss its failures. We have already encountered a number of occasions in which
the linear model fails to yield precise quantitative predictions. Indeed, the behavior of simple cells is in many ways nonlinear. This Section describes some of these
nonlinearities, which will be discussed more quantitatively in Chapter 3, where I
introduce the normalization model.
2.4.1 Contrast Responses
Presented with a change in contrast, a linear neuron would scale its response by the
same amount. The responses of a simple cell, instead, are often not proportional
to stimulus contrast. An example of this is illustrated in the central column of
arrangements give identical rst harmonic responses to sinusoidal stimulation at a xed temporal
frequency.
43
Firing rate (Spikes/sec)
Spatial frequency (cpd):
1.0
0.6
0.4
Contrast:
25%
50%
100
0
0
153
Time (ms)
100%
Response Amplitude (Spikes/sec)
B
A
392l016.p15
0.6 cpd
100
0.4 cpd
50
1.0 cpd
20
10
5
2
20
50 100
Contrast (%)
Responses of a monkey V1 simple cell to a drifting sine grating for three dierent spatial frequencies and three dierent contrasts. The
curves are ts of the normalization model. The ts were performed on a
larger data set, which included the responses to 45 dierent drifting gratings, that had 5 dierent contrasts, 3 dierent spatial frequencies, and 3
dierent temporal frequencies. These stimuli were randomly interleaved to
minimize the eect of visual adaptation. A: Spike histograms of one period of the responses, averaged over many presentations. The three columns
show the responses to drifting gratings with a spatial frequency of 1, 0.6
and 0.4 cycles/degree (cpd). Each row corresponds to one of three dierent
contrasts: 25, 50 and 100%. B: Amplitude of the responses as a function
of contrast. The ordinate plots the amplitude of the rst harmonic of the
same responses as Panel A. Error bars indicate the standard error of the
mean (N=3). The number next to each curve species the spatial frequency
of the stimulus. Experimental methods for this and for the following gures
are outlined in Chapter 3.
Figure 2.11:
44
Figure 2.11A. The central column shows the spike histograms of a simple cell in
response to a drifting grating. The dierent rows correspond to dierent contrasts.
As the contrast goes from 50% to 100%, the response does not double. Instead it
grows very little. This phenomenon is known as response saturation (Maei and
Fiorentini, 1973; Dean, 1981; Albrecht and Hamilton, 1982; Ohzawa et al., 1982;
Sclar et al., 1990). Cells can even exhibit \supersaturation", in which increasing the
contrast of the stimulus reduces the amplitude of the responses (Li and Creutzfeldt,
1984; Bonds, 1991).
Response saturation is not due to the high ring rates. This can also be seen
in Figure 2.11. The three columns in Panel A show the responses of a cell to
three gratings of dierent spatial frequency. Even though the left column and the
right column stimuli elicit fewer spikes than the central column stimulus, there
clearly is response saturation. This phenomenon thus depends on the contrast of
the stimulus per se, not on the amplitude of the responses it elicits in the cell.
This property of the contrast responses can be more precisely observed in Panel
B, which plots the amplitude of the responses shown in Panel A, as a function
of contrast. In spite of the amplitude saturation the three contrast responses are
vertical shifts of each other. Since the vertical axis is logarithmic, a vertical shift
means that the ratio of the responses to any two dierent spatial frequencies is
constant, irrespective of the stimulus contrast.
Another way to express this property is to say that the shape of the spatial
frequency tuning curve is independent of the contrast at which it is measured.
Changing the contrast of the stimuli just scales the tuning curve. This has been
observed for both spatial frequency tuning and orientation tuning (Movshon et al.,
1978c; Albrecht and Hamilton, 1982; Sclar and Freeman, 1982; Li and Creutzfeldt,
1984; Skottun et al., 1987). An example of the contrast invariance of the orientation
45
Response Amplitude (Spikes/sec)
A
B
392l008.p04, 6.6 Hz, N=3
100
100%
50%
80 deg
50
19%
12%
20
10
6%
40 deg
5
3%
2
1
2
5 10 20 50 100
Contrast (%)
40 80 120
Orientation (deg)
Responses of a monkey V1 simple cell to a drifting sine grating
for dierent stimulus orientations and contrasts. The continuous curves are
ts of the normalization model. The ts were performed on a larger data
set, which included the responses to additional temporal frequencies. Error
bars indicate the standard error of the mean (N=3). A: Contrast responses
for two dierent stimulus orientations. Changing the orientation of a grating
shifts the contrast responses up and down on a logarithmic scale. B. Eect
of contrast on the orientation tuning. Data for 40 and 80 are the same as
those in panel A. The orientation tuning is invariant with contrast.
Figure 2.12:
46
tuning is shown in Figure 2.12.
The contrast independence of the tuning curve shapes would be easy to explain
if the contrast responses of simple cells were linear. The responses of a linear cell to
two stimuli S1; S2 with the same contrast c could be written as cL(S1) and cL(S2).
Their ratio would be L(S1)=L(S2), independent of the contrast c. We have seen,
however, that the contrast responses of real simple cells are nonlinear, since they
often saturate at high contrasts. The contrast independence of the tuning curves
is thus by no means a trivial property.
2.4.2 Nonspecic Suppression
The response to a preferred stimulus can be suppressed by superimposing an additional stimulus that would not elicit any response when presented alone. This
phenomenon is a violation of superposition, a dening property of linearity. We
call it nonspecic suppression, as it has been found to be independent of direction
of motion, largely independent of orientation and broadly tuned for spatial and
temporal frequency (Bishop et al., 1973; Dean et al., 1980; Burr et al., 1981; Hammond and MacKay, 1981; Morrone et al., 1982; De Valois and Tootell, 1983; Li
and Creutzfeldt, 1984; De Valois et al., 1985; Kaji and Kawabata, 1985; Gulyas
et al., 1987; Bonds, 1989; Nelson, 1991; DeAngelis et al., 1992; Geisler and Albrecht, 1992). After some debate, there is now consensus that cross-orientation
inhibition can be driven dichoptically (with one grating in each eye), although
monoptic suppression (with both gratings in the same eye) is typically stronger
(Ferster, 1981; Ohzawa and Freeman, 1986a,b; Freeman et al., 1987; DeAngelis
et al., 1992; Sengpiel and Blakemore, 1994; Sengpiel et al., 1995; Walker et al.,
1996).
47
B
50%
Test
contrast:
0%
6%
12%
25%
50
50%
0
0
612
Time (ms)
Response Amplitude (Spikes/sec)
Mask contrast:
6%
25%
Firing rate (Spikes/sec)
A
392l024.p09
100
50
20
10
5
2
1
0.5
5
10
20 50 100
Test Contrast (%)
Responses of a monkey V1 simple cell to a plaid composed of
two drifting gratings (\test" and \mask"), for dierent contrasts of the two
gratings. The orientations of test and mask dier by 90. The curves are
ts of the normalization model. A: Spike histograms of one period of the
responses, averaged over many presentations. Rows correspond to a xed
test contrast, columns to a xed mask contrast. The mask does not elicit
any overt response when presented alone (top row) and strongly inhibits
the responses to the test (second and third columns) B: Amplitude of the
responses as a function of contrast. The ordinate plots the amplitude of the
rst harmonic of the same responses as Panel A. Error bars indicate the
standard error of the mean (N=3). The white, gray and black circles refer
respectively to mask contrasts of 6, 25 and 50%.
Figure 2.13:
The origins of suppression are most likely cortical, as it is completely absent
in monkey P LGN and cat LGN (Bonds, 1989; Movshon et al., 1994). Moreover,
the temporal properties of nonspecic suppression are consistent with the view
that it originates from complex cells or from a large pool of simple cells. Indeed,
the suppression elicited by a drifting grating is not modulated in time (Morrone
et al., 1982; Bonds, 1989), and the suppression elicited by a contrast-modulated
modulates at twice the frequency of the stimulus (Morrone et al., 1982).
48
Figure 2.13 shows an example of nonspecic suppression. The stimulus was a
plaid made of two gratings. One (the \test") drifted in the cell's preferred direction
and evoked a large response when presented on its own. The other grating (the
\mask") drifted at right angles with the test grating, and was ineective in driving
the cell. Its presence, however, clearly suppressed the responses. For example,
when the mask contrast was 50% the cell responded only when the test grating
had high contrast.
From Figure 2.13B one can see that the presence of the mask shifts the contrast
response to the right (Bonds, 1989). This corresponds to a scaling of contrast
(Heeger, 1992b). We will see in Chapter 3 that the contrast responses shift to the
right only when the cell is completely unresponsive to the mask. If the cell gives
even a minimal response to the mask the eect is more complicated than just a
rightward shift.
2.4.3 Temporal Nonlinearities
Simple cells display prominent temporal nonlinearities. Figure 2.14A provides a
good example of this. As the stimulus contrast increases, the responses occur
earlier in time. This is called phase advance (Dean and Tolhurst, 1986; Carandini
and Heeger, 1994; Albrecht, 1995). It is a nonlinearity because for a linear cell
scaling the input would just scale the output, not change its timing. Phase advance
is not entirely cortical in origin, but has a strong cortical component. Phase
advance in cat LGN (measured at 2-5 Hz) is on average less than 20 ms (Carandini
et al., 1993a), whereas phase advance in cat V1 (measured at 2 Hz) is on average
around 47 ms (Dean and Tolhurst, 1986). In the monkey LGN, phase advance is
present in M cells, but it is completely absent in P cells (Sherman et al., 1984).
49
382l021.p05 ori= 120, tf = 6.54
Contrast:
Firing Rate (Spikes/sec)
25%
31%
50%
63%
80
40
0
100%
50
100
Time (ms)
150
B
Response Amplitude (Spikes/sec)
A
382l021.p05 ori= 120
50
20
10
5
2
1
1.
5
Te
m 3
po
ra
l
0
10
Fr
eq
6
ue
nc 12
y
(H
30
70
)
t (%
50
s
a
ntr
Co
z)
Responses of a monkey V1 simple cell to a drifting sine grating for dierent contrasts and temporal frequencies. The curves are ts
of the normalization model. The ts were performed on a larger data set,
which included the responses to an additional orientation. A: Spike histograms of one period of the responses, averaged over many presentations.
Each panel corresponds to a dierent contrast of the stimulus. Note the
prominent phase advance with increasing contrast. B: Amplitude of the
responses as a function of contrast and temporal frequency. The ordinate
plots the amplitude of the rst harmonic responses such as those in Panel
A. Data points are joined by dashed lines. The temporal frequency tuning
is strongly dependent on contrast: at low contrasts the high frequencies are
more attenuated than at high contrasts.
Figure 2.14:
50
Another temporal nonlinearity of simple cells was uncovered by Reid et al.
(1992). They measured the responses of cat simple cells to eight dierent stimuli
and to the compound stimulus obtained by summing the eight stimuli. They found
that the responses to the compound stimulus occur earlier in time than the linear
prediction obtained from the responses to the individual stimuli. This decrease
in \integration time" is quite prominent, in the range of 5-60 ms, and there is
evidence that its origin is cortical (Reid et al., 1992).
Finally, a third temporal nonlinearity of simple cell responses is given by
the contrast dependence of their temporal frequency tuning (Holub and MortonGibson, 1981). In particular, increasing stimulus contrast increases the cell's responsivity to the high temporal frequencies (Hawken et al., 1992). An example of
this is shown in Figure 2.14B. According to the linear model increasing the contrast should just scale the responses, with no eect on their temporal frequency
tuning. The origins of this nonlinear behavior are partially subcortical, since it
was observed in the cat retina (Shapley and Victor, 1978) and in the monkey M
LGN (Benardete et al., 1992). There is however evidence that in the monkey this
behavior is much stronger in V1 than in the LGN. Preliminary results by M. J.
Hawken et al. (personal communication) indicate that on average the high-cuto
frequency of V1 cells changes from around 10 Hz at 8-16% contrast to around 30
Hz at 64% contrast. By contrast, the average change in high-cuto frequency of
LGN cells is negligible.
2.5
Conclusions
The linear model is quite successful in explaining the selectivity of simple cells
for a variety of stimulus attributes, such as shape, size, position, orientation, and
51
direction of motion.
The model however fails by a scale factor when it is used to relate responses
obtained with stimuli that dier substantially in contrast or energy. We have seen
this property when comparing the responses to drifting gratings with those to
ashing bars (Section 2.2.2), or with those to contrast-modulated gratings (Section 2.2.3), or with those to plaids (Sections 2.2.4 and 2.4.2). This nonlinearity
can be observed directly by changing the contrast of the stimuli (Section 2.4.1).
Given its numerous successes, however, it would not be wise to dispense with
the linear model altogether. The next Chapter shows how the gain control nonlinearity, as well as the temporal nonlinearities mentioned in Section 2.4.3, can
be explained by an extension of the linear model, the normalization model. This
model postulates a mechanism that decreases both the gain and the latency of the
responses when the contrast of a stimulus is increased, or when another stimulus
is superimposed.
52
Chapter 3
Gain Control
According to a view emerged in recent years, the nonlinearities of simple cells could
be explained by extending the linear model to include a gain-control stage (Albrecht
and Geisler, 1991; Heeger, 1991; DeAngelis et al., 1992). In particular, Heeger
(1991, 1992b) proposed a normalization model (Figure 1.1B, Page 2), in which the
linear response of every cell is divided by the same number (\normalized"). This
number grows with the activity of a large number of cortical cells, the normalization
pool. If the pool is suciently varied in its composition, the normalization signal
can be shown to grow with the local stimulus energy, which is the variance of the
intensity values of the stimulus, measured over local space and recent time and
over a band of spatial and temporal frequencies.
The normalization model attributes a cell's selectivity to the initial linear stage
and its nonlinear behavior to the normalization stage. For example, the model
predicts response saturation because increasing the contrast of a stimulus increases
its energy and thus increases the divisive suppression. Similarly, it predicts crossorientation inhibition because adding an orthogonal grating increases the stimulus
53
energy.
In this Chapter we propose a biophysical implementation of the normalization
model, and we test it with large data sets obtained from the primary visual cortex
of the monkey. Indeed, while the model responses have been shown by computer
simulation to resemble those of real neurons (Heeger, 1992b, 1993; Carandini and
Heeger, 1994), a more decisive test of the model must investigate its ability to
quantitatively t neural data.
Our implementation of the model is depicted in Figure 1.1C (Page 2). The cell
membrane is modeled as a simple RC circuit, composed of a resistor and a capacitor
in parallel. The linear stage injects synaptic current into the cell, which outputs a
membrane potential response. A rectication stage converts the latter into a ring
rate. Normalization operates by shunting inhibition: the cells in the normalization
pool inhibit each other by increasing each other's membrane conductance. The
conductance controls the gain of the transformation of input currents into output
membrane potentials.
To test this model we recorded from simple cells in the primary visual cortex of paralyzed, anesthetized macaques, while presenting very large sets of visual
stimuli. These stimuli included drifting gratings, plaids composed of two drifting
gratings, and sums of a drifting grating and spatiotemporal white noise. The gratings assumed a wide variety of contrasts, temporal frequencies, spatial frequencies
and orientations. We derived closed-form equations for the model responses to
such stimuli, and we found that these equations provide good ts to the neural
responses.
54
3.1
Methods
Experiments were performed on 5 cynomolgus macaque monkeys (Macaca fascicularis) and 4 pigtail macaque monkeys (Macaca nemestrina) ranging in weight
from 1.5 to 4 kg.
3.1.1 Preparation and Maintenance
Animals were initially anesthetized with Ketamine HCL (10 mg/kg) and premedicated with Atropine sulfate (0.05 mg/kg) and Acepromazine maleate (0.1 mg/kg).
Anesthesia continued on 1.5 { 2.0% halothane in a 98% O2 { 2% CO2 mixture
while the initial surgery was performed. Indwelling catheters were introduced into
the saphenous veins of each hindlimb and a tracheotomy was performed.
The animal was then mounted in a stereotaxic instrument, and halothane anesthesia was replaced by a continuous infusion of sufentanil citrate (typically 4{6
g/kg/hr, beginning with a loading dose of 4 g/kg). EEG, EKG and arterial
blood pressure were monitored continuously, and any signs of arousal were corrected by modifying the rate of anesthetic infusion. The monkey was articially
respirated with a mixture of O2, N2O and CO2 adjusted so that end-tidal CO2 was
maintained at 3.8 { 4.0%. Rectal temperature was kept near 37 C with a heating
pad.
A small craniotomy was performed, usually 9{10 mm lateral to the midline and
3{4 mm posterior to the lunate sulcus. This location often yielded two encounters with the primary visual cortex, with eccentricities rst around 2{5 and then
around 8{15 . A small slit in the dura was made, and a vertical hydraulic microdrive containing a glass-coated-tungsten microelectrode (Merrill and Ainsworth,
55
1972) in a guide tube was positioned. The craniotomy was covered with a chamber containing 4% agar in sterile saline solution.
Upon completion of surgery, animals were paralyzed to minimize eye movements. Paralysis was maintained with an infusion of vecuronium bromide (Norcuron: 0.1 mg/kg/hr) in lactated Ringer's solution with dextrose (5.4 ml/hr).
The pupils were dilated and accommodation paralyzed with topical atropine. The
corneas were protected with zero power gas-permeable contact lenses; supplementary lenses were chosen to focus the eyes on a tangent screen plotting table set
up at a distance of 57 in. To maintain the animal in good physiological condition
during experiments (typically 72{96 hr), intravenous supplementation of 2.5% dextrose/lactated Ringer's was given at 5{15 ml/hr. Animals received daily injections
of a broad-spectrum antibiotic (Bicillin) as well as an anti-inammatory agent
(Dexamethasone) to prevent cerebral edema.
3.1.2 Stimuli
Stimuli were generated by a Truevision ATVista board operating at a resolution
of 582 x 752 and a frame rate of 106 Hz whose output was directed to a Nanao
T560i (mean luminance 72 cd/m2). Nonlinearities in the relation between applied
voltage and phosphor luminance were compensated by appropriate nonlinearities
in three color look-up tables.
Drifting luminance-modulated sinusoidal gratings were presented alone or superimposed on another grating or on a noise background. When two gratings were
presented together they had the same temporal frequency and diered in orientation and/or spatial frequency. Their contrast could be varied independently. The
noise background was composed of square pixels whose size was chosen for each cell
56
to be approximately 1/4 the spatial period of the optimal grating. Occasionally
we used one-dimensional noise (bars rather than squares). The intensity of each
square was randomly refreshed at 13.4 or 26.8 Hz and could assume two possible
values.
All the dierent stimuli had the same mean luminance. The stimuli were vignetted by a square window. We generally set the luminance of the surrounding
uniform eld to be equal to the mean stimulus luminance. Occasionally, however,
we recorded from cells that could be driven only if the surround luminance was
signicantly lower than the mean stimulus luminance, a phenomenon similar to
that observed by Kaplan and Shapley (1989) in ON-center P retinal ganglion cells.
Stimulus strength is measured in units of contrast, dened as the dierence
between the highest and lowest intensities, divided by the sum of the two. The
maximum contrast is 100%, which is attained when the lowest intensity is zero and
the highest intensity is twice the mean.
Experiments. Experiments consisted of 2-9 consecutive blocks of stimuli. Each
block consisted of a random permutation of 5-90 stimuli. Randomization was
adopted to minimize the eects of adaptation and other nonstationarities. The
stimuli had equal duration (generally 5-10 s) and were separated by uniform eld
presentations lasting about 4 s.
Experimental protocol. Receptive elds were initially mapped by hand on
a tangent screen. When a single neuron's activity was isolated, we established
the neuron's dominant eye, and occluded the other eye. We then positioned the
receptive eld on the face of the monitor, and quantitative experiments proceeded
under computer control.
57
To characterize each cell we performed the following sequence of measurements
using single gratings: (1) orientation/direction tuning; (2) spatial frequency tuning;
(3) temporal frequency tuning; (4) stimulus size tuning. Each of these measurements was performed at the optimal values of the parameters as obtained from
the previous measurements. To classify the unit as simple or complex we then
considered the cell's response to an \optimal" drifting grating. If the component
of the response at the temporal frequency of the stimulus was stronger than the
change in mean ring rate, the cell was classied as simple (Skottun et al., 1991),
and we proceeded to the core experiments in this study. These were of three types:
Grating matrix experiments, consisting of drifting sinusoidal stimuli having
5 to 10 dierent contrasts, 2-4 dierent temporal frequencies and 2-4 dierent orientations or spatial frequencies. A typical experiment would involve
3 orientations/spatial frequencies, 3 temporal frequencies and 5 contrasts,
yielding a total of 45 stimuli.
Plaid experiments, consisting of sums of two gratings whose contrasts were in-
dependently varied. Often the two directions were opposite, and the \plaid"
was a counterphase ickering grating. A typical experiment would involve
two orthogonal gratings whose contrasts assumed 5 possible values, yielding
a total of 25 dierent stimuli.
Noise masking experiments, in which the contrast response to drifting grat-
ings was measured in the presence of noise at dierent contrasts. A typical
experiment would involve 9 grating contrasts and two noise contrasts (0 and
50%) yielding a total of 18 dierent stimuli.
58
Cell population We report here on 149 experiments performed on a total of 54
cells that were clearly identied as simple cells and were held long enough to be
tested with at least two blocks of one of the core experiments in our paradigm.
In particular, we report on 51 grating matrix experiments from 34 cells, 76 plaid
experiments from 27 cells, and 22 noise masking experiments from 17 cells.
The cells in the sample exhibited a broad spectrum of tuning properties. The
orientation tuning of the cells ranged from 14 to 124 half-width, with a third
of the cells showing a tuning sharper than 24 and another third broader than
51. The directional index of the cells (DI; Reid et al., 1987) ranged in the whole
spectrum from 0 to 1. Direction selectivity was scarce in a third of the cells
(DI< 0:2), and prominent in another third of them (DI> 0:6).
3.1.3 Data Analysis
Amplied and band-pass ltered signals from the microelectrode were fed into
a hardware window discriminator. A computer interface (Cambridge Electronic
Design 1401 Plus) collected the pulses triggered by each action potential and the
synchronization signals from the video graphics board.
Response measure. Our main measure of cell response is the rst harmonic
r of the spike trains, a complex number indicating the amplitude and phase of
the best-tting sinusoid having the same temporal frequency as the stimulus. The
rst harmonic is obtained from the spike train by computing r = Pk cos(2!tk ) +
i sin(2!tk ), where ! is the temporal frequency of the stimulus and the tk are the
times of the individual spikes. In the following the responses r will often be written
as a matrix r = frs;bg, where the subscripts indicate the s-th stimulus presented
59
in the b-th stimulus block. We will denote the mean across blocks of the responses
as the vector r = frsg. For example, in an experiment in which three blocks of 25
dierent stimuli were run, the matrix r would contain 75 elements, and the vector
r would contain 25 elements.
Correction for eye movements. Inspection of the spike rasters obtained across
blocks often revealed the presence of a few discrete eye movements. For drifting
grating stimuli the sole eect of these eye movements would be a shift in response
timing. We reduced this eect by shifting in time all the responses in each block by
an amount chosen to minimize the variance across blocks of the rst harmonics of
the responses. As all the responses in a block are translated by the same amount,
this method would completely remove the eect of the movements only if they
occurred exactly between blocks. In all other cases it is just an approximation that
reduces the variance across blocks of the data. No attempt was made to correct
the eect of possible eye movements on the responses to plaids or to gratings in
the presence of noise.
Variability across experiments. V1 cells are known to adapt, i.e. their re-
sponses depend on the history of stimulation (Maei et al., 1973; Movshon and
Lennie, 1979; Ohzawa et al., 1985; Sclar et al., 1989). Indeed, we found that the responses of a cell to the same stimulus presented in two dierent experiments could
be quite dierent. This is illustrated in Figure 3.1A, which shows the responses
of a simple cell as a function of contrast of a drifting grating, as obtained in two
consecutive experiments. The two experiments were started 58 minutes apart, and
involved two sets of stimuli that shared only the stimuli that elicited the responses
shown in the Figure. The dierence in the responses is presumably due to the cell
60
being in dierent adaptation states. The adaptation behavior of some cells in our
sample was explicitly measured, and is the subject of a separate study (Poirson
et al., 1995). To average out the eects of adaptation we randomized the order of
the stimuli within each block.
Variability within experiments. The number of blocks in our experiments (2{
9) was not sucient to obtain reliable estimates of the variance s2 of the responses
to each stimulus s. For this reason we estimated the dependence of s2 on the
amplitude of the mean responses. As a functional form for this dependence we
chose the simple relation s2 = jrsj , where and are free parameters. An
example of the results of such a t is shown in Figure 3.1B. In the ts the scale
factor was on average 2.25 6 0.2, and the exponent was on average 1.18 6
0.03, consistent with the notion that the variance of the responses of V1 neurons
is proportional to their mean (Tolhurst et al., 1983).
Model ts. The models discussed in the Results section were t to the responses
to all stimuli in an experiment. Dierent experiments were tted independently
and thus yielded dierent sets of parameters. To t the predictions of a model
m = fms g to the data we performed a weighted least squares t, i.e. we searched
for the parameters a that minimized the error function
Error(a) =
X
s
jms(a) 0 rsj2=s2;
where the s2 are the estimated variances. To avoid giving too much importance
to data points of low amplitude, when tting the models of the visual responses
we took all the s2 < 1 to be exactly equal to 1. When tting the responses to
gratings in the presence of visual noise the estimated variances s2 were ignored
(because they include a visually-driven component), and were all set to one.
61
50
A
100
B
Variance, (sp/s)^2
Mean response, sp/s
20
10
5
10
1
2
0.1
0.1
1
10
20
50
Contrast (%)
1
10
Mean response, sp/s
100
100
Response variability across and within experiments. A: Variability across experiments. White and black dots represent responses to the
same stimuli in two dierent experiments performed on the same cell, as a
function of stimulus contrast. Error bars represent one standard error of the
mean (black: N=6, white: N=5). Stimuli were gratings drifting at 3.3 Hz.
The rst experiment (black) involved a block of 40 stimuli (10 contrasts,
2 temporal frequencies, 2 spatial frequencies) which was presented 6 times.
The second experiment (white) was initiated 58 minutes after the rst and
involved a block of 90 stimuli (10 contrasts, 3 orientations, 3 temporal frequencies) which was presented 5 times. B: Variability within an experiment.
Each datum represents the responses to one of the 90 stimuli in the second experiment. The abscissa plots the amplitude of the mean responses,
jrsj. The ordinate plots the variance s2 of the responses. The line shows
the t by the model of the variance described in the text, with parameters
= 1:95; = 0:98. Unit 389l019, Exps 5,6.
Figure 3.1:
62
Percentage of the variance. To gain an intuitive assessment of the quality of
the ts provided by a model, we computed the percentage of the variance across
stimuli that it accounted for. To dene this measure it is useful to consider the
(mean square) distance between two sets of responses x = fxsg and y = fysg
d(x; y) = 1=N
X
s
jxs 0 ysj2;
where the sum is over the stimuli s, and N is the number of stimuli. The percentage
of the variance accounted for by the model m in the responses r is then simply
%Variance = 100 3 [1 0 d(m; r)=d(r; r)] ;
where r is the response mean computed across stimuli and across blocks. In this
expression, the numerator is the distance between the model predictions and the
mean cell responses; the denominator is the variance across stimuli of the mean cell
responses. For example, if the model predicted the mean responses exactly then
it would account for 100% of the variance. More realistically, if the mean error
between the model prediction and the responses was 10 sp/s, and the responses in
the data set had very dierent amplitudes and/or phases, so that their variance
was large, say 100 sp/s, then the model accounted for 90% of the variance in the
data.
Bootstrap test. While the percentage of the variance is an intuitive measure
of the quality of the ts, it has the disadvantage of taking into account only
the variability across stimuli, and not the variability across blocks. A very noisy
experiment could yield bad estimates of the mean response to each stimulus, in
which case the model would account for a small percentage of the variance in the
data even if it reected the exact physical reality underlying the responses. This
suggests a need for a test that takes into account the statistical properties of the
data.
63
To test the equality of the model predictions to the means of the probability
distribution underlying the neural responses, we performed a bootstrap hypothesis
test (Efron and Tibshirani, 1991). The advantage of such a test is that it does not
require that the distribution underlying the responses is normal. We tested whether
we could reject the null hypothesis that the mean of the probability distribution
underlying the neural responses was identical to the predictions of the model.
Let rb be the vector of responses obtained in the b-th block of stimuli. If for
example an experiment consisted of 25 dierent stimuli, and was repeated 4 times,
there would be 4 vectors of responses, r1, r2, r3, and r4, and each would contain
25 (complex) numbers. Let m be the model's prediction, obtained by tting all
the rb. The null hypothesis states that the mean r of the probability distribution
from which the rb are drawn is identical to the model's prediction:
H0 : r = m:
As a test statistic we chose the distance between the model predictions and the
empirical average of the responses:
r:
t = d(m; )
Having observed a value tobs by evaluating the test statistic on the actual experimental data, we set to measure what the probability would be of observing at least
that large a value if the null hypothesis were true. This probability is the achieved
signicance level of the test,
ASL = Probft tobsjH0g
The smaller the ASL, the stronger the evidence against H0.
To compute the ASL with the bootstrap method, we converted our data set r
into one whose empirical distribution function obeyed H0. This was simply done
64
by considering ~r = r 0 r + m. Shifting the empirical average of the distribution in
such a way is justied if the probability distribution of r is shift-invariant (Efron
and Tibshirani, 1993). We then computed the bootstrap estimate of the ASL by
repeating a thousand times:
1. Draw a sample data set r3 with replacement from ~.
r For example, if the
experiment was repeated 4 times, a possible draw would be r3 = f~r4~r1~r2~r2g,
another one could be r3 = f~r2~r1~r2~r3g, and so on.
2. Compute the test statistic on the sample, t3 = d(m; r3).
The bootstrap estimate of the achieved signicance level of the test is equal to
the percentage of samples for which the t3 are larger than the observed value tobs.
65
3.2
Results
Our goal is to test a model of simple cells. We thus start by describing the model
and then we compare its predictions with the responses of simple cells to gratings,
plaids and gratings masked by noise.
3.2.1 The Normalization Model
The normalization model is depicted in Figure 1.1C (Page 2). To keep the model
simple and mathematically tractable, we adopt a number of simplications. First,
we consider the input to simple cells to be the driving current, which we dene as
the current that would be measured by clamping the voltage of the cell at rest.
Then, we assume that (1) the relation between the visual stimuli and the driving
current is linear; (2) the cell membrane is a single passive compartment; (3) the
ring rate is a rectied copy of the membrane potential; (4) cells inhibit each other
by increasing each other's membrane conductance; and (5) the pool of cells that
inhibit each other contains cells tuned to a wide variety of stimulus attributes.
The linear stage. As a visual stimulus is projected on the retina it can be
described by its light distribution l(x; y; t), that varies in the two spatial dimensions
x,y and in time t. This representation ignores the color of the stimulus and assumes
monocular viewing, but is in all other respects complete.
In order to ignore the eects of light adaptation in the retina, we restricted
our choice of visual stimuli to intensity distributions that modulate about a xed
mean l. In these conditions the stimulus can be characterized by its local contrast
c(x; y; t) = [l(x; y; t) 0 l]=l. This characterization is particularly relevant because
66
the retina produces a \neural image" of the local contrast (Shapley and EnrothCugell, 1984).
To relate the local contrast to the driving current in simple cells we adopt a
linear model. The driving current Id(t) is obtained by weighting the local stimulus
contrast c(x; y; t) at each location and time by the value of the cell's weighting
function W at that location and at that time, and by algebraically summing the
results:
Z Z Z
Id (t) =
W (x; y; T ) c(x; y; t 0 T ) dx dy dT:
(3.1)
This linear equation is essentially the simplest possible relation between the visual
stimuli and the input to simple cells. It is intended to be at best an approximation.
We describe in Appendix Section 3.4 some biophysical conditions that would lead
to it being exact.
In the following we will use the term contrast and the symbol c to denote the
maximal value of the local contrast c(x; y; t). A uniform eld has a contrast of 0%,
while a grating oscillating between zero and twice its mean intensity has contrast
100%.
RC circuit. We adopt an extremely simplied biophysical model of a cell mem-
brane: a circuit composed of a resistor and a capacitor arranged in parallel (RC
circuit). According to this model, the membrane potential V (t) obeys the following
equation:
CdV=dt + gV = Id ;
(3.2)
where C is the membrane capacitance, g(t) is the total membrane conductance,
and Id(t) is the driving current. In the absence of visual stimuli, the driving current
is zero, and the membrane potential is driven to its resting value, which we have
67
A
10
5
0
-10
0
10
20
30
Membrane potential (mV)
300
100
Membrane potential (mV)
15
Conductance increase (%)
Firing rate (sp/s)
20
B
30
10
3
1
1
3
10
30
100
Total firing rate (%)
10
C
5
0
0
10
20
Time (ms)
30
Interrelations and eects of the principal variables in the normalization model. A: relation between membrane potential V and ring
rate R. The thick, intermediate and thin lines respectively depict rectication with thresholds Vthresh =0, 6 and 12 mV. The dashed curves indicate
approximations to rectication obtained with power functions (see text),
with exponents n = 2 (thick dashes) and n = 3 (thin dashes). B: relation
between pool activity and membrane
conductance. The abscissa plots the
P
overall response of the pool, k R; the ordinate plots the increase in membrane conductance g=g0 0 1 (Equation 3.4). C: Eects of conductance on the
size and time course of the membrane potential responses. The curves are
the membrane potential responses to a current step with onset at time zero,
for three dierent values of the conductance g . As the conductance increases
(thin to thick curves), it reduces both the gain and the time constant of the
cell.
Figure 3.2:
taken for simplicity to be zero.
Rectication. As a rst approximation, the transformation by V1 cells of mem-
brane potentials V into spike rate R can be modeled by rectication, that is by
a function that is zero for membrane potentials below a threshold, and grows linearly from there on (Movshon et al., 1978b; Jagadeesh et al., 1992; Carandini et al.,
1996b). Rectication is given by R(t) / max(V (t) 0 Vthresh ; 0). This function is
depicted for three dierent values of the threshold Vthresh by the straight lines in
Figure 3.2A.
Rectication is however not very easily handled in mathematical derivations.
We thus approximate rectication (Vthresh > 0) with half-rectication (Vthresh = 0)
68
followed by elevation to the power n:
R / max(0; V )n :
(3.3)
The quality of this approximation is shown in Figure 3.2A. The value of the exponent n grows with the distance of the threshold Vthresh from the resting potential
Vrest , which is here considered to be 0 mV for simplicity. If the threshold is very
close to rest, then n 1 (\half-rectication"). If the threshold is a bit above rest,
e.g. 6 mV higher, then n 2 (\half-squaring"). If the threshold is much above
rest then n 3 or more.
Conductance and cortical activity. We now make the central assumption
that gain control operates by shunting inhibition. We assume that the membrane
conductance g of each cell grows with the overall activity P R of a pool of cortical
cells, the normalization pool. Some possible biophysical mechanisms for shunting
inhibition are discussed in Appendix Section 3.4. The particular function that
we choose to relate the conductance and the activity of the pool is illustrated in
Figure 3.2B. Its mathematical expression is
r
g = g0 = 1 0 k
X
R;
(3.4)
where the parameter k measures the eectiveness of the normalization pool. This
function is completely ad hoc, and is currently not supported by any physiological
evidence. Our reasons for choosing it will become evident when we derive closed
form equations for the responses of the model (Appendix Section 3.5).
The membrane conductance g aects both the size and the time course of the
responses. The conductance aects response size because it controls the gain of the
transformation of input currents I and output potentials V ; at steady-state the gain
is V=I = 1=g, inversely proportional to the conductance. The conductance aects
69
the time course of the response because the membrane takes time to charge and
discharge, and this time is proportional to the membrane time constant = C=g,
which is also inversely proportional to the conductance.
Figure 3.2C illustrates these concepts. It shows the responses of the membrane
to a current step, for three values of the conductance g. If the conductance is very
small, the response is slow and there is high gain (that is, the voltage response to
a given current is high). If the conductance g is very large (the membrane is very
leaky), it has small gain and is fast in charging and discharging the capacitor.
When the cells in the normalization pool are silent | e.g. in the absence
of any visual stimulus | their conductance is minimal, g = g0, so they are less
responsive. Also, their time constant is long, so the cells cannot follow the ne
temporal changes of a stimulus nearly as well as its slower variations. The more a
visual stimulus is eective in driving the cells in the pool, the more the cells inhibit
each other by increasing their conductance. This decreases their responsivity, and
it decreases their time constant, so the cells can better follow the ne temporal
changes of the stimulus.
The normalization pool. We now make a nal assumption, about the compo-
sition of the normalization pool. We assume that the cells in the pool are tuned
to the full range of stimulus orientations and directions and to a broad range of
spatial and temporal frequencies. Most visual stimuli will therefore elicit spikes in
some cells of the pool. These stimuli will increase the conductance of | and thus
inhibit | all the cells in the pool.
Solution of the model The variables in the model depend on each other in
a circular way: (1) each cell's ring rate R depends on its membrane potential
70
V (Equation 3.3, Figure 3.2A); (2) each cell's membrane potential V depends on
its driving current Id and on its conductance g (Equation 3.2); (3) each cell's
conductance g depends on the total ring rate of the cells in the normalization
pool, P R (Equation 3.4 and Figure 3.2B). The model is a nonlinear neural network
(Grossberg, 1988), and is in general quite complicated because both the driving
current and the conductance vary over time.
Nevertheless, the model was designed so that for the visual stimuli employed
in this study | drifting sine gratings, plaids and noise | we can derive approximate closed-form equations for its responses. These equations, together with their
derivation, are detailed in the Appendix Sections 3.5 and 3.6.
3.2.2 Responses to Gratings
Figure 3.3A shows the period histograms of the responses of a typical simple cell
to drifting gratings with four dierent stimulus contrasts. The responses look like
rectied sinusoids, which is consistent with the prediction of the linear model.
Indeed, the response of a linear cell to a drifting sinusoidal grating would be a
sinusoid modulating at the same temporal frequency as the stimulus. The rectication introduced by the ring rate encoder would hide the negative part of the
responses.
Contrast responses. There are subtler aspects of the responses that are not
consistent with the linear model. Presented with a scaling in contrast, a linear
neuron would scale its response by the same amount. The responses of the cell in
Figure 3.3, instead, increase only marginally as the contrast doubles from 50% to
100%. This phenomenon has been extensively studied, and is known as response
71
100
25%
Amplitude (sp/s)
12.5%
50
B
0
20
5
Response (sp/s)
100%
0
Relative phase (deg)
2
50%
0
C
154
Time (ms)
-40
-60
-80
-45
-90
0
D
-20
10
Sin*Response (sp/s)
A
5
10 20
50 100
Contrast (%)
-100
-40
-20
0
Cos*Response (sp/s)
Responses to drifting sine gratings of dierent contrasts. The
curves are ts of the normalization model. The ts were performed on a
larger data set, which included the responses to 72 dierent drifting gratings, that had 8 dierent contrasts, 3 dierent orientations, and 3 dierent
temporal frequencies. These stimuli were randomly interleaved to minimize
the eect of visual adaptation. A: Period histograms. Each row corresponds
to a dierent stimulus contrast, indicated to the right of each histogram. B:
Amplitude of the responses as a function of contrast. The ordinate plots the
amplitude of the rst harmonic of responses such as those in A. C: Phase of
the responses as a function of contrast. D: Polar plot of the responses in B
and C. Every point in the plot corresponds to a sinusoid whose amplitude is
given by the distance from the origin, and whose phase is given by the angle
with the horizontal axis. As the contrast increases the responses get larger
(far from the origin), and their phases advance (they turn counter-clockwise).
Circles have radius one standard error of the mean (N=3) computed from
the estimated variance (Figure 3.1A). Error bars in B and C are 6 one standard error of the mean, computed from circles in D. Cell 392l008, exp. 4.
Parameters: 0 = 37 ms; 1 = 9 ms; n = 1:34 (semisaturation contrast: 30%
for this temporal frequency of 6.5 Hz).
Figure 3.3:
72
saturation (Maei and Fiorentini, 1973; Dean, 1981; Ohzawa et al., 1982; Albrecht
and Hamilton, 1982; Li and Creutzfeldt, 1984; Sclar et al., 1990; Bonds, 1991;
Carandini and Heeger, 1994).
Another nonlinearity displayed by simple cells is reected in the latency of
their responses. For a linear cell response latency would be unaected by stimulus
contrast. Simple cells, instead, respond sooner to high-contrast stimuli than to lowcontrast ones. For example, the cell in Figure 3.3 responds to the 100% contrast
grating around 20 ms sooner than to the 12.5% contrast one. This phenomenon
is called phase advance (Dean and Tolhurst, 1986; Carandini and Heeger, 1994;
Albrecht, 1995).
The curves tted to the histograms are the predictions of the normalization
model. According to the model, the sinusoidal currents L(t) produced by the linear
stage in response to sinusoidal gratings are subject to ltering by the membrane
before being rectied into spike rates. The conductance of the cells grows with
the stimulus energy/contrast. Increasing the conductance decreases the gain of
the membrane, resulting in response saturation. Increasing the conductance also
decreases the time constant, so at high contrast the membrane introduces shorter
delays than at low contrast, resulting in phase advance.
A well-established way to characterize the responses of simple cells to gratings
is to consider their rst harmonic, i.e. to t them with a sinusoid having the same
temporal frequency as the stimulus. This yields two numbers, the amplitude and
phase of the sinusoid, which indicate the size and timing of the responses. The
dependence of these two measures on stimulus contrast is illustrated in Figure 3.3B
and C. For contrasts below 20% the amplitudes (B) grow roughly linearly with
contrast, and the phases (C) stay substantially constant. A contrast of 30% here
73
corresponds to the semisaturation contrast for this cell i.e. the contrast at which
response amplitude is 1/2 of the response to a 100% contrast. As the contrast
increases, the amplitudes saturate and the phases advance. The curves tted to
the data are the predictions of the normalization model, which clearly captures
these phenomena. By contrast, the linear model would have predicted that the
data in B lie on a diagonal line, and those in C on a horizontal line.
The equations for response amplitude and phase predicted by the model are
derived in Appendix Section 3.5. We here present the equation for response amplitude because it helps illustrate the behavior of the model. According to the
model, the amplitude of the responses of a simple cell to a grating of contrast c
and temporal frequency ! is
2
amplitude(R) = 4amplitude(L) q
c
(! )2 + c2
3n
5 :
(3.5)
The role of the quantities L, (!), and n is easy to understand if one keeps in mind
the structure of the model (Figure 1.1C, Page 2). The output of the cell's linear
weighting function (Equation 3.1) is a sinusoid of amplitude [amplitude(L) c]. The
normalization stage divides that by a quantity that depends on the activity of a
large number of neurons. Appendix Section 3.5 shows that for drifting grating
q
stimuli this quantity is (!)2 + c2, where depends on the temporal frequency
! of the stimuli, and is related to the low-pass properties of the cell membrane.
Finally, the exponent n is a constant and is related to the rectication stage that
encodes the membrane potentials into ring rates (Figure 3.2A).
Equation 3.5 is similar to a hyperbolic ratio, which was empirically found to
provide good ts to the amplitude of the contrast responses of V1 cells (Albrecht
and Hamilton, 1982; Sclar et al., 1990). Indeed, our ad hoc choice of the dependence
of conductance on the activity of the normalization pool (Equation 3.4) was made
74
with this expression in mind. The dependence of response amplitude on stimulus
contrast is quite simple: at low contrasts one has c (!), and the responses
grow approximately linearly with the contrast c. At high contrasts, instead, the
denominator has a strong eect, and the responses saturate.
Amplitude and phase were t at the same time by performing the t in the polar
plane illustrated in Figure 3.3D. In a polar plot response amplitude is represented
as distance from the origin, and response phase is represented as the angle with
the horizontal axis. The data and the curve predicted by the model shown in D are
the same as those in B and C. As the contrast increases the data points get further
from the origin (response amplitude increases), and they turn counter-clockwise
(response phase advances).
The curves in B and C are determined by a total of ve parameters. The
rst two specify the amplitude and phase of the output of the linear weighting
function to the grating at full contrast; these determine the vertical positions of
the two curves. Two other parameters are the time constants of the membrane at
rest and at full contrast. These determine the saturation point of the amplitude
curve and the steepness of the phase curve. The last parameter, the exponent,
describes the rectication stage. It determines the steepness of the amplitude
curve below saturation. The latter three parameters do not depend on the visual
stimulus, and were obtained by tting simultaneously the contrast responses to all
the drifting grating stimuli in an experiment. In particular, the cell in Figure 3.3
was tested with 72 dierent drifting gratings, that had 8 dierent contrasts, 3
dierent orientations, and 3 dierent temporal frequencies. The time constant at
rest that was estimated from tting this large data set was 0 = 37 ms; the time
constant at full contrast was 1 = 9 ms. For this data set, then, the ts of the
model estimate a fourfold decrease in time constant, i.e. a fourfold increase in
75
-45
25%
Amplitude (sp/s)
50
A
B
80
20
10
5
203
100%
2
Relative phase (deg)
Response (sp/s)
50%
D
60
Sin*Response (sp/s)
-15
30
C
40
20
0
0
-20
-30
-40
0
0
154
Time (ms)
5
10
20
Contrast (%)
50
100
0
20
40
Cos*Response (sp/s)
Responses to drifting sine gratings at two dierent orientations,
(gray) and -45 (white). Fits of the normalization model (curves) were
performed on a larger data set than shown, which included 72 stimulus
conditions (8 contrasts, 3 orientations, 3 temporal frequencies). A: Period
histograms. Rows correspond to dierent contrasts, columns to orientations.
B: Response amplitude as a function of contrast. C: Response phase as a
function of contrast. To facilitate comparison the responses to each grating
were shifted vertically so that their values predicted by the normalization
model would overlap. D: Polar plot of the responses in B and C. Circles
and error bars are standard errors of the mean (N=3), as in Figure 3.3. Cell
392l009, exp. 8. Parameters: 0 = 28 ms; 1 = 3 ms; n = 1:6 (semisaturation
contrast: 15% for this temporal frequency of 6.5 Hz).
Figure 3.4:
-15
conductance. The exponent was n = 1:34, indicating that the estimated resting
potential of the cell was close to its spike threshold (Figure 3.2A).
Dierent orientations. Figure 3.4 shows the contrast responses of a simple
cell to two drifting gratings diering in their orientation. As shown in A, the
responses elicited by the grating drifting at -15 (left column) were around 40%
larger than those elicited by the grating drifting at -45 . This proportion remained
76
substantially constant in the face of prominent saturation above 25% contrast.
This property can be better observed in B, which shows the amplitude of the
responses as a function of contrast. The contrast response to the -15 grating (gray)
is larger than that to the other grating (white). In spite of the amplitude saturation
the contrast responses are vertical shifts of each other. On the logarithmic axis
this vertical shift means that the ratio of the responses to dierent orientations
is constant, irrespective of the stimulus contrast. Another way to express this
property is to say that the orientation tuning curve scales with contrast. Hence, the
ability of the cell to discriminate orientations does not deteriorate in the presence of
saturation. This property has been repeatedly observed for both orientation tuning
and spatial frequency tuning (Movshon et al., 1978c; Albrecht and Hamilton, 1982;
Sclar and Freeman, 1982; Li and Creutzfeldt, 1984; Skottun et al., 1987).
Even though the absolute phases of the responses to the two gratings diered by
about 180 (D), their dependence on contrast was very similar. This is illustrated
in C, which plots the relative phases as a function of stimulus contrast. The phases
of the responses to each grating were shifted vertically so that the ts provided
by the normalization model would overlap. This transformation highlights the
similarity of the dependence of response phase on stimulus contrast for the two
gratings. As with response saturation, phase advance is controlled by the contrast
of the stimulus per se, rather than by the cell's ring rate. The relative timing of
the responses (dierence in response phase) to dierent orientations is independent
of stimulus contrast.
The normalization model captures the orientation invariances in the contrast
responses, both in amplitude and in phase. This is because saturation and phase
advance arise from the conductance increase caused by the increase in response
77
of the entire pool of neurons, which reects the increased stimulus energy, independent of stimulus orientation. More precisely, in the expression for the response
amplitude (Equation 3.5), stimulus contrast and stimulus orientation are separable.
Indeed, the expression can be seen as the product of two factors, [amplitude(L)]n
q
and (c= (!)2 + c2)n. The rst factor depends on L, the response of the cell's
linear receptive eld to the grating at unit contrast, so it depends on orientation
but not on contrast. The second factor depends only on the contrast c and on the
temporal frequency ! of the grating. For a xed temporal frequency the shape of
the contrast responses is entirely controlled by this second factor, which is independent of stimulus orientation. A similar argument of separability can be made
for the phase responses predicted by the model. The expression for response phase
that is derived in Appendix Section 3.5 (Equation 3.19) is the sum of two terms,
one which depends on the stimulus orientation but not on its contrast, and one
that depends on the stimulus contrast but not on its orientation.
Dierent spatial frequencies. Changing the spatial frequency of a grating
has the same eect on the contrast responses as changing orientation: response
amplitude is shifted vertically on a logarithmic scale, and response phase is shifted
vertically on a linear scale. An example of this can be observed in Figure 3.5, which
shows the contrast responses of a simple cell to two drifting gratings diering in
their spatial frequency. The responses elicited by the 1.4 cycles/degree grating (A,
left column) were about 70% larger than those elicited by the 1.1 cycles/degree
grating (right column). This proportion held substantially constant in the face of
response saturation. Indeed, the amplitude vs. contrast curves (B) were similar
except for a vertical shift on a logarithmic scale. In addition, the phase vs. contrast
curves were similar except for a vertical shift on a linear scale (C).
78
B
1.1 cpd
20
30
10
5
50%
0
0
154
Time (ms)
Relative phase (deg)
Response (sp/s)
2
100%
D
40
Sin*Response (sp/s)
25%
Amplitude (sp/s)
A 1.4 cpd
0
20
10
0
C
-10
-30
-10
20
50
Contrast (%)
100
0
10
Cos*Response (sp/s)
Contrast responses for gratings with two dierent spatial frequencies: 1.4 cpd (gray) and 1.1 cpd (white). Fits of the normalization model
(curves) were performed on a larger data set than shown, which included
40 stimulus conditions (10 contrasts, 2 spatial frequencies, 2 temporal frequencies). Stimuli of 3%, 6% and 12% contrast elicited less than 1 sp/s.
A: Period histograms. Rows correspond to dierent contrasts, columns to
spatial frequencies. B: Response amplitude as a function of contrast. C:
Response phase as a function of contrast. Responses to each grating were
shifted vertically so that their values predicted by the normalization model
would overlap. D: Polar plot of the responses in B and C. Circles and error
bars are standard errors of the mean (N=6), as in Figure 3.3. Cell 382l019,
exp. 5. Parameters: 0 = 18 ms; 1 = 8 ms; n = 4 (semisaturation contrast:
44% for this temporal frequency of 6.5 Hz).
Figure 3.5:
79
The ts of the normalization model (continuous curves) capture all these properties of the responses. Indeed, the very same argument about separability in the
model responses of contrast and orientation can be made for contrast and spatial
frequency. The model prediction for response amplitude can be expressed as the
product of two terms, one depending on spatial frequency but not on contrast, and
one depending on contrast but not on spatial frequency. The latter is the result
of activity in the normalization pool, which is assumed to contain cells tuned to a
broad range of spatial frequencies. Over this range its total response is independent
of spatial frequency.
Dierent temporal frequencies. Changes in the stimulus temporal frequency
had very dierent eects from changes in orientation or spatial frequency. In particular the above-mentioned invariances of the contrast responses did not hold
for stimuli diering in temporal frequency. Rather, we found that increasing the
temporal frequency increased the contrast at which the responses saturated and
decreased the total phase advance. Similar results (for the amplitude of the responses) were obtained in the cat by Holub and Morton-Gibson (1981) and in the
monkey by Hawken and collaborators (1992; see also appendix in Albrecht, 1995).
Figure 3.6 illustrates these phenomena. While at low temporal frequencies the
responses saturated at low contrast (A, left columns), at high temporal frequencies
the responses did not show much saturation (right columns). At 50% contrast
the responses to the 1.6 Hz stimulus had already reached saturation, while the
responses to the 13 Hz stimulus were just coming out of the noise. Yet, at 100%
contrast these stimuli were roughly equally eective. This behavior can be better
observed in an amplitude plot (B): the contrast responses dier in their horizontal
position, so they could not be superimposed by a vertical shift, as was the case
80
3.3 Hz
6.5 Hz 13 Hz
12.5%
50
Amplitude (sp/s)
1.6 Hz
A
Response (sp/s)
25%
50%
166
613
307
20
10
5
100%
0
B
154 77
Time (ms)
2
D
0
50
C
-45
20
Phase (deg)
Amplitude (sp/s)
-90
10
5
2
100
1
-135
-180
-225
-270
50
1.6
3.3
6.5
Temporal frequency (Hz)
13
25
Contrast (%)
10
20
50
Contrast (%)
100
Dependence of the contrast responses on temporal frequency.
Continuous curves are predictions of normalization model. A: Period histograms. Rows correspond to dierent contrasts, columns to temporal frequencies. B: Response amplitude as a function of contrast. The 3.3 Hz data
were very close to the 1.6 Hz data, and were omitted to avoid clutter. C:
Response phase as a function of contrast. Error bars in B and C are 6 one
standard error of the mean (N=3). Gray levels indicate the temporal frequency as in A. D: Response amplitude as a function of temporal frequency
and contrast. Dashed lines connect actual data (dots). Fits were performed
on a larger data set than shown, which included 64 stimulus conditions (8
contrasts, 4 temporal frequencies, 2 orientations). Cell 382l021, exp. 5. Parameters: 0 = 66 ms; 1 = 8 ms; n = 4 (semisaturation contrasts: 22% at
1.6 Hz, 29% at 3.3 Hz, 45% at 6.5 Hz, 65% at 13 Hz).
Figure 3.6:
81
100
Gain (%)
50
20
10
5
Phase (deg)
2
0
−22.5
−45
−67.5
−90
1 2
5 10 20
50
Temporal frequency (Hz)
Dependence of the estimated transfer function of the cell membrane on contrast. Continuous curves show the transfer function at rest,
dashed lines show the transfer function at 100% contrast. Model parameters are obtained from Figure 3.6. Arrows indicate decrease in gain (top) and
phase advance (bottom) at the four temporal frequencies in that experiment
(1.6, 3.3, 6.5 and 13 Hz).
Figure 3.7:
with the contrast responses to dierent orientations or spatial frequencies.
The eect of temporal frequency on the contrast responses can be rephrased
in terms of the eect of contrast on the temporal frequency tuning. Increasing
stimulus contrast increased the cells' responsivity to the high temporal frequencies.
This phenomenon is most visible in D, which can be seen as a set of temporal
frequency curves measured at dierent contrasts. At low contrast the cell was
essentially low-pass. At high contrast the cell was mildly band-pass, with the 6.5
Hz stimulus eliciting 46% stronger responses than the 1.6 Hz stimulus. From the
quality of the ts it is clear that the normalization model captures this behavior.
By contrast, according to the linear model increasing the contrast should just scale
the responses, with no eect on the temporal frequency tuning.
The model ascribes the band-pass tuning observed at high contrasts to the
linear spatiotemporal lter stage. At low contrasts this tuning is counteracted by
82
the low-pass ltering operated by the membrane. This can be seen directly in
Equation 3.5. Appendix Section 3.5 shows that, as a consequence of the low-pass
properties of the membrane, the quantity (!) grows with the temporal frequency
! of the stimulus. At low contrasts c, has a strong eect, considerably scaling
down the responses and giving them a low-pass characteristic. At high contrasts,
when c (!), the eect is weaker, so the temporal frequency tuning of the linear
responses is not much aected by the membrane's low-pass ltering.
The contrast-dependence of temporal frequency tuning predicted by the model
can be understood by observing how an increase in conductance modies the frequency response of an RC circuit (Figure 3.7). Increasing stimulus contrast increases the membrane conductance, decreasing its gain more at low frequencies
than at high frequencies.
Another prediction of the model which is illustrated in Figure 3.7 is related
to phase advance. According to the model the phase advance should depend on
the temporal frequency of the stimulus, being minimal at low and high temporal
frequencies, and maximal at intermediate frequencies. The vertical arrows in the
bottom panel indicate the total phase advance predicted by the model at the four
temporal frequencies tested in the experiment of Figure 3.6. The model predicts
that phase advance would be largest for the 6.5 Hz stimulus (51.9 ), marginally
smaller for the 3.3 and 13 Hz stimuli (44.4 and 46.9 ), and quite small for the 1.6
Hz stimulus (29.5 ). The expression for the total phase advance predicted by the
model is particularly simple:
Phase Advance = arctan(2!0) 0 arctan(2!1);
(3.6)
where ! is the stimulus temporal frequency, and 0 and 1 are respectively the time
constant of the membrane at 0 and at 100% contrast. The maximal phase advance
83
A
1.6 Hz
3.3 Hz
6.5 Hz
180
13 Hz
B
6.25%
135
90
25%
Response (sp/s)
50%
167
Phase (deg)
12.5%
45
0
-45
-90
100%
0
613
Time (ms)
307
154
77
-135
5
10
20
50
Contrast (%)
100
Phase advance and temporal frequency. Continuous curves are
predictions of normalization model. A: Period histograms. Rows correspond
to dierent contrasts, columns to temporal frequencies. B: Response phase
as a function of contrast. Error bars are 6 one standard error of the mean
(N=3). Gray levels indicate the temporal frequency as in A. Fits were
performed on a larger data set than shown, which included 60 stimulus
conditions (5 contrasts, 4 temporal frequencies, 3 spatial frequencies). Cell
392l008, exp. 7. Parameters: 0 = 27 ms; 1 = 7 ms; n = 1:2 (semisaturation
contrasts: 18% at 1.6 Hz, 20% at 3.3 Hz, 25% at 6.5 Hz, 36% at 13 Hz).
Figure 3.8:
is achieved at a frequency equal to 1=(2p01).
A data set in which the dependence of phase advance on temporal frequency
is evident is illustrated in Figure 3.6. For this cell the model predicts that the
phase advance should be minimal (11.3 ) at 1.6 Hz, and increase with temporal
frequency: 20.77 at 3.3 Hz, 31.8 at 6.5 Hz, and 35.7 at 13 Hz. The data clearly
conrm this trend, which was typical of our sample. Indeed, most of the Figures
84
in this Chapter display data acquired with temporal frequencies around 6 Hz. We
wanted to provide examples of contrast responses showing clear saturation and
clear phase advance. Just as predicted by the model, we found that temporal
frequencies below 3 Hz yielded stronger saturation but little phase advance, while
temporal frequencies much above 6 Hz showed larger phase advances but little
saturation.
The increase in phase advance with increasing temporal frequency can also be
seen as a decrease in integration time, the slope of a line tted to a phase vs.
temporal frequency plot of the data. A similar phenomenon | together with dramatic changes in the temporal frequency tuning of the cells | was observed in
cat by Reid et al. (1992) using broadband high-energy stimuli. The authors of
that study pointed out that these behaviors could be explained by a decrease in
membrane conductance in cortical cells. The normalization mechanism that we
propose works exactly that way, and indeed we have shown with computer simulations that it predicts eects similar to those observed by Reid and collaborators
(Carandini and Heeger, 1993).
Quality of the ts. A useful measure of the quality of the ts is the \percentage
of the variance" (see Methods). For our 51 grating matrix data sets this percentage
was below 80% in only 4 sets, between 80% and 90% for another 13 sets, and above
90% for the remaining 34 sets (median: 92.9% of the variance).
The quality of the ts to an entire data set can be evaluated in Figure 3.9. This
data set illustrates the principal properties of the contrast responses: changing
orientation shifts the amplitude responses vertically on a logarithmic scale, and
the phase responses vertically on a linear scale. Amplitude saturation is more
prominent at low temporal frequencies, phase advance is more prominent at high
85
3.26 Hz
100
Amplitude (sp/s)
50
6.51 Hz
100
A
50
13 Hz
100
B
50
20
20
20
10
10
10
5
5
5
2
2
2
C
180
180
180
Phase (deg)
90
90
90
0
0
0
-90
-90
-90
-180
5
10
20
Contrast
50
100
5
10
20
Contrast
50
100
5
10
20
Contrast
50
An example of an entire grating matrix data set. The cell was
tested with three dierent temporal frequencies (A: 3.27 Hz; B: 6.54 Hz; C:
13 Hz), three dierent orientations (white: 120; gray: 80; black: 40 ), and
nine dierent contrasts. Error bars indicate 6 one standard error of the
mean (N=3). Estimates for the variance of this same data set are shown
in Figure 3.1A. Some period histograms for these responses are shown in
Figure 3.3A. The t accounted for 95.7% of the variance. The shape of all
the 18 curves in the Figure is determined by only 3 parameters: 0 = 37 ms;
1 = 9 ms; n = 1:34 (semisaturation contrasts: 22% at 3.3 Hz, 30% at 6.5
Hz, 42% at 13 Hz). Eighteen additional parameters determine the vertical
position of each curve. Cell 392l008, exp. 4.
Figure 3.9:
86
100
temporal frequencies. The model accounted for 95.7% of the variance of this data
set; it provided worse ts to 37/51 data sets and better ts to the remaining 13/51
data sets.
The vertical positions of the nine amplitude curves and of the nine phase curves
are free parameters of the model, and correspond respectively to the amplitudes
and phases of the responses of the linear stage to each grating at full contrast.
The shape of all the eighteen curves shown in the Figure is determined by
only three parameters. Two of them determine the normalization stage: the time
constant of the membrane at rest 0, and the time constant of the membrane
at full contrast 1. These parameters determine the saturation of the amplitude
curves and the rising of the phase curves. The remaining parameter describes
the rectication stage: it is the exponent n, which grows with the distance of the
resting potential from the ring threshold. This parameter controls the steepness
of the contrast responses before saturation.
The two time constants are the only parameters of the normalization model
which are not parameters of the linear model. We found that they contribute a
substantial improvement in the quality of the ts, as the linear model accounted
for only 84.2% of the variance of our grating matrix data sets, compared to 92.9%
for the normalization model (median values). The reason for this improvement is
easy to see if one considers that the linear model would predict straight lines in the
amplitude plots and horizontal lines in the phase plots, missing the phenomena of
contrast saturation and phase advance.
To take into account the variability of the responses in our evaluation of the
model we performed bootstrap tests of the null hypothesis that the mean of the
probability distribution underlying the neural responses was identical to the pre87
dictions of the model (see Methods). Having observed a dierence between the
actual experimental data and the predictions of the model, we evaluated the probability of observing at least that large a dierence if the null hypothesis were true.
This probability is the achieved signicance level (ASL) of the test. Based on the
null hypothesis, we would expect that 2-3 out of 51 grating matrix data sets would
achieve a signicance level of 5% or less. In fact, 4 of the data sets had an ASL
below 5%. For two data sets the ASL was below 2.5%; for another two it was
between 2.5 and 5%; for nine it was between 5 and 10%; for the remaining 38 it
was above 10%.
3.2.3 Responses to Plaids
We now consider the responses to a wider set of visual stimuli: plaids composed of
two drifting gratings having the same temporal frequency. The gratings diered
in orientation and/or in spatial frequency, and their contrasts c1 and c2 assumed
a variety of dierent values.
Cells in the cat primary visual cortex display a phenomenon known as \crossorientation inhibition" (Morrone et al., 1982; Bonds, 1989; Gizzi et al., 1990), in
which the responses to optimal stimuli are inhibited by the presence of stimuli of
suboptimal orientation, that would elicit negligible responses if presented alone.
More generally, there are numerous reports of conditions in which cells in the cat
visual cortex are inhibited by the presence of a stimulus that is ineective in driving
them. This inhibition has been found to be independent of direction of motion,
largely independent of orientation and broadly tuned for spatial and temporal
frequency (Bishop et al., 1973; Dean et al., 1980; Burr et al., 1981; Hammond
and MacKay, 1981; Morrone et al., 1982; De Valois and Tootell, 1983; Li and
88
0%
6.25% 25%
A
50%
50
0%
6.25%
Response (sp/s)
25%
76
50%
Response Amplitude (sp/s)
20
B
C
10
5
2
1
0.5
0
0
613
Time (ms)
10
20
50
100
Contrast of grating 1 (%)
10
20
50
100
Contrast of grating 2 (%)
Masking by an orthogonal grating. Responses to a plaid
experiment in which one component was nearly optimally oriented, and the
other was orthogonal and ineective in driving the cell when presented alone.
Curves are ts of the normalization model. A: period histograms for dierent
contrasts of the components. Rows: dierent contrasts of \grating 1" (c1).
Columns: dierent contrasts of \grating 2" (c2). When presented alone,
grating 1 elicits strong responses (left column), grating 2 none (top row). As
c2 is increased the responses decrease in size (cross-orientation inhibition).
B: Response amplitude as a function of c1 , for dierent values of c2 (white to
black: 6.25, 12.5, 25, and 50%). Error bars indicate 6 one standard error of
the mean (N=3). As c2 increases, the contrast responses shift to the right:
more and more contrast of grating 1 is needed to maintain a set level of
ring. C: Same data, plotted as a function of c2, for dierent values of c1
(white to black: 0, 6.25, 25, and 50%). Cell 392l024, exp. 9. Parameters:
0 = 158 ms; 1 = 5 ms; n = 2:3.
Figure 3.10:
Creutzfeldt, 1984; De Valois et al., 1985; Kaji and Kawabata, 1985; Gulyas et al.,
1987; Bonds, 1989; Nelson, 1991; DeAngelis et al., 1992; Geisler and Albrecht,
1992). After some debate, there is now consensus that cross-orientation inhibition
can be driven dichoptically (with one grating in each eye), although monoptic
suppression (with both gratings in the same eye) is typically stronger (Ferster,
1981; Ohzawa and Freeman, 1986a,b; Freeman et al., 1987; DeAngelis et al., 1992;
Sengpiel and Blakemore, 1994; Sengpiel et al., 1995; Walker et al., 1996).
Our results indicate that cross-orientation inhibition is present in most cells of
89
the monkey primary visual cortex. An example of this is shown in Figure 3.10,
which shows the responses of a simple cell to a plaid whose components drifted in
orthogonal directions. While one of the gratings (\grating 1") was quite eective
in driving the cell (A, left column), the other (\grating 2") elicited almost no
spikes when presented alone (top row). Its presence however clearly suppressed
the responses to the rst grating. The inhibitory eect of the second grating can
be observed more precisely in B, which shows the contrast responses of the cell
for four dierent contrasts of grating 2. As observed by Bonds (1989) in the cat,
the presence of the second grating shifts the contrast response to the right. This
corresponds to a scaling of contrast (Heeger, 1992b).
The ts in Figure 3.10 and in subsequent Figures show that the normalization
model provides a good t to the plaid responses. Appendix Section 3.6 sketches
the derivation of approximate equations for the amplitude and phase of the rst
harmonic response to plaids. In particular the expression for the amplitude is
2
3n
amplitude(
(
)
+
(
))
c
L
t
c
L
t
1
1
2
2
5 ;
q
amplitude(R) / 4
(! )2 + c21 + c22
(3.7)
where c1 and c2 are the contrasts of the two gratings, L1(t) and L2(t) are the sinusoidal responses of the linear weighting function to the individual gratings at unit
contrast, and the remaining symbols have the same meaning as in the expression
for the response to individual gratings (Equation 3.5). Since the spatiotemporal receptive eld is linear, its response to the plaid is just a linear combination
of its responses to the individual gratings, c1L1(t) + c2L2(t). The normalization
stage divides that by a quantity that depends on the activity of a large number
of neurons. For plaids composed of two gratings this quantity is approximately
q
(! )2 + c21 + c22 (Appendix Section 3.6). Finally, the rectication stage is responsible for the exponent n (Figure 3.2A).
90
6.25%
12.5%
25%
50%
A
80
B
Response (sp/s)
Sin*Response (sp/s)
60
40
20
192
0
-20
0
0
154
Time (ms)
-80
-60
-40
-20
0
Cos*Response (sp/s)
20
40
Responses to the sum of two eective stimuli. Data come
from a plaid experiment in which the component gratings diered in their
spatial frequency. The continuous curves are ts of the normalization model.
A: Period histograms for four dierent contrasts. Dark gray: responses
to grating 1 (1.2 cpd). White: responses to grating 2 (0.6 cpd). Light
gray: responses to the sums of the stimuli in the top and bottom row. B:
Polar plot of the rst harmonic responses, for a variety of grating and plaid
contrasts. See Figure 3.3 for explanation of polar plots. Circles have radius
one standard error of the mean (N=4). Asterisks joined by dashed lines
indicate the vectorial sums of the responses to the individual gratings (dark
gray + white data points). The actual responses to the plaid were smaller
(closer to the origin) and occurred slightly sooner (counter-clockwise) than
this linear prediction. Cell 385r037, exp. 05. Parameters: 0 = 45 ms; 1 =
2 ms; n = 2:44.
Figure 3.11:
The eect of a mask on the responses of a model cell can be seen directly in
Equation 3.7. If, as in Figure 3.10, grating 2 alone does not elicit any response
(L2 0), then the suppressive eect of the mask is due to its contrast c2 appearing
only in the denominator. The eect of an increase of c2 in the denominator is to
shift the contrast response to the right (Heeger, 1992b).
The pure rightward shift of the contrast responses occurs only when the cell
is completely unresponsive to the masking grating. When both gratings in the
plaid elicit (even minimal) responses when presented alone, their mutual eect is
91
more complicated. In that case the sinusoidal responses of the linear stage to the
individual gratings are added together before the normalization stage. Depending
on their relative phase they can add constructively or destructively. An example
of this is shown in Figure 3.11. The top and bottom rows in A show the period
histograms of the responses of a cell to two gratings of dierent spatial frequency.
Both gratings elicited strong responses, with phases diering by approximately
90. The responses to the \plaids" obtained by summing the gratings are shown
in the middle row.
The sum of sinusoids is best understood in a polar plot (B), where every sinusoid
corresponds to a vector, and the sum of sinusoids is just a sum of vectors. The
dark gray data points are the responses to grating 1; the white data points are
the responses to grating 2. The light gray data points are the responses to the
\plaid" obtained by superimposing the two gratings. The asterisks joined by the
dashed lines indicate the linear predictions for the plaid responses obtained by
summing (vectorially) the responses to the individual gratings. The actual plaid
responses show more saturation (they remain closer to the origin) than these linear
predictions. They also occur earlier (their angle with the horizontal axis is larger)
than the linear predictions. While far from absolute perfection, the ts of the
normalization model (continuous curves) capture both phenomena. This is because
the local stimulus energy of the plaid is greater than that of the individual gratings.
In the model this results in higher membrane conductance, which causes a decrease
in gain and time constant.
Figure 3.12 illustrates another example of plaid responses. In this case two
orthogonal gratings were able to drive the cell. Grating 2 was not as eective as
grating 1, but it did elicit some spikes when presented alone. The dependence of the
responses on the contrasts of the gratings is not trivial: depending on the contrast
92
0%
6.25%
A
25%
50%
50
25%
82
Response amplitude (sp/s)
6.25%
Response (sp/s)
B
0%
20
10
5
2
50%
0
1
0
154
Time (ms)
10
20
50
100
Contrast of grating 2 (%)
Masking with a grating that is eective in driving the cell. Responses to a plaid experiment in which one component was nearly optimally
oriented, and the other was orthogonal but still elicited some response when
presented alone. A: period histograms for dierent contrasts of the components. Rows: dierent contrasts of \grating 1" (c1). Columns: dierent
contrasts of \grating 2" (c2). When presented alone, grating 1 elicits strong
responses (left column), grating 2 weak responses (top row). B: Response
amplitude as a function of c2, for dierent values of c1 (white to black: 0,
6.25, 12.5, and 50%). Error bars indicate 6 one standard error of the mean
(N=3). Increasing c2 increases the size of the responses when grating 1 is
absent; it inhibits the responses for intermediate contrasts of grating 1, and
it has little eect for high contrasts of grating 1. Cell 392r013, exp. 12.
Parameters: 0 = 136 ms; 1 = 1.4 ms; n = 2:22.
Figure 3.12:
93
of Grating 1, increasing the contrast of Grating 2 can either enhance or suppress
the responses. This behavior is predicted by the normalization model, as shown
by the continuous curves t to the responses. The contrasts of the two gratings,
c1 and c2 , appear both in the numerator and in the denominator of Equation 3.7.
Increasing one of the two can result either in an enhancement or in a reduction in
the response, depending on the amplitude and phases of L1 and L2, the responses
of the linear receptive eld to the individual gratings.
Figure 3.13 illustrates the responses of the same cell to dierent plaids. Panel
A replots the data and ts of Figure 3.12B. Panel B shows the corresponding phase
data and ts, and illustrates how increasing the contrast of either grating resulted
in phase advance. In A and B grating 2 drifted at 90 with respect to grating
1, and it elicited responses that were smaller by about a factor of ve. In C and
D grating 2 was replaced by one drifting at 30 with respect to grating 1, thus
eliciting responses which were only marginally smaller than those to grating 1.
The phases of the responses to the two individual gratings were almost opposite
(D), around 0 for grating 1 and around 135 for grating 2. As a result the two
stimuli interacted destructively, as witnessed by the dip in the diagonal region of
C. In that region increasing the contrast of any of the two gratings reduced the
amplitude of the responses. The model clearly captures this phenomenon, which
is principally due to its linear stage. Indeed, when the phase of grating 2 was
changed by 90 (E,F), this phenomenon disappeared. Now increasing the contrast
of either grating increased the size of the responses.
Quality of the ts. As with the grating matrix data sets, the model provided
good ts to our plaid data sets. For only 13 out of 76 data sets did the model
account for less than 70% of the variance. The percentage of the variance accounted
94
Amplitude (sp/s)
A
C
20
20
20
10
10
10
5
5
5
2
2
2
50
50
20
10
50
20
10
50
20
10
5
0
10
2
0
0
2
0
0
135
135
90
90
90
45
45
45
0
0
0
−45
−45
−45
−90
−90
−90
50
50
5
C2 (%)
5
2
0
0
2
10
20
C1 (%)
2
0
F
135
50
10
5
2
D
20
10
50
20
5
5
2
B
20
10
50
20
5
5
2
Phase (deg)
E
50
20
10
50
5
C2 (%)
5
2
0
0
2
10
20
C1 (%)
20
10
50
5
C2 (%)
5
2
0
0
2
10
C1 (%)
Amplitude and phase of the responses of a cell to three dierent plaids, for dierent contrasts c1 and c2 of the two components. Grating
1 was the same in all three experiments. Its orientation was close to optimal.
Spheres connected by dashed lines are actual responses, continuous curves
are ts of the normalization model. Cell 392r013. A,B: Grating 2 orthogonal
to grating 1. Parameters: 0 = 40 ms; 1 = 1:3 ms; n = 2:1. C,D: Grating
2 drifted 30 away from grating 1. Parameters: 0 = 35 ms; 1 = 1:4 ms;
n = 1. E,F: Same stimuli as C,D, but phase of grating 2 is delayed by 90.
Parameters: 0 = 52 ms; 1 = 1:7 ms; n = 1:1. Cell 392r013, Exps. 12, 9
95
and 8.
Figure 3.13:
20
for by the model was between 70% and 80% for 11 sets, between 80% and 90% for
33 sets, and above 90% for the remaining 19 sets (median: 85.5% of the variance).
The model passed the bootstrap statistical test at the 5% signicance level for 61
out of 76 plaid data sets. For nine data sets the achieved signicance level (ASL)
was below 2.5%; for six it was between 2.5% and 5%; for 11 it was between 5%
and 10%; for the remaining 50 it was above 10%. No systematic dierence in the
quality of the ts was found between experiments in which the two components
diered in orientation (35 data sets) and those in which they diered in spatial
frequency (28 data sets) and those in which they diered in both attributes (13
data sets).
3.2.4 Responses to Gratings and Noise
We now consider the responses to gratings in the presence of noise. Consistent with
results obtained with static and drifting noise stimuli in the cat (Hammond and
MacKay, 1977; Burr et al., 1981), we found that ickering binary noise elicited few
spikes in simple cells. When presented together with an eective grating stimulus,
however, it provided strong inhibition. This is consistent with the predictions
of the normalization model, since the presence of the noise mask increases the
stimulus energy.
An example of our results is shown in Figure 3.14. In the absence of a grating
stimulus, the noise elicited very few spikes (A, top row). By contrast, the grating
was very eective in driving the cell (left column). Increasing noise contrast decreased the size of the responses. As illustrated in B, the other major eect of the
noise masks was to reduce response latency. Indeed, increasing the noise contrast
advanced response phase to the point that at high noise contrast (black points),
96
0%
A
18.8%
50%
3.12%
Response (sp/s)
12.5%
Sin*Response (sp/s)
6.25%
10
B
0
-10
81
-20
-10
0
Cos*Response (sp/s)
25%
0
0
154
Time (ms)
Response Amplitude (sp/s)
50
50
C
D
20
20
10
10
5
5
2
2
1
1
5
10
20
50
Grating contrast (%)
10
20
50
100
Noise contrast (%)
Masking with spatiotemporal white noise. An \optimal"
drifting grating was presented together with two-dimensional ickering binary noise composed of squares whose size was one fourth of the period of
the grating. Fits of the normalization model (curves) were performed on a
larger data set than shown, which included 72 stimulus conditions (9 grating
contrasts and 8 mask contrasts). A: period histograms for dierent grating
contrasts (rows) and noise contrasts (columns). When presented alone, the
grating elicited strong responses (left column), the noise very weak responses
(top row). B: Polar plot of the contrast responses for three dierent noise
contrasts (white to black: 0, 18.8, and 50%). Increasing noise contrast decreased response amplitude and advanced response phase. C: Response amplitude as a function of grating contrast, for dierent noise contrasts. Gray
levels as in B. Increasing the noise contrast shifted the contrast responses to
the right. D: Response amplitude as a function of noise contrast. Grating
contrasts (white to black): 3.12, 6.25, 12.5, and 25%. Cell 394l015, exp. 7.
Parameters: 0 = 101 ms; 1 = 3.2 ms; n = 1:54.
Figure 3.14:
97
the phase had reached its maximum, so that grating contrast could have no further
eect on it. The eect of noise on response amplitude was to shift the contrast
responses to the right (C).
As testied by the continuous curves in the Figure, the normalization model
provided good quantitative accounts of these phenomena. The ts where performed
somewhat dierently from the ts to the grating and plaid data sets. First, we did
not include the variance of the data in the function to be minimized (see Methods).
This is because the noise greatly increased response variance, and we did not want
high noise contrast data points to be t less accurately than low noise contrast
points. Second, these ts were performed by using the same equations as with
plaid stimuli, but imposing that the rst harmonic of the linear response to the
noise alone be zero. This reduced the number of free parameters in the ts to ve:
the amplitude and phase of the rst harmonic response of the linear stage to the
grating at full contrast, the time constants 0 and 1 of the cell at rest and at full
contrast, and the exponent n of the spike encoding stage.
Quality of the ts. As with the grating and plaid data sets, we found that
the normalization model provided good ts to our noise masking data. For only
2 out of 22 data sets did the model account for less than 70% of the variance
(see Methods). The model accounted for 70%-80% of the variance for 3 sets, for
80%-90% of the variance for 8 sets, and for more than 90% for the remaining 9
sets (median: 89.3% of the variance). The results of the bootstrap statistical tests
were equally encouraging. The model passed the test at the 5% signicance level
for 21 out of 22 data sets. The achieved signicance level (ASL) was below 10%
for only 3 data sets.
98
% variance, compressive nonlinearity model
% variance, linear model
98
A
95
90
80
50
0
98
B
95
90
80
50
0
% variance, anisotropic model
98
C
95
90
80
50
0
0
50
80
90
95
98
% variance, normalization model
Figure 3.15: Performance of four dierent models, measured by the percentage of the variance accounted for in the data. White dots represent
plaid data sets, black dots represent noise masking data sets. The abscissae plot the performance of the normalization model, the ordinates that of
three other models. A: Linear model. B: \Feed-forward inhibition" model.
C: \Anisotropic model".
3.2.5 Comparison with Other Models
We here compare the quality of the ts obtained with the normalization model with
that of three dierent models: the linear model, a \relaxed" normalization model,
and an alternative model in which saturation is brought about by feed-forward
inhibition or by a compressive non-linearity.
This analysis is illustrated in Figure 3.15. The abscissae in the Figure plot the
99
percentage of the variance accounted for by the normalization model in our plaid
and noise masking data sets. For a plaid experiment, the normalization model is
specied by a total of seven free parameters: four specify the responses of the linear
stage to the individual full-contrast gratings (amplitudes and phases of the rst
harmonic responses L1 and L2), two specify the parameters of the RC circuit (the
membrane time constants at rest and at full contrast, 0 and 1), and one species
the spike encoding stage (the exponent n). For a noise masking experiment the
total number of parameters is ve because there is only one grating, and the
amplitude and phase of L2 are set to zero.
We rst examined the quality of the ts of the linear model. The linear model
is obtained from the normalization model by setting both the time constant at
rest and the time constant at full contrast to zero. In the diagram of Figure 1.1C
(Page 2) this corresponds to setting the membrane capacitance to zero and the
membrane conductance to an arbitrary constant value. The model thus has two
fewer parameters than the normalization model, so it is bound to provide worse
ts. Indeed, we already know the failures of the linear model: it does not predict
amplitude saturation, nor phase advance, nor noise masking, nor any of the other
nonlinearities that we have mentioned in this Chapter. The extent of the dierence
in quality of the ts can be taken as a quantitative measure of the importance
of the two extra parameters postulated by the normalization model. As shown
in Figure 3.15A, in most cases the normalization model provided a substantial
improvement over the linear model. For plaid experiments (white), the median
value for the percentage of the variance accounted for by the linear model was
70.5%, as opposed to 85.5% for the normalization model. Similar results were
obtained with the noise masking data sets (black, medians: 61.9% and 89.3%) and
with the grating matrix data sets, which are not included in the Figure (medians:
100
84.2% of the variance for the linear model; 93.0% for the normalization model).
We then considered an alternative to the normalization model, in which the
linear stage contributes both to the driving current and to the conductance increase. This would for example be the case if the output of the linear stage was
reected in a single type of synaptic conductance, rather than in the push-pull
arrangement discussed in Appendix Section 3.4. To visualize this alternative feedforward inhibition model, one can imagine modifying Figure 1.1C so that the arrow
denoting the variable conductance originates in the linear stage rather than in the
pooled activity of other cells in the cortex. More precisely, the model is dened
by the same Equations 3.1, 3.2, and 3.3 that dene the normalization model, with
Equation 3.4 replaced by g = g0 + k amplitude(L). Intuitively, this model postulates that gain control is proportional to the ecacy of a stimulus in driving the
cell. It is equivalent to a model constituted by a linear stage followed by a static
compressive nonlinearity.
This feed-forward inhibition model can be compared on an equal footing with
the normalization model because it has the same number of free parameters. In
the case of the feed-forward inhibition model the time constant at full contrast 1
of the normalization model has a dierent meaning: it is the time constant for
the optimal grating at full contrast. Figure 3.15B illustrates how the feed-forward
inhibition model fared in comparison to the normalization model in accounting
for the variance in the plaid data sets. In many cases the normalization model
provided substantially better ts than the feed-forward inhibition model. For plaid
experiments (white) the median value for the percentage of the variance accounted
for by the feed-forward inhibition model was 80.8%, as opposed to 85.5% for the
normalization model. Where the dierence in performance between the two models
is most impressive, however, is in the noise masking data sets (black): for these
101
data sets the median value for the percentage of the variance accounted for by the
feed-forward inhibition model was 67.4% as opposed to 89.3% for the normalization
model. The origin of this dierence in performance is quite simple: the feed-forward
inhibition model does not predict that noise would mask the responses of simple
cells. It is clear that the grating matrix data sets would yield similar results: the
feed-forward inhibition model would not account for key aspects of those data sets,
such as the fact that amplitude saturation and phase advance are contrast-driven
and not response-driven.
Finally, we considered an extension of the normalization model, which we may
term an anisotropic normalization model. This model is equivalent to the normalization model except that it relaxes one of its most stringent constraints, i.e.
that the normalization pool be equally responsive to a broad range of visual stimuli. The anisotropic normalization model does not require the normalization pool
to respond equally strongly to all orientations and spatial frequencies. Fitting the
plaid responses with this model involves an additional free parameter, which dierentiates the responses of the pool to the two component gratings. This parameter
scales the contrast c2 of the second grating (or of the noise) in the denominator
of Equation 3.7 and in the equation for response phase provided in Appendix Section 3.6. As illustrated in Figure 3.15C, the anisotropic model provided only a
marginal improvement over the normalization model in the quality of the ts. In
particular, the median value for the percentage of the variance accounted for by
the anisotropic model was 86.9%, only 1.3% better than the normalization model.
This hardly justies the use of its additional parameter to account for our data.
102
80
80
B
60
Phase advance (deg)
Phase advance (deg)
A
40
20
60
40
20
0
0
2
5
10
20
50
Sensitivity (1/contrast)
2
5
10
20
50
Sensitivity (1/contrast)
Sensitivity and phase advance for all the data sets in this
study, estimated (at 6.6 Hz) from the ts of the normalization model. White:
grating matrix data sets. Black: noise masking data sets. Gray: plaid
data sets. the latter have been plotted separately (B) to avoid clutter.
\Sensitivity" is one over the contrast needed to obtain half the response to
a 50% contrast grating. \Phase advance" is the dierence in the phases of
the responses to 0% contrast and to 50% contrast.
Figure 3.16:
3.2.6 Cell Population
Given that the model provides a good t to our data, it can be used to summarize some properties of the cell responses. Moreover, since its parameters have a
biophysical interpretation, we can use them to gauge the plausibility of the mechanisms that we have postulated, rectication and shunting inhibition.
Phase advance and sensitivity. Figure 3.16 illustrates the relation between
two characteristics of the contrast responses. The rst is the total phase advance
between 0 and 50% contrast. The second is sensitivity, which we dene as one over
the contrast needed to obtain a criterion response. We set this criterion to be half
of the response elicited by a 50% contrast grating. Both measures, phase advance
and sensitivity, were derived from the estimated (and when necessary extrapolated)
responses to single gratings drifting at 6.6 Hz.
103
The Figure shows that for all three types of data sets in this study sensitivity
and phase advance were positively correlated. This is to be expected from previous
discussions on the behavior of the model. Sensitivity grows with the eectiveness of
the normalization stage, because it depends on the amount of saturation present
in the contrast responses. A cell whose contrast response starts rising at low
contrasts, and saturates early, has high sensitivity. If the contrast response only
rises at high contrasts the cell has low sensitivity. For a linear cell saturation is
absent, and the contrast needed to obtain half the 50% contrast response is above
25%, yielding sensitivities below 4. Phase advance also grows with the eectiveness
of the normalization stage: the more the time constant is decreased by stimulus
contrast, the more response phase advances. As a result, the position of a data
point in the panels of Figure 3.16 is related to the linearity of the responses. Very
linear responses are on the lower left, and very nonlinear (strongly normalized)
responses are on the upper right. In particular, the ordinate depends on the time
constants at rest 0 and at full contrast 1 (and is described approximately by
Equation 3.6), whereas the abscissa depends also on the exponent n.
The contrast responses recorded during noise masking experiments (black) differed from those recorded in the two other types of experiment in that they tended
to have larger phase advances and lower sensitivities. This phenomenon may be
an eect of adaptation, as discussed in the Methods and in Figure 3.1. Indeed, the
data in the Figure refer to the same physical stimulus, a single drifting grating, and
dier only in the history of stimulation. Adaptation is known to depend both on
the amount of contrast (Sclar et al., 1989) and on the type of stimulus (Movshon
and Lennie, 1979) that were presented in the recent past. It reduces the sensitivity
of the cells, mostly by increasing the contrast at which their contrast responses
saturate (Ohzawa et al., 1985; Sclar et al., 1989).
104
15
A
Number of data sets
100
30
10
3
B
10
5
1
0
1
3
10
30
100
1
0.3
0.1
0.03
0.01
Plaid data sets (N=76)
15
C
Number of data sets
100
30
10
3
D
10
1
5
0
1
3
10
30
100
1
0.3
0.1
0.03
0.01
Noise mask data sets (N=22)
6
100
E
Number of data sets
Time constant at c = 100% (ms)
Time constant at c = 100% (ms)
Time constant at c = 100% (ms)
Grating matrix data sets (N=50)
30
10
3
1
F
5
4
3
2
1
0
1
3
10 30 100
Time constant at rest (ms)
1
0.3
0.1
0.03 0.01
Ratio of time constants
Time constants of V1 simple cells estimated by the normalization model from grating matrix data sets (A,B), from plaid data sets (C,D),
and from noise masking data sets (E,F). Left: scatter plots of the time constant at rest 0 (abscissa) vs. the time constant at full contrast 1 (ordinate).
Time constants below 1 ms are omitted. Dashed line indicates the identity
0 = 1 . Continuous line in lower right indicates bound in tting procedure.
Right: Histograms of ratios 1=0 . These include data sets missing from left
column because both time constants were below 1 ms. Dashed line indicates
bound in the tting procedure.
Figure 3.17:
105
Time constants and exponent. The three parameters that dene the prop-
erties of model cells are the time constants at rest and at full contrast 0 and 1,
and the exponent n of the spike rate encoding stage.
Figure 3.17 illustrates the range of time constants that we obtained by tting
all our data sets. The time constant at rest 0 (abscissae) was constrained to be
between 1 and 1000 ms for grating matrix data sets (A), and between 1 and 250
ms for plaid (C) and noise masking (E) data sets. For grating matrix data sets
its estimated values lie mostly between 10 and 50 ms, with a median value of 25
ms. For plaid data sets the median value was 51 ms. Noise masking experiments
yielded even higher values: if one excludes the 6 (out of 22) data sets for which the
estimated time constant at rest was 0 ms (which we attribute to noisy measurements), the median value of the time constant at rest was 146 ms. The ratio 1=0
between the time constant at full contrast 1 (ordinates) and the time constant at
rest 0 was constrained to be between 0.01 and 1 for grating matrix data sets, and
between 0.03 and 1 for plaid and noise masking data sets. The estimated values
of 1 are substantially lower than those of 0, with a median of 4.9 ms for grating
matrix data sets, 5.4 ms for plaid data sets, and 3.8 ms for noise masking data
sets.
For grating matrix data sets the ratio 1=0 was mostly above 0.1 (B), and had
a median value of 0.23, which corresponds to a fourfold increase in conductance.
A value of 1 would correspond to no conductance increase, i.e. to the linear model.
Plaid data sets yielded substantially smaller values for 1=0 (D). The median value
of this ratio in plaid data sets was 0.11, suggesting a tenfold decrease in model
conductance. Noise masking data sets (F) yielded even larger values: excluding
the 6 data sets for which 0 was 0, the median ratio 1=0 was 0.043, corresponding
to an increase in estimated conductance by a factor of 23. A conductance increase
106
% of the variance
95
94
93
10
92
91
1
1
Time constant at full contrast (ms)
Time constant at full contrast (ms)
100
A
10
100
Time constant at rest (ms)
100
B
% of the variance
91
90
89
88
10
87
86
1
1
10
100
Time constant at rest (ms)
Dependence of the t quality on the values of the time constants. Gray levels indicate the percentage of the variance accoounted for
by the model for each value of the time constant at rest tau0 and at full
contrast 1 . White circles indicate optimal values. A: Grating matrix experiment in Figure 3.9. B: Plaid experiment on the same cell, with orthogonal
components. The ts of the model constrain the ratio of the time constants
better than their individual values. Cell 392l008, Exps, 4 and 5.
Figure 3.18:
of this extent is unlikely to be possible in real cells (see Discussion).
On selected cells we performed an analysis of the dependence of the t quality
on the time constants. We concluded that the ts constrain the ratio of the time
constants better than their individual values. This analysis is illustrated for two
data sets in Figure 3.18. The percentage of the variance accounted for by the
model is maximal along a diagonal region in the plots, suggesting that the ts
constrained the ratio 1=0 better than the individual values of the time constants.
The regions of high t quality for the grating experiments tended to be closer to
the diagonal than those for the plaid experiments, a dierence that we ascribe to
the cells being in dierent states of adaptation.
The estimated values of the exponent n were spread between 1 and 4, which
was the region in which they were allowed to vary. Approximately a quarter of
the data sets yielded an n of 1 and another quarter yielded an n of 4. Examples
of spike encoders with exponents equal to 1, 2 and 3 are shown in Figure 3.2A.
107
The median estimated value was n = 2:37 for grating experiments, n = 2:38 for
plaid experiments and n = 2:03 for noise masking experiments. These values for
the exponent are consistent with the results of Albrecht and Hamilton (1982) and
of Sclar et al. (1990), who tted the amplitude of the responses with an equation
similar to our Equation 3.5.
108
3.3
Discussion
To summarize our results with grating matrices, we have found that the normalization model faithfully predicted the saturation in response size and the decrease
in response latency observed with increasing contrasts. More importantly, it predicted that the ratio of the sizes and the dierence in latency of the responses to
two dierent spatial frequencies (or orientations) were unaected by changes in
stimulus contrast. Similarly, the ratio of the sizes and the dierence in latency
of the responses to two dierent contrasts were unaected by changes in stimulus
spatial frequency (or orientation). The model also predicted the dependence of the
temporal frequency tuning on the contrast of the stimuli. Increasing stimulus contrast increased the cells' ability to follow high temporal frequencies, and increasing
temporal frequency increased the contrast at which the responses saturated.
From plaid experiments we found that the model correctly predicted that the
eect of adding a \masking" grating | one that would not elicit responses when
presented alone | would be to increase the stimulus contrast needed to obtain a
given response. The model also predicted the more complicated eects of adding
two gratings that were both eective in driving a cell when presented alone. Because of mutual suppression, the responses to the plaid were smaller and had
shorter latencies than the sum of the responses to the individual gratings.
The model also performed well in tting data from noise masking experiments.
It predicted that spatiotemporal white noise stimuli that did not elicit many spikes
would act as potent inhibitors of the responses of the cell to other stimuli, increasing the contrast needed to achieve a criterion response amplitude. Noise masks also
aected the latencies of the responses, with higher noise contrasts yielding considerably shorter latencies than low noise contrasts. All of these phenomena were
109
quantitatively accounted for by the model, which ascribes them to the increase in
local stimulus energy produced by a noise mask.
Comparing the quality of the ts with those provided by alternative models
(Section 3.2.5) also supported the normalization model. We rst considered the
linear model, and we found that in addition to having qualitative defects it is
also insucient in quantitative terms. We then considered a model in which the
output of the linear stage results in a synaptic conductance increase rather than
pure current injection in the cells. This model is equivalent to having the linear
stage being followed by a static compressive nonlinearity. We found that it would
not be a valid alternative to gain control. We also considered a model which is
more general than normalization, since it relaxes the assumption that all gratings
elicit the same overall response from the normalization pool. This model is bound
to t better than the normalization model, because it has one more free parameter.
It is also more realistic because certainly not all gratings are visible to the cortex.
With the stimuli used in this study, however, this extended model did not much
improve the quality of the ts.
3.3.1 Comparison with Geniculate Cells
We have assumed that the responses of LGN neurons are linear functions of the
stimulus contrast distribution. This assumption is a better approximation for the
Parvocellular (P) layers of the LGN than for the Magnocellular (M) layers. Evidence that the P pathway is substantially linear and that the M pathway quite nonlinear is available from studies of the responses of retinal ganglion cells (Benardete
et al., 1992; Lee et al., 1994; Benardete and Kaplan, 1995) and of LGN cells (Derrington and Lennie, 1984; Sherman et al., 1984; Carandini et al., 1993a; Movshon
110
et al., 1994). Even though P cells constitute around 90% of the monkey LGN
(Dreher et al., 1976), many simple cells also receive M inputs (Malpeli et al.,
1981). Indeed, while the two streams are segregated in layer 4C (Hubel and Wiesel,
1972; Hendrickson et al., 1978; Blasdel and Lund, 1983), they are not segregated
at all in the upper layers (Lahica et al., 1992; Yoshioka et al., 1994; Nealey and
Maunsell, 1994). In particular, for those neurons that do receive M input, the rst
7-10 ms of activation are due exclusively to the M signal (Maunsell and Gibson,
1992).
To assess the importance of LGN nonlinearity, we analyzed data obtained in the
LGN by Movshon, Hawken, Kiorpes, Skoczenski, Tang and O'Keefe (1994), who
recorded from 83 P cells and 27 M cells. They measured the contrast responses
with a drifting grating in the presence and in the absence of a spatiotemporal
white noise mask. We tted these data with the same model that we used for the
V1 noise masking experiments. We considered sensitivity and phase advance, and
looked at how they were aected by the presence of a mask. Figure 3.19 illustrates
our results. M cells had high sensitivity (A), and displayed prominent phase advance (B). In the presence of a noise mask their responses were strongly reduced:
sensitivity dropped by a factor of two (C), and phase advance virtually disappeared
(D). P cells had much lower sensitivities than M cells (E), and generally showed
very little phase advance (F). Their responses were not nearly as aected by the
presence of a noise mask as the responses of M cells; sensitivity was unaected
(G), and phase advance was reduced only in the small portion of cells that did
show some in the rst place (H).
These results are consistent with known properties of M and P cells: the difference in sensitivity between the two types of cell is well established (Kaplan
and Shapley, 1982; Shapley and Perry, 1986), their dierent phase advance behav111
M cells
N
6
4
A
4
B
No noise
D
Noise
F
No noise
H
Noise
2
2
0
N
10
0
20
C
5
10
0
0
P cells
N
40
E
20
20
10
0
N
40
0
G
30
20
20
10
0
0
1 2
5 10 20 50
Sensitivity (1/contrast)
0
20
40
60
Phase advance (deg)
Figure 3.19: Sensitivity and phase advance in the LGN and eects of noise
masking. Data were recorded by Movshon et al. (1994) and tted with the
normalization model. \Sensitivity" (left) is one over the contrast needed
to obtain half the response to a 50% contrast grating. \Phase advance"
(right) is the dierence in the phases of the responses to 0% contrast and
to 50% contrast. Ordinates plot the number of cells. A,B: Sensitivity and
phase advance in M cells. C,D: same, in the presence of a noise mask. E,F:
Sensitivity and phase advance in P cells. G,H: same, in the presence of a
noise mask. Gratings drifted at 6.6 Hz.
112
80
A
Phase advance (deg)
60
40
20
0
1
2
5
10
Sensitivity (1/contrast)
20
Phase advance (deg) with NOISE MASK
80
B
V1
M LGN
60
P LGN
40
20
0
1
2
5
10
20
Sensitivity (1/contrast) with NOISE MASK
Figure 3.20: Sensitivity and phase advance in the geniculocortical pathway.
Black: V1 simple cells. Gray: LGN M cells. White: LGN P cells. All
cells were tested with noise masking experiments. A: sensitivity vs. phase
advance. B: same, in the presence of a white noise mask. LGN data are
same as Figure 3.19, V1 data are same as Figure 3.16A. V1 simple cells are
typically less sensitive than M cells, and have similar sensitivities as P cells.
They typically exhibit phase advances similar to M cells, larger than that of
P cells.
ior was mentioned by Derrington and Lennie (1984), and measured by Sherman,
Schumer and Movshon (1984), and the dierent eect of a noise mask on their
responses has been reported by Movshon et al. (1994).
All together, our observations reinforce the view that P cells are substantially
linear, and that M cells are nonlinear. Many aspects of M cell responses (phase
113
advance, masking, eect of masking on sensitivity and phase advance) suggest that
their nonlinearity might be due to a gain control mechanism. Indeed, based on
similar evidence it has been proposed (Benardete et al., 1992) that M cells may
have a gain control mechanism similar to that observed by Shapley and Victor in
cat retinal X ganglion cells (Shapley and Victor, 1978; Victor, 1987)
In some respects the behavior of M LGN cells is very similar to that of V1
simple cells, with both cell groups showing dramatically decreased sensitivity and
phase advance in the presence of a mask. Figure 3.20 illustrates this more directly.
Panel A shows the distribution of sensitivity vs. total phase advance for LGN
P and M cells, and for V1 simple cells. The vast majority of P cells occupy the
lower left corner, corresponding to substantially linear responses, with no clear
saturation and with little phase advance (cf. Figure 3.19E-H). M cells rather tend
to occupy the upper right region, corresponding very nonlinear contrast responses,
with strong phase advance and strong saturation. Simple cells displayed a wide
range of nonlinearity, with some being as nonlinear as the most nonlinear M cells,
and some being as linear as the most linear P cells. Simple cells were typically less
sensitive than LGN cells exhibiting the same phase advance. For completeness, in
B we show the eect of a noise mask on sensitivity and phase advance.
If some portion of the input to simple cells comes from the M pathway, then
some portion of the nonlinearity that we have ascribed to normalization in V1
must rather originate in the LGN. Could all of the nonlinearities that are present
in simple cells originate from their receiving a preponderant M input? The results
of Figure 3.20 do not rule out this possibility. Other results, however, provide
evidence in favor of a strong gain control mechanism in monkey V1. These results
were provided by Hawken et al. (1992), who measured temporal frequency tunings
at dierent stimulus contrasts both in the LGN and in V1. The results in V1
114
were consistent with those described in the present study (Figure 3.6C): increasing
stimulus contrast increased the sensitivity to the high temporal frequencies, with
the average high-cuto frequency changing from 17 Hz at 8-16% contrast to 27 Hz
at 64% contrast. By contrast,the average increase in high-cuto frequency of LGN
cells (both M and P) was negligible, suggesting that the origin of this phenomenon
is entirely cortical.
The normalization model is consistent with this change in the high temporal
frequency cut-o in V1 cells. According to the model increasing contrast decreases
the sensitivity of the membrane more at low temporal frequencies than at high
temporal frequencies (Figure 3.7). As the contrast increases, high temporal frequency responses grow above threshold, becoming visible in the spike responses.
Hawken et al. (1994) reported that there is a signicant low-pass lter between
LGN and V1. The normalization model predicts how the gain and time constant
of that lter change with the stimulus contrast.
In the cat, there is abundant evidence that the nonlinearities described in this
are cortical in origin. Bonds (1989) reported that geniculate cells do not show
any evidence of cross-orientation inhibition, and Morrone et al. (1982) found that
an orthogonal contrast-modulated grating elicits frequency-doubled suppression,
indicating that suppression originates in complex cells or in pools of simple cells.
In addition, Reid et al. (1992) found that high-energy broadband stimulation
decreased the latency of the cortical responses to a much larger degree than would
be possible for geniculate responses.
115
3.3.2 Composition of the Normalization Pool
We have mentioned numerous studies conducted in cats that provide evidence for
inhibition between visual cortical cells tuned to dierent stimuli. In the present
study we have shown that this is the case also in simple cells of the macaque
visual cortex, that this inhibition is divisive, and that it obeys the equations of
the normalization model. A point that remains largely unanswered is the precise
composition of the pool providing this inhibition.
First, we have no way to tell whether the pool contains simple cells, complex
cells or both. We found that the inhibition provided by drifting gratings is not
modulated in time, so it could originate either in complex cells or in a number of
simple cells with dierent spatial phases. Similarly, in a study of the cat visual cortex Burr et al. (1981) found that contrast-reversing standing gratings orthogonal
to the preferred orientation of simple cells provide inhibition with a strong second
harmonic component. This second harmonic component is characteristic of the
responses of complex cells, but could also originate from the summed responses
of many simple cells. Burr, Morrone and Maei (1981; Maei, 1985) proposed
that inhibition originates in complex cells because they found that these cells responded to two-dimensional drifting noise patterns that did not elicit spikes in
simple cells. Their argument could in principle be extended to our data: ickering
noise elicited little responses in simple cells but was able to provide potent inhibition. We however believe that the feedback signal is the sum of the activities
of a very large number of cells. Even if each cell in the pool were to respond to
the noise mask with a mere 0.1 sp/s, if there are 10,000 of them feeding back to
each simple cell, the inhibitory eect of the noise mask would be similar to that
of a stimulus that elicited 100 sp/s in 10 cells. The resulting 1,000 sp/s produced
116
by the normalization pool would not be far from what one would expect with a
drifting grating.
Second, we do not know the precise overall tuning of the inhibitory pool. The
normalization model proposed by Heeger postulates that the pool is a very large
population of cells, tuned to a broad range of stimulus parameters, so that its total
output is independent of stimulus orientation, and independent of spatial frequency
and temporal frequency over a broad range of frequencies. As we demonstrated in
Section 3.2.5, our data do not allow us to reject this \isotropic" model in favor of
an \anisotropic" one with substantial tuning in the normalization pool, because
doing so would yield only a marginal improvement in the ts.
This isotropy of the normalization pool is consistent with measurements in the
cat visual cortex by DeAngelis et al. (1992), who found that suppression was
essentially independent of orientation. The opposite view has been proposed by
Bonds, who measured inhibition in cats, both in the orientation domain (Bonds,
1989) and in the spatial frequency domain (Bauman and Bonds, 1991). Bonds
argued that his results called for a degree of tuning in the pool of cells providing
inhibition. Modeling studies by Heeger (1992b) and by Nestares and Heeger (1996),
on the other hand, suggest that Bonds's results are consistent with an isotropic
normalization pool, so the precise tuning of the pool remains an open question.
3.3.3 Shunting Inhibition
Shunting inhibition is a widely cited proposal for how neurons might perform
division (Fatt and Katz, 1953; Coombs et al., 1955; Koch and Poggio, 1987). Its
dening property is that it does not introduce any current when the cell is at rest,
thus aecting only the cell's overall conductance. Shunting inhibition is usually
117
thought to operate through GABAA synaptic channels, permeable to Chloride
ions, because the equilibrium potential of Chloride is close to the resting potential
of a typical cell. The idea that there are strong inhibitory circuits in the cortex,
and that these circuits operate through GABA-mediated shunting inhibition, arose
rst as a result of a seminal study by Krnjevic and his colleagues (Dreifuss et al.,
1969). They showed that electrical stimulation of the cortical surface produced
very large (up to 300%) increases in membrane conductance, and that similar
eects were obtained by iontophoretic application of GABA. These results were
extended by Rose (1977), who observed that iontophoresing GABA over V1 cells
yields divisive eects on their visual responses.
Is shunting inhibition really the mechanism for normalization? Our results
(Figure 3.17) indicate that this would call for very large conductance increases associated with visual stimulation. The conductance increases estimated from grating
matrix data sets (4-500%) are large but not inconceivable: computer simulations
of a pyramidal cell (Bernander et al., 1991) have demonstrated that membrane
conductance can increase by a factor of 10, due to changes in synaptic inputs.
Conductance increases estimated from plaid and noise masking experiments were
however larger and thus not realistic. Even though our estimates for conductance
increase in the cortex are inated by the assumption of LGN linearity, it is a
fact that no large conductance increases have been reported in intracellular in
vivo studies (Berman et al., 1991; Ferster and Jagadeesh, 1992). Berman et al.
(1991) measured the conductance of simple cells in the presence of drifting bar
stimuli of dierent orientations. They did not nd any evidence for orientationspecic shunting inhibition. This is consistent with the normalization model, that
postulates that shunting inhibition should not be orientation specic. The normalization model however predicts that presenting a stimulus of any orientation would
118
increase the conductance wit respect to rest. Berman et al. reported conductance
increases of less than 20%, less than what the normalization model would have
predicted.
It may thus be that our proposed mechanism for normalization, shunting inhibition, is too simplistic. In future studies we plan to explore an alternative model
that is slightly more complex but perhaps more plausible. According to this alternative model, normalization would act on the spike encoding mechanism rather
than on the whole membrane, causing conductance increases that are localized to
the axon hillock, or other modications that aect the mechanism that converts
intracellular signals into spike trains. The next Chapter is devoted to a study of
this mechanism. Our main nding is that the simple reftication mechanism that
we have used to describe spike encoding is by itself a bad appproximation of that
process, but can become a good model if it is followed by a high-pass lter. In
the Conclusions (Chapter 5), we speculate that this high-pass stage could be a
candidate for the site of normalization.
3.3.4 Feedback Models of the Visual Cortex
For reasons of simplicity we have assumed that the selectivity of our model cells
is determined by feed-forward excitation and inhibition, and that the only role of
intracortical feedback is to provide shunting inhibition. In this we dier from a
number of recent models that consider intracortical feedback crucial in sharpening
the selectivity conferred by the inputs from the lateral geniculate nucleus (BenYishai et al., 1995; Somers et al., 1995; Suarez et al., 1995). While the feedforward view is supported by recent evidence (Reid and Alonso, 1995; Ferster et al.,
1996), the linear model should not necessarily be identied with a feed-forward
119
arrangement inputs. A linear receptive eld could, in principle, be constructed
with pure feed-forward connections, pure feedback connections, or a combination
of feed-forward and feedback.
Somers et al. (1995) and Suarez et al. (1995) have proposed nonlinear recurrent models of simple cell responses. According to these models, simple cells
receive a broadly tuned excitatory input from the LGN, which is substantially
sharpened by intracortical excitation from similarly tuned cells and by broadly
tuned intracortical inhibition. A computational analysis of these models, however,
reveals that they would not account for many of the phenomena described in this
Chapter (Carandini and Ringach, 1997). In particular, the recurrent model by
Somers and collaborators ascribes contrast saturation and phase advance entirely
to the LGN input; the relation between the output of their model and its LGN
input is constant, with no saturation nor phase advance. In addition, their model
does not account for masking by gratings and by noise, nor does it predict the
associated phase advances or decreases in integration time. Finally, the recurrent
models make the unlikely prediction that the orientation tuning of the responses
to plaids and to stimuli containing three dierent gratings should be strikingly different from that of the responses to gratings (Carandini and Ringach, 1997). The
recurrent models may however be more correct than ours in the relative importance they ascribe to geniculocortical excitation versus corticocortical excitation,
and are successful in predicting specic experimental results such as the linearity
of directional selectivity (Suarez et al., 1995), and the eects of pharmachological
manipulations of the strength of inhibition (Somers et al., 1995). A future goal for
our research is to integrate the best aspects of the normalization model and of the
recurrent models, perhaps by postulating a substantial role for cortical feedback
in determining the linear receptive eld of V1 simple cells.
120
3.3.5 Conclusions
While its biophysical foundations are unsure, the normalization model in its RC
circuit implementation is very successful in explaining an extremely wide range of
linear and nonlinear phenomena that have lled the visual neuroscience literature
for more than 30 years.
A limitation of the model is that it is local in space. It was not designed to
account for the strong surround inhibition displayed by many cortical cells (Born
and Tootell, 1991; DeAngelis
, 1994; and references therein). While surround
suppression could in principle be due to the same mechanism that explains masking, it is not clear that its nature is divisive. Indeed, there is evidence that divisive
gain control is highly spatially selective (DeAngelis et al., 1992). In addition some
V1 neurons exhibit center-surround phenomena that are signicantly more complicated than divisive normalization: for some very specic stimulus congurations,
introducing a stimulus in the surrounding eld can facilitate a neuron's response
(Maei and Fiorentini, 1976; Nelson and Frost, 1985; van Essen et al., 1989; Gilbert
and Wiesel, 1990; Kapadia et al., 1995; Sillito et al., 1995; Gilbert et al., 1996).
et al.
Another limitation of the model is that it is local in time: it does not take into
consideration the phenomenon of adaptation. The data sets that we have tted
were all obtained by randomizing the order of presentations, in hopes of achieving
an average level of adaptation. To some extent, adaptation can be framed within
the context of the normalization model: it can be treated as masking by assuming
that gain control has a long memory (Heeger, 1992a; Poirson et al., 1995). It
is however not clear that the eects of adaptation are entirely similar to those
of masking, and it may very well be that this phenomenon is due to additional
mechanisms beyond those discussed in this Chapter.
121
Simple cells in V1 have a limited dynamic range, a limit to how strong an
output signal they can generate and, hence, a limit to the range of inputs over
which they can respond dierentially. As we have seen (Figures 3.4B and 3.5A),
the ratio of the responses to any two dierent stimuli is constant, irrespective of
the stimulus contrast, even in the face of response saturation. In addition, gain
control keeps their relative timing constant. These invariances, that we attribute
to normalization, are critical for encoding visual information (e.g., about motion,
orientation, binocular disparity, etc.) independent of contrast.
The issues of gain control and limited dynamic range are, of course, not restricted to V1 neurons. Gain control has been measured and modeled in a variety
of other neural systems, including: turtle photoreceptors (Baylor and Hodgkin,
1974), retinal ganglion cells (Shapley and Victor, 1978), movement detectors in
the y visual system (Reichardt et al., 1983), the vestibulo-ocular reex (Lisberger
and Sejnowski, 1992), and velocity-selective neurons in area MT of the primate
cortex (Simoncelli and Heeger, 1996). In particular, Reichardt et al. (1983), addressed the same specic issue of retaining linearity in the presence of gain control
that we encountered in this Chapter, and proposed a recurrent shunting inhibition
scheme not too dierent from the one we have proposed.
The normalization model of simple cell responses is also analogous to models of
retinal adaptation/normalization (Sperling and Sondhi, 1968; Shapley and EnrothCugell, 1984; Grossberg and Todorovic, 1988), in which the stimulus intensity at
a particular point is normalized with respect to the mean stimulus intensity. This
makes the retinal response largely independent of the overall level of illumination,
and allows the brain to proceed to process visual information without having to
attend to the overall light level. Similarly, the V1 normalization mechanism allows the brain to process visual information without having to attend further to
122
contrast; the perceived orientation or direction of motion of a stimulus is indeed
largely invariant with respect to contrast.
123
3.4
Appendix: Proposed Biophysics of the Model
When describing the normalization model, we have assumed that the driving current Id that is input to simple cells is the output of a spatiotemporal linear lter
(Equation 3.1), and that the conductance g of the cell is controlled by the activity
of many other cortical cells (Equation 3.4). This leaves the question of the model's
biophysical substrate open. In particular, given that synaptic currents originate
from synaptic conductance increases, how can the output of such a lter result
in pure current injection, with no eect on the membrane conductance? Conversely, how can the normalization pool aect the membrane conductance without
injecting any current? These are the issues addressed in this appendix.
The spatiotemporal receptive eld We consider a simple feedforward model,
in which the spatiotemporal receptive eld of a simple cell is the result of an orderly
arrangement of LGN inputs. An ON subregion results from the sum of an aligned
series of LGN ON-center inputs, while an OFF subregion results from the sum of
an aligned series of OFF-center inputs. Summing appropriately positioned nonoriented LGN receptive elds results in an oriented simple cell receptive eld. In
the spatial domain this scheme is the classical one proposed by Hubel and Wiesel
(1962) to explain how orientation selectivity can arise from the integration of inputs
from LGN cells, which are not orientation selective. This scheme can also explain
the direction selectivity of a simple cell. For a linear cell, direction selectivity is
the result of a receptive eld which is oriented in space-time. Such a receptive
eld can obtained by summing non-oriented receptive elds that are appropriately
shifted in space and delayed in time.
124
Push-pull arrangement of inputs. Every region of the receptive eld provides
both excitation and inhibition. In ON regions a light increase results in EPSPs and
a light decrease results in IPSPs, while in OFF regions a light decrease results in
EPSPs and a light increase results in IPSPs (Ferster, 1988; Reid and Alonso, 1995).
Since the centers of LGN receptive elds are much stronger than the surrounds, it
is likely that simple cell receptive eld subregions originate from the appropriate
arrangement of LGN receptive eld centers. For example, an ON subregion of
a simple cell would be the result of excitation from ON-center LGN cells and
inhibition from OFF-center LGN cells.
If the inputs to a simple cell are arranged in push-pull (i.e. the gains of excitation and inhibition are identical) then an increase in excitatory synaptic conductance ge is accompanied by an equal decrease in inhibitory synaptic conductance gi,
and vice versa. In that case the total conductance is always constant, independent
of the visual stimulus:
ge (t) + gi (t) + gleak = g0 ;
(3.8)
where gleak is the leak conductance of the cell and g0 is the conductance of the cell
at rest.
While there is evidence for the complementary arrangement of excitation and
inhibition expressed in Equation 3.8 (Heggelund, 1981; Palmer and Davis, 1981;
Ferster, 1988; Ferster and Jagadeesh, 1992; Douglas et al., 1991), there is no evidence for direct geniculocortical inhibition (Watanabe et al., 1966; Garey and
Powell, 1971; Toyama and Takeda, 1974; Toyama et al., 1974, 1977b,a; Ferster
and Lindstrom, 1983; Tanaka, 1983). As a result the IPSPs most likely originate
in cortical interneurons, a fact that we here ignore but that certainly complicates
matters.
125
Retinal image
Other
cortical
cells
V
C
Vleak
gleak
Ve
ge
Vi
gi
Vshunt
gshunt
Firing rate
Possible biophysics underlying linearity and normalization in
V1 simple cells. The cell membrane is modeled a single compartment with
passive properties and three classes of synaptic inputs: excitatory, inhibitory
and shunting. In the central excitatory subregion of the receptive eld the
excitation is provided by ON-center cells and the inhibition by OFF-center
cells with superimposed receptive elds. The anking inhibitory subregions
are obtained by the opposite arrangement of excitation and inhibition (not
shown). This push-pull arrangement of excitation and inhibition ensures the
linearity of the membrane potential V . The shunting conductance gshunt
grows with the activity of a large number of cortical cells, the normalization
pool. The equilibrium potential for the shunt is Vshunt = Vrest , meaning that
at rest these synapses increase the conductance without injecting current.
The membrane potential is encoded into ring rate by a rectier. See text
for explanation of symbols.
Figure 3.21:
126
Integration of the synaptic inputs We consider a simple single-compartment
model of the simple cell (Figure 3.21) with a leak conductance gleak and three types
of synaptic conductances: excitatory (ge), inhibitory (gi ), and shunting (gshunt ).
The membrane potential of a model cell obeys
0C dV
= gi(V 0 Vi ) + ge (V 0 Ve ) + gshunt (V 0 Vshunt ) + gleak (V 0 Vleak ); (3.9)
dt
where C is the membrane conductance, Ve , Vi and Vshunt are the equilibrium potentials of the synaptic conductances, and Vleak determines the leak current.
The above can be rewritten as Equation 3.2 by dening
g = ge + gi + gshunt + gleak
(3.10)
to be the total conductance, and
Id = ge Ve + gi Vi + gshunt Vshunt + gleak Vleak
(3.11)
to be the driving current. The driving current depends on the cell's synaptic
inputs, but it is independent of V , the cell's membrane potential. The driving
current could only be measured by voltage-clamping the cell. It is not the actual
synaptic current, which depends on the membrane potential V .
Shunting inhibition The shunting synaptic conductance has the property that
its equilibrium potential Vshunt is the same as the resting potential of the cell, Vrest
(Fatt and Katz, 1953; Coombs et al., 1955; Koch and Poggio, 1987). For ease of
notation we pick this value as the origin of the membrane potential measurements:
Vshunt = Vrest = 0:
(3.12)
Uncoupling current and conductance Since ge and gi are in a push-pull
arrangement (Equation 3.8), and the equilibrium potential of the shunt is equal to
127
the resting potential of the cell (Equation 3.12), driving current and conductance
are decoupled:
The overall conductance, g = g0 +gshunt , is completely controlled by the shunt
conductance gshunt , independent of the excitatory and inhibitory inputs ge
and gi.
The driving current, Id = ge Ve +gi Vi +gleak Vleak , is a function of the excitatory
and inhibitory inputs ge and gi, independent of the shunt conductance gshunt .
The expression for the overall conductance Equation 3.4 can be obtained by
postulating that the shunt conductance gshunt grows with the activity of the normalization pool P R as follows:
r
gshunt = g0 1= 1 0 k
X
!
R01 :
(3.13)
Finally, since the expression for the driving current Id is linear in ge and gi , and
the latter (and thus the inputs from the LGN) are presumed to depend linearly on
the local contrast of the stimuli, then the driving current itself is a linear function
of the local contrast of the stimuli, as expressed by Equation 3.1.
128
3.5
Appendix: Predicted Responses to Gratings
We here derive approximate closed-form equations for the responses of model cells
to drifting sinusoidal gratings. The derivation is based on the assumptions stated
in Equations 3.1, 3.2, 3.3, and 3.4.
Consistent with results obtained in the cat and monkey (Albrecht and Hamilton, 1982; Sclar et al., 1990), we assume the average exponent for the cells in the
normalization pool to be n = 2 (Heeger, 1992a). This assumption is justied a
posteriori by our own results (Section 3.2.6). As a result, in the absence of normalization the response of each cell in the normalization pool to a drifting sine grating
is a half-squared sinusoid. We call this rectied and squared linear response the
\unnormalized response". It is given by max(0; Id)2.
The receptive elds of adjacent simple cells tend to exhibit either 90 or 180
phase relationships (Palmer and Davis, 1981; Pollen and Ronner, 1981; Foster
et al., 1983; Liu et al., 1992). We can thus reasonably assume the normalization pool to contain quadruples of cells with the same amplitude response but
with phases 90 apart. For drifting sine grating stimuli, then, the sum of the
unnormalized responses of the four units in each quadruple is constant over time
and is proportional to the square of the stimulus contrast c (Adelson and Bergen,
1985). This follows directly from sin2 + cos2 = 1. The sum of the unnormalized
responses of all the cells in the pool is thus a neural measure of local stimulus
P
energy: max(0; Id )2 / c2 .
If the membrane conductance changes slowly, dg=dt 0, it is possible to directly relate two unknowns, the overall response of the pool P R and the total
129
conductance of each cell g:
X
R/
X
max(0; V )2 /
X
max(0; I )2=g2 / c2=g2 :
(3.14)
There is another equation relating those unknowns: the denition of g (Equation 3.4). It is thus easy to combine the two to eliminate P R and obtain a relation
between conductance and local stimulus energy:
q
g = g02 + k 2 c2 ;
(3.15)
where k is proportional to the k in Equation 3.4. This relation is exact only at
steady state, when the conductance is constant in time. We have conrmed with
numerical simulations that the model does reach such a steady state for drifting
grating stimuli.
Once in steady state, the conductance does not change in time, and the cell
membrane behaves as a linear system. Stimulation with gratings of contrast c thus
results in sinusoidal membrane potentials V . It is easy to show (by taking the
Fourier Transform of both sides of Equation 3.2) that the amplitude and phase of
such a sinusoid are given by
amplitude(c L) ;
amplitude(V ) = q
(3.16)
g 2 + (2!C )2
phase(V ) = phase(c L) 0 atan(2!C=g):
where ! is the stimulus temporal frequency (in Hz), and Id = c L(t) is the output
of the linear stage.
Since the amplitude of the rst harmonic of the power is proportional to the
power of the amplitude of the rst harmonic, we can rewrite the previous equations
for the rst harmonic of the ring rate R as
2
3n
amplitude(
c
L
)
5 ;
amplitude(R) / 4 q 2
(3.17)
2
g + (2!C )
130
phase(R) = phase(V ):
A few rearrangements yield the expressions for the rst harmonic responses
of the normalization model to a drifting grating which are used throughout this
Chapter:
2
amplitude(R) / 4amplitude(L) q
c
3n
5 ;
(! )2 + c2
2!0
phase(R) = phase(L) 0 atan q
;
1 + ((0=1)2 0 1)c2
where
(3.18)
(3.19)
1 + (2!0)2 :
(3.20)
(0=1)2 0 1
The stimulus variables are the contrast c and the temporal frequency !. The model
parameters are: the amplitude and phase of the response L of the linear stage to
the grating at full contrast; the time constant at rest, 0 = C=g0; the time constant
q
at full contrast, 1 = C= k2 + g02; the exponent n of the spike encoding stage.
(! )2 =
131
3.6
Appendix: Predicted Responses to Plaids
The expressions derived in Appendix Section 3.5 for the ring rate of simple cells to
drifting sinusoidal gratings can be approximately extended to stimuli composed of
more than one grating. This Appendix describes the extension to stimuli composed
of two drifting gratings. We will restrict our attention to the case in which the two
gratings have the same temporal frequency !.
Let c1 and c2 be the contrasts of the two gratings. Let L1 and L2 (sinusoids)
be the responses of the linear weighting function to the individual gratings. The
driving current is just the sum of the linear responses weighted by the contrasts:
Id(t) = c1 L1 (t) + c2L2 (t):
(3.21)
The quantity P Id(t)2 is not in general constant in time, since it contains a
component at twice the temporal frequency of the stimulus. Nonetheless, if one
assumes that the relation between population ring rate and membrane conductance described by Figure 3.2B is preceded or followed by a temporal averaging
stage, then the resulting conductance change will be approximately constant in
time. The same result would be obtained by assuming that the normalization
pool is very large so it eectively performs a local spatial average, or by some
combination of spatial and temporal averaging.
If the conductance is approximately constant over time, the same arguments
as in the last Appendix apply, yielding:
2
3n
c
L
t
c
L
t
amplitude(
(
)
+
(
))
1
1
2
2
5 (3.22)
q
;
amplitude(R) / 4
2
2
2
(! ) + c1 + c2
0
phase(R) = phase(c1L1(t) + c2L2(t)) 0 atan @ q
!0
1 + ((0=1 )2 0 1)(c21 + c22)
where is dened in Equation 3.20.
132
1
A(3.23)
;
Chapter 4
Spike Encoding
Since the fundamental work of Hodgkin and Huxley (1952), a very large body
of data has become available on the mechanisms underlying the generation of
spike trains. In cortical cells, in particular, spike train encoding was found to be
controlled by an array of voltage{ and calcium{dependent channels (see Gutnick
and Crill, 1995, for a recent review). Progress in modeling has made it possible
to incorporate the physiological ndings into detailed simulations of single cells or
even whole networks (Koch and Segev, 1989; McCormick and Huguenard, 1992;
Bower and Beeman, 1995).
This extended and detailed knowledge is however not easily applied to the
context of systems neuroscience, where attention is largely concentrated on factors
such as the nature of the inputs to a network and the connectivity of dierent cell
types. In particular, when modeling the responses of cells in the primary visual
cortex one would like to devote the bulk of the model's parameters to factors
such as the visual properties of subcortical inputs, the wiring of these inputs onto
cortical cells and the nature of intracortical feedback. Adding to these a detailed
133
spike encoding mechanism results in tens of additional free parameters, and in a
heavy computational burden, which make it impossible to t the model to actual
data (see e.g. Suarez et al. 1995). This suggests a need for a simple and robust
model of the transformation of synaptic currents into spike trains by cortical cells.
In studies of the visual cortex this transformation has been traditionally modeled with a simple stage that instantaneously converts somatic current or membrane potential into a continuous ring rate. Perhaps the simplest of these models
is the rectication model which we have used in the previous Chapters. It postulates that the ring rate is zero for membrane potentials below a threshold, and
grows linearly with the synaptic current above that threshold. Common variations of this model include functions with a smoother transition from rest (Heeger,
1992), and functions that saturate to a maximal ring rate, such as sigmoids.
These models are all static (or memoryless) nonlinearities, i.e. ones whose output
depends only on the present value of their input and not on past history.
Rectication and the other static models can be accurate in describing the
steady-state responses of cortical cells, but fail to predict the time-varying responses. There is indeed a large body of literature pointing to a linear or bilinear
steady-state relation between injected current and ring rate, once the current is
above a threshold level (see Stafstrom et al., 1984c, and references therein). In
the primary visual cortex, in particular, the ring rate grows roughly linearly with
injected current (Jagadeesh et al., 1992). When the stimuli are current steps, however, the ring rate of some cortical cells displays prominent adaptation (Connors
et al., 1982). Firing rate thus depends not only on the injected current I , but also
on time t. In addition, when the stimuli are current ramps, the resulting ring
rate depends on the slope dI=dt of the ramp (Stafstrom et al., 1984c).
134
The aim of the present study is to gain a general understanding of the spike
encoding properties of cortical cells, and to provide a model of these properties
that lies between the excessive simplicity of the static nonlinearity models and the
complexity of the detailed biophysical descriptions. There are a number of models
that could in principle capture the spike encoding properties of cortical cells while
being described by a limited number of free parameters. Among these models
are variations on the integrate-and-re scheme (e.g. Knight, 1972 and Getting,
1989), as well as sequences of basic signal{processing blocks such as linear lters
and static nonlinearities (French and Korenberg, 1989; Korenberg et al., 1989).
The model that we advocate in this Chapter belongs to the latter category, and
predicts smooth ring rates rather than spike trains.
We performed intracellular in vitro experiments on slices of the guinea pig visual
cortex. We recorded from regular-spiking cells, which are known to be pyramidal
or spiny stellate cells and to be excitatory (Connors and Gutnick, 1990). We
injected currents of various waveforms (sinusoids, broadband noise and steps), and
analyzed the cells' membrane potential and spike train responses.
In the rst part of the Chapter (Section 4.2.1) we report on the spike train
responses. We found that the responses to sinusoidal currents have very dierent
properties from the responses to broadband noise currents. With sinusoidal currents the spike encoding mechanism acts as a band-pass lter, and the averaged
responses are very nonlinear. The nonlinearities are of two kinds: rectication,
which refers to the absence of response in the negative portion of the stimulus,
and spike synchronization, which refers to the recurrence of spikes at the same
exact points in the stimulus cycle. In response to broadband noise, the cells are
more responsive, and encode all the frequencies between 0.1 and 130 Hz equally
well. In addition, the averaged responses are much more linear than with sinusoidal
135
currents.
In the second part of the Chapter (Section 4.2.2) we show that the above{
mentioned properties of the spike responses are not present in the underlying
membrane potential traces. Indeed, a very large portion of the variance of the
membrane potential responses can be captured by a simple single{compartment
passive model of the cell, which is a linear low{pass lter.
In the nal part of the Chapter (Section 4.2.3) we propose a sandwich model
that accounts quantitatively for the spike responses. It essentially consists of a
static nonlinearity | the rectication stage | sandwiched between two linear
lters (Victor et al., 1977; Korenberg et al., 1989). This model is an extension of
the rectication model, and is similar to one proposed by French and Korenberg
(1989) to describe the transformation of injected currents into spike trains by
cockroach mechanoreceptors. The linear lter that precedes the rectication stage
is low{pass, and is determined by the passive properties of the cell membrane.
The linear lter that follows the rectication stage is high-pass, and presumably
summarizes the eect of voltage{ and calcium{dependent conductances. The whole
model is described by 6 parameters.
In the Discussion we compare our approach to that of other studies, we examine
the possible role of the spike encoding mechanism in shaping the visual responses
of neurons in the primary visual cortex, and we speculate on the possible role of
the high{frequency uctuations observed in vivo by Jagadeesh et al. (1992).
Portions of this work have been presented as conference abstracts (Carandini
et al., 1994b, 1995).
136
4.1
Methods
4.1.1 Preparation and Maintenance
Brain slices were prepared from albino or pigmented guinea pigs (150-600 g) that
were deeply anesthetized with pentobarbitol (35-70 mg/kg) and then decapitated.
The skull was rapidly opened with rongeurs and the visual cortex was removed
and placed in ice-cold Ringer's solution. The cooled block was axed to the stage
of a vibratome with cyanoacrylate and 350 m thick slices were cut. Slices were
individually incubated at room temperature in a Ringer's solution continuously
bubbled with 95% O2 and 5% CO2 until being placed into a recording chamber
(between 1 and 12 hours later) that was maintained at 22-33 C. The Ringer's solution contained (in mM) 124 NaCl, 5 KCl, 1.2 NaH2P O4 , 2.7 CaCl2, 3 MgSO4,
26 NaHCO3, 10 Glucose.
Cells were impaled with glass micropipettes lled with 3M KCl having DC resistances of 70 to 150 M
. Intracellular recordings were performed with a currentclamp (Axon Instruments) recording amplier utilizing capacitance neutralization.
Current was injected through an active bridge circuit allowing the voltage drop
across the electrode resistance to be subtracted. The electrode was tested in the
solution to make sure it did not introduce substantial rectication and other nonlinearities. Stimulus generation and data acquisition were all performed by computer
through a CED 1401 Plus interface (Cambridge Electronic Design). Injection currents were sampled at 1-4 kHz; voltage traces were sampled at 4 kHz.
The recordings were obtained from neurons in the primary and secondary visual
cortices (Creel and Giolli, 1972; Choudhury, 1978; Wree et al., 1981; Spatz et al.,
1991). Neurons were identied as regular-spiking if their response to current steps
137
showed spike frequency adaptation, had no tendency to burst and had a denite
threshold for the generation of a single action potential (Connors and Gutnick,
1990).
4.1.2 Stimuli
We used three types of stimuli: (1) Steps, in which the current I (t) stepped from
I0 to I1 and back. (2) Sinusoids I (t) = I0 + I1sin(2ft). (3) Broadband noise,
obtained by adding 8 incommensurate sinusoids: I (t) = Ic Pi8=1 sin(2fit + i).
The frequencies fi were chosen so that their sums and dierences would not coincide
(Victor and Shapley, 1980). The deterministic nature of this broadband signal
makes it particularly useful for system identication purposes (Victor and Knight,
1979).
Experiments contained a sequence of stimuli lasting 2-16 s each, separated by
4 s intervals during which no current was injected. The stimuli in each experiment
were presented in random order to minimize the eect of slow drifts in the quality
of the impalement. The order of the stimuli was recorded, to assess the importance
of such drifts. Each cell was tested with three core experiments:
1. A sequence of steps of dierent amplitude I1.
2. A sequence of sinusoids of dierent frequency f and of dierent baseline
intensity I0 and/or amplitude I1.
3. An experiment constituted by 16 stimuli: 8 broadband noise stimuli, in which
the 8 component sinusoids assumed dierent relative phases, and 8 sinusoid
stimuli, in which the component sinusoids were presented alone. The values
of the phases fig, as well as the methods for computing the frequency tuning
138
of the responses to the broadband stimuli are described by Victor (1988). The
amplitude of the component sinusoids was I1 = 0:25 0 1 nA when presented
alone, and Ic = 0:075 0 0:3 nA when presented with the others. Their ratio
I1=Ic was between 3 and 4. The RMS intensity of the broadband stimuli was
70-95% that of the sinusoid stimuli. Two sets of frequencies fi were used:
(in Hz) f 0.933, 2, 4.133, 8.4, 16.933, 34, 68.133, 136.4g and f 0.193, 0.452,
0.968, 2.0, 4.065, 8.193, 16.452, 32.968g. The frequencies in each set are
integer multiples of a fundamental frequency, which is 0.133 Hz for the rst
set and 0.0643 Hz for the second set. Each stimulus lasted one period of the
fundamental frequency (7.5 s or 15.5 s).
4.1.3 Data Analysis
We analyzed two types of responses. The rst was the raw membrane potential
response V (t), an analog signal. The second was the spike train response S (t), a
discrete signal which was computed from V (t) oine by detecting the downward
crossings of V (t) with a threshold around -10 mV. For mathematical analysis, the
individual spikes were considered as Dirac delta functions, i.e. as innitesimal
intervals in which the ring rate was innite. A spike train would thus have the
form S (t) = PNj=1 (t0tj ), where ftj gNj=1 are the spike times. The Fourier transform
of the spike train at a frequency f is S^(f ) = (2=T ) PNj=1 exp(2iftj ), where T is
the stimulus duration.
The sandwich model has three stages, dened respectively by Equations 4.1,
4.2 and 4.3 in the Results section. Each stage has two free parameters. The parameters of the rst stage (membrane resistance and capacitance) were estimated
by tting the transfer function of an RC circuit (Equation 4.1) to the rst harmon139
ics of the membrane potential responses. The remaining parameters, two for the
rectier (Equation 4.2) and two for the high-pass linear lter (Equation 4.3), were
subsequently estimated by a minimization routine that searched for the minimum
square dierence between the model predictions and the spike trains. Both the
model predictions and the spike trains were lowpass-ltered, usually with a cuto
at 200 Hz, before computing their dierence.
140
4.2
Results
We recorded from 26 cells in slices of the guinea pig visual cortex. Of these, 9
satised our criteria for healthy impalements, were classied as regular-spiking as
described in the Methods section, and were held long enough to be tested with the
core experiments in our paradigm. The average resting potential of these cells was
-70.2 6 4.7 mV (mean 6 s.e.m.); their spike threshold was 31.1 6 5.3 mV; spike
height was 87.8 6 6.2 mV. The average spike half-width was 1.1 6 0.1 ms for the
7 cells recorded at 33 C, and 2.5 and 3.4 ms respectively in the other two cells,
which were recorded at 22 C.
In order to illustrate the relation between the responses to dierent stimuli, we
chose to show data from the same cell in all gures. This cell (19s2) is typical of
our sample, and the properties described in this study were extremely consistent
across cells.
We recorded no spontaneous spikes when searching for cells, and we did not
observe postsynaptic potentials in our intracellular recordings. This complete lack
of spontaneous activity is a fundamental dierence between our in vitro preparation
and the normal in vivo conditions. Indeed, our voltage traces (e.g. Figure 4.1) look
very dierent from those obtained in vivo (e.g. Jagadeesh et al. 1992), in that they
are entirely stimulus-driven, and the only high-frequency uctuation in membrane
potential is given by the action potentials. As a consequence, the cells in this
study should not be thought of as part of a network, but as single computational
elements.
141
A
B
C
f=2
f=5
Potential (mV)
f =10
0
−40
f =20
−80
0
Time (s)
0.5
Responses to sinusoidal current injections. The rows correspond to dierent frequencies of stimulation (f = 2; 5; 10; 20 Hz). The
modulation amplitude was I1 = 0:6 nA in A and 0:3 nA in B and C. In
C the stimulus had a positive baseline intensity I0 = 0:3 nA (the baseline
intensities in A and B were zero). Spikes are truncated at 0 mV. Cell 19s2,
exp. 5.
Figure 4.1:
142
4.2.1 Spike Train Responses
Sinusoidal stimuli. Figure 4.1 shows the responses of a cell to sinusoidal current
injection. The central column shows the responses elicited with a modest stimulus
amplitude (0.3 nA). The left and right columns show respectively the eect of
doubling the amplitude of the stimulus, and of introducing a positive baseline
current (0.3 nA). In both conditions the cell red more spikes. This is particularly
clear for the 20 Hz stimuli (bottom row), which did not elicit any spikes at low
amplitude, and elicited many spikes at high amplitude or in the presence of a
positive baseline current.
The spike encoding mechanism of the cells exhibited a band-pass character.
For frequencies below a certain cuto frequency, increasing the stimulus frequency
increased the number of spikes. Frequencies above the cuto elicited no spikes.
The value of the cuto depended on stimulus amplitude and baseline intensity. A
typical example of this behavior is shown in Figure 4.2, which plots the amplitude
and phase of the rst harmonic component of the spike responses as a function of
stimulus frequency. The dierent curves correspond to dierent stimulus amplitudes (A) and baseline intensities (B). The responses to low amplitude sinusoids
in the absence of baseline injected currents are plotted in both Panels (small dark
symbols). In this condition the amplitude of the responses peaked at 5 Hz, and
was zero above 15 Hz. Panel A shows that increasing the stimulus amplitude I1
uncovered strong responses to the 15 and 20 Hz stimuli. Panel B shows that introducing a positive baseline current had a similar eect. This behavior was typical
of all the cells we tested, irrespective of whether the spike responses were measured
by their rst harmonic or by the mean ring rate.
The curves t to the phase data in Figure 4.2 are the predictions of a simple
143
Firing Rate (Spikes/s)
50
A
B
20
10
5
Phase (deg)
45
0
Amplitude
0.6 nA
0.45
0.3
−45
1
Baseline
0.4 nA
0.2
0
2
5 10 20 1
Frequency (Hz)
2
5 10 20
Frequency (Hz)
Frequency tuning of the ring rate responses to sinusoidal
currents for dierent stimulus amplitudes (A) and baseline intensities (B).
The top panels show the amplitude of the responses, the bottom panels their
phases. The cell acted as a band-pass lter, whose best frequency depended
on the stimulus amplitude and baseline intensity. Firing rates were measured
by computing the rst harmonic components of the spike trains. Some of
the responses appear in Figure 4.1. The continuous curves tting the phase
data are the predictions of a delay. They would become lines if they were
plotted in linear scale. Cell 19s2, exp. 5.
Figure 4.2:
144
model consisting of a delay with an arbitrary phase lead. These curves would be
straight lines if plotted in a linear scale; the phase-vs-frequency plot of the output
of a delay is a line, whose negative slope is equal to the duration of the delay.
We will call the intercept of the tted line with the ordinate the phase lead of
the system, and the negative slope of the line the integration time of the system.
The phase lead of the spike encoding mechanism was positive, indicating that the
spike responses to low{frequency sinusoids were concentrated on the rising phases
of the stimulus. This phase lead increased with stimulus amplitude or baseline
intensity. The integration time of the spike encoding mechanism did not vary with
the amplitude of the sinusoidal currents. It decreased with their baseline intensity,
a phenomenon that will be further examined in the context of the responses to
broadband noise.
Braodband noise stimuli. The frequency tuning of the ring rate measured
with broadband noise was very dierent from that measured with single sinusoids. The cells were more responsive both to the low and to the high frequencies
when they were part of a broadband stimulus than when they appeared as a single sinusoid. A typical example of this behavior is shown in Figure 4.3. While
the frequency tuning of the ring rate gain measured with single sinusoids was
markedly band-pass (black symbols), that measured with broadband stimuli was
at or mildly high-pass (gray symbols). In other words, all the frequencies present
in the boadband stimuli were represented in the spike train, with no sign of attenuation.
Response phase is another aspect in which the spike train responses to broadband noise diered from the responses to single sinusoids. The slope of the phase
vs. frequency lines was steeper for sinusoids than for broadband noise, indicating
145
Firing rate gain (spikes/s/nA)
50
20
10
5
Sinusoids
Noise
Phase (Deg)
45
0
−45
0.2
0.5 1 2
5 10 20
Frequency (Hz)
50
Comparison of the frequency tuning of the ring rate responses
to sinusoids and to broadband noise. The top panel shows the gain of the
responses, the bottom panel their phase. Gain was measured by computing the rst harmonic of the spike trains at a given component frequency
and dividing the result by the intensity of that component in the stimulus.
Stimulation with broadband noise enhanced the cell's responsivity to the low
frequencies and uncovered strong responses to 33 Hz. The broadband noise
was the sum of 8 sinusoids whose amplitude was 0:135 nA. When injected
alone, the intensity of the sinusoids was 0:45 nA. The continuous curves
tting the phase data are the predictions of a delay and would become lines
if they were plotted in linear scale. Their negative slope is a measure of
the delay of the responses (\integration time"). Their intercept with the
zero-frequency axis is a measure of the phase by which they lead the current
injections (\phase lead"). Cell 19s2, exp. 4.
Figure 4.3:
146
a decrease in the integration time of the spike encoding mechanism. The responses
to broadband noise were advanced by around 10 ms with respect to sinusoids. Also
the phase lead decreased, from around 45 for sinusoids to around 10 for broadband noise. This means that while with single low frequency sinusoids the spikes
were concentrated on the rising phases of the stimulus, when these sinusoids were
part of a boadband stimulus the responses were concentrated on the peak of the
stimulus cycle.
The results of the broadband noise experiments were nearly identical in all
the cells of our sample. This is made clear in Figure 4.4. The top panels show
the responsivity of the cells as measured with sinusoids (A) and with broadband
noise (B), as a percentage of the maximal responsivity to broadband noise. At
frequencies below 8 Hz the responsivity of all cells was mildly enhanced by the
broadband stimulation. Between 8 and 20 Hz the responsivity was essentially
the same with broadband noise and with sinusoids. At frequencies above 20 Hz
the responsivity was dramatically enhanced by the broadband stimulus. Panel
C shows how for all cells the integration time was shorter when measured with
broadband noise than when measured with sinusoids: the mean integration times
were respectively 3 6 1 ms and 14 6 5 ms. The phase lead also changed substantially
in the two conditions, from 28 6 7 to 6 6 3 (D).
Nonlinearities. If spike train encoding were a linear system, the spike density in
response to a sinusoidal current injection would modulate sinusoidally. Figure 4.5A
shows that this was far from being the case. The period histograms of the spike
responses to sinusoidal current injection displayed two kinds of nonlinearity (French
et al., 1972; Ascoli et al., 1974). The rst nonlinearity was response rectication:
there are spikes only when the sinusoidal currents are positive. This nonlinearity
147
Firing rate gain (%)
100
A
100
50
50
20
20
B
1
10
100
Frequency (Hz)
1
10
100
Frequency (Hz)
20
40
C
Phase lead (Deg)
(NOISE)
Integration time (ms)
(NOISE)
25
15
10
5
0
30
D
20
10
0
0
5
10 15 20 25
Integration time (ms)
(SINUSOIDS)
0
10
20
30
40
Phase lead (Deg)
(SINUSOIDS)
Summary of the data obtained with sinusoid and broadband
noise currents in all the cells in our sample. The top panels show the frequency tuning of the responses as measured with sinusoidal currents (A)
and with broadband noise currents (B). To facilitate comparison, each cell's
responses were normalized by its maximal response to the broadband stimuli. The responsivity of every cell was enhanced by the broadband stimuli,
except in the range between 10 and 20 Hz, where on average it remained
constant. The bottom panels show the dierences in the time course of the
responses to sinusoids and broadband noise. Integration time and phase lead
were obtained from the responses of each cell to sinusoids and broadband
noise, by tting the phase vs. frequency data with lines, as in the bottom
panel of Figure 4.3. Both the integration time (C) and the phase lead (D)
were shorter in the responses to broadband noise than in the responses to
sinusoids.
Figure 4.4:
148
0.19 Hz
0.45 Hz
0.97 Hz
2 Hz
4.1 Hz
8.2 Hz
16 Hz
33 Hz
150
100
A
50
0
Firing Rate (Spikes/s/nA)
150
100
B
50
0
5.18
Time (s)
2.21
1.03
0.5
0.24
0.12
0.06
Figure 4.5: Period histograms of the spike responses to sinusoids (A) and
to broadband noise (B). The horizontal lines show the mean ring rates and
the sinusoidal curves show the rst harmonic of the responses. A: Responses
to eight dierent sinusoids. The amplitude of the sinusoids was I1 = 0:45
nA. B: Response to broadband noise obtained by adding the eight sinusoids.
The amplitude of each sinusoid was Ic = 0:135 nA. Each panel shows the
spike rate over the period of one of the component sinusoids. The eight
panels originate from the same spike train, and dier only in the period used
to average the responses. The sinusoid histograms show two nonlinearities:
\rectication" (the spikes do not encode the negative portions of the signal),
and \spike synchronization" (the cell tends to spike at particular stimulus
phases). In the presence of broadband noise both nonlinearities disappear:
histograms are much more sinusoidal (\linearization by noise"). Cell 19s2,
exp. 4.
149
0.03
is to be expected in visual cortical cells, since these cells usually show no spike
activity at rest, and ring rates cannot be negative. The second nonlinearity is
spike synchronization which means that spikes tend to occur at particular times in
the stimulus cycle. This results in sharp peaks in the spike histograms.
These nonlinearities are not present in the responses to broadband noise. An
example of this is shown in Figure 4.5B, which plots the period histograms corresponding to the eight frequencies present in the broadband stimulus. The stimulus
was proportional to the sum of the sinusoidal currents of Panel A. The responses
to the broadband stimuli were much more sinusoidal than the responses to the
sinusoids (Panel A), and indeed were well described by their rst harmonic. This
eect is known as linearization by noise (Spekreijse and Oosting, 1970).
The linearizing eect of broadband noise cannot be judged exclusively from
period histograms such as those shown in Figure 4.5. By construction, period
histograms average out any frequency components that are not multiples of the
fundamental frequency of the histogram. The responses to broadband stimuli
could still be very nonlinear and have power at frequencies that are not multiples
of the eight input frequencies; this nonlinearity would not appear in Figure 4.5B. In
addition, histogramming is a form of smoothing, which hides the high-frequency
components of the responses. For example, the third frequency in Figure 4.5
(0.965 Hz) is ve times the rst frequency (0.193 Hz). The strong component of
the responses at the higher frequency (third panel in B) does not however appear
in the period histogram of the lower frequency (rst panel in B), because the
histograms contain 8 bars, and are thus only able to reveal harmonics below the
fourth.
To assess more precisely the degree to which broadband noise linearized the
150
spike responses, we analyzed their spectral composition. The responses of a nonlinear cell would contain sinusoids of frequencies not present in the stimulus. These
additional frequencies would be expected to include the sums and/or dierences
of some of the frequencies present in the stimulus (Victor and Shapley, 1980).
For a single sinusoid stimulus, these correspond to the zeroth and second harmonics. Consistent with the nonlinearity observed in the period histograms, the
responses to sinusoidal currents showed substantial power at the second harmonic:
on average the peak power of the rst harmonic responses to sinusoids was only
1:2 6 0:3 times larger than that of the second harmonic responses. The broadband
stimuli contained eight frequencies, yielding 64 possible sums and dierences of
frequencies. The Fourier component of the responses at these sum and dierence
frequencies is called the \second order kernel" of the responses (Victor et al., 1977),
and is a measurement to which we will return when evaluating the predictions of
the sandwich model. We found that on average the peak power of the responses
at the stimulus frequencies was 2:4 6 0:8 times larger than the peak power of the
second order kernel, conrming the linearizing eects of broadband noise on the
spike responses.
4.2.2 Membrane Potential Responses
The simplest possible model of the transformation of injected currents into membrane potentials is a single-compartment passive model of the membrane composed
of a capacitance and a resistance in parallel (RC model). There are some indications
that this model may be at least partially successful in describing the subthreshold
responses of cortical neurons. For example Stafstrom et al. (1984b) reported an
approximately linear current-voltage relation for the membrane potentials of cortical cells below threshold. The passive model is however clearly incomplete above
151
threshold, where the membrane potential responses exhibit a variety of nonlinearities, which of course include the spikes themselves. These phenomena have been
extensively studied, and a large body of knowledge is now available that describes
the passive and active properties of these cells (Connors et al., 1982; Stafstrom
et al., 1984a,b,c; McCormick et al., 1985; Schwindt et al., 1988a,b,c; Connors and
Gutnick, 1990; Douglas and Martin, 1990; Lorenzon and Foehring, 1992; Gutfreund
et al., 1995).
We were interested in modeling the transformation of injected currents into
membrane potentials with as few free parameters as possible, so that we could use
such a model as the rst stage of a model of the spike train responses to injected
currents. Being described by just two parameters, the RC model of the membrane
was an ideal candidate for this rst stage. We therefore set out to measure the
discrepancy between its output and the membrane potential responses of our cells.
We were somewhat surprised to nd that in most respects this simple linear model
provided an acceptable rst approximation to the membrane potential responses,
even above threshold.
Sinusoidal stimuli. The membrane potential responses of our cells were largely
consistent with the output of a linear lter. This substantial linearity can be
assessed by observing the degree to which the responses to sinusoidal currents were
sinusoidal. Formally, this was done by investigating their spectral composition and
comparing the size of their rst harmonic component with that of all the other
components. Figure 4.6 illustrates this analysis. Panel A shows one period of the
response of a cell to a sinusoidal current. Panel B shows the decomposition of
the response into a rst-harmonic component and a residual response, which is
essentially constituted by the action potentials. Panel C plots the power of the
152
A
Potential (mV)
0
B
-50
-100
0
0.2 0
Time (s)
0.2
Time (s)
C
Power (mV^2)
100
10
1
0.1
0.01
1
5
10
50 100
Frequency (Hz)
500
Predominance of the rst harmonic of the membrane potential
response to sinusoidal current. A: the rst period of a response to sinusoidal
current injection at a frequency f = 5 Hz. The baseline intensity was I0 = 0
and the amplitude was I1 = 0:45 nA. B: decomposition of the response into
rst harmonic and residual (actual response minus rst harmonic). Besides
the spikes the residual traces are substantially at. C: Squared amplitude
of the Fourier Transform of the response. The rst harmonic (f = 5 Hz) is
by far the largest frequency component in the response. Cell 19s2, exp. 5.
Figure 4.6:
153
dierent frequencies in the response. The spikes are fast events that contribute
very little power: their presence does not strongly aect the spectral composition
of the responses. The rst harmonic clearly dominated the response. The power of
the rst harmonic was between 9 and 141 times the power of the second harmonic.
The Total Harmonic Distortion | the total power at the harmonics higher than
the rst | was 3.8% (median) of the power at the rst harmonic.
We performed the same data analysis on the membrane potential responses as
on the spike train responses, to characterize the relation between the two types
of response. The frequency response of the membrane potential responses, shown
in Figure 4.7, was very dierent from that of the ring rate (Figure 4.2). The
membrane clearly acted as a low-pass lter, with a corner frequency around 10 Hz.
The curves shown in the Figure are the predictions of the RC model, the singlecompartment passive model of the membrane. The frequency tuning predicted by
the RC model is
(4.1)
V^ (f ) = I^(f )R=(1 + 2if );
where V^ and I^ are the Fourier transforms of the membrane potential V and of
the stimulus intensity I . The variable f is the frequency of the stimulus, and the
parameters R and are respectively the membrane resistance and time constant.
The model was t to the amplitude and phase of all the responses shown in the
Figure, and all the curves that appear in the Figure were determined by the same
two parameters, R and . The ts of the RC model were generally satisfactory. On
average, the ts accounted for 80% of the variance of the rst harmonic data. For
comparison, the t in the Figure captures 87% of the variance. The parameters of
the ts to all the cells in our sample are listed in Table 4.1.
Figure 4.7 illustrates some deviations from perfect linearity in the membrane
154
40
A
Potential (mV)
30
B
20
10
Phase (deg)
0
−45
Baseline
0.4 nA
0.2
0
Amplitude
0.6 nA
0.45
0.3
−90
1
2
5 10 20 501
Frequency (Hz)
2
5 10 20 50
Frequency (Hz)
Frequency tuning of the membrane potential responses to sinusoidal currents for dierent stimulus amplitudes (A) and baseline intensities
(B). The top panels show the amplitude of the responses, the bottom panels
their phases. Responses were measured by computing the rst harmonic
components of the membrane potential traces. The cell acted as a low-pass
lter. By contrast, the ring rate responses of the same cell to the same
stimuli were more band-pass (Figure 4.2). The dashed curves show the
predictions of the \RC circuit", a single-compartment passive model of the
membrane, composed of a resistor and a capacitor in parallel (Equation 4.1).
Cell 19s2, exp. 5.
Figure 4.7:
155
Cell
02s2
02s3
05s1
15s1
16s1
18s1
19s1
19s2
20s1
Layer
?
?
III
II/III
IV/V
II/III
III/IV
IV/V
IV/V
Vrest (mV)
R (M
)
(ms)
-56.6
-53.0 6 1.6
-77.0 6 1.4
-95.8 6 3.8
-70.9 6 1.4
-76.0 6 1.3
-78.6 6 0.1
-70.7 6 1.4
-53.6 6 0.5
28.1
48.4 6 7.8
34.5 6 2.4
109 6 14
71.3 6 16
40.9 6 7.9
52.3 6 12
58.3 6 3.5
73.4 6 9.0
8.5
6.7 6 1.3
8.0 6 1.3
5.6 6 0.6
17.6 6 3.1
7.1 6 1.4
9.5 6 2.4
9.3 6 0.5
8.9 6 1.6
General properties of the cells in our sample. Vrest is the resting
potential of the cells. Input resistance (R) and time constant ( ) were measured by tting the frequency tuning of a single-compartment passive model
of the cell membrane (Equation 4.1) to the rst harmonic of the membrane
potentials obtained with sinusoidal stimulation. The values in the table are
the means (6 s.d.) of dierent measurements made during the course of a
recording session. The membrane resistance, time constant and resting potential varied substantially during a recording session, presumably an eect
of the decaying quality of the impalements.
Table 4.1:
156
potential responses. Indeed, if the cells were perfectly linear changing the baseline intensity of the stimuli would not aect the amplitude of their rst harmonic
responses. In that case, the data points that have the same gray level (stimulus
amplitude) but dierent sizes (baseline intensities), would coincide. Instead, Figure 4.7B shows that the frequency tuning did show a mild dependence on the baseline intensity. In addition, if the cells were perfectly linear neither the amplitude
nor the baseline intensity of the stimuli would aect the phase of the responses.
Instead, the phase data in the Figure do show a dependence on stimulus amplitude
and baseline intensity.
For some stimulus conditions the membrane potential responses of some cells
displayed mildly band-pass transfer functions, peaking at 4-8 Hz. An example of
such a transfer function is shown in the Figure 4.8 (gray symbols). In this case,
the amplitude of the responses was higher at 8 Hz than at lower frequencies, and
the phases led the predictions of the RC model. This band-pass behavior was
overall very mild, and was consistent with the recent results of Gutfreund et al.
(1995), who in a similar preparation found subthreshold oscillations and resonant
frequencies in the 3-20 Hz range. Similar resonant behaviors have been observed
in other types of cortical cells (Silva et al., 1991; Llinas et al., 1991). Accounting
for this aspect of the frequency tuning of the membrane potential would require a
more sophisticated model than a single RC circuit, for example one that included
voltage-dependent conductances that would act as phenomenological inductances
(Cole and Baker, 1941; Koch, 1984; Gutfreund et al., 1995).
Broadband noise stimuli. Unlike the spike train responses, the membrane po-
tential responses to broadband currents were closely predictable from the responses
to sinusoids. This is illustrated in Figure 4.8: the membrane potential had essen157
70
60
Gain (mV/nA)
50
40
30
Sinusoids
Noise
20
Phase (Deg)
0
−45
−90
0.2
0.5 1 2
5 10 20
Frequency (Hz)
50
Comparison of the frequency tuning of the membrane potential
responses to sinusoids and to broadband noise. The top panel shows the
gain of the responses (the membrane impedance), the bottom panel their
phase. Gain was measured by computing the rst harmonic of the membrane
potential responses at a given component frequency and dividing the result
by the intensity of that component in the stimulus. The similarity of the
tuning to sinusoids and broadband noise is consistent with the cell operating
as a linear system. By contrast, the ring rate responses of the same cell
to the same stimuli show a substantial dierence between the sinusoid and
broadband noise conditions (Figure 4.3). The dashed curves indicate the t
of the single-compartment passive model of the membrane (RC circuit) to
the sinusoid data. Cell 19s2, exp. 4.
Figure 4.8:
158
tially the same frequency tuning whether it was measured with sinusoids (black
symbols) or with broadband noise (gray symbols). This behavior is another piece
of evidence for the partial linearity of the membrane potential responses. Consistent with this linearity, most of the power of the membrane potential responses
to the broadband stimuli was concentrated at the eight frequencies that composed
the stimuli. The peak power of the responses at the component frequencies was
between 11 and 426 (mean: 133) times larger than the peak power of the responses
at the sums and dierences of the component frequencies.
To ascertain whether the equivalence of the membrane potential responses to sinusoids and to broadband noise was shared by all our cells, we t the rst harmonic
responses to broadband noise and sinusoids independently using Equation 4.1 and
compared the parameters obtained in the two conditions. The result is illustrated
in Figure 4.9: neither the resistance nor the time constant changed substantially
between the two stimulus conditions. In some cells, however, the sinusoid measurements appeared more erratic, and the quality of the ts by Equation 4.1 was
lower than that of the ts to the broadband noise measurements. On average, the
RMS error in the ts to the broadband noise measurements was 72% of the RMS
error in the ts to the sinusoid measurements. This occasional discrepancy may be
due to the higher intensities used in the sinusoid stimuli, which may be activating
strong active conductances.
4.2.3 The Sandwich Model
To account for the transformation of injected currents into ring rates, we have used
a simple \sandwich" model. Sandwich models are composed of a static nonlinearity
sandwiched between two linear lters, and have been used to model a variety of
159
Resistance (MOhm)
(NOISE)
200
A
100
50
20
20
50
100
200
Resistance (MOhm)
(SINUSOIDS)
Time constant (ms)
(NOISE)
20
B
15
10
5
0
0
5
10
15
20
Time constant (ms)
(SINUSOIDS)
Summary of the passive properties of the membrane, as measured with sinusoids (abscissae) and with broadband noise (ordinates). A:
membrane resistance. B: membrane time constant. The values were obtained by tting the broadband noise responses and the sinusoid responses
with the predictions of a single-compartment passive model of the membrane. Each data point corresponds to one experiment in which sinusoid
and broadband stimuli were randomly interleaved. Four of the nine cells
in our sample were tested with two dierent sets of frequencies, yielding a
total of 13 data points. The substantial identity of the tted values suggests
that the cell encodes the input currents into membrane potentials in a linear
fashion.
Figure 4.9:
160
A
100
40
m
V
20
Gain (%)
Gain (mV/nA)
50
0
0
90
90
0
−70
0
Phase (deg)
Phase (deg)
I
60
−20
Potential (mV)
−90
0
−90
1
10 100
Frequency (Hz)
1
10 100
Frequency (Hz)
I
B
30 mV
0
−1
m
V
nA
1
0.1
0.2
0
0.1
0
0.1
−30
R
100 %
0
0
R
100 spikes/s
50
0
0.1
0.2
0.2
0
0.1
0.2
0
0
0
50
0
0.1
0.2
0.2
0
0.1
0.1
0.2
0
0.1
0.2
0
0
0
0.1
0.2
0.2
0
0.1
0.2
0.1
0.2
0
0.1
0.2
0.1
0.2
0
0.1
0.2
C
D
E
0
0
0.1
0.2
Time (s) 0.25
The sandwich model. A: structure of the model. The rst
stage is a low{pass lter (the RC model of the membrane). Its inputs are
the injected currents, and its outputs are the passive membrane potential
responses. The second stage is rectication. Its output is the amount of
activation of the cell. The third stage is a high-pass linear lter. Its output is
fed to a (parameter-free) rectication stage, which ensures that the predicted
ring rates are positive. B-E: output of the dierent stages for dierent
input currents: a low-frequency sinusoid (B), a high-frequency sinusoid (C),
the sum of the two (D), and a step (E). The model parameters used in this
Figure are the median values of the parameters estimated for our population
(Table 4.2).
Figure 4.10:
161
neural systems (Victor et al., 1977; Victor, 1988; French and Korenberg, 1989;
Korenberg et al., 1989). Figure 4.10A illustrates the structure of the model.
The rst stage of the model is a low-pass linear lter, the single-compartment
passive model of the membrane. Its input is the injected current I (t) and its output
is a linear prediction of the membrane potential V (t). This stage is described
(in the frequency domain) by Equation 4.1, which is fully determined by two
parameters: the membrane resistance R and the time constant .
The second stage is a static nonlinearity: a rectier with threshold VT . Its
output m is given by:
m(t) = G max(0; V (t) 0 VT ):
(4.2)
This stage has two free parameters: the threshold VT (in mV), and the gain G (in
spikes/s/mV).
The third stage is a high-pass linear lter. Its output F is described in the
Fourier domain by
^ (f )(1 0 gH =(1 + 2ifH ));
F^ (f ) = m
(4.3)
where m is the output of the rectication stage, and gH and H are free parameters
that determine the shape of the transfer function.
The model also includes a fourth stage, a half-rectier that ensures that the
predicted ring rates are positive. This stage requires no parameters. Its output
is the ring rate R(t) predicted by the model:
R(t) = max(0; F (t)):
(4.4)
Figure 4.10B-E illustrates the output of the dierent stages for four dierent
input currents. The current in B is a low-frequency sinusoid. The membrane
162
potential response (V ) predicted by the RC model of the membrane is quite large,
and a substantial portion of it (m) is above threshold, and is input to the highpass lter. The lter enhances the portions of its input that vary rapidly in time,
and suppresses the portions that are roughly constant. Its rectied output (R) is
thus concentrated in the upward-going portion of the sinusoidal input. The input
current in C is a high-frequency sinusoid. It is substantially attenuated by the rst
low-pass lter, so no portion of it reaches threshold. As a consequence, the output
of the model is zero. The input current in D is the sum of the ones in B and C.
The output V of the rst linear lter is thus the sum of its outputs in B and C.
The two sinusoids help each other get across the threshold, and the output m of
the rectication stage contains both the low and the high frequency. The highpass lter greatly enhances the high-frequency components of the input current,
so the ring rate R has strong high-frequency components. The current in E is a
square wave. The output V of the rst linear lter is a smoothed version of the
current. A substantial portion of it is above threshold, so it appears in the output
m of the rectication stage. The high-pass lter enhances the initial transient and
suppresses the subsequent constant portion, yielding a rapidly adapting ring rate
R.
The model provided good ts to our data. For broadband noise experiments
(which included eight sinusoid stimuli and eight broadband stimuli) it accounted
for 78 { 95% of the variance of the responses (median: 89%). The percentage
of the variance was measured as the variance of the dierence between predicted
and actual responses, divided by the variance of the actual responses. Before this
measurement both responses were smoothed with a cuto of 150 Hz. The values
of the parameters obtained for each cell are listed in Table 4.2; the median values
of the parameters were used to draw Figure 4.10A. Figures 4.11, 4.12 and 4.13
163
Cell
02s2
02s3
05s1
15s1
16s1
18s1
19s1
19s2
20s1
VT (mV)
G (S/s/mV) gH (%)
H (ms)
-49.5
-48.7 6 0.0
-60.0 6 3.0
-79.2 6 4.4
-61.9 6 2.6
-52.3 6 4.8
-56.3 6 4.9
-60.1 6 1.6
-43.8 6 2.4
107.01
19.0 6 9.9
21.4 619.3
3.0 6 1.2
7.7 6 3.0
7.7 6 5.3
5.7 6 1.8
7.3 6 0.7
8.2 6 2.4
3.2
57.8 6 60
17.1 6 7.7
43.9 6 9.3
14.6 6 7.8
48.3 6 46
32.1 6 8.4
15.3 6 2.4
16.3 6 8.9
98
97 6 2
88 6 5
85 6 12
94 6 3
88 6 14
96 6 5
94 6 4
84 6 4
Parameters of the sandwich model for the cells in our sample.
The model was t independently to each experiment. The rst two parameters determine the rectication stage (Equation 4.2): VT is the threshold in
mV and G is the gain in spikes/s/mV. The last two parameters determine
the high-pass lter (Equation 4.3): gH is the zero-frequency attenuation (in
percentage), and H is the low-cut frequency.
Table 4.2:
164
Current (nA)
Potential (mV)
1
0
-1
0
-50
-100
Rate (Sp/s)
60
40
20
0
0
0.5
1
1.5
Time (s)
2 0
0.5
1
1.5
2 0
0.5
1
1.5
2
2.5
3
3.5
4
Responses to a low-frequency sinusoid, to a high-frequency
sinusoid and to a broadband noise stimulus. Top row: injected currents.
Middle row: membrane potential responses. Bottom row: spike train responses (thick gray curves) and predictions of the sandwich model (thin
black curves), both low-passed by convolving with a Gaussian ( = 25 Hz).
The parameters of the sandwich model were obtained by tting all the responses in the experiment, which consisted of eight sinusoid stimuli and eight
broadband stimuli obtained by adding the sinusoids with eight dierent sets
of relative phases. Cell 19s2, exp. 4.
Figure 4.11:
illustrate the ts to the data set that we have previously used when comparing the
responses to broadband noise to those to single sinusoids. As the model accounted
for 86% of the variance of the responses in this data set, these gures give a
conservative example of the quality of the ts.
The bottom row of Figure 4.11 illustrates how the sandwich model captures
general features of the responses, such as the fact that the spikes are preferentially
located on the rising phase of the sinusoidal currents. The thick curves are the
actual ring rates of the cell, obtained by smoothing the spike trains, and the thin
curves are the predictions of the model. The model predicts the locations of the
165
clusters of spikes in response to the broadband stimuli, although it does not always
predict the right number of spikes in the clusters. Another aspect of the responses
that is captured by the model is the phenomenon of linearization by noise. We
observed this by averaging the predicted ring rate responses over one period of a
sinusoid composing the stimulus, as was done for the actual ring rate in Figure 4.5.
The result was much more sinusoidal when the stimulus was broadband noise than
when it was a sinusoid. As pointed out by Spekreijse (1970), linearization by
noise is a general property of models that include a static nonlinearity such as the
rectication stage.
To better compare the predictions of the model with the data, we performed
on the simulated responses the same analysis that we had performed on the actual
responses of the cells. An example of the results is shown in Figure 4.12. The Figure
shows a comparison of the linear responses of the cell and of the model to sinusoid
and broadband noise stimuli. The model correctly predicts the band-pass tuning of
the responses to sinusoids, and the broadening of the tuning caused by stimulation
with broadband noise. In addition, the model captures the dierent response
phases obtained with the two kinds of stimulus, correctly predicting the lower
integration time and phase lead that are obtained with the broadband stimulus.
We further evaluated the quality of the ts by analyzing the second-order kernels of the broadband noise responses, which measure the frequency component of
a response at the sums and dierences of frequencies present in the input (Victor
et al., 1977; Victor and Shapley, 1980). The second-order kernel of a response R
is dened as K2(fa; 6fb) = R^(fa 6 fb), where fa and fb are frequencies present in
the input, and R^ (f ) is the Fourier transform of the response R at the frequency
f . Since our broadband stimuli contain eight dierent frequencies, K2 assumes
2 2 8 2 8 values, except that it is undened in the 8 cases in which fa = 0fb.
166
Gain (Spikes/s/nA)
50
20
10
5
Sinusoids
Noise
Phase (Deg)
45
0
−45
0.2
0.5 1 2
5 10 20
Frequency (Hz)
50
Comparison of the rst harmonic of the spike responses of a
cell to sinusoids and broadband noise with the predictions of the sandwich
model. The lines show the rst harmonics of the responses predicted by the
sandwich model. The model predicts the enhancement in responsivity observed with broadband stimulation. It also predicts the eects of broadband
stimulation on response timing: the shortening of the integration time and
of the phase lead. The data points are the same as in Figure 4.3. Some of
the raw responses are shown in Figure 4.11. Cell 19s2, exp. 4.
Figure 4.12:
167
A
Amplitude (sp/s/nA)
20
10
Phase (deg)
45
5
2
0
B
-45
Amplitude (sp/s/nA)
20
10
-90
5
30
10
2
30
3
10
3
1
0.3 -0.3
1
-1
-3
Frequency b (Hz)
-10 -30
0.3
Frequency a (Hz)
Second-order kernels of the responses to broadband noise
stimuli. A: observed. B: predicted by sandwich model. At each pair of frequencies fa ; fb the second-order kernel K2(fa ; fb ) is given by the component
of the response at the frequency fa + fb . In the plots, surface height represents response amplitude, and gray level represents response phase. The
surfaces are not dened in the diagonal in which fa = 0fb . The \Frequency
b" axis is the juxtaposition of two logarithmic axes, one for the positive frequencies and one for the negative frequencies. The observed and predicted
rst-order kernels for this cell are shown in Figure 4.12 (gray circles and line
tting them). Cell 19s2, exp. 4.
Figure 4.13:
168
Figure 4.13A shows the second-order kernel of the broadband noise responses
of a cell. It is about half the size of the rst-order kernel for the same experiment,
which we illustrated in Figure 4.12 (gray circles). For a linear system the secondorder kernel is zero because all the power is concentrated at the input frequencies.
In very nonlinear systems such as Y ganglion cells in the cat retina, the secondorder kernels can be much larger than the rst-order kernels (Victor et al., 1977).
The second-order kernel of the responses predicted by the sandwich model is
shown in Figure 4.13B. The model replicates the size, the phase and the general
shape of the second-order kernel of the actual responses. It however consistently
underestimates the highest frequency components, which are shown in the far edges
of the surface plots. This underestimation was present in most ts, and was in
some cases evident already in the rst-order kernel. Indeed, in Figure 4.12 the
33 Hz component of the broadband noise response (rightmost gray data point),
is underestimated by the model. For frequencies below around 30 Hz, the model
provided good ts to the second-order kernels of all our cells.
Figure 4.14 compares the model predictions with the responses of two cells to
sinusoidal currents of dierent amplitudes and baselines. The model captures the
general behavior of the data: it predicts the dependence of the frequency tuning
on the stimulus amplitude and baseline intensity. The quality of some of the ts
to the sinusoid data, however, is not entirely satisfactory. An analysis of these
errors shows that the model inherits the shortcomings of its rst stage, the RC
model of the cell's membrane. For example, the sandwich model underestimated
the responses to the low-amplitude stimuli (dark symbols) in Panel C. This underestimation can be traced back to the RC model underestimating the membrane
potential responses to those stimuli, which can be observed in Figure 4.7A. When
the RC model provided good ts to the membrane potential data, as in Figure 4.8,
169
40
A
40
20
Amplitude
1.0 nA
0.8
0.6
Baseline
0.3 nA
10
5
0.5
Firing rate (sp/s)
50
1
2
5
10
20
50
C
Amplitude
0.6 nA
0.45
0.3
Baseline
0.2 nA
0.4
10
5
1
20
10
5
0.5
50
20
2
5
10
20
Frequency (Hz)
50
B
1
2
5
10
170
50
D
20
10
5
1
2
5
10
Comparison of the amplitude of the rst harmonic of the
spike responses of two cells to sinusoids with the predictions of the sandwich
model. A and B show the responses of a cell to sinusoids of dierent amplitudes and baselines. The model captures the rightward shift in the tuning
curves observed with increasing amplitude or baseline. Cell 05s1, exp. 10/6.
C and D show the same data as in Figure 4.2, tted with the predictions of
the sandwich model.
Figure 4.14:
20
20
50
the sandwich model provided good ts to the ring rate data (Figure 4.12).
To measure the degree to which the RC model of the membrane contributed to
the total error in the ts, we explored the eects of bypassing it, and feeding the
rest of the model directly with the linear membrane potential responses of the cell
such as that depicted in Figure 4.6B. Without its rst stage the sandwich model
becomes a model of the transformation of the \slow" membrane potential responses
into ring rates similar to that proposed by Korenberg, Sakai and Naka (1989) for
the catsh retina. The reduced model provided better ts to the spike data than
the full sandwich model: while the t of the full model shown in Figure 4.14C-D
accounted for 69% of the variance of the data, the reduced model (not shown in the
Figure) accounted for 84% of the variance. As a model of the transformation of the
injected currents into spike trains, however, the reduced model does not constitute
a good alternative to the sandwich model, because it requires knowledge of the
rst harmonic of the membrane potential responses to all the frequencies present
in the stimulus.
A nal example of the performance of the sandwich model is illustrated in Figure 4.15. The bottom row of the gure shows the responses of a cell to two current
steps of dierent intensity. The thick traces show the smoothed spike trains, and
the thin traces show the predictions of the sandwich model. The model exhibits
spike rate adaptation because the step onset has strong high-frequency components, which get amplied by the high-pass lter much more than the subsequent
constant current injection.
171
Current (nA)
Potential (mV)
0.5
0
0
−40
−80
Rate (Sp/s)
60
40
20
0
0
0.5
1
Time (s)
1.5 0
0.5
1
1.5
Responses to current steps. Top row: injected currents.
Middle row: membrane potential responses. Bottom row: spike train responses (thick gray curves) and predictions of the sandwich model (thin
black curves), both low-passed by convolving with a Gaussian ( = 30 Hz).
The model captures the spike rate adaptation because its last stage is a
high-pass lter, which attenuates the responses to steady inputs. Cell 19s2,
exp. 3.
Figure 4.15:
172
4.3
Discussion
The goal of this study was to gain insight into the spike encoding properties of
regular-spiking cells. To this end, we measured the spike responses of the cells to
injected currents with dierent waveforms. We found that the spike encoder had
markedly band-pass properties when measured with sinusoids, and was instead
not selective for stimulus frequency when measured with broadband noise, which
enhanced its responsivity both to the low and to the high frequencies. In addition
to enhancing the cells' responsivity, broadband noise also linearized their averaged
spike responses, which were otherwise quite nonlinear.
Nonlinearities in the spike train responses. The most evident type of non-
linearity that we encountered in the spike train responses was rectication. Rectication is to be expected in cortical cells because their low resting ring rate
does not allow them to encode negative currents. By contrast, some noncortical
neurons that have high resting ring rates can act as linear encoders (du Lac and
Lisberger, 1995).
The other type of nonlinearity that we observed is spike synchronization. This
nonlinearity has been observed in a variety of neural systems. These include cockroach mechanoreceptors (French et al., 1972), and eccentric cells of the Limulus eye
(Ascoli et al., 1974). Knight (1972) observed that spike synchronization seriously
limits the amount of information that a spike train can carry. He also showed that
spike synchronization is predicted by a model of spike train generation as simple
as the leaky integrate-and-re model. Based on this model he concluded that even
very low amounts of noise should be sucient to get rid of spike synchronization in
the averaged spike train responses. This prediction was conrmed experimentally
173
by French et al. (1972) in their studies of the spike encoding properties of a cockroach mechanoreceptor, and we have seen that it is also correct for regular-spiking
cells in the visual cortex. Many other results of this Chapter can be likened to
results obtained in the cockroach mechanoreceptors and Limulus eccentric cells.
For example, the frequency tunings of both types of cell measured with sinusoidal
stimulation were found to be markedly band-pass (Knight et al., 1970; French
et al., 1972).
The sandwich model. To account for the spike responses we used a \sandwich
model" consisting of a low-pass linear lter (the RC circuit), a rectication nonlinearity, and a high-pass linear lter. Sandwich models have been successfully used
for a variety of neural systems. Spekreijse (1969) used a sandwich model to predict
the spike responses of ganglion cells in the goldsh retina to light stimulation, and
Victor and Shapley (Victor et al., 1977; Victor, 1988) applied it to Y ganglion cells
in the cat retina. Korenberg, Sakai and Naka (1989) used a sandwich model to
describe the generator potential responses of catsh retinal ganglion cells to light
stimulation. In addition, they used a static nonlinearity followed by a band-pass
linear lter to describe the transformation of generator potentials into spike trains.
More directly related to this study is the work of French and Korenberg (1989),
who used a sandwich model to describe the transformation of injected currents
into spike trains by cockroach mechanoreceptors. Our results conrm the validity
of such a model, and extend it to a dierent type of neuron.
A major dierence between our approach and the forementioned studies lies
in our eort to limit the number of free parameters to a bare minimum. This
was motivated by our goal of eventually incorporating the sandwich model into
large-scale models of the visual cortex. For this reason we imposed severe restric174
tions on the stages of the model. In particular, we required that the rst stage
be a single-compartment passive model of the cell membrane (RC model). Such a
model is dened by two parameters, the gain and the time constant of the membrane. We also required that the static nonlinearity be a simple rectier, which
is also described by two parameters, i.e. the threshold and the gain. Finally, we
required that the second linear lter be high-pass with a very strict functional form
(Equation 4.3), also described by two parameters.
The rst stage of the model | the RC model of the membrane | is the only
one which we can directly relate to the cell biophysics. Its parameters are derived
from the membrane potential responses of the cells. By contrast, the rectication
stage has no rm biophysical interpretation. It embodies a threshold which was
in general lower than the voltage threshold at which spikes were generated. We
think of its output as a measure of sodium channel activation. Finally, the highpass lter can be interpreted as a phenomenological description of the eects on
the spike train of sodium inactivation and of the after-hyperpolarization currents
present in cortical cells.
The sandwich model captures most of the essential properties of the ring rate
responses of regular-spiking cells in the visual cortex to sinusoidal, broadband
noise and step currents. In particular, the model predicts the approximately linear
dependence of the ring rate on the injected current observed by Jagadeesh et
al. (1992). If a stimulus is suprathreshold, increasing its amplitude results in
a proportional increase in the predicted ring rate of the cells. According to the
model, the slope of the current-rate line is a function of the stimulus frequency, the
absolute value of the product of the two lters' transfer functions. Other properties
of the spike encoding mechanism that are captured by the model include spike
rate adaptation, band-pass tuning and rectication in the responses to sinusoids,
175
the general shape and size of the second order kernels, and the phenomenon of
linearization by noise observed with broadband stimulation.
The model however outputs analog ring rate traces, so it cannot predict spike
synchronization and other phenomena related to the exact timing of the individual
spikes. Another shortcoming of the model is that it relies on a very simplied model
of the membrane potential responses, the RC model, which tends to be less accurate
when the injected currents have large modulations or baseline intensities. The
advantage of the RC model, however, is that it is specied by only two parameters.
It would be interesting to test the sandwich model on the data of Stafstrom
et al. (1984c), who measured the ring rate responses of cortical cells to current
ramps. In that study, the authors discuss | and ultimately reject | a model in
which the ring rate R is a weighted sum of the membrane potential V and of
its derivative dV=dt. Such a model is a particular type of high-pass lter, so it is
possible that the sandwich model, being more general, would provide better ts to
those data.
Neuronal inputs and outputs. In natural conditions the input to a neuron
is constituted by synaptic conductances. To what degree can somatic current
injection simulate synaptic stimulation? Schwindt and Calvin (1973), and more
recently Powers et al (1992, 1995) showed that somatic injection of current into
motoneurons has the same eect on the spike trains as synaptic stimulation. The
spike train of the cell can thus be taken to reect the overall synaptic current that
reaches the site of spike initiation, which is located in the axon hillock (Stuart
and Sakmann, 1994). A similar conclusion was drawn by Ahmed et al. (1993),
who reported that the net somatic input current can be estimated from the spike
discharge of a neuron \by deconvolving the spike train response to visual stimuli
176
with a suitably transformed response to somatic step current" (R. Douglas, personal communication). This is akin to considering the ring rate as the output of
a linear lter whose input is the synaptic current. Above threshold, this model
resembles the sandwich model.
Role in Visual Responses. The nonlinear nature of ring rate encoding in
visual cortical neurons may contribute to the many nonlinearities of their contrast
responses (reviewed in Heeger, 1992a,b). The extent of this contribution can be
roughly assessed by equating our current injection with the synaptic current resulting from visual contrast stimulation. It is reasonable to assume that the relation
between visual contrast and synaptic current is monotonic, with higher visual contrasts resulting in larger modulations in the synaptic currents. For simple cells,
in particular, this assumption has been tested intracellularly. The membrane potential responses were found to be consistent with a linear dependence of synaptic
current on visual contrast (Jagadeesh et al., 1993).
Remarkably, the period histograms obtained from simple cells with sinusoidal
contrast modulation look much more sinusoidal than those that we obtained with
sinusoidal current injection. This must be due to the synaptic and active currents
that result from sinusoidal visual stimulation containing power at many frequencies
besides that of the stimulus (Jagadeesh et al., 1993). This \noise" linearizes the
averaged responses, which are therefore more sinusoidal than they would have been
if they had received a perfectly sinusoidal current as input.
Based on our results, we suggest that spike encoding may contribute to the
increase in the temporal resolution of V1 cells observed with increasing visual contrasts (Holub and Morton-Gibson, 1981; Hawken et al., 1992). Indeed, increasing
the current amplitude in our sinusoidal current injections had a similar eect on
177
the high-cut temporal frequency (Figure 4.2A).
Spike encoding is also likely to contribute to the dierences in the temporal
frequency bandwidth and integration time observed with broadband noise and
sinusoidal contrast modulations (Reid et al., 1992). The broadband stimuli used in
the present study were constructed in the same way as those used by Reid et al., i.e.
by adding eight dierent sinusoids. Similarly to Reid et al., we found that the cells
were more responsive to the low and high frequencies when they were stimulated
with broadband noise than when the stimuli were sinusoidal. Moreover, we found
that the increase in bandwidth was accompanied by a decrease in integration time
of about 10 ms. By comparison, the decrease in integration time reported by Reid
et al. with visual contrast modulation is 20-30 ms. The dierence in bandwidth
between the sinusoid and broadband noise conditions was however most probably
larger in our injected currents than in the synaptic and active currents resulting
from visual stimulation. As a consequence, the spike encoding mechanism likely
played a lesser role in the results of Reid et al. than in our experiments.
Another temporal nonlinearity in which spike encoding may play a role, is
the increased transiency of the responses to contrast steps with respect to the
predictions from sinusoidal contrast modulation (Tolhurst et al., 1980). Indeed,
the spike responses to step currents are more transient than predicted from the
tuning to sinusoids: the frequency tuning deduced from the step responses showed
more low-frequency attenuation than the one measured with sinusoids (Carandini
et al., 1995).
High Frequencies and Visual Responses. Intracellular in vivo records of vi-
sual cortical cells show that their membrane potential contains substantial power
at frequencies in the 40-100 Hz range (Jagadeesh et al., 1992). We have seen that if
178
other frequencies are present in the stimulus, as is the case for our broadband stimulus, the cortical cells can encode such high frequency signals into their spike trains.
Indeed, high-frequency components have been observed in the spike responses of
visual cortical cells (Gray and Singer, 1989), and are considered by some to carry
an important signal (Singer, 1991). High frequency signals may not be suciently
strong to elicit spikes by themselves, but could serve to determine the timing of
the spikes when larger, but more slowly varying, signals are superimposed.
Our results suggest an additional role for these high-frequency uctuations,
related to the phenomenon of linearization by noise and to the broadening of the
range of encoded frequencies caused by broadband stimulation that we observed in
the spike mechanism. By increasing the stimulus bandwidth, the high-frequency
uctuations can act to eectively amplify and linearize the spike responses to lower
frequency currents, improving the ability of the cells to pass information about the
visual stimuli.
In conclusion, we have described the basic properties of spike encoding in a
class of cortical cells. We found that the bandwidth of the signals can profoundly
aect the input/output properties of the neurons. In addition, we have proposed a
model that succeeds in describing the spike encoding mechanism while using very
few parameters. Our hope is that such a model will help in the interpretation
of extracellularly recorded spike trains, and will improve the design of large-scale
models of the visual cortex.
179
Chapter 5
Conclusions
In this Thesis we have discussed four models related to the workings of the visual
cortex (Figure 1.1, page 2). The rst three | the linear model, the normalization model, and the RC implementation of normalization | are models of the
transformation of light intensity into V1 simple cell ring rates. The last, the
sandwich model, regards the transformation of intracellular current into ring rate
by regular-spiking cells in the visual cortex.
The sandwich model of spike rate encoding could be integrated into the RC
model of simple cell visual responses. This would make the latter more realistic:
We have seen in the last Chapter that the sandwich model represents a dramatic
improvement over the simple rectication stage used by the linear and normalization models to describe spike rate encoding. The new, integrated model would be
a linear stage followed, in the order, by an RC circuit, a rectication stage, and a
high-pass lter.
In the Discussion of Chapter 3, however, we have hinted at a number of other
possible improvements of the RC/normalization model. While all of these im180
100
Gain (%)
50
20
10
5
2
Phase (deg)
1
90
67.5
45
22.5
0
0.5 1 2
5 10 20 50
Temporal frequency (Hz)
Eect of decreasing the parameter H in the high-pass lter
that is part of the sandwich model. Continuous curves show a hypothetic
transfer function at rest (H = 0:1 s), dashed lines show a hypothetic transfer
function at 100% contrast (H = 0:025 s). Arrows indicate decrease in gain
and phase advance at the four temporal frequencies commonly used in the
experiments of Chapter 3 (1.6, 3.3, 6.5 and 13 Hz). The parameter gH was
set to 85%.
Figure 5.1:
provements would have a cost in free parameters, some may have larger yields in
realism than others. For example, allowing for a degree of nonlinearity in the LGN
responses may constitute a greater increase the realism of the model, and result in
a more drastic improvement of the ts, than adding a high-pass stage to the spike
rate encoding stage.
Incorporating the sandwich model into the normalization scheme may, on the
other hand, lead to a dierent model for response normalization, one that is not
based on shunting inhibition. In the new model the intracellular potentials would
be entirely linear functions of the intensity distribution on the retina, and gain
control would operate on the spike encoding mechanism. In particular, the new
model would control the high-pass stage of the sandwich model. This would be
reminiscent of the model proposed by Shapley and Victor for gain control in cat
retinal X ganglion cells (Shapley and Victor, 1978; Victor, 1987).
181
Figure 5.1 illustrates an example of how gain control could aect the high-pass
lter that is part of the sandwich model. The lter is described by two parameters,
gH and H (Equation 4.3). The rst parameter determines the steady-state gain of
the lter; the second one scales the temporal frequency, and thus determines the
horizontal position of the transfer function. If gain control operated by decreasing
this parameter, the transfer function would shift to the right, and the gain of the
lter would decrease more at low frequencies than at high frequencies. At the same
time, for a wide range of temporal frequencies the phase of the responses would
advance. As we have seen in Chapter 3, these are the the signatures of gain control
in the cortex. If, in addition, the gain control mechanism aected the overall gain
of the lter, it would shift the curve in the top panel down on a logarithmic scale,
leading to stronger suppression than shown in the Figure.
A testable prediction of the new model is that at very low temporal frequencies
the contrast responses should show phase decrease rather than phase advance. Preliminary data indicates that at least for some cells this prediction may be correct.
Another testable prediction is related to the shape of the response histograms. As
illustrated in Figure 4.10, the high-pass lter strongly aects the shape of the ring
rate responses, and makes the cell respond more strongly to the rising portions of
its input than to the falling portions. Increasing the low-cut frequency of the lter
as we propose would accentuate this behavior. If the new model is correct, then,
increasing the energy of the stimulus should make the ring rate responses look less
and less like half-rectied sinusoids, concentrating more spikes in the rising phase
of the response than in the falling phase. Preliminary results indicate that some
cells do show signs of this behavior. An example of this is shown in Figure 5.2,
which plots the period histograms of the responses of a V1 simple cell to drifting
gratings in the presence of spatiotemporal white noise masks, for various contrasts
182
Response (sp/s)
60
6.25%
80
60
9.38%
80
60
12.5%
80
60
18.8%
80
60
25%
80
60
37.5%
80
60
40
40
40
40
40
40
40
40
20
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
80
80
80
80
80
80
80
80
60
60
60
60
60
60
60
60
40
40
40
40
40
40
40
40
20
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
80
80
80
80
80
80
80
80
60
60
60
60
60
60
60
60
40
40
40
40
40
40
40
40
20
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
80
80
80
80
80
80
80
80
60
60
60
60
60
60
60
60
40
40
40
40
40
40
40
40
20
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
81
80
80
80
80
80
80
80
60
60
60
60
60
60
60
40
40
40
40
40
40
40
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
0
153
Time (ms)
50%
6.25%
9.38%
12.5%
18.8%
25%
183
80
Period histograms of the responses to 5 dierent grating contrasts (rows) and 8 dierent noise mask contrasts (columns). Curves are ts
of the normalization model. The data set is the same as in Figure 3.14.
0%
60
Figure 5.2:
80
of gratings and noise. As the noise contrast increases, the responses are clearly
suppressed, and we have seen in Figure 3.14 (which shows the same data set), that
the phase advances. In addition, the spikes tend to concentrate more on the rising
phases of the responses. This eect is slight and may be due to a variety of other
mechanisms (including the retinal contrast gain control described by Shapley and
Victor) but is nonetheless encouraging for the new model.
The decisive test of this model would require intracellular measurements. For
example, the new model predicts that the membrane potential responses should
show no sign of gain control, and that adding energy to the visual stimuli should
decrease the eectiveness of the spike encoding mechanism. In particular, the
latter should become more high-pass, so that a step of current should result in
a smaller and more transient burst of spikes. By contrast, the predictions of the
RC implementation of the normalization model are the opposite: gain control
should be evident already at the level of the membrane potential responses, the
conductance of the membrane should grow with stimulus energy, and the spike
encoding mechanism should be unaected by changes in the visual stimulus.
Clearly, more work is needed to decide which mechanism underlies gain control
in the visual cortex. In this Thesis we have used extracellular in vivo methods
to study gain control, and we have developed a quantitative model for how visual
stimuli aect the responsivity of simple cells in the primary visual cortex. We
have also used intracellular in vitro methods to study spike encoding, and we have
developed a quantitative model for the transformation of currents into ring rates
by a class of cortical cells. Our plan is to proceed to the next logical step in this
research, which involves a test of these models with intracellular in vivo methods.
184
Bibliography
Adelson, E. H. and Bergen, J. R. (1985). Spatiotemporal energy models for the
perception of motion. J. Opt. Soc. Am. A, 2, 284{299.
Ahmed, B., Anderson, J. C., Douglas, R. J., Martin, K. A. C. and Whitteridge,
D. (1993). A method of estimating net somatic input current from the action
potential discharge of neurones in the visual cortex of the anaesthetized cat.
J. Physiol. (London), 459, 134.
Albrecht, D. G. (1995). Visual cortex neurons in monkey and cat: eect of contrast
on the spatial and temporal phase transfer functions. Vis. Neurosci., 12, 1191{
1210.
Albrecht, D. G. and Geisler, W. S. (1991). Motion sensitivity and the contrastresponse funtion of simple cells in the visual cortex. Vis. Neurosci., 7, 531{546.
Albrecht, D. G. and Hamilton, D. B. (1982). Striate cortex of monkey and cat:
Contrast response function. J. Neurophysiol., 48, 217{237.
Andrews, B. W. and Pollen, D. A. (1979). Relationship between spatial frequency
selectivity and receptive eld prole of simple cells. J. Physiol. (London), 287,
163{176.
Ascoli, C., Barbi, M., Ghelardini, G. and Petracchi, D. (1974). Rectication and
spike synchronization in the Limulus lateral eye. Kybernetik, 14, 155{160.
Bauman, L. A. and Bonds, A. B. (1991). Inhibitory renement of spatial frequency
selectivity in single cells of the cat striate cortex. Vis. Res., 31, 933{944.
Baylor, D. A. and Hodgkin, A. L. (1974). Changes in time scale and sensitivity in
turtle photoreceptors. J. Physiol. (London), 242, 729{758.
Ben-Yishai, R., Bar Or, R. L. and Sompolinsky, H. (1995). Theory of orientation
tuning in the visual cortex. Proc. Natl. Acad. Sci., 92, 3844{3848.
185
Benardete, E. A. and Kaplan, E. (1995). The receptive eld of the primate P
retinal ganglion cell. I: linear dynamics. Submitted to?
Benardete, E. A., Kaplan, E. and Knight, B. W. (1992). Contrast gain in the
primate retina: P cells are not X-like, some M cells are. Vis. Neurosci., 8,
483{486.
Berman, N. J., Douglas, R. J., Martin, K. A. C. and Whitteridge, D. (1991).
Mechanisms of inhibition in cat visual cortex. J. Physiol. (London), 440,
697{722.
Douglas, R. J., Martin, K. A. C. and Koch, C. (1991). Synaptic
Bernander, O.,
background activity inuences spatiotemporal integration in single pyramidal
cells. Proc. Natl. Acad. Sci., 88, 11569{11573.
Bishop, P. O., Coombs, J. S. and Henry, G. H. (1973). Receptive elds of simple
cells in the cat striate cortex. J. Physiol. (London), 231, 31{60.
Blasdel, G. G. and Lund, J. S. (1983). Termination of aerent axons in macaque
striate cortex. J. Neurosci., 3, 1389{1413.
Blomeld, S. (1974). Arithmetical operations performed by nerve cells. Br. Res.,
69, 115{124.
Bonds, A. B. (1989). Role of inhibition in the specication of orientation selectivity
of cells in the cat striate cortex. Vis. Neurosci., 2, 41{55.
Bonds, A. B. (1991). Temporal dynamics of contrast gain in single cells of the cat
striate cortex. Vis. Neurosci., 6, 239{255.
Bonds, A. B. (1992). Spatial and temporal nonlinearities in receptive elds on the
cat striate cortex. In R. B. Pinter and B. Nabet (Eds.), Nonlinear vision.
CRC Press, Boca Raton, FL.
Born and Tootell (1991). Single-unit and 2-deoxyglucose studies of side inhibition
in macaque striate cortex. Proc. Natl. Acad. Sci., 88, 7071{7075.
Bower, J. M. and Beeman, D. (1995). The book of GENESIS. New York: SpringerVerlag.
Burr, D., Morrone, C. and Maei, L. (1981). Intra-cortical inhibition prevents
simple cells from responding to textured visual patterns. Exp. Br. Res., 43,
455{458.
186
Carandini, M. and Heeger, D. J. (1993). Normalization with shunting inhibition
explains simple cell response phase and integration time. Inv. Opht. and Vis.
Sci. (Suppl.), 34, 907.
Carandini, M. and Heeger, D. J. (1994). Summation and division by neurons in
visual cortex. Science, 264, 1333{1336.
Carandini, M. and Heeger, D. J. (1995). Summation and division in V1 simple
cells. In J. M. Bower (Ed.), The neurobiology of computation: proceedings
of the third annual Computation and Neural Systems conference (pp. 59{65).
Norwell, MA: Kluwer.
Carandini, M., Heeger, D. J. and Movshon, J. A. (1993a). Amplitude and phase
of contrast responses in LGN and V1. Soc. Neurosci. Abs., 19, 628.
Carandini, M., Heeger, D. J. and Movshon, J. A. (1993b). Normalization and
phase advance in simple cell responses. Perception (Supplement), 22, 43.
Carandini, M., Heeger, D. J. and Movshon, J. A. (1996a). Linearity and gain
control in V1 simple cells. In E. G. Jones and P. S. Ulinski (Eds.), Cerebral
cortex, Vol. XII: Cortical models. New York: Plenum. In press.
Carandini, M., Heeger, D. J., O'Keefe, L. P., Tang, C. and Movshon, J. A. (1994a).
Simple cells, spatiotemporal frequency and contrast. Inv. Opht. and Vis. Sci.
(Suppl.), 35, 1469.
Carandini, M., Mechler, F., Leonard, C. S. and Movshon, J. A. (1994b). Firing
rate encoding by visual cortical neurons in vitro. Soc. Neurosci. Abs., 20, 624.
Carandini, M., Mechler, F., Leonard, C. S. and Movshon, J. A. (1995). Spike rate
encoding can explain some visual properties of cortical cells. Inv. Opht. and
Vis. Sci. (Suppl.), 36, S692.
Carandini, M., Mechler, F., Leonard, C. S. and Movshon, J. A. (1996b). Spike
train encoding in regular-spiking cells of the visual cortex in vitro. In press.
Carandini, M. and Ringach, D. (1997). Some properties of recurrent models of
orientation selectivity. In J. M. Bower (Ed.), The neurobiology of computation:
proceedings of the ftth annual Computation and Neural Systems conference.
Academic Press. In preparation.
Choudhury, B. P. (1978). Retinotopic organization of the guinea pig's visual cortex.
Br. Res., 144, 19{29.
187
Cole, K. S. and Baker, R. F. (1941). Longitudinal impedance of the squid giant
axon. J. Gen. Physiol., 24, 771{788.
Connors, B. W. and Gutnick, M. J. (1990). Intrinsic ring patterns of neocortical
neurons. Trends in Neuroscience, 13.
Connors, B. W., Gutnick, M. J. and Prince, D. A. (1982). Electrophysiological
properties of neocortical neurons in vitro. J. Neurophysiol., 48, 1302{1320.
Coombs, J. S., Eccles, J. C. and Fatt, P. (1955). The inhibitory suppression of
reex discharges from motoneurones. J. Physiol. (London), 130, 396{413.
Creel, D. J. and Giolli, R. A. (1972). Reticulogeniculate projections in the guinea
pigs: albino and pigmented strains compared. Experimental Neurology, 36,
411{425.
Creutzfeldt, O. D. and Ito, M. (1968). Functional synaptic organization of primary
visual cortex neurones in the cat. Exp. Br. Res., 6, 324{352.
De Valois, K. and Tootell, R. (1983). Spatial-frequency-specic inhibition in cat
striate cortex cells. J. Physiol. (London), 336, 359{376.
De Valois, K. K., De Valois, R. L. and Yund, E. W. (1979). Responses of striate
cortex cells to grating and checkerboard patterns. J. Physiol. (London), 291,
483{505.
De Valois, R. L., Albrecht, D. G. and Thorell, L. G. (1982). Spatial frequency
selectivity of cells in macaque visual cortex. Vis. Res., 22, 545{559.
De Valois, R. L., Thorell, L. G. and Albrecht, D. G. (1985). Periodicity of striatecortex-cell receptive elds. J. Opt. Soc. Am. A, 2, 1115{1123.
Dean, A. F. (1981). The relationship between response amplitude and contrast for
cat striate cortical neurones. J. Physiol. (London), 318, 413{427.
Dean, A. F., Hess, R. F. and Tolhurst, D. J. (1980). Divisive inhibition involved
in direction selectivity. J. Physiol. (London), 308, 84p{85p.
Dean, A. F. and Tolhurst, D. J. (1983). On the distinctiveness of simple and
complex cells in the visual cortex of the cat. J. Physiol. (London), 344, 305{
325.
Dean, A. F. and Tolhurst, D. J. (1986). Factors inuencing the temporal phase of
response to bar and grating stimuli for simple cells in the cat striate cortex.
Exp. Br. Res., 62, 143{151.
188
DeAngelis, G. C., Freeman, R. D. and Ohzawa, I. (1994). Length and width tuning
of neurons in the cat's primary visual cortex. J. Neurophysiol., 71, 347{374.
DeAngelis, G. C., Ohzawa, I. and Freeman, R. D. (1993a). The spatiotemporal
organization of simple cell receptive elds in the cat's striate cortex. I. General
characteristics and postnatal development. J. Neurophysiol., 69, 1091{1117.
DeAngelis, G. C., Ohzawa, I. and Freeman, R. D. (1993b). The spatiotemporal organization of simple cell receptive elds in the cat's striate cortex. II. Linearity
of temporal and spatial summation. J. Neurophysiol., 69, 1118{1135.
DeAngelis, G. C., Robson, J. G., Ohzawa, I. and Freeman, R. D. (1992). The
organization of suppression in receptive elds of neurons in the cat's visual
cortex. J. Neurophysiol., 68, 144{163.
deBoer, E. and Kuyper, P. (1968). Triggered correlation. IEEE Trans. Biomed.
Eng., 15, 169{179.
Derrington, A. M. and Lennie, P. (1984). Spatial and temporal contrast sensitivities of neurones in lateral geniculate nucleus of macaque. J. Physiol. (London),
357, 219{240.
Douglas, R. J. and Martin, K. A. (1990). Neocortex. In G. M. Shepherd (Ed.), The
synaptic organization of the brain (pp. 389{438). Oxford, UK: Oxford UP.
Douglas, R. J., Martin, K. A. C. and Whitteridge, D. (1991). An intracellular
analysis of the visual responses of neurones in cat visual cortex. J. Physiol.
(London), 440, 659{696.
Dreher, B., Fukada, Y. and Rodieck, R. W. (1976). Identication, classication
and anatomical segregation of cells with X-like and Y-like properties in the
lateral geniculate nucleus of old-world primates. J. Physiol. (London), 258,
433{452.
Dreifuss, J. J., Kelly, J. S. and Krnjevic, K. (1969). Cortical inhibition and gammaamminobutirric acid. Exp. Br. Res., 9, 137{154.
Efron, B. and Tibshirani, R. J. (1991). Statistical data analysis in the computer
age. Science, 253, 390{395.
Efron, B. and Tibshirani, R. J. (1993). An introduction to the bootstrap. Number 57
in Monographs on statistics and applied probability. New York: Chapman &
Hall.
189
Einstein, G., Davis, T. L. and Sterling, P. (1987). Ultrastructure of synapses from
the A-laminae of the lateral geniculate nucleus in layer IV of the cat striate
cortex. J. Comp. Neurol., 260, 63{75.
Emerson, R. C. (1988). A linear model for symmetric receptive elds: Implications
for classication test with ashed and moving images. Spatial Vision, 3, 159{
177.
Emerson, R. C. and Citron, M. C. (1992). Linear and nonlinear mechanisms of
motion selectivity in simple cells of the cat's striate cortex. In R. B. Pinter
and B. Nabet (Eds.), Nonlinear vision. CRC Press, Boca Raton, FL.
Enroth-Cugell, C. and Robson, J. G. (1966). The contrast sensitivity of retinal
ganglion cells of the cat. J. Physiol. (London), 187, 517{552.
Enroth-Cugell, C. and Robson, J. G. (1984). Functional characteristics and diversity of cat retinal ganglion cells. Inv. Opht. and Vis. Science, 25, 250{267.
Enroth-Cugell, C., Robson, J. G., Schweitzer-Tong, D. E. and Watson, A. B.
(1983). Spatio-temporal interactions in cat retinal ganglion cells showing linear
spatial summation. J. Physiol. (London), 341, 279{307.
van Essen, D. C., DeYoe, E. A., Olavarria, J. F., Knierim, J. J., Sagi, D., Fox, J. M.
and Julesz, B. (1989). Neural responses to static and moving texture patterns
in visual cortex of the macaque monkey. In D. M. K. Lan and C. D. Gilbert
(Eds.), Neural Mechanisms of Visual Perception (pp. 137{156). Woodlands,
Texas: Portfolio Publishing.
Fahle, M. and Poggio, T. (1981). Visual hyperacuity: spatiotemporal interpolation
in human vision. Proc. Roy. Soc. Lon. B, 213, 451{477.
Fatt, P. and Katz, B. (1953). The eect of inhibitory nerve impulses on a crustacean
muscle bre. J. Physiol., 121, 374{389.
Ferster, D. (1981). A comparison of binocular depth mechanisms in areas 17 and
18 of the cat visual cortex. J. Physiol. (London), 311, 623{655.
Ferster, D. (1986). Orientation selectivity of synaptic potentials in neurons of cat
primary visual cortex. J. Neurosci., 6, 1284{1301.
Ferster, D. (1988). Spatially opponent excitation and inhibition in simple cells of
the cat visual cortex. J. Neurosci., 8, 1172{1180.
Ferster, D. (1990a). X- and Y-mediated current sources in areas 17 and 18 of cat
visual cortex. Vis. Neurosci., 4, 135{145.
190
Ferster, D. (1990b). X- and Y-mediated synaptic potentials in neurons of areas 17
and 18 of cat visual cortex. Vis. Neurosci., 4, 115{133.
Ferster, D., Chung, S. and Wheat, H. S. (1996). Orientation selectivity of thalamic
input to simple cells of cat visual cortex. Nature, 380, 249{252.
Ferster, D. and Jagadeesh, B. (1992). EPSP-IPSP interactions in cat visual cortex
studied with
whole-cell patch recording. J. Neurosci., 12, 1262.
Ferster, D. and Koch, C. (1987). Neuronal connections underlying selectivity in
cat visual cortex. Trends in Neuroscience, 10, 487{492.
Ferster, D. and Lindstrom, S. (1983). An intracellular analysis of geniculo-cortical
connectivity in area 17 of the cat. J. Physiol. (London), 342, 181{215.
in vivo
Field, D. J. and Tolhurst, D. J. (1986). The structure and symmetry of simple-cell
receptive eld proles in the cat's visual cortex. Proc. R. Soc. Lon. B, 228,
379{400.
Foster, K. H., Gaska, J. P., Marcelja, S. and Pollen, D. A. (1983). Phase relationships between adjacent simple cells in the feline visual cortex. J. Physiol.
(London), 345, 22P.
Freeman, R. D., Ohzawa, I. and Robson, J. G. (1987). A comparison of monocular
and binocular inhibitory processes in the visual cortex of cat. J. Physiol.
(London), 396, 69p.
French, A. S., Holden, A. V. and Stein, R. B. (1972). The estimation of the
frequency response function of a mechanoreceptor. Kybernetik, 11, 15{23.
French, A. S. and Korenberg, M. J. (1989). A nonlinear cascade model for action
potential encoding in an insect sensory neuron. Biophys. J., 55, 655{661.
Garey, L. J. and Powell, T. P. S. (1971). An experimental study of the termination
of the lateral geniculo-cortical pathway in the cat and monkey. Proc. R. Soc.
Lon. B, 179, 41{63.
Gaudiano, P. (1992). A unied neural network model of spatiotemporal processing
in X and Y retinal ganglion cells I: Analytical results. biocyb, 67, 11{21.
Geisler, W. S. and Albrecht, D. G. (1992). Cortical neurons: isolation of contrast
gain control. Vis. Res., 8, 1409{1410.
Getting, P. A. (1989). Reconstruction of small neural networks. In C. Koch and
I. Segev (Eds.), Methods in neuronal modeling (pp. 171{194). MIT Press.
191
Gilbert, C. D., Das, A., Ito, M., Kapadia, M. and Westheimer, G. (1996). Spatial
integration and cortical dynamics. Proc. Natl. Acad. Sci., 93, 615{622.
Gilbert, C. D. and Wiesel, T. N. (1990). The inuence of contextual stimuli on the
orientation selectivity of cells in primary visual cortex of the cat. Vis. Res.,
30, 1689{1701.
Gizzi, M. S., Katz, E., Schumer, R. A. and Movshon, J. A. (1990). Selectivity
for orientation and direction of motion of single neurons in cat striate and
extrastriate visual cortex. J. Neurophysiol., 63, 1529{1543.
Glezer, V. D., Tscherbach, T. A., Gauselman, V. E. and Bondarko, V. E. (1980).
Linear and nonlinear properties of simple and complex receptive elds in area
17 of the cat visual cortex. Biol. Cyb., 37, 195{208.
Glezer, V. D., Tscherbach, T. A., Gauselman, V. E. and Bondarko, V. E. (1982).
Spatio-temporal organization of receptive elds of the cat striate cortex. Biol.
Cyb., 43, 35{49.
Gray, C. M. and Singer, W. (1989). Stimulus-specic neuronal oscillations in
orientation columns of cat visual cortex. Proc. Natl. Acad. Sci., 86, 1698{
1702.
Grossberg, S. (1988). Nonlinear neural networks: principles, mechanisms and
architectures. Neural Networks, 1, 17{61.
Grossberg, S. and Todorovic, D. (1988). Neural dynamics of 1-d and 2-d brightness
perception: A unied model of classical and recent phenomena. Percept. &
Psychophys., 43, 241{277.
Gulyas, B., Orban, G. A., Duysens, J. and Maes, H. (1987). The suppressive
inuence of moving textured backgrounds on responses of cat striate neurons
to moving bars. J. Neurophysiol., 57, 1767{1791.
Gutfreund, Y., Yarom, Y. and Segev, I. (1995). Subthreshold oscillations and
resonant frequency in guinea-pig cortical neurons: physiology and modelling.
J. Physiol. (London), 483, 621{40.
Gutnick, M. J. and Crill, W. E. (1995). The cortical neuron as an electrophysiological unit. In M. J. Gutnick and I. Mody (Eds.), The cortical neuron. Oxford
University Press.
Hammond, P. and MacKay, D. M. (1977). Dierential responsiveness of simple
and complex cells in cat striate cortex to visual texture. Exp. Br. Res., 30,
275{296.
192
Hammond, P. and MacKay, D. M. (1981). Modulatory inuences of moving textured backgrounds on responsiveness of simple cells in feline striate cortex. J.
Physiol. (London), 319, 431{442.
Hawken, M. J., Shapley, R. M., Gordon, J., Grosof, D. H. and Mechler, F. (1994).
Comparison of temporal tuning in primate geniculate and V1. Inv. Opht. and
Vis. Sci. (Suppl.), 35, 1662.
Hawken, M. J., Shapley, R. M. and Grosof, D. H. (1992). Temporal frequency tuning of neurons in macaque V1: eects of luminance contrast and chromaticity.
Inv. Opht. and Vis. Sci. (Suppl.), 33, 955.
Heeger, D. J. (1991). Nonlinear model of neural responses in cat visual cortex.
In M. Landy and J. A. Movshon (Eds.), Computational Models of Visual
Processing (pp. 119{133). Cambridge, MA: MIT Press.
Heeger, D. J. (1992a). Half-squaring in responses of cat simple cells. Vis. Neurosci.,
9, 427{443.
Heeger, D. J. (1992b). Normalization of cell responses in cat striate cortex. Vis.
Neurosci., 9, 181{198.
Heeger, D. J. (1993). Modeling simple cell direction selectivity with normalized,
half-squared, linear operators. J. Neurophysiol., 70, 1885{1897.
Heggelund, P. (1981). Receptive-eld organization of simple cells in cat striate
cortex. Exp. Br. Res., 42, 89{98.
Heggelund, P. (1986). Quantative studies of enhancement and suppression zones
in the receptive eld of simple cells in cat striate cortex. J. Physiol. (London),
373, 293{310.
Hendrickson, A. E., Wilson, J. R. and Ogren, M. P. (1978). The neuroanatomical
organization of pathways between the dorsal lateral geniculate nucleus and
visual cortex in Old World and New World primates. J. Comp. Neurol., 182,
123{136.
Hochstein, S. and Shapley, R. M. (1976). Quantitative analysis of retinal ganglion
cell classications. J Physiol, 262, 237{264.
Hodgkin, A. L. and Huxley, A. F. (1952). A quantitative description of membrane
current and its application to conduction and excitation in nerve. J. Physiol.
(London), 117, 500{544.
193
Holub, R. A. and Morton-Gibson, M. (1981). Response of visual cortical neurons
of the cat to moving sinusoidal gratings: Response-contrast functions and
spatiotemporal interactions. J. Neurophysiol., 46, 1244{1259.
Hubel, D. and Wiesel, T. (1962). Receptive elds, binocular interaction, and
functional architecture in the cat's visual cortex. J. Physiol. (London), 160,
106{154.
Hubel, D. H. and Wiesel, T. N. (1972). Laminar and columnar distribution of
geniculo-cortical bers in macaque monkeys. J. Comp. Neurol., 146, 421{450.
Jack, J. J. B., Noble, D. and Tsien, R. W. (1975). Electric current ow in excitable
cells. Oxford, UK: Oxford University Press.
Jagadeesh, B., Gray, C. M. and Ferster, D. (1992). Visually evoked oscillations of
membrane potential in cells of cat visual cortex. Science, 257, 552{554.
Jagadeesh, B., Wheat, H. S. and Ferster, D. (1993). Linearity of summation of
synaptic potentials underlying direction selectivity in simple cells of the cat
visual cortex. Science, 262, 1901{1904.
Jones, J. P. and Palmer, L. A. (1987a). An evaluation of the two-dimensional Gabor
lter model of simple receptive elds in cat striate cortex. J. Neurophysiol.,
58, 1233{1258.
Jones, J. P. and Palmer, L. A. (1987b). The two-dimensional spatial structure of
simple receptive elds in cat striate cortex. J. Neurophysiol., 58, 1187{1211.
Jones, J. P., Stepnoski, A. and Palmer, L. A. (1987). The two-dimensional spectral
structure of simple receptive elds in cat striate cortex. J. Neurophysiol., 58,
1212{1232.
Kaji, S. and Kawabata, N. (1985). Neural interactions of two moving patterns
in the direction and orientation domain in the complex cells of cat's visual
cortex. Vis. Res., 25, 749{753.
Kapadia, M. K., Ito, M., Gilbert, C. D. and Westheimer, G. (1995). Improvement
in visual sensitivity by changes in local context: Parallel studies in human
observers and in V1 of alert monkeys. Neuron, 15, 843{856.
Kaplan, E., Purpura, K. and Shapley, R. (1987). Contrast aects the transmission
of visual information through the mammalian lateral geniculate nucleus. J.
Physiol. (London), 391, 267{288.
194
Kaplan, E. and Shapley, R. M. (1982). X and Y cells in the lateral geniculate
nucleus of the macaque monkeys. J. Physiol. (London), 330, 125{43.
Kaplan, E. and Shapley, R. M. (1989). Illumination of the receptive eld surround
controls the contrast gain of macaque P retinal ganglion cells. Soc. Neurosci.
Abs., 15, 174.
Knight, B. W. (1972). Dynamics of encoding in a population of neurons. J. Gen.
Phys., 59, 734{766.
Knight, B. W., Toyoda, J. and Dodge, F. A. (1970). A quantitative description of
the dynamics of excitation and inhibition in the eye of Limulus. Journal of
General Physiology, 56, 421{437.
Koch, C. (1984). Cable theory in neurons with active, linearized membranes. Biol.
Cyb., 50, 15{33.
Koch, C. and Poggio, T. (1987). Biophysics of computation: neurons, synapses
and membranes. In G. M. Edelman, W. E. Gall and W. M. Cowan (Eds.),
Synaptic function. Wiley, NY.
Koch, C. and Segev, I. (1989). Methods in Neuronal modeling. Cambridge, MA:
MIT Press.
Korenberg, M. J., Sakai, H. M. and Naka, K. (1989). Dissection of the neuron
network in the catsh inner retina. iii. interpretation of spike kernels. J.
Neurophysiol., 61, 1110{1120.
Kulikowski, J. J. and Bishop, P. O. (1981a). Fourier analysis and spatial representation in the visual cortex. Experimentia, 37, 160{163.
Kulikowski, J. J. and Bishop, P. O. (1981b). Linear analysis of the response of
simple cells in the cat visual cortex. Exp. Br. Res., 44, 386{400.
du Lac, S. and Lisberger, S. G. (1995). Cellular processing of temporal information
in medial vestibular nucleus neurons. J. Neurosci., 15, 8000{8010.
Lahica, E. A., Beck, P. D. and Casagrande, V. A. (1992). Parallel pathways in
macaque monkey striate cortex: Anatomically dened columns in layer III.
Proc. Natl. Acad. Sci., 89, 3566{3570.
Lee, B. B., Pokorny, J., Smith, V. C. and Kremers, J. (1994). Responses to pulses
and sinusoids in macaque ganglion cells. Vis. Res., 34, 3081{3096.
195
Li, C. Y. and Creutzfeldt, O. (1984). The representation of contrast and other
stimulus parameters by single neurons in area 17 of the cat. Pugers Archives,
401, 304{314.
Lisberger, S. G. and Sejnowski, T. J. (1992). Motor learning in a recurrent network
model based on the vestibulo-ocular reex. Nature, 360, 159{161.
Liu, Z., Gaska, J. P., Jacobson, L. D. and Pollen, D. A. (1992). Interneuronal
interaction between members of quadrature phase and anti-phase pairs in the
cat's visual cortex. Vis. Res., 7, 1193{1198.
Llinas, R. R., Grace, A. A. and Yarom, Y. (1991). In vitro neurons in mammalian cortical layer 4 exhibit intrinsic oscillatory activity in the 10- to 50-hz
frequency range. Proc. Natl. Acad. Sci., 88, 897{901.
Lorenzon, N. M. and Foehring, R. C. (1992). Relationship between repetitive ring
and afterhyperpolarizations in human neocortical neurons. J. Neurophysiol.,
67, 350{363.
Maei, L. (1985). Complex cells control simple cells. In D. Rose and V. G. Dobson
(Eds.), Models of the visual cortex (pp. 334{340). Wiley.
Maei, L. and Fiorentini, A. (1973). The visual cortex as a spatial frequency
analyzer. Vis. Res., 13, 1255{1267.
Maei, L. and Fiorentini, A. (1976). The unresponsive regions of visual cortical
receptive elds. Vis. Res., 16, 1131{1139.
Maei, L., Fiorentini, A. and Bisti, S. (1973). Neural correlate of perceptual
adaptation to gratings. Science, 182, 1036{1038.
Maei, L., Morrone, C., Pirchio, M. and Sandini, G. (1979). Responses of visual
cortical cells to periodic and nonperiodic stimuli. J. Physiol. (London), 296,
27{47.
Malpeli, J. G., Schiller, P. H. and Colby, C. L. (1981). Response properties of
single cells in monkey striate cortex during reversible inactivation of individual
lateral geniculate laminae. J. Neurophysiol., 46, 1102{1119.
Marr, D. (1982). Vision. San Francisco: W. H. Freeman and Co.
Mastronarde, D. N. (1987). Two classes of single-input X-cells in cat lateral geniculate nucleus. I. receptive eld properties and classication of cells. J. Neurophysiol., 57, 357{380.
196
Maunsell, J. H. R. and Gibson, J. R. (1992). Visual response latencies of striate
cortex of the macaque monkey. J. Neurophysiol., 68, 1332{1344.
McCormick, D. A., Connors, B. W., Lighthall, J. W. and Prince, D. A. (1985).
Comparative electrophysiology of pyramidal and sparsely spiny stellate neurons of the neocortex. J. Neurophysiol., 54, 782{806.
McCormick, D. A. and Huguenard, J. R. (1992). A model of the electrophysiological properties of thalamocortical relay neurons. J. Neurophysiol., 68,
1384{1400.
McLean, J. and Palmer, L. A. (1989). Contribution of linear spatiotemporal receptive eld structure to velocity selectivity of simple cells in area 17 of cat.
Vis. Res., 29, 675{679.
McLean, J., Raab, S. and Palmer, L. A. (1994). Contribution of linear mechanisms
to the specication of local motion by simple cells in areas 17 and 18 of the
cat. Vis. Neurosci., 11, 271{294.
Merrill, E. G. and Ainsworth, A. (1972). Glass-coated platinum-plated tungsten
microelectrode. Med. Biol. Eng., 10, 495{504.
Morrone, M. C., Burr, D. C. and Maei, L. (1982). Functional implications of crossorientation inhibition of cortical visual cells. 1. Neurophysiological evidence.
Proc. R. Soc. Lon. B, 216, 335{354.
Movshon, J. A., Hawken, M. J., Kiorpes, L., Skoczenski, A. M., Tang, C. and
O'Keefe, L. P. (1994). Visual noise masking in macaque LGN neurons. Inv.
Opht. and Vis. Sci. (Suppl.), 35, 1662.
Movshon, J. A. and Lennie, P. (1979). Pattern-selective adaptation in visual cortical neurones. Nature, 278, 850{852.
Movshon, J. A., Thompson, I. D. and Tolhurst, D. J. (1978a). Spatial summation
in the receptive elds of simple cells in the cat's striate cortex. J. Physiol.
(London), 283, 53{77.
Movshon, J. A., Thompson, I. D. and Tolhurst, D. J. (1978b). Receptive eld
organization of complex cells in the cat's striate cortex. J. Physiol. (London),
283, 79{99.
Movshon, J. A., Thompson, I. D. and Tolhurst, D. J. (1978c). Spatial and temporal
contrast sensitivity of neurones in areas 17 and 18 of the cat's visual cortex.
J. Physiol. (London), 283, 101{120.
197
Nealey, T. A. and Maunsell, J. H. (1994). Magnocellular and parvocellular contributions to the responses of neurons in macaque striate cortex. J. Neurosci.,
14, 2069{79.
Nelson, J. I. and Frost, B. (1985). Intracortical facilitation amoung co-oriented,
co-axially aligned simple cells in cat striate cortex. Exp. Br. Res., 6, 54{61.
Nelson, S. B. (1991). Temporal interactions in the cat visual system i. Orientationselective suppression in visual cortex. J. Neurosci., 11, 344{356.
Nestares, O. and Heeger, D. J. (1996). Modeling the apparent frequency-specic
suppression in simple cell responses. Vis. Res. Submitted.
Ohzawa, I. and Freeman, R. D. (1986a). The binocular organization of complex
cells in the cat's visual cortex. J. Neurophysiol., 56, 243{259.
Ohzawa, I. and Freeman, R. D. (1986b). The binocular organization of simple cells
in the cat's visual cortex. J. Neurophysiol., 56, 221{242.
Ohzawa, I., Sclar, G. and Freeman, R. D. (1982). Contrast gain control in the cat
visual cortex. Nature, 298, 266{268.
Ohzawa, I., Sclar, G. and Freeman, R. D. (1985). Contrast gain control in the
cat's visual system. J. Neurophysiol., 54, 651{667.
Palmer, L. A. and Davis, T. L. (1981). Receptive-eld structure in cat striate
cortex. J. Neurophysiol., 46, 260{276.
Poirson, A. B., O'Keefe, L. P., Carandini, M. and Movshon, J. A. (1995). Spatial
adaptation and masking in macaque V1. Soc. Neurosci. Abs., 21, 22.
Pollen, D. and Ronner, S. (1981). Phase relationships between adjacent simple
cells in the visual cortex. Science, 212, 1409{1411.
Pollen, D. and Ronner, S. (1982). Spatial computation performed by simple and
complex cells in the visual cortex of the cat. Vis. Res., 22, 101{118.
Pollen, D. A., Gaska, J. P. and Jacobson, L. D. (1988). Responses of simple and
complex cells to compound sine-wave gratings. Vis. Res., 28, 25{39.
Powers, R. K. and Binder, M. D. (1995). Eective synaptic current and motoneuron
ring rate modulation. J. Neurophysiol., 74, 793{801.
Powers, R. K., Robinson, F. R., Konodi, M. A. and Binder, M. D. (1992). Eective
synaptic current can be estimated from measurements of neuronal discharge.
J. Neurophysiol., 68, 964{968.
198
Reichardt, W., Poggio, T. and Hausen, K. (1983). Figure{ground discrimination
by relative movement in the visual system of the y. Part II. Towards the
neural circuitry. Biol. Cyb., 46((Suppl.)), 1{30.
Reid, R. C. and Alonso, J. M. (1995). Specicity of monosynaptic connections
from thalamus to visual cortex. Nature, 378, 281{284.
Reid, R. C., Soodak, R. E. and Shapley, R. M. (1987). Linear mechanisms of
directional selectivity in simple cells of cat striate cortex. Proc. Natl. Acad.
Sci., 84, 8740{8744.
Reid, R. C., Soodak, R. E. and Shapley, R. M. (1991). Directional selectivity and
spatiotemporal structure of receptive elds of simple cells in cat striate cortex.
J. Neurophysiol., 66, 505{529.
Reid, R. C., Victor, J. D. and Shapley, R. M. (1992). Broadband temporal stimuli
decrease the integration time of neurons in cat striate cortex. Vis. Neurosci.,
9, 39{45.
Rose, D. (1977). On the arithmetical operation performed by inhibitory synapses
onto the neuronal soma. Exp. Br. Res., 28, 221{223.
van Santen, J. P. H. and Sperling, G. (1985). Elaborated Reichardt detectors. J.
Opt. Soc. Am. A, 2, 300{321.
Saul, A. B. and Humphrey, A. L. (1990). Spatial and temporal response properties
of lagged and nonlagged cells in cat lateral geniculate nucleus. J. Neurophysiol., 64, 206{224.
Saul, A. B. and Humphrey, A. L. (1992). Evidence for input from nonlagged cells
in the lateral geniculate nucleus to simple cells in cortical area 17 of the cat.
J. Neurophysiol., 68, 1190{.
Schiller, P. H., Finlay, B. L. and Volman, S. F. (1976). Quantitative studies of
single-cell properties in monkey striate cortex. I. Spatiotemporal organization
of receptive elds. J. Neurophysiol., 39, 1288{1319.
Schumer, R. A. and Movshon, J. A. (1984). Length summation in simple cells of
cat striate cortex. Vis. Res., 24, 565{571.
Schwindt, P. C. and Calvin, W. H. (1973). Equivalence of synaptic and injected
current in determining the membrane potential trajectory during motoneuron
rythmic ring. Brain Res., 59, 389{394.
199
Schwindt, P. C., Spain, W. J. and Crill, W. E. (1988a). Inuence of anomalous rectier activation on afterhyperpolarizations of neurons from cat sensorimotor
cortex in vitro. J. Neurophysiol., 59, 468{481.
Schwindt, P. C., Spain, W. J., Foehring, R. C., Chubb, M. C. and Crill, W. E.
(1988b). Slow conductances in neurons from cat sensorimotor cortex in vitro
and their role in slow excitability changes. J. Neurophysiol., 59, 450{467.
Schwindt, P. C., Spain, W. J., Foehring, R. C., Stafstrom, C. E., Chubb, M. C. and
Crill, W. E. (1988c). Multiple potassium conductances and their functions in
neurons from cat sensorimotor cortex in vitro. J. Neurophysiol., 59, 424{449.
Sclar, G. and Freeman, R. D. (1982). Orientation selectivity of the cat's striate
cortex is invariant with stimulus contrast. Exp. Br. Res., 46, 457{461.
Sclar, G., Lennie, P. and DePriest, D. D. (1989). Contrast adaptation in striate
cortex of macaque. Vis. Res., 29, 747{755.
Sclar, G., Maunsell, J. H. R. and Lennie, P. (1990). Coding of image contrast in
central visual pathways of the macaque monkey. Vis. Res., 30, 1{10.
Sengpiel, F. and Blakemore, C. (1994). Interocular control of neuronal responsiveness in cat visual cortex. Nature, 368, 847{850.
Sengpiel, F., Blakemore, C. and Harrad, R. (1995). Interocular suppression in the
primary visual cortex: a possible neural basis of binocular rivalry. Vis. Res.,
35, 179{196.
Shapley, R. and Enroth-Cugell, C. (1984). Visual adaptation and retinal gain
control. Progress in Retinal Research, 3, 263{346.
Shapley, R., Reid, R. C. and Soodak, R. (1991). Spatiotemporal receptive elds and
direction selectivity. In M. Landy and J. A. Movshon (Eds.), Computational
Models of Visual Processing (pp. 109{118). Cambridge, MA: MIT Press.
Shapley, R. M. and Perry, V. H. (1986). Cat and monkey retinal ganglion cells
and their visual functional roles. TINS, (pp. 1{7).
Shapley, R. M. and Victor, J. D. (1978). The eect of contrast on the transfer
properties of cat retinal ganglion cells. J. Physiol., 285, 275{298.
Sherman, S. M., Schumer, R. A. and Movshon, J. A. (1984). Functional cell classes
in the macaque's LGN. Soc. Neurosci. Abs., 10, 296.
200
Sillito, A. M., Grieve, K. L., Jones, H. E., Cudeiro, J. and Davis, J. (1995). Visual
cortical mechanisms detecting focal orientation discontinuities. Nature, 378,
492{496.
Silva, L. R., Amitai, Y. and Connors, B. W. (1991). Intrinsic oscillations of neocortex generated by layer 5 pyramidal neurons. Science, 251, 432{435.
Simoncelli, E. P. and Heeger, D. J. (1996). A comprehensive model of neural
responses in area MT. In preparation.
Singer, W. (1991). The formation of cooperative cell assemblies in the visual
cortex. In J. Kruger (Ed.), Neuronal Cooperativity (pp. 165{183). Berlin:
Spring-Verlag.
Skottun, B. C., Bradley, A., Sclar, G., Ohzawa, I. and Freeman, R. D. (1987). The
eects of contrast on visual orientation and spatial frequency discrimination:
A comparison of single cells and behavior. J. Neurophysiol., 57, 773{786.
Skottun, B. C., De Valois, R. L., Grosof, D. H., Movshon, J. A., Albrecht, D. G.
and Bonds, A. B. (1991). Classifying simple and complex cells on the basis of
response modulation. Vis. Res., 31, 1079{1086.
Somers, D. C., Nelson, S. B. and Sur, M. (1995). An emergent model of orientation
selectivity in cat visual cortical simple cells. J. Neurosci., 5448, 5465.
Spatz, W. B., Vogt, D. M. and Illing, R. B. (1991). Delineation of the striate
cortex and the striate-peristriate projections in the guinea pig. Exp. Br. Res.,
84, 495{504.
Spekreijse, H. (1969). Rectication in the goldsh retina: analysis by sinusoidal
and auxuliary stimulation. Vis. Res., 9, 1461{1472.
Spekreijse, H. and Oosting, H. (1970). Linearizing: a method for analysing and
synthesizing nonlinear systems. Kybernetik, 7, 1461{1472.
Sperling, G. and Sondhi, M. M. (1968). Model for visual luminance discrimination
and icker detection. J. Opt. Soc. Am. A, 58, 1133{1145.
Stafstrom, C. E., Schwindt, P. C. and Crill, W. E. (1984a). Cable properties of
layer V neurons from cat sensorimotor cortex in vitro. J. Neurophysiol., 52,
278{288.
Stafstrom, C. E., Schwindt, P. C. and Crill, W. E. (1984b). Properties of subthreshold response and action potential recorded in layer V neurons from cat
sensorimotor cortex in vitro. J. Neurophysiol., 52, 244{263.
201
Stafstrom, C. E., Schwindt, P. C. and Crill, W. E. (1984c). Repetitive ring in
layer V neurons from cat neocortex in vitro. J. Neurophysiol., 52, 264{277.
Stuart, G. J. and Sakmann, B. (1994). Active propagation of somatic action
potentials into neocortical pyramidal cell dendrites. Nature, 367, 69{72.
Suarez, H. H., Koch, C. and Douglas, R. J. (1995). Modeling direction selectivity
of simple cells in striate visual cortex within the framework of the canonical
microcircuit. J. Neurosci., 15, 6700{6719.
Tadmor, Y. and Tolhurst, D. J. (1989). The eect of threshold on the relationship
between the receptive-eld prole and the spatial-frequency tuning curve in
simple cells of the cat's striate cortex. Vis. Neurosci., 3, 445{454.
Tanaka, K. (1983). Cross-correlation analysis of geniculostriate neuronal relationships in cats. J. Neurophysiol., 49, 1303{1318.
Tolhurst, D. J. and Dean, A. F. (1987). Spatial summation by simple cells in the
striate cortex of the cat. Exp. Br. Res., 66, 607{620.
Tolhurst, D. J. and Dean, A. F. (1990). The eects of contrast on the linearity of
spatial summation of simple cells in the cat's striate cortex. Exp. Brain Res.,
79, 582{588.
Tolhurst, D. J. and Dean, A. F. (1991). Evaluation of a linear model of directional
selectivity in simple cells of the cat's striate cortex. Vis. Neurosci., 6, 421{428.
Tolhurst, D. J. and Heeger, D. J. (1996a). Contrast normalization and a linear
model for the directional selectivity of simple cells in cat striate cortex. Vis.
Neurosci., (p. In press).
Tolhurst, D. J. and Heeger, D. J. (1996b). Contrast normalization and hard threshold models of the responses of simple cells in cat striate cortex. Vis. Neurosci.,
(p. In press).
Tolhurst, D. J., Movshon, J. A. and Dean, A. F. (1983). The statistical reliability
of single neurons in cat and monkey visual cortex. Vis. Res., 23, 775{785.
Tolhurst, D. J., Walker, N. S., Thompson, I. D. and Dean, A. F. (1980). Nonlinearities of temporal summation in neurones in area 17 of the cat. Exp. Br.
Res., 38, 431{435.
Toyama, K., Kimura, M., Shiida, T. and Takeda, T. (1977a). Convergence of
retinal inputs onto visual cortical cells: II. A study of the cells disynaptically
excited from the lateral geniculate body. Br. Res., 137, 221{231.
202
Toyama, K., Maikawa, K. and Tanaka, T. (1977b). Convergence of retinal inputs
onto visual cortical cells: I. A study of the cells monosynaptically excited from
the lateral geniculate body. Br. Res., 137, 207{220.
Toyama, K., Matsunami, K., Ohno, T. and Tokashiki, S. (1974). An intracellular
study of neuronal organization in the visual cortex. Exp. Br. Res., 21, 45{66.
Toyama, K. and Takeda, T. (1974). A unique class of cat's visual cortical cells
that exhibit either ON or OFF excitation for stationary light slits and are
responsive to moving edge patterns. Br. Res., 73, 350{355.
Troy, J. B. (1983). Spatial contrast sensitivities of X and Y type neurones in the
cat's dorsal lateral geniculate nucleus. J. Physiol. (London), 344, 399{417.
Victor, J. (1987). The dynamics of the cat retinal X cell centre. J. Physiol.
(London), 386, 219{246.
Victor, J. (1988). The dynamics of the cat retinal Y cell subunit. J. Physiol.
(London), 405, 289{320.
Victor, J. and Shapley, R. M. (1980). A method of nonlinear analysis in the
frequency domain. Biophys. J., 29, 459{484.
Victor, J., Shapley, R. M. and Knight, B. W. (1977). Nonlinear analysis of cat
retinal ganglion cells in the frequency domain. Proc. Natl. Acad. Sci., 74,
3068{3072.
Victor, J. D. and Knight, B. W. (1979). Nonlinear analysis with an arbitrary
stimulus ensemble. Q. App. Math., 37, 113{136.
Walker, G. A., Ohzawa, I. and Freeman, R. D. (1996). Interocular transfer of
cross-orientation suppression in the cat's visual cortex. Inv. Opht. and Vis.
Sci. (Suppl.), 37, S485.
Watanabe, S., Konishi, M. and Creutzfeldt, O. D. (1966). Postsynaptic potentials
in the cat's visual cortex following electrical stimulation of aerent pathways.
Exp. Br. Res., 1, 272{283.
Watson, A. B. and Ahumada, A. J. (1985). Model of human visual-motion sensing.
J. Opt. Soc. Am. A, 2, 322{342.
Wree, A., Zilles, K. and Schleicher, A. (1981). A quantitative approach to cytoarchitectonics. vii. the areal pattern of the cortex in the guinea pig. Anatomy
and Embriology, 162, 81{103.
203
Yoshioka, T., Levitt, J. B. and Lund, J. (1994). Independence and merger of thalamocortical channels within macaque monkey primary visual cortex: anatomy
of interlaminar projections. Vis. Neurosci., 11, 467{489.
204
© Copyright 2026 Paperzz