Millisecond encoding precision of auditory cortex neurons

Millisecond encoding precision of auditory
cortex neurons
Christoph Kaysera,1, Nikos K. Logothetisa,b,1, and Stefano Panzeria,c
a
Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany; bDivision of Imaging Science and Biomedical Engineering, University of
Manchester, Manchester M13 9PT, United Kingdom; and cRobotics, Brain, and Cognitive Sciences Department, Italian Institute of Technology, 30 16163
Genoa, Italy
Neurons in auditory cortex are central to our perception of sounds.
However, the underlying neural codes, and the relevance of millisecond-precise spike timing in particular, remain debated. Here,
we addressed this issue in the auditory cortex of alert nonhuman
primates by quantifying the amount of information carried by precise spike timing about complex sounds presented for extended
periods of time (random tone sequences and natural sounds). We
investigated the dependence of stimulus information on the temporal precision at which spike times were registered and found
that registering spikes at a precision coarser than a few milliseconds significantly reduced the encoded information. This dependence demonstrates that auditory cortex neurons can carry
stimulus information at high temporal precision. In addition, we
found that the main determinant of finely timed information was
rapid modulation of the firing rate, whereas higher-order correlations between spike times contributed negligibly. Although the
neural coding precision was high for random tone sequences
and natural sounds, the information lost at a precision coarser
than a few milliseconds was higher for the stimulus sequence that
varied on a faster time scale (random tones), suggesting that the
precision of cortical firing depends on the stimulus dynamics. Together, these results provide a neural substrate for recently reported behavioral relevance of precisely timed activity patterns
with auditory cortex. In addition, they highlight the importance
of millisecond-precise neural coding as general functional principle
of auditory processing—from the periphery to cortex.
hearing
| neural code | spike timing | natural sounds | mutual information
O
ur auditory system can reliably encode natural sounds that
vary simultaneously over many time scales, and from these
sounds, it can extract behaviorally relevant information such as
a speaker’s identity or the sound qualities of musical instruments
(1, 2). Neurons in the auditory cortex are sensitive to temporally
structured acoustic stimuli and likely play a central role in mediating their perception (3). Yet, how exactly auditory cortex
neurons use their temporal patterns of action potentials (spikes)
to provide an information-rich representation of complex or
natural sounds remains debated.
One central question pertains to the relevance of millisecondprecise spike times of individual auditory cortical neurons for
stimulus encoding. Previous work has shown that onset responses
to isolated brief sounds can be millisecond-precise and carry
acoustic information (4–6). However, auditory cortex neurons
cannot lock their responses to rapid sequences of short stimuli.
For example, during the repetitive presentation of brief sounds,
their response precision degrades when the intersound interval
becomes shorter than ≈10–20 ms. This finding argues against a role
of millisecond-precise spike timing in the encoding of fast
sequences of short sounds (7, 8). In addition, studies on the
decoding of complex sounds from single-trial responses reported
time scales of 10–50 ms as optimal (9) or suggested only small
benefits of knowing the spike timing in addition to knowing the
spike rate (10, 11). Together, these results may be taken to suggest
that millisecond-precise spike timing of individual auditory cortex
www.pnas.org/cgi/doi/10.1073/pnas.1012656107
neurons may not encode additional information about complex
sounds beyond that contained in the spike rate on the 10- to 20-ms
scale. However, it could also be that the importance of precisely
stimulus-locked spike patterns was masked in previous studies,
because anesthetic drugs can suppress stimulus-synchronized
spikes (12, 13), because stimulus-induced temporal patterns of
cortical activity may be less precise during the presentation of brief
artificial sounds than during the presentation of extended periods of
naturalistic sounds (14, 15), or because some previously used
analysis methods did not capture all possible information carried by
the responses (16).
Here, we investigated the role of millisecond-precise spike timing of individual neurons in the auditory cortex of alert nonhuman
primates. In particular, unlike previous studies that used isolated
and brief sounds, we considered the encoding of prolonged complex sounds that were presented as a continuous sequence. We
computed direct estimates of the mutual information between
stimulus and neural response, which can capture all of the information that neural responses carry about the sound stimuli. We
found that auditory cortex neurons encode complex sounds by
millisecond-precise variations in firing rates. Registering the same
responses at a coarser (e.g., tens of millisecond) precision considerably reduced the amount of information that can be recovered.
Results
Encoding of Complex Sounds. We recorded single neurons’ respon-
ses in the caudal auditory cortex of passively listening macaque
monkeys. In a first experiment, we recorded the responses of 41
neurons to a sequence of pseudorandom tones (so called “random
chords”; Fig. 1A). Given that this stimulus has a short (few milliseconds) correlation time scale (Fig. 1B), and given the debate
about whether there could be a precise temporal encoding in the
absence of response locking to fast sequences of stimulus presentations (7), random chords are particularly suited to determine
whether precise spike timing can contribute to the encoding of
complex sounds within a rich acoustic background. We found that
responses to random chords were highly reliable and precisely timelocked across stimulus repeats, resulting in good alignment of spikes
across trials and sharp peaks in the trial-averaged response (the
peristimulus time histogram, PSTH). This precision is visualized by
the example responses in Fig. 1C and Fig. S1. The width of the peaks
in the PSTH reflects the duration of stimulus-induced firing events
and their trial-to-trial variability and can be used as a simple measure of response precision (17, 18). By following Desbordes et al.
(19), we used the PSTH autocorrelogram to quantify the width of
such peaks. For many neurons, this analysis revealed very narrow
correlogram peaks (as illustrated by the example data; Fig. 1C and
Author contributions: C.K. and S.P. designed research; C.K. performed research; C.K. and
S.P. analyzed data; and C.K., N.K.L., and S.P. wrote the paper.
The authors declare no conflict of interest.
1
To whom correspondence may be addressed. E-mail: [email protected].
de or [email protected].
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
1073/pnas.1012656107/-/DCSupplemental.
PNAS Early Edition | 1 of 6
NEUROSCIENCE
Contributed by Nikos K. Logothetis, August 25, 2010 (sent for review June 24, 2010)
0
A
B
0
1
Frequency [Hz]
Frequency [Hz]
2000
4000
6000
8000
400
1600
6400
10000
0
−80
−40
0
40
80
Time lag [ms]
1s
C
Information [bit]
Trials
-100
0
100
Time lag [ms]
0.4
0.3
0.2
200
300
E
12
# neurons
400
6
0
0
5
10
15
HWHH [ms]
20
500
Time [ms]
600
700
800
900
100
F
80
60
2
4 6 12
Effective precision [ms]
1000
% neurons
D
100
Normalized information
0
80
60
40
20
40
0
20
1
2
4
6
12
24
Effective precision [ms]
48
2
4
6
12
24
48
Effective precision [ms]
Fig. 1. Responses to random chords. (A) Time-frequency representation of a short stimulus section. Red colors indicate high sound amplitude, and individual
tones of the random chord stimulus are visible as red lines. (B) Autocorrelation of the amplitude of individual sound frequency bands. Strong correlations (red
colors) are limited to within ±20 ms. (C) Example data showing the response of a single neuron for a stimulus section of 1-s duration. (Left) Spike times on
individual stimulus repeats and the trial-averaged response (PSTH). (Right) Autocorrelation of the PSTH (Upper) and stimulus information (in bits) obtained at
different effective temporal precisions (Lower). The high response precision of the neuron is visible as the tight alignment of spikes across trials, the narrow
peak of the correlogram, and the rapid drop of information with decreasing effective precision. Note that information values are only shown for precision <2
ms, because finer precision did not add information (compare E and F). (D) Distribution of the PSTH autocorrelogram half-width (HWHH) across neurons (n =
41). The median value is 5 ms. (E) Normalized stimulus information obtained from responses sampled at different effective precisions. For each neuron, the
absolute information values were normalized (100%) to the value at 1-ms precision. Boxplots indicate the median and 25th and 75th percentile across
neurons. Information values from 1 to 6 ms were calculated by using Δt = 1-ms bins, whereby the effective precisions of 2, 3, and 6 ms were obtained by
shuffling spikes in two, three, and six neighboring bins; values for 12, 24, and 48-ms precision were computed by using bins of Δt = 2, 4, and 8 ms, respectively,
and by shuffling spikes in six neighboring bins. (F) Percentage of neurons with significant information loss (compared with 1-ms precision) at each effective
temporal precision (bootstrap procedure, P < 0.05). Note that neurons with a significant information loss at (e.g., 4-ms) precision also lose information at even
coarser precisions, hence resulting in an increasing proportion of neurons with decreasing effective precision.
Fig. S1), indicating a response precision finer than 10 ms (the
population median of the half-width at half-height of the PSTH
autocorrelogram was 5 ms; Fig. 1D).
Yet, the width of PSTH peaks does not fully establish the
temporal scale at which spikes carry information, because, for
example, spikes might encode stimulus information depending
on where they occur within a PSTH peak. We thus used Shannon
information [I(S;R); Eq. 1] to directly quantify the information
carried by temporal spike patterns sampled at different temporal
precision. Shannon information measures the reduction of uncertainty about the stimulus gained by observing a single trial response (averaged over all stimuli and responses) and was computed
from many (average 55) repeats of the same stimulus sequence by
using the direct method (20). We computed the stimulus information carried by spike patterns characterized as “binary”
(spike/no spike) sequences sampled in fine (1-ms) time bins, and
used a temporal shuffling procedure to compare the stimulus information encoded at different “effective” response precisions.
This shuffling procedure entailed shuffling spikes in nearby time
2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1012656107
bins (compare Methods and Fig. S2A) and can be used to progressively degrade the effective precision of a spike train without
affecting the statistical dimensionality of the data. As “stimuli” we
considered eight randomly selected sections form the long stimulus
sequence (Fig. S2B). The resulting information estimates hence
indicate how well these different sounds can be discriminated given
the observed responses.
Estimating stimulus information from responses at effective
precisions coarser than 1 ms resulted in a considerable information
loss. This information loss is illustrated for the example neurons,
which reveal a considerable drop in stimulus information with
progressively degrading temporal precision (Fig. 1C and Fig. S1A).
Across the population of neurons, the dependency of stimulus information on response precision was quantified by normalizing
information values at coarser precisions by the information derived
from the original (1-ms precision) response (Fig. 1E). Across
neurons the information loss amounted to 5, 11, and 20% (median)
for effective precisions of 4, 6, and 12 ms, respectively, demonKayser et al.
Response Features Contributing to Millisecond-Precise Coding. The
above analysis demonstrates that auditory cortex neurons encode
sounds by using responses with a temporal precision of a few
milliseconds. This result raises the question of what specific
aspects of the spike train carry this information. It could be that
the only information-bearing feature is the temporal modulation
of the PSTH, hence the time-dependent firing rate. However, it
could also be that higher-order correlations within the spike response pattern carry additional information. We quantified the
relative importance of PSTH modulations and of correlations
between spike times by computing IPSTH (SI Methods, Eq. S1),
which is a lower bound to the amount of stimulus information that
can be extracted by an observer that considers only the timedependent firing rate but ignores higher-order correlations (21).
We found that IPSTH accounted for 99 ± 0.2% of the total information I(S;R) that can be derived from the responses for precisions below 48 ms and 96 ± 1.6% at 48-ms precision (mean ±
SEM across neurons). This high fraction demonstrates that essentially the entire stimulus information can be extracted without
knowledge of higher-order correlations. Modulations of timedependent firing rate are thus a key factor in the auditory cortical
representation of sounds.
Encoding of Natural Sounds. In a second experiment we further
tested the importance of sampling responses at millisecond precision using an additional set of neurons (n = 31) recorded during
the presentation of natural sounds. This stimulus (Fig. 2A), which
has correlation time scale of few tens of milliseconds (Fig. 2B),
comprised a continuous sequence of environmental sounds, animal vocalizations and conspecific macaque vocalizations.
We found that responses to natural sounds were also characterized by high temporal precision: Several neurons (e.g., Fig. 2C,
Upper) exhibited similarly precise responses as observed during
random chord stimulation. For these neurons, spike times were
reliable and precisely time-locked across stimulus repeats, resulting in sharp peaks in the trial-averaged response and narrow peaks
in the PSTH autocorrelogram. However, other neurons exhibited
responses with coarser precision and wider peaks in the PSTH
(e.g., Fig. 2C, Lower). This neuron by neuron variability is reflected
in the wide distribution of the PSTH-derived response precision
(median, 16 ms; Fig. 2D) and the wide distribution of information
loss when reading responses at different effective temporal precisions (Fig. 2E). In fact, across the entire sample of neurons, the
proportional information lost by reducing the effective temporal
Kayser et al.
precision was smaller than the results obtained during stimulation
with random chords (compare Fig. 1).
Importantly, however, a subset of neurons carried considerable
stimulus information at high temporal precision and revealed
a significant information loss when ignoring the temporal response
precision. The fraction of neurons with significant information
loss (bootstrap test; P < 0.05) was 11 and 17% at 6 and 12 ms,
respectively (Fig. 2F). For those with significant loss at 12 ms, the
information transmitted per spike dropped from 3.8 bits/spike at
1 ms to 3.2 and 2.7 bits/spike (median) at 6 and 12 ms precision,
respectively. This result demonstrates that millisecond-precise
spike timing can also carry additional information about natural
sounds that cannot be recovered from the spike count on the scale
of ≈10 ms or coarser.
Discussion
Temporal Encoding of Sensory Information. Determining the precision of the neural code (i.e., the coarsest resolution sufficient to
extract all of the information carried by a neuron’s response) is
a fundamental prerequisite for characterizing and understanding
sensory representations (17, 18, 22–24). In recent years, the temporal precision of neural codes has been systematically investigated in several sensory structures, and in many cases, it was
suggested that the nervous system uses millisecond-precise activity
to encode sensory information (22, 25, 26). For example, neurons
in rat somatosensory cortex encode information about whisker
movements with a precision of a few milliseconds (27), and the
responses of retinal ganglion cells can account for the visual discrimination performance of the animal only if they are registered
with millisecond precision (28).
In the auditory system, the role of millisecond-precise spike
times is well established at the peripheral and subcortical level (29,
30), but it remains debated whether auditory cortical neurons encode acoustic information with millisecond-precise spike timing (8,
9, 11). Our results demonstrate that neurons in primate auditory
cortex can indeed use spike times with a precision of a few milliseconds to carry information about complex and behaviorally relevant sounds, albeit at a precision coarser than found for peripheral
or subcortical auditory structures. When sampling or decoding
these responses at millisecond precision, considerably more information can be obtained than when decoding the same response
at coarser precision. For example, about a third of the neurons
exhibited a significant information loss at 6-ms effective precision.
Our results also revealed that basically the entire information in the
spike train is provided by the time-dependent firing rate and that
higher order correlations between spike times add negligibly to this.
Together these results show that auditory cortex neurons encode
complex sounds by using finely timed modulations of the firing rate
that occur on the scale of a few milliseconds.
It is tempting to hypothesize that such millisecond-precise single
neuron-encoding serves as neural substrate for recently reported
behavioral correlates of millisecond-precise cortical activity. For
example, trained rats were shown to being able to report millisecond differences in patterns of electrical microstimulation applied to auditory cortex (31), demonstrating that auditory cortex
can access temporally precise patterns of local activity. In addition,
another study demonstrated that behavioral performance can be
better explained when decoding neural population activity sampled at a precision well below 10 ms than when decoding the same
responses at coarser time scales (32). The present results demonstrate that the responses of individual neurons can indeed
contribute to such millisecond encoding of complex sounds.
Our finding that millisecond-precise spike patterns can carry
considerable information about the acoustic environment suggests
that neurons in higher auditory cortices should be sensitive to the
timing of neurons in earlier cortices to extract such information.
Acoustic information might hence be passed through the different
stages of acoustic analysis by either preserving the temporal prePNAS Early Edition | 3 of 6
NEUROSCIENCE
strating that a considerable fraction of the encoded stimulus information is lost when sampling responses at coarser resolution.
We evaluated the statistical significance of the information
lost at coarser precision for individual neurons by using a bootstrap test. For each neuron we estimated the information lost by
reducing the temporal precision given a (artificially generated)
response that does not carry information at that particular precision (compare Methods). This bootstrap test demonstrated that
the information loss was significant (P < 0.05) for some neurons
already at 4-ms effective precision (7% of the neurons), and for
nearly half the neurons (46%) at 12-ms precision (Fig. 1F). In
addition, we did not find neurons with a significant information
loss when reducing the response precision from 1- to 2-ms bins.
This result demonstrates that for the present dataset the temporal precision of information coding is coarser than one, but
finer than ≈4–6 ms.
The information that each spike gains when registered at high
precision can be better appreciated by expressing information
values as information per spike. For those neurons with a significant loss at 12-ms precision (46% of the neurons), information per spike was 4.3 bits/spike (median) at 1-ms response
precision and dropped to 3.6 and 2.6 bits/spike at 6 and 12-ms
precision, respectively.
A
B
0
0
1
Frequency [Hz]
Frequency [Hz]
2000
4000
6000
8000
400
1600
6400
10000
0
−80
−40
0
40
80
Time lag [ms]
1s
C
Information [bit]
-100
0
0.3
0.2
0.1
0
2
Information [bit]
-100
300
400
E
5
0
0
10
20 30 40
HWHH [ms]
50
500
600
Time [ms]
700
800
900
1000
F
100
80
% neurons
200
# neurons
D
100
Normalized information
0
60
40
100
0.4
4
6
12
0
100
Time lag [ms]
0.4
0.3
0.2
0.1
0
2
4
6 12
Effective precision [ms]
40
30
20
10
0
20
1
2
4
6 12 24
Effective precision [ms]
48
2
4
6 12 24 48
Effective precision [ms]
Fig. 2. Responses to natural sounds. (A) Time-frequency representation of a short stimulus section. Red colors indicate high sound amplitude. (B) Autocorrelation
of the amplitude of individual sound frequency bands. Strong correlations (red colors) extend over several tens of milliseconds. (C) Example data showing the
responses of two neurons for a stimulus section of 1-s duration. (Left) Spike times on individual stimulus repeats and the trial-averaged response (PSTH). (Right)
Autocorrelation of the PSTH (Upper) and stimulus information (in bits) obtained at different effective temporal precisions (Lower). Upper is characterized by
narrow peaks in the PSTH and the PSTH autocorrelogram, whereas Lower is characterized by broad peaks and a marginal drop of stimulus information obtained
at different effective temporal resolutions. Note that information values are only shown for precision <2 ms, because finer precision did not add information
(compare E and F). (D) Distribution of the PSTH autocorrelogram half-width (HWHH) across neurons (n = 31). (E) Normalized stimulus information obtained from
responses sampled at different effective precisions. For each neuron, the absolute information values were normalized (100%) to the value at 1-ms precision.
Boxplots indicate the median and 25th and 75th percentile across neurons. (F) Percentage of neurons with significant information loss (compared with 1-ms
precision) at each effective temporal precision (bootstrap procedure, P < 0.05). Note that neurons with a significant information loss at (e.g., 4-ms) precision also
lose information at even coarser precisions, hence resulting in an increasing proportion of neurons with decreasing effective precision.
cision of activity volleys generated in the auditory periphery, or by
creating de novo temporally precise activity patterns that encode
nontemporal features. In either case, our findings emphasize that
precise response timing not only prevails at peripheral stages of
the auditory system, but is a general principle of auditory coding
throughout the sensory pathway.
Stimulus Decoding Versus Direct Information Estimates. Several
previous analyses on the time scales of stimulus representations in
auditory cortex were based on decoding techniques, which predict
the most likely stimulus given the observation of a neural response
(e.g., ref. 9), rather than direct estimates of mutual information as
used here. By making specific assumptions, these methods reduce
the statistical complexity and can yield reliable results also when
only small amounts of experimental data are available. However,
4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1012656107
even well constructed decoders may not be able to capture all of
the information carried by neural responses (16). To determine
whether such methodological aspects may affect the apparent temporal encoding precision, we compared results from linear stimulus
decoding and direct information methods (compare SI Materials
and Methods and SI Results). We found that linear decoding resulted in smaller losses of stimulus information when degrading
temporal precision and, hence, would have indicated a coarser
encoding time scale than that found with direct estimates of the
stimulus information (SI Results and Fig. S3). This lower temporal
sensitivity suggests that some decoding procedures may fail to reveal
part of the information encoded by neurons at very fine precisions.
Dependency of Coding Precision on Stimulus Dynamics. Our results
reveal a dependency of neural response precision on the promKayser et al.
Materials and Methods
Recording Procedures and Data Extraction. All procedures were approved by
the local authorities (Regierungspräsidium Tübingen) and were in full
compliance with the guidelines of the European Community (Directive 86/
609/EEC). Neural activity was recorded from the caudal auditory cortex of
three adult rhesus monkeys (Macaca mulatta) by using procedures detailed
in studies (15, 37). Briefly, single neuron responses were recorded by using
multiple microelectrodes (1–6 MOhm impedance) and amplified (Alpha
Omega system) and digitized at 20.83 kHz. Recordings were performed in
a dark and anechoic booth, while animals were passively listening to the
acoustic stimuli. Recording sites were assigned to auditory fields (primary
Kayser et al.
field A1 and caudal belt fields CM and CL) based on stereotaxic coordinates,
frequency maps constructed for each animal, and the responsiveness for
tone vs. band-passed stimuli (37). Spike-sorted activity was extracted by using commercial spike-sorting software (Plexon Offline Sorter) after high-pass
filtering the raw signal at 500 Hz (third-order Butterworth filter). Only units
with high signal to noise (SNR > 10) and <1% of spikes with interspike
intervals shorter than 2 ms were included in this study.
Acoustic Stimuli. Sounds (average 65 dB sound pressure level) were delivered
from two calibrated free field speakers (JBL Professional) at 70-cm distance.
Two different kinds of stimuli were used. The first was a 40-s sequence of
pseudorandom tones (“random chords”). This sequence was generated by
presenting multiple tones (125-ms duration) at different frequencies (12 fixed
frequency bins per octave), with each tone frequency appearing (independently of the others) with an exponentially distributed intertone interval (range 30–1,000 ms, median 250 ms; see Fig. 1A for a spectral representation). The second was a continuous 52-s sequence of natural sounds. This
sequence was created by concatenating 21 each 1- to 4-sec-long snippets of
various naturalistic sounds, without periods of silence in between (animal
vocalizations, environmental sounds, conspecific vocalizations, and short
samples of human speech; Fig. 2A). Each stimulus was repeated multiple times
(on average 55 repeats of the same stimulus, range 39–70 repeats) and for
a given recording site only one of the two stimuli was presented.
The temporal structure of these stimuli was quantified by using the autocorrelation of individual frequency bands. The envelope of individual
quarter octave bands was extracted by using third-order Butterworth filters
and applying the Hilbert transform, and the autocorrelation was computed.
This calculation revealed that the correlation time constant of the random
chords extended over <20 ms (Fig. 1B) and was considerably shorter than
that of the natural sounds (Fig. 2B).
Information Theoretic Analysis. Information theoretic analysis was based on
direct estimates of stimulus information I(S;R) and followed detailed procedures (15, 38, 39). The Shannon (mutual) information between a set of
stimuli S and a set of neural responses R is defined as
IðS; RÞ ¼ ∑r;s PðsÞPðrjsÞlog2
PðrjsÞ
;
PðrÞ
[1]
with P(s) the probability stimulus s, P(r |s) the probability of the response r
given presentation of stimulus s, and P(r) the probability of response r across
all trials to any stimulus. It is important to note that, in the formalism used
here, the information theoretic calculation does not make assumptions about
the particular acoustic features driving the responses (see also refs. 15 and
40). As a result, our information estimates the overall amount of knowledge
about the sensory environment that can be gained from neural response.
The stimuli for the information analysis were defined by eight (randomly
selected) 1-s-long sections from the long sound sequences (Fig. S2B). For each
section, the time axis was divided into a number of nonoverlapping time
windows of length T (here ranging from 6 to 48 ms), and for each window,
we computed the information carried by the neural responses about which
of these eight different complex sounds was being presented. Information
values were then averaged over all time windows and over 50 sets of randomly selected eight sound sections to ensure independence of results on
the exact selection of stimulus chunks. The limited sampling bias of the information values was corrected by using established procedures described
in SI Materials and Methods.
Information Dependence on Effective Response Precision. Responses were
characterized as binary n-letter words, whereby the letter in each time (Δt)
indicates the absence (0) or presence of at least one spike (1). Here, we considered bins of Δt = 1, 2, 4, and 8 ms and patterns of length n = 6. To quantify
the impact of reading responses at different effective precisions, we compared
the information in the full pattern and patterns of degraded precision. To
reduce the effective response precision we shuffled (independently across
trials and stimuli) the responses in groups of N neighboring time bins, where
the parameter N indicates the temporal extent of the shuffling (Fig. S2A). It is
important to note that this shuffling procedure does not alter the statistical
dimensionality (number of time bins N) of the data and allows investigating
different effective temporal precisions without changing the absolute length
of the time window T. The latter is important because the associated biases of
the information estimate are kept constant (38) and because it avoids spurious
scaling of the information values by changing the considered time window
T: Direct information estimates increase rapidly with decreasing window T
because the use of short windows can underestimate the redundancy of the
PNAS Early Edition | 5 of 6
NEUROSCIENCE
inent stimulus time scales: The average PSTH-derived response
precision (Figs. 1D and 2D) was finer, and the information loss
with decreasing response precision (Figs. 1E and 2E) was larger,
for the chord stimulus, which had a much shorter autocorrelation
time scale than the natural sounds (compare Figs. 1 and 2B). Our
results hence suggest that the neural response precision and the
importance of millisecond-precise spike patterns may depend on
the overall dynamics of stimulus presentation. In this regard, our
findings parallel previous results obtained in the visual thalamus
of anesthetized cats, where visual neurons were found to exhibit
higher response precision for faster visual stimuli (18). Our
results hence generalize this dependency of neural coding time
scales upon stimulus time scales to the auditory system, sensory
cortex, and the alert primate.
The apparent dependency of the encoding precision on stimulus
dynamics also suggests that possibly even finer encoding precisions
than reported here may be found when using stimuli of yet faster
dynamics. In light of this stimulus dependency, our results may
rather provide a lower bound on the effective precision, leaving the
possibility that auditory cortex neurons might encode sensory information at the (sub) millisecond scale.
What biophysical mechanisms could be responsible for the generation of stimulus information at high temporal spike precision in
the context of more rapidly modulating sensory stimuli? A simple
hypothesis is that precision of spike times is determined by the slope
of the membrane voltage when crossing the firing threshold, with
steeper slopes resulting in more precise spike times (17, 33). On the
basis of this hypothesis, our results suggest that stimuli with faster
dynamics elicit more rapid membrane depolarizations and, hence,
more precisely timed spikes that, in return, then encode stimulus
information at high temporal precision. Moreover, a recent study
proposed that cortical networks driven by static or slowly varying
inputs are unable to encode information by reliable and precise
spikes times (34). In light of this previous result, our results suggest
that during stimulation with complex sounds input to auditory
cortex is varying sufficiently rapid to enable the representation of
stimulus information by millisecond-precise spike timing.
In addition to depending on the overall stimulus dynamics, auditory cortex responses are also modulated by sounds that occurred
several seconds in the past (35, 36). This history or context dependence might be very relevant to the study of neural codes: For
example, previous studies reporting a 10- to 20-ms coarse coding
precision of auditory cortex neurons were based on the presentation of short stimuli (natural sounds or clicks) interspersed
with long periods of silence (9, 11). In such paradigms, individual
“target” sounds are presented in a context of silence. In contrast to
such a paradigm, we here presented continuous sequences of
sounds, hence effectively embedding each target sound in the
context of other (similar) sounds. Our finding of a higher coding
precision than in previous studies might well result from this difference in stimulation paradigm. Future studies should directly
investigate the context dependency of neural coding time scales, by
e.g., presenting the same set of stimuli either separated by extended
periods of silence or embedded in a context of stimuli of the same
kind. Addressing this issue will be crucial to understand how the
brain forms sensory representations which can operate efficiently in
continuously changing sensory environments.
information carried by spikes in adjacent response windows when these windows become too short (41, 42). Note that this shuffling procedure is equivalent to decreasing the spiking precision by randomly jittering spike times
within the window T with a jitter taken from a uniform distribution.
To quantify the information loss induced by ignoring the response precision,
we normalized (for each neuron) the information value obtained at coarser
precisions by the values at 1-msprecision (hence defined as 100%, compare Figs. 1E
and 2E) and calculated the information loss as the difference of the normalized
information to 100%. To evaluate the statistical significance of the information
loss for individual neurons, we used a nonparametric bootstrap procedure: For
each unit, we first randomly shuffled along the time dimension the neural response independently in each trial and then computed the percent information
loss when evaluating the randomized response at coarser effective precisions.
This procedure was repeated 500 times, and the 95th percentile (P = 0.05) was
used as the statistical threshold to identify significant information loss.
1. Plack CJ, Oxenham AJ, Fay RR, Popper AN, eds (2005) Pitch: Neural Coding and
Perception (Springer, New York).
2. Rosen S (1992) Temporal information in speech: Acoustic, auditory and linguistic
aspects. Philos Trans R Soc Lond B Biol Sci 336:367–373.
3. Zatorre RJ (1988) Pitch perception of complex tones and human temporal-lobe
function. J Acoust Soc Am 84:566–572.
4. DeWeese MR, Wehr M, Zador AM (2003) Binary spiking in auditory cortex. J Neurosci
23:7940–7949.
5. Heil P, Irvine DR (1997) First-spike timing of auditory-nerve fibers and comparison
with auditory cortex. J Neurophysiol 78:2438–2454.
6. Furukawa S, Xu L, Middlebrooks JC (2000) Coding of sound-source location by
ensembles of cortical neurons. J Neurosci 20:1216–1228.
7. Lu T, Liang L, Wang X (2001) Temporal and rate representations of time-varying
signals in the auditory cortex of awake primates. Nat Neurosci 4:1131–1138.
8. Wang X, Lu T, Bendor D, Bartlett E (2008) Neural coding of temporal information in
auditory thalamus and cortex. Neuroscience 157:484–494.
9. Schnupp JW, Hall TM, Kokelaar RF, Ahmed B (2006) Plasticity of temporal pattern
codes for vocalization stimuli in primary auditory cortex. J Neurosci 26:4785–4795.
10. Bizley JK, Walker KM, King AJ, Schnupp JW (2010) Neural ensemble codes for stimulus
periodicity in auditory cortex. J Neurosci 30:5078–5091.
11. Nelken I, Chechik G, Mrsic-Flogel TD, King AJ, Schnupp JW (2005) Encoding stimulus
information by spike numbers and mean response time in primary auditory cortex. J
Comput Neurosci 19:199–221.
12. Ter-Mikaelian M, Sanes DH, Semple MN (2007) Transformation of temporal properties
between auditory midbrain and cortex in the awake Mongolian gerbil. J Neurosci 27:
6091–6102.
13. Rennaker RL, Carey HL, Anderson SE, Sloan AM, Kilgard MP (2007) Anesthesia
suppresses nonsynchronous responses to repetitive broadband stimuli. Neuroscience
145:357–369.
14. Schroeder CE, Lakatos P, Kajikawa Y, Partan S, Puce A (2008) Neuronal oscillations and
visual amplification of speech. Trends Cogn Sci 12:106–113.
15. Kayser C, Montemurro MA, Logothetis NK, Panzeri S (2009) Spike-phase coding
boosts and stabilizes information carried by spatial and temporal spike patterns.
Neuron 61:597–608.
16. Quian Quiroga R, Panzeri S (2009) Extracting information from neuronal populations:
Information theory and decoding approaches. Nat Rev Neurosci 10:173–185.
17. Mainen ZF, Sejnowski TJ (1995) Reliability of spike timing in neocortical neurons.
Science 268:1503–1506.
18. Butts DA, et al. (2007) Temporal precision in the neural code and the timescales of
natural vision. Nature 449:92–95.
19. Desbordes G, et al. (2008) Timing precision in population coding of natural scenes in
the early visual system. PLoS Biol 6:e324.
20. Rieke F, Warland D, de Ruyter van Steveninck RR, Bialek W (1999) Spikes—Exploring
the Neural Code (MIT Press, Cambridge, MA).
21. Latham PE, Nirenberg S (2005) Synergy, redundancy, and independence in population
codes, revisited. J Neurosci 25:5195–5206.
22. Panzeri S, Brunel N, Logothetis NK, Kayser C (2010) Sensory neural codes using
multiplexed temporal scales. Trends Neurosci 33:111–120.
23. Bialek W, Rieke F, de Ruyter van Steveninck RR, Warland D (1991) Reading a neural
code. Science 252:1854–1857.
24. Victor JD (2000) How the brain uses time to represent and process visual information
(1). Brain Res 886:33–46.
25. Reinagel P, Reid RC (2000) Temporal coding of visual information in the thalamus.
J Neurosci 20:5392–5400.
26. Reich DS, Mechler F, Purpura KP, Victor JD (2000) Interspike intervals, receptive fields,
and information encoding in primary visual cortex. J Neurosci 20:1964–1974.
27. Arabzadeh E, Panzeri S, Diamond ME (2006) Deciphering the spike train of a sensory
neuron: Counts and temporal patterns in the rat whisker pathway. J Neurosci 26:
9216–9226.
28. Jacobs AL, et al. (2009) Ruling out and ruling in neural codes. Proc Natl Acad Sci USA
106:5936–5941.
29. Johnson DH (1980) The relationship between spike rate and synchrony in responses of
auditory-nerve fibers to single tones. J Acoust Soc Am 68:1115–1122.
30. Lesica NA, Grothe B (2008) Dynamic spectrotemporal feature selectivity in the
auditory midbrain. J Neurosci 28:5412–5421.
31. Yang Y, DeWeese MR, Otazu GH, Zador AM (2008) Millisecond-scale differences in
neural activity in auditory cortex can drive decisions. Nat Neurosci 11:1262–1263.
32. Engineer CT, et al. (2008) Cortical activity patterns predict speech discrimination
ability. Nat Neurosci 11:603–608.
33. Azouz R, Gray CM (1999) Cellular mechanisms contributing to response variability of
cortical neurons in vivo. J Neurosci 19:2209–2223.
34. London M, Roth A, Beeren L, Häusser M, Latham PE (2010) Sensitivity to perturbations
in vivo implies high noise and suggests rate coding in cortex. Nature 466:123–127.
35. Asari H, Zador AM (2009) Long-lasting context dependence constrains neural
encoding models in rodent auditory cortex. J Neurophysiol 102:2638–2656.
36. Ulanovsky N, Las L, Farkas D, Nelken I (2004) Multiple time scales of adaptation in
auditory cortex neurons. J Neurosci 24:10440–10453.
37. Kayser C, Petkov CI, Logothetis NK (2008) Visual modulation of neurons in auditory
cortex. Cereb Cortex 18:1560–1574.
38. Panzeri S, Senatore R, Montemurro MA, Petersen RS (2007) Correcting for the sampling
bias problem in spike train information measures. J Neurophysiol 98:1064–1072.
39. Magri C, Whittingstall K, Singh V, Logothetis NK, Panzeri S (2009) A toolbox for the
fast information analysis of multiple-site LFP, EEG and spike train recordings. BMC
Neurosci 10:81.
40. Strong SP, Koberle R, de Ruyter van Steveninck RR, Bialek W (1998) Entropy and
information in neural spike trains. Phys Rev Lett 80:197–200.
41. Panzeri S, Schultz SR (2001) A unified approach to the study of temporal, correlational,
and rate coding. Neural Comput 13:1311–1349.
42. Oizumi M, Ishii T, Ishibashi K, Hosoya T, Okada M (2010) Mismatched decoding in the
brain. J Neurosci 30:4815–4826.
6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1012656107
ACKNOWLEDGMENTS. We are supported by the Max Planck Society, the Brain
Machine Interface (BMI) project of the Italian Institute of Technology, the
Compagnia di San Paolo, and the Bernstein Center for Computational Neuroscience Tübingen.
Kayser et al.
Supporting Information
Kayser et al. 10.1073/pnas.1012656107
SI Materials and Methods
Correction of the Limited Sampling Bias in Direct Estimates of
Information. Direct estimates of mutual information can suffer
from a bias due to the limited amount of data available to calculate
the conditional probabilities P(r|s) (compare Eq. 1). This sampling
bias in the information estimate was corrected by using a multistep
procedure, which combines the established shuffling technique (1)
and the quadratic extrapolation procedure (2). For details of this
procedure, see refs. 3 and 4. The performance of this procedure
with the number of trials was tested on simulated data in several
previous reports (3–6). A test of the performance on simulated
data with first-order statistical properties matching those of the
real auditory cortical neurons analyzed here was performed as
follows (we refer to ref. 3 for full details): We computed the time
varying peristimulus time histogram (PSTH) in 4-ms bins collected
in response to the natural sound stimulus. Then, for a given neuron
a simulated set of responses was generated by using an inhomogeneous Poisson process with the same PSTH as in the real
experiment. We then studied the variation of the information estimates as a function of the number of simulated trials included in
the estimate (see figure S2 in ref. 3 for an example). The critical
parameters for the performance of bias correction procedures are
the number of trials per stimulus N (which was in the range 39–70
in the real data, with mean 55) and the cardinality R of the response space (i.e., the number of possible different responses that
can be observed), which was R = 26 = 64 in this study. The larger R
(and the smaller N), the more difficult is to correct for the bias (1).
The simulations showed that our algorithm is highly data robust,
with very mild degradation of performance when decreasing the
number of trials. In particular, the information estimate converged
to within 4 and 1% of the “asymptotic” (large trial number) value
when using 32 and 64 trials per stimulus, respectively (average over
all simulated neurons), suggesting that our information theoretic
calculations provide a good estimate of the actual values.
Evaluation of the Relative Importance of PSTH Modulations and of
Correlations Between Spike Times in Information Transmission. In the
main text, we computed how mutual information I(S; R) (Eq. 1)
depends on the precision at which neural responses are sampled.
This calculation is useful to prove that precise spike times are
important for encoding, but it is not sufficient to determine what
aspects of the spike train are most crucial for information transmission. The simplest possibility is that the only informationbearing feature of the spike train is the temporal modulation of
the PSTH. The PSTH is proportional to the time-dependent firing
rate and expresses the first-order statistical properties of the spike
train at a given temporal resolution. A second possibility is that
neuronal firing cannot be described completely by its first-order
statistics, and information is encoded by high order statistical
properties of the spike train, such as correlations between spike
pairs that cannot be explained by firing rate modulations. In this
second case, an observer of the spike train would not be able to
decode all information available in the spike train unless it takes
into account such higher-order correlations between spike times.
We characterized the relative importance of PSTH modulations
and correlations between spike times by estimating how much
stimulus information can be extracted even by a decoder that
considers only the time-dependent firing rate and ignores temporal
correlations. By following refs. 7–9, a lower bound to this information can be computed by using the following equation:
Kayser et al. www.pnas.org/cgi/content/short/1012656107
IPSTH ¼ ∑ PðrjsÞ log2
r; s
PPSTH ðrjsÞ
;
PPSTH ðrÞ
[S1]
where PPSTH ðrjsÞis the probability of observing response r to
stimulus s if the neuron fired according to a time-dependent
Poisson process with the same PSTH as the one of the neuron
under analysis, and PPSTH ðrÞ ¼ ∑ PPSTH ðrjsÞPðsÞ.
r
To determine whether correlations are needed to extract
most information from the spike train, it is useful to compare
the value of IPSTH to the value of IðS; RÞ, which quantifies the
overall amount of information that can be obtained from the
neuronal response taking into account all its characteristics.
By information theoretic inequalities it can be proved that
IPSTH ≤ IðS; RÞ. If the ratio IPSTH =IðS; RÞ is close to 1, then most
information carried by the spike train can be extracted even by
a downstream observer of neural activity that pays attention only
to the first-order statistics and neglects correlations. The
quantity IPSTH was corrected for the limited sampling bias using
the same procedures used for the overall information IðS; RÞand
reported above.
Stimulus Decoding. An alternative approach to computing the mutual information between stimulus and neural response are methods
based on explicit models for stimulus decoding, such as linear discriminant analysis (10, 11). By making particular assumptions, such
as the linear separability of the responses to different stimuli, these
methods reduce the statistical complexity of estimating the information conveyed by neural responses about the stimulus and can
yield reliable results also for problems where only small amounts of
experimental data are available.
However, a decoding approach may miss critical ways in which
a response eliminates uncertainty about the stimulus, such as by
making incorrect or oversimplified assumptions about the stimulus–
response relationship. For example, a linear decoder (as e.g., described below) would fail to capture the contribution of nonlinear
stimulus–response relationships. Another reason why decoding
approaches may not capture all information encoded by a neural
response is that responses may encode information by other means
than just indicating the most likely stimulus (11, 12). For example,
the knowledge that a particular stimulus is totally unlikely can
provide considerable information that would not be captured by
a decoder extracting the most likely stimulus, even if this decoder
makes all of the proper assumptions about the stimulus–response
relationship (11). In contrast to the decoder, information theory
has the advantages of quantifying all of the knowledge that can be
gained from a response and of taking possible nonlinear stimulus–
response relationships into account.
Here, we used a linear decoder to probe the stimulus–response
relation at different response precisions. Details of the procedure are described as follows.
General principle. Stimulus decoding analysis is a direct approach to
test howwell apredeterminedset of stimuli canbediscriminated given
the set of observed responses. Here we used a linear discriminant
decoder in conjunction with a leave-one-out cross-validation procedure. Practically, the decoding was done as follows. We randomly
selected eight 1-s chunks from each of the long sounds (Fig. S2) and
used these as eight stimuli for decoding. For each trial and stimulus,
we then repeated the following: (i) The average responses to all other
seven stimuli were computed across all repeats of the respective
stimuli. (ii) For the current stimulus the mean response was computed by averaging across all trials, excluding the current “test” trial.
(iii) The Euclidean distance (over time points) was computed be1 of 4
tween the response on the test trial and all eight average responses.
The test trial was decoded as that stimulus yielding the minimal
distance to the test response. This procedure was repeated for each
trial of each of the eight stimuli, and the total percentage of correctly
decoded trials and the confusion matrix were determined. It should
be noted that this decoding operation is based only on the likelihood
of the neural response given the stimulus and, therefore, implicitly
assumes that all stimuli have the same prior probability of being
presented, as it actually happens in our experiment by design.
Response definition and temporal precision. Decoding was applied to
the same eight stimulus chunks used for the information analysis
and individual neuron’s responses were sampled at 1-ms resolution
(resulting in a response vector of 250 points). To study the impact
of temporal response precision on decoding, we used a temporal
shuffling procedure. The temporal precision of the response was
manipulated by shuffling (independently across trials and stimuli)
the responses in groups of n neighboring time bins, where the
parameter n indicates the temporal extend of the shuffling. Importantly, this shuffling procedure preserves the original number of
time bins and, hence, the statistical dimension of the data, but
reduces the effective precision at which the response is “read” by
the decoder.
Decoding performance quantification. To quantify the performance of
the decoder, we derived the information provided by the “confusion” matrix of the decoder. The values on a given row s and column d of the confusion matrix Q(d | s) represent the fraction of
trials on which the presented stimulus s was decoded to be stimulus
d. If decoding is perfect, the values in Q will be one on the diagonal
and zero otherwise. The confusion matrix is of interest not only
because it provides an intuitive picture of the decoding success and
errors, but also because it provides a direct link to information
theory. The information in the confusion matrix is defined by the
following equation:
insight about the stimulus–response relationship but may fall
short in capturing all of the knowledge that can be gained by
observing a neural response.
Characterization of Neural Precision by Peristimulus Time Histogram
(PSTH) Autocorrelation Width. By following Desbordes et al. (15), the
temporal precision of a neuron’s response can be measured by
using the width of the central peak in the temporal autocorrelogram of the trial averaged PSTH. Under assumptions described in
ref. 15, this analysis provides a good measure of the width of typical
peaks in the neuron’s response. We here computed this measure by
first calculating the temporal autocorrelation of the trial averaged
response and defined the response precision as the half-width at
half-height (HWHH) of the central peak of the autocorrelogram.
SI Results
Stimulus Decoding Versus Direct Information Estimates. Several
Information theoretic inequalities ensure that I(S;D) ≤ I(S;R),
with I(S;R) being the direct information estimate (13). It is important to note that even for an optimal decoder, the extracted
information may be strictly less than the information available in
the response. This inequality comes because the decoding operation captures only one aspect of the information carried by the
population response, namely the identity of the most likely stimulus. Mutual information [I(S;R)], in contrast, quantifies the
overall knowledge about the stimulus gained with the single-trial
response, including information carried by the absence of a response in a particular time window, or by providing evidence
about the relative likelihood of different stimuli (14). As a consequence, methods of (linear) decoding can provide considerable
previous analyses on the time scales of stimulus representations
were based on decoding techniques. By making specific assumptions, such as the linear separability of the responses to different
stimuli, these methods reduce the statistical complexity and can
yield reliable results also when only small amounts of experimental
data are available. However, even for optimal well constructed
decoders, the extracted information may be strictly less than the
total information available in the response (11), because the decoding operation captures only one aspect of the response: the
identity of the most likely stimulus. Direct estimates of mutual
information [I(S;R)], in contrast, quantify the overall knowledge
about the stimulus gained from a single-trial response including,
for example, information carried by the amount of certainty with
which the most likely stimulus was predicted, or by providing evidence about which are the less likely stimuli given the neural response. As a consequence, decoding methods might underestimate
the contribution of precise spike timing.
Indeed, when using a linear decoder to probe the stimulus–response relation at different effective precisions, we found a reduced
information loss compared with the direct information estimates
reported above. Fig. S3 directly compares the normalized information values obtained with the direct estimate from the responses
to random chords (as in Fig. 1E) to the information extracted from
the linear decoder (Eq. S2). For all temporal precisions, the information loss suggested by the decoder was smaller than that obtained from the direct information estimate, with the difference
between the two methods amounting up to 60% at 48 ms effective
precision. This result shows that the information carried by precisely timed spikes is better revealed when considering the full
dependency between stimulus and response and might be underestimated by decoding approaches.
1. Panzeri S, Senatore R, Montemurro MA, Petersen RS (2007) Correcting for the sampling
bias problem in spike train information measures. J Neurophysiol 98:1064–1072.
2. Strong SP, Koberle R, de Ruyter van Steveninck RR, Bialek W (1998) Entropy and
information in neural spike trains. Phys Rev Lett 80:197–200.
3. Kayser C, Montemurro MA, Logothetis NK, Panzeri S (2009) Spike-phase coding boosts and
stabilizes information carried by spatial and temporal spike patterns. Neuron 61:597–608.
4. Montemurro MA, Rasch MJ, Murayama Y, Logothetis NK, Panzeri S (2008) Phase-offiring coding of natural visual stimuli in primary visual cortex. Curr Biol 18:375–380.
5. Montemurro MA, Senatore R, Panzeri S (2007) Tight data-robust bounds to mutual
information combining shuffling and model selection techniques. Neural Comput 19:
2913–2957.
6. Magri C, Whittingstall K, Singh V, Logothetis NK, Panzeri S (2009) A toolbox for the
fast information analysis of multiple-site LFP, EEG and spike train recordings. BMC
Neurosci 10:81.
7. Nirenberg S, Latham PE (2003) Decoding neuronal spike trains: How important are
correlations? Proc Natl Acad Sci USA 100:7348–7353.
8. Pola G, Thiele A, Hoffmann KP, Panzeri S (2003) An exact method to quantify the
information transmitted by different mechanisms of correlational coding. Network
14:35–60.
9. Latham PE, Nirenberg S (2005) Synergy, redundancy, and independence in population
codes, revisited. J Neurosci 25:5195–5206.
10. Nelken I, Chechik G (2007) Information theory in auditory research. Hear Res 229:
94–105.
11. Quian Quiroga R, Panzeri S (2009) Extracting information from neuronal populations:
Information theory and decoding approaches. Nat Rev Neurosci 10:173–185.
12. Rieke F, Warland D, de Ruyter van Steveninck RR, Bialek W (1999) Spikes - Exploring
the Neural Code (MIT Press, Cambridge, MA).
13. Cover TM, Thomas JA (1991) Elements of Information Theory (Wiley-Interscience, London).
14. Panzeri S, Treves A, Schultz S, Rolls ET (1999) On decoding the responses of
a population of neurons from short time windows. Neural Comput 11:1553–1577.
15. Desbordes G, et al. (2008) Timing precision in population coding of natural scenes in
the early visual system. PLoS Biol 6:e324.
IðS; DÞ ¼ ∑ PðsÞQðdjsÞlog2
d; s
QðdjsÞ
:
QðdÞ
Kayser et al. www.pnas.org/cgi/content/short/1012656107
[S2]
2 of 4
A
Information [bit]
0.5
0.4
0.3
0.2
2
4 6 12
Effective precision [ms]
0
100
200
300
400
500
600
700
800
900
-100
1000
0
100
Time lag [ms]
Time [ms]
B
Information [bit]
0.2
0.1
0
2
4 6 12
Effective precision [ms]
0
100
200
300
400
500
600
700
800
900
1000
-100
0
100
Time lag [ms]
Time [ms]
Fig. S1. Responses of example neurons to random chords (A) and natural sounds (B). (Left) Spike times on individual repeats of the same stimulus together
with the trial-averaged response (upper trace). (Right) The stimulus information obtained at different effective temporal precisions (Upper) and the response
autocorrelation (Lower).
Kayser et al. www.pnas.org/cgi/content/short/1012656107
3 of 4
A
B
Effective precision 2ms, no shuffling
1
0
1
0
1
1
0
1
0
1
1
1
Effective precision 6ms, shuffling N=3 bins
0
0
1
1
0
1
Effective precision 12ms, shuffling N=6 bins
0
0
0
1
1
1
Sections from sound sequence
Stimulus 1
Effective precision 4ms, shuffling N=2 bins
Stimulus 2
.
.
.
.
...
T1=n*Dt
T2
T3
Fig. S2. Details of the temporal shuffling procedure and the stimulus definition. (A) Temporal shuffling procedure to reduce the effective response precision.
Spikes were shuffled (independently across trials and stimuli) across N neighboring time bins. In particular, we shuffled spikes in neighboring bins (doubling the
effective resolution, n = 2), in triplets (increasing resolution by a factor of three, n = 3) or all six spikes in the pattern (increasing resolution by a factor of six, n =
6). Boxes correspond to time bins (here, Δt = 2 ms). (B) Stimulus definition for information calculation. Stimuli were eight sections (black traces) randomly
chosen from a long sound sequence (gray trace). For the quantification of spike patterns, each section was divided into nonoverlapping time windows Ti
consisting of n = 6 time bins (Δt = 1, 2, 4, or 8 ms). Mutual information was calculated between responses (spike patterns), and these eight stimuli separately for
each time window Ti. Information values were averaged over all subsequent time windows to provide the information conveyed by the responses for discriminating the eight different sound sections. To ensure the independence of our results from the eight stimulus sections used to calculate the information
values, we repeated the same procedure by using 50 sets of randomly selected sections and averaged the information values.
Normalized values
100
80
60
40
Direct information estimate
Information from linear decoder
20
2
4
6
12
Effective precision [ms]
24
48
Fig. S3. Comparison of direct information estimates and those derived from stimulus decoding. The figure displays the normalized information values derived
from direct estimates of the stimulus information (black, same as in Fig. 1E) and from the confusion matrix of a linear decoder (red). It demonstrates that in this
situation, a linear decoding method reveals a smaller drop in stimulus information with progressively coarser effective temporal precision than the direct
method. For each neuron, information was normalized (100%) to the value at 1-ms precision. Boxplots indicate the median and 25th–75th percentile across
neurons.
Kayser et al. www.pnas.org/cgi/content/short/1012656107
4 of 4