The Temporal and Spatial Allocation of Motor Preparation during a

Articles in PresS. J Neurophysiol (July 30, 2008). doi:10.1152/jn.90703.2008
The Temporal and Spatial Allocation of Motor Preparation during a
Mixed-Strategy Game
Areh Mikulić and Michael C. Dorris
Department of Physiology, Centre for Neuroscience Studies and Canadian Institutes of
Health Research Group in Sensory-Motor Systems, Queen‟s University,
Kingston, ON K7L 3N6
Canada
Running Head: Motor Preparation Preceding Strategic Responses
Corresponding Author: Michael Dorris, Department of Physiology, Queen‟s University,
Botterell Hall, Room 440, Kingston ON, K7L 3N6
[email protected]
1
Copyright © 2008 by the American Physiological Society.
ABSTRACT
Adopting a mixed response strategy in competitive situations can prevent opponents from
exploiting predictable play. What drives stochastic action selection is unclear given that
choice patterns suggest that, on average, players are indifferent to available options
during mixed-strategy equilibria. To gain insight into this stochastic selection process we
examined how motor preparation was allocated during a mixed-strategy game. If
selection processes on each trial reflect a global indifference between options then there
should be no bias in motor preparation („unbiased preparation hypothesis‟). If, however,
differences exist in the desirability of options on each trial then motor preparation should
be biased towards the preferred option („biased preparation hypothesis‟). We tested
between these alternatives by examining how saccade preparation was allocated as
human subjects competed against an adaptive computer opponent in an oculomotor
version of the game „matching pennies‟. Subjects were free to choose between two visual
targets using a saccadic eye movement. Saccade preparation was probed by occasionally
flashing a visual distractor at a range of times preceding target presentation. The
probability that a distractor would evoke a saccade error and, when it failed to do so, the
probability of choosing each of the subsequent targets quantified the temporal and spatial
evolution of saccade preparation, respectively. Our results show that saccade preparation
became increasingly biased as the time of target presentation approached. Specifically,
the spatial locus to which saccade preparation was directed varied from trial to trial and
its time course depended on task timing.
Keywords: saccade, Nash equilibrium, game theory, decision-making, neuroeconomics
2
INTRODUCTION
Neuroscientists have largely focused on the visuosaccadic system to examine the
decision-making processes that precede motor actions (Glimcher, 2003;Schall,
2001;Schultz, 2006). When a saccade is initiated and where it is directed can be modeled
by determining the region that first surpasses a threshold level of activity on
topographically organized maps of potential saccade goals (Trappenberg et al.,
2001;Usher and McClelland, 2001;Wilimzig et al., 2006). This selection process has been
examined predominantly under conditions in which deterministic strategies were optimal;
that is, either sensory cues or differences in expected payoffs indicated the „correct‟
choice for maximizing the intake of reward. Under these conditions, the quality of
sensory instruction influences the rate at which activity rises towards threshold (Hanes
and Schall, 1996;Shadlen and Newsome, 2001) whereas differences in expected payoffs
predisposes baseline activation in favor of one of the potential actions (Carpenter and
Williams, 1995;Dorris and Munoz, 1998;Ikeda and Hikosaka, 2003).
In competitive situations, however, there often is no single correct action and
instead a mixed-strategy is required to prevent opponents from exploiting predictable
play (Fundenberg and Tirole, 1991). During repeated play of „rock-paper-scissors‟, for
example, dynamic interactions between the two players drives the game towards a mixedstrategy equilibrium. For this game, the predicted equilibrium strategy is for each player
to choose the available actions in equal proportions and stochastically from trial to trial
(Nash, 1950). Such a response pattern suggests that players perceive the relative
desirability of each action, on average, as equal. Equal desirability is further inferred
from the fact that there is no incentive for varying from this equilibrium strategy and that
3
if a player perceived one action as more desirable they presumably would choose that
action exclusively. Debate has arisen over what „drives‟ the selection process on
individual trials if, overall, subjects are indifferent between the available actions and
there are no instructive sensory cues in which to guide behavior (Aumann,
1985;Rubinstein, 1991). Understanding the neural processes leading to stochastic action
selection has broad significance because a wide range of inter-personal, geo-political,
pursuit-evasion, evolutionary and competitive sports relationships require such mixedstrategies (Driver and Humphries, 1988;Maynard Smith, 1982;Miller, 1997;PalaciosHeurta, 2003;Shinar et al., 1994;Smith, 1982;Walker and Wooders, 2001).
Although optimal strategies have been outlined in game theory (Fundenberg and
Tirole, 1991;Nash, 1950) and behavioral economists have introduced descriptive models
that approximate human choice patterns (Camerer et al., 2002;Erev and Roth, 1998), only
recently have the neural processes underlying mixed-strategy decision-making been
examined. These neurophysiology experiments have focused on evaluative processes
involved in assessing the outcome of past actions and rewards and in representing the
desirability of choice stimuli (Barraclough et al., 2004;Cohen and Ranganath,
2007;Dorris and Glimcher, 2004;Seo and Lee, 2007). Little is known, however, about the
processes immediately preceding action selection during mixed-strategy tasks. The
current experiments use a behavioral measure of motor preparation to gain insight into
the spatial and temporal allocation of mixed-strategy action selection.
We put forth two possible processes, based on previous economic and
neurophysiology considerations, which could lead to mixed-strategy action selection.
First, the unbiased preparation hypothesis posits that selection processes associated with
4
competing actions are equivalent and this leads to unbiased motor preparation. If, on
average, actions are equally desirable over many trials of a mixed-strategy game then it is
plausible that actions are equally desirable on individual trials as well. In support of this,
no bias is observed in the activity related to the selection and preparation of upcoming
actions when the value of those actions are equivalent during the period of uncertainty
preceding instructive sensory cues (Cisek and Kalaska, 2005;Dorris and Munoz,
1998;Platt and Glimcher, 1999).
The second biased preparation hypothesis stems from an alternative theoretical
perspective. Although actions may be equally desirable on average during mixed-strategy
equilibria, slight differences may exist in their desirability from trial to trial (Harsanyi,
1974). Neurophysiological evidence in support of this hypothesis comes from free-choice
experiments in which neuronal activities associated with choice alternatives are
predictive of upcoming actions (Coe et al., 2002;Dorris and Glimcher, 2004;Sugrue et al.,
2004). Moreover, under conditions of equal desirability, motor preparation signals are
influenced by the previous sequence of events (Dorris et al., 2000). These small, initial
trial-by-trial differences in selection processes could evolve to bias the allocation of
motor preparation in advance of choice stimuli.
To test between these two possibilities, humans played an oculomotor version of
the mixed-strategy game „matching-pennies‟ using saccadic eye movements to indicate
their choices. The otherwise covert process of saccade preparation was probed by the
ability of visual distractors presented in advance of the choice targets to trigger erroneous
saccades known as oculomotor captures (Theeuwes et al., 1998). The use of oculomotor
captures is similar in its logic to how electrical micro-stimulation of the visuosaccadic
5
circuitry has been used to examine the ongoing formation of perceptual-based decisions
(Gold and Shadlen, 2000;Gold and Shadlen, 2003). In those experiments, the deviation in
the endpoints of electrically-evoked saccades provided an instantaneous read-out of the
degree to which developing perceptual decisions shaped activity patterns in brain regions
responsible for producing the appropriate motor response. Here, the type of saccadic
response following non-invasive visual distractors provided the instantaneous read-out of
the degree to which ongoing selection processes shape saccade preparation. Specifically,
the probability that a distractor evokes an oculomotor capture and, when it fails to do so,
the probability of choosing each of the subsequent targets are indicative of the level of
saccade preparation at the time and location at which the distractor is presented (Dorris et
al., 2007;Milstein and Dorris, 2007). Unequal allocation of either oculomotor captures or
subsequent target choices would provide support for the biased preparation hypothesis.
Equal allocation of these behavioral measures would provide support for the unbiased
preparation hypothesis.
Importantly, we are not trying to make the claim that mixed-strategy actions are
selected exclusively by the brain‟s motor structures. Although oculomotor captures
measure underlying saccade preparation most directly, other cognitive processes, such as
those involved in decision-making on a higher level of abstraction and in visuospatial
attention, are also likely important for this selection process. The use of oculomotor
captures takes advantage of the fact that potentially diverse cognitive processes involved
in this selection process must ultimately be consolidated in order to produce a single
action. Therefore, our claim is that oculomotor captures provide insight into how the
6
summative selection process shapes the spatial and temporal allocation of saccade
preparation.
7
METHODS
Experiments were conducted using 10 human subjects (four female and six male,
22 to 34 years old, including the two authors) using eye-tracking procedures described
previously (Milstein and Dorris, 2007). All procedures were approved by the Queen's
University Human Research Ethics Board.
Subjects were seated in front of a computer monitor with their heads stabilized on
a chin rest positioned 59 cm from the centre of a 17‟‟ CRT monitor (refresh rate 100Hz)
that spanned 32° of their central visual field. Left eye position was recorded at 250 Hz
with resolution of 0.1˚ using an infra-red eye tracker system (Eyelink II, SR Research).
Real-time data acquisition software (Gramalkn, Ryklin Software) was used for stimuli
presentation and data collection. Data analysis was performed offline using MATLAB,
version 7.04 (Mathworks Inc) on a Pentium 4 personal computer.
Each task consisted of randomly interleaved standard trials (Fig. 1A – 75%) and
distractor trials (Fig. 1B – 25%). Subjects were required to maintain central gaze fixation
throughout the 800 ms presentation of the fixation point and after its removal during a
fixed 600 ms warning period. Subjects indicated their choice by directing a saccade to
one of two targets presented simultaneously 8° right and left of center. The fixed warning
period and known target locations facilitated advanced saccade preparation (Dorris and
Munoz, 1998). Following choice selection, a red box appeared around one of the targets
for 400 ms indicating the computer opponent‟s choice (see below). Each trial ended with
a central display of the monetary payoff that lasted 1000 ms followed by a 1000 ms intertrial interval.
8
The level of saccade preparation allocated towards the two targets was probed at
specific times during the warning period of distractor trials (Fig. 1B). A single green
stimulus was briefly flashed for 70 ms with equal probability at each of the two
upcoming target locations. The distractor was equally likely to be presented at 100, 200,
300, 400, or 500 ms into the 600 ms warning period. Distractors occasionally triggered
erroneous saccades referred to as oculomotor captures (Theeuwes et al., 1998) despite
instructions not to look to the distractor and reward being withheld for doing so.
Mixed-Strategy Task. Subjects competed in an oculomotor version of the game
„matching pennies‟ against a dynamic computer opponent (Fig. 1). This opponent
performed statistical analyses on the subject‟s history of previous choices and payoffs to
uncover systematic biases in their choice strategy (see algorithm 2 from (Lee et al., 2004)
for specific details). If both players chose the same target, the subject won $0.04;
otherwise no monetary reward was received. Importantly, subjects were fully informed of
the rules of the game and that they were playing a strategic game against a dynamic,
competitive computer opponent.
To examine how task timing influenced strategic responses, all subjects also
performed a version of the mixed-strategy task in which the fixed warning period was
extended from 600 to 1200 ms. Under this condition, distractors were equally likely to be
presented at 100, 300, 500, 800 and 1100ms into the 1200 ms warning period.
Pure-Strategy Task. The pure-strategy task (Fig. 1) was identical to the mixed-strategy
task except the computer opponent always chose the same target for an entire block of
trials and the payoff was reduced from $0.04 to $0.02 per trial. The rewarded direction
alternated between the left and right targets across blocks of trials.
9
Instructed Task. The instructed task was also identical to the mixed-strategy task except
that a single saccadic target was presented on each trial with equal probability to the left
or right. Subjects received $0.02 for successfully acquiring the target with a saccade.
Data Analysis. Subjects came in on four separate days and completed four, 220 trial
blocks of each task. The first day was a practice session and this data was not included in
the final analysis. Only seven of the ten subjects completed the pure-strategy task. The
first 20 trials of each block were discarded from analysis to allow subjects time to adjust
to the new task conditions.
A correct saccade was defined as the first saccade initiated between 120 ms and
350 ms after target onset that landed within 3˚ of the target. An oculomotor capture was
defined as a saccade initiated between 70 ms to 220 ms after the distractor onset that
landed within 5° of the distractor location. The temporal and spatial constraints were
relaxed slightly for distractor-triggered oculomotor captures because these saccades are
known to be of shorter latency and hypometric relative to slower correct saccades
(Milstein and Dorris, 2007;Theeuwes et al., 1999).
The degree of response stochasticity was quantified using entropy (Lee et al.,
2004). Entropy describes the measure of uncertainty about the state of the observed
system. The entropy, H, can be calculated as follows:
pk log 2 pk ,
H
k
where pk is the probability of the system being in the state k. Our blocks of trials were of
sufficient length to look at patterns of three consecutive trials. There are eight possible
left (L) right (R) sequences (RRR, RRL, RLR, RLL, LRR, LRL, LLR, and LLL).
10
8
Therefore, the specific calculation of entropy used here is H
p k log 2 p k , whereby
k 1
pk is the probability of finding a sequence k in the array of subject‟s outcomes. If behavior
is truly stochastic, each of these sequences should be represented with equal probability
corresponding to maximum entropy of 3 bits.
We subjected data from distractor trials to a three-way repeated measures
ANOVA with task type, distractor timing and the target response direction as factors. A
chi-square test was used to test whether there were differences in proportions of
responses to distractors presented at different times during the warning period. To test
whether the proportion of responses increased or decreased we compared the first (i.e.,
100 ms) and last (i.e., 500 ms) time of distractor presentation using 95% confidence
intervals for populations of proportions (Utts and Heckard, 2007).
11
RESULTS
Temporal Preparation of Mixed-Strategy Responses: Oculomotor Captures
The first results section analyzes the pattern of distractor-directed oculomotor
captures to determine how motor preparation was temporally allocated in advance of
strategic response. First, to establish that distractors were effective probes of underlying
saccade preparation, the pattern of oculomotor captures was examined during the purestrategy task in which the planned saccade was known with near certainty. During a
representative block, subjects chose the rewarded target exclusively during standard trials
(Fig. 2A) and, importantly, only distractors presented at the rewarded location triggered
oculomotor captures (Fig. 2B). In total, subjects chose the rewarded target 99.5% of the
time during standard trials and oculomotor captures were directed almost exclusively
towards distractors flashed at rewarded rather than unrewarded locations (Fig. 3A; χ2 test,
p<0.001). This conjunction between the direction of saccadic targets and oculomotor
captures suggests that distractors effectively probed the spatial allocation of saccade
preparation.
Ideally, choices should be allocated equally and stochastically between the two
targets during the instructed and mixed-strategy standard trials (Fig. 2C and E). This
behavioral pattern is expected during instructed trials because the computer randomly
selected either target location. This behavioural pattern should be approached during
mixed-strategy trials because the dynamic computer opponent encouraged random choice
selection by exploiting predictability in the subjects‟ behavior. We used the term
“approached” because it is well documented that humans often fall short of achieving the
exact proportions and stochasticity requirements predicted by the Nash equilibrium
12
solution (Erev and Roth, 1998). Overall, the percentage of saccades directed to each
target did not differ from 50% during these two task (instructed: 49.6±0.9% left, t-test,
p>0.05; mixed-strategy: 48.9±0.8% left, t-test, p>0.05) nor did they differ between these
two tasks (t-test, p=0.55). To measure stochasticity, we calculated entropy in patterns of
choices (see METHODS). Instructed and mixed-strategy tasks both approached maximal
entropy of 3 bits of information (instructed: 2.95±0.007% bits; mixed-strategy:
2.94±0.009% bits; t-test, p>0.05). Consequently, any differences that exist between these
two tasks during the following analysis of distractor trials cannot be accounted for by any
differences in overall response patterns but presumably reflect the instructed versus
voluntary nature of these two tasks.
During the instructed task, oculomotor captures were directed to both left and
right distractors which is consistent with the uncertainty of the upcoming target location
(Fig. 2D). Overall, the pattern of oculomotor captures did not differ towards left or right
distractors (not shown; p>0.05). In addition, the overall percentage of oculomotor
captures increased as distractors were presented later in the warning period (Fig. 3C;
ANOVA, p<0.001).
Having established the pattern of oculomotor captures when the upcoming
response was nearly certain (the pure-strategy task) and uncertain (the instructed task),
we now examine the pattern of oculomotor captures during the mixed-strategy task. Like
the two other tasks, the percentage of oculomotor captures increased when distractors
were presented later in the warning period (Fig. 3E; p<0.001). Moreover, the pattern of
oculomotor captures did not differ towards left or right distractors (not shown; p>0.05).
Together, this pattern of oculomotor captures suggests that during the mixed-strategy task
13
saccade preparation begins in advance of target presentation and increases as the time of
target presentation approaches.
Increasing the warning period delayed the time course of saccade preparation
(Fig. 4). Initially (i.e., ≤300ms distractor presentation), the percentages of oculomotor
captures did not differ between short and long warning period blocks but these
percentages diverged when distractors were presented later in the warning period (i.e.,
500ms distractor presentation; p<0.01). By the end of each of the respective warning
periods, the percentage of oculomotor captures reached similar levels.
Lastly, we examined whether the time at which the distractor was presented
during the warning period influenced oculomotor capture reaction times. Higher
preparatory activity later in the warning period could not only increase the likelihood that
visual transients associated with distractor presentation would surpass saccadic threshold
but it could also reduce the time required for these visual transients to surpass saccadic
threshold. We performed a two-way repeated measures ANOVA on the saccadic reaction
times associated with oculomotor captures and found no significant effect of the factors
task type (p=0.47) or distractor timing (p=0.73). This is consistent with other studies
suggesting that transient visual bursts of activity in visuosaccadic structures are of such
rapid onset and high frequency that if they trigger a saccade these saccades occur at very
stereotyped latencies (Dorris et al., 1997;Dorris et al., 2007;Godijn and Theeuwes, 2002).
Spatial Preparation of Mixed-Strategy Responses: Target-Directed Saccades
The analysis of oculomotor captures revealed that during the mixed-strategy task
saccade preparation increased as the time of target presentation approached. It is
important to recognize that the percentage of oculomotor captures indicates how saccade
14
preparation is allocated probabilistically across blocks of trials rather than on a trial-bytrial basis. Equal proportions of left and right oculomotor captures during the mixedstrategy task could arise if saccades were prepared equally in both directions on every
trial (i.e., unbiased preparation hypothesis) or biased in a particular direction on every
trial (i.e., biased preparation hypothesis). For the latter to be the case, however, the
direction of the planned saccade must vary stochastically from trial to trial. An analysis
of target choices during distractors trials follows, whose purpose is to distinguish whether
saccade preparation is spatially allocated in a biased or unbiased manner.
This section analyzes the choice of targets on those distractor trials in which the
distractor failed to trigger a saccade. The following logic is critical for establishing this
analysis: saccades planned stochastically either towards the left or right, coupled with
distractors presented equally likely at either location, results in distractors that are 50%
likely to be presented at the same location to which a saccade is being prepared on any
given trial. Therefore, target-directed saccades that occur during distractor trials will be
classified as (Fig. 1B): 1) those directed towards the target on the same side as the
distractor (Tsame), and 2) those directed towards the target on the side opposite the
distractor (Topp). The assumption underlying this analysis is that the selection process
continues to culminate towards choosing the target that would have occurred had the
distractor not been presented.
To test this assumption, target-directed saccades were first analyzed during the
pure-strategy task. During the representative block, the subject chose the rewarded target
regardless of where the distractor was presented (Fig. 2B – black traces) and this
observation held for over 97% of all trials across subjects. This finding suggests that
15
ongoing saccade preparation processes continued largely undisturbed on those trials in
which the distractor failed to trigger an oculomotor capture.
Target choices during the pure-strategy task were then analyzed relative to
distractor location rather than reward location (Fig. 3B). The percentage of Topp did not
vary with the time of distractor presentation (χ2 test, p>0.05) remaining near 50%
throughout the warning period (Fig. 3B – solid line). This pattern arose because
oculomotor captures were successfully withheld during the 50% of trials in which the
distractor was presented on the unrewarded side (Fig. 3A – unfilled squares) and saccades
were subsequently directed towards the target on the opposite, rewarded side. The
percentage of Tsame, however, decreased as the time of target presentation approached
(Fig. 3B – dashed line; χ2 test, p<0.001). This pattern arose because as saccade
preparation towards the rewarded target increased during the warning period, distractors
presented on that same side were more likely to trigger an oculomotor capture.
Importantly, conducting this target-directed analysis under conditions with a known
saccade goal establishes how Topp and Tsame differ when saccade preparation is biased
towards one location.
There was no a priori reason that saccade preparation should be spatially biased
during the instructed task because the target was equally likely to be presented in either
direction. Consistent with this, the percentages of both Tsame and Topp decreased as the
time of target presentation approached (Fig. 3D - Tsame, p<0.001; Topp, p<0.05) and these
curves did not differ significantly from each other (p>0.05).
Target choices following distractor presentation during the mixed-strategy task
more closely resembled that of the pure-strategy task than the instructed task; the
16
percentage of Tsame responses decreased (χ2 test, p<0.001) while the percentage of Topp
responses did not vary with the time of distractor presentation (χ2 test, p>0.05) (Fig. 3F).
Importantly, the Topp and Tsame curves significantly differed at the end of the warning
period suggesting a bias existed in favor of one response just before the target
presentation in the mixed-strategy task (Fig. 3F; p<0.001).
To describe the overall effects, we subjected data from distractor trials to a threeway repeated measures ANOVA with task type, distractor timing and target response
direction (Tsame vs. Topp) as factors. There was an effect of the task type (p<0.0001),
distractor timing (p<0.0001) and target response direction (p<0.0001) on the proportion
of target-directed saccades following distractor presentation. There were significant
interactions between task type and distractor timing (p<0.05), task type and target
response direction (p<0.01), and distractor timing and target response direction (p<0.05).
Lastly, to differentiate saccade preparation processes underlying the instructed and
mixed-strategy tasks we conducted another three-way repeated measures ANOVA that
included only these two task types. We found a significant interaction between task type
and target response direction (p<0.005). This final statistical interaction is of particular
importance because it highlights the differences in saccade preparation processes that
exist between the instructed and mixed-strategy tasks.
DISCUSSION
Our results support a biased preparation mechanism preceding mixed-strategy
actions. First, we found that strategic saccade preparation accumulated in advance of the
presentation of target stimuli. This temporal property of the preparatory process was
indexed by an increase in oculomotor captures as the time of target presentation
17
approached (Fig. 3A,C,E). This accumulation depended on the specific timing of the task
as evidenced by a delay in the time course of oculomotor captures that accompanied
extending the fixed warning period (Fig. 4). Second, we found that strategic saccade
preparation became biased to one target over another in advance of the target stimuli.
This spatial property of saccade preparation was evidenced by the pattern of targetdirected saccades that emerged when analyzed relative to the distractor location (Fig.
3B,D,F).
Temporal and Spatial Allocation of Saccade Preparation Preceding Mixed-Strategy
Responses
This oculomotor version of „matching-pennies‟ mimics many features of the more
familiar manual version of the game (a.k.a., „odds-evens‟, „matching‟, or „choosing‟). In
the manual version, players indicate their choices by displaying either one or two fingers.
Before game play, players must determine who will fulfill the role of „odds‟ and who
„evens‟. „Odds‟ wins if the players display different numbers of fingers (i.e., one/two or
two/one) and „evens‟ wins if the players display the same number of fingers (i.e., one/one
or two/two). To facilitate simultaneous presentation, players must also agree upon the
number of „primes‟ or fist-pumps that will be required before displaying their manual
choices. The fixed warning period aids subjects in initiating their saccade choices within
a narrow temporal window (Munoz et al., 2000) (i.e., 120-350 ms after target
presentation) and thus performs a role analogous to priming. Oculomotor captures
increased as distractors were presented later in the warning period (Fig. 3) and this
increase was delayed in time when the duration of the warning period was extended (Fig.
4). Both of these observations suggest that motor preparation peaked coinciding with the
18
predictable timing of the target stimuli. This makes sense from a strategic standpoint;
maintaining motor preparation at a high sustained level could lead to premature responses
thus providing one‟s opponent with a strategic advantage whereas maintaining motor
preparation at a low sustained level could result in delayed responses and disqualification
from that round of play. Recently described behavioral and neural effects associated with
predictably timed responses (Ding and Hikosaka, 2007;Janssen and Shadlen,
2005;Maimon and Assad, 2006) likely played a role in the timing of the strategic
responses described here.
The predictable target locations associated with this oculomotor „matching
pennies‟ game also enabled us to examine the spatial allocation of motor preparation.
This spatial analysis may not have been feasible with complexly encoded hand
configurations typical of the manual version of the game. Moreover, probing with visual
distractors provided insight into the spatial allocation of saccade preparation that would
otherwise be difficult given the stochastic nature of choice selection during mixedstrategy games. Specifically, the analysis of target choices following successfully
withheld oculomotor captures rests upon the underlying assumption that the same target
would have been selected had the distractor not been presented. That subjects
overwhelmingly continued to choose the rewarded target under the predictable purestrategy task partly bears this assumption out.
Overall, the pattern of target choices following distractors (Fig. 3) shows that
during the mixed-strategy task, preparatory processes became biased in favor of one
response before the presentation of target stimuli. This locus of saccade preparation
varied stochastically from trial to trial. The extent to which preparatory processes were
19
spatially biased was not as pronounced preceding strategic responses as when the
rewarded saccade was known with full foreknowledge (i.e., compare the separation in the
Topp and Tsame curves for the mixed- (Fig. 3F) and pure-strategy (Figs. 3B) tasks). The
differing time course and extent of saccade preparation bias may reflect that ongoing
deliberation takes place during the mixed-strategy task compared to early commitment
during the pure-strategy task.
The current results are important because they suggest that accumulation-towardsthreshold models, which have been so successful in describing simple perceptual
decision-making, may also, be applicable to the formation of strategic decisions. The rate
of activation accumulation preceding perceptual decisions is largely driven by the quality
of sensory evidence (Hanes and Schall, 1996;Shadlen and Newsome, 2001) whereas
baseline activation is largely shaped by prior information such as the probability or
magnitude of reward (Dorris and Munoz, 1998;Ikeda and Hikosaka, 2003). During the
mixed-strategy task, however, both of these selection factors are lacking. First, there is no
sensory evidence indicating the correct choice; in fact, subjects are situated in the dark
during the warning period. Second, subjects are presumably indifferent on average
between available responses because their overall payoffs are equal once the mixedstrategy equilibrium is established (Nash, 1950). We speculate on two mechanisms that
could lead to selection biases proceeding stochastic responses. One possibility is that
noise in the accumulation signal randomly biases the selection process in favor of one of
the options which, coupled with local recurrent excitation and distal competitive
inhibition, further strengthens this selection path over time (Dorris et al., 2007;Wong and
Wang, 2006). Another possibility is that the recent history of responses and rewards
20
subtly biases subsequent selection processes. In this case, a player would never actually
be indifferent to the available responses because of small perturbations in subjective
desirability between the options from trial to trial (Harsanyi, 1974). It was not feasible to
examine this possibility in the current data set because interleaved distractor trials
prohibitively disrupted the natural sequence of events. However, previous studies
examining the choice patterns during mixed-strategy games suggests that the previous
history of choices and rewards may indeed have a subtle effect on selecting upcoming
choices (Corrado et al., 2005;Dorris et al., 2000;Lau and Glimcher, 2005;Lee et al.,
2004;Thevarajah and Dorris, 2007). Of course, following a “win-stay/lose shift” strategy
too strictly would lead to predictable behavior that could be exploited by one‟s opponent.
Outstanding Issues
Although our results are consistent with the gradual accumulation of preparatory
activation towards one saccade threshold over another, they could also be achieved if the
level of activation increased abruptly on each trial (i.e., a “Eureka!” transition). The
pattern of increasing oculomotor captures could arise from this latter mechanism if such
step transitions occurred more frequently as the time of the sensory trigger approached.
Oculomotor captures only describe the probabilistic time course of saccade preparation
across blocks of trials. Direct neuronal recording within visuosaccadic structures will be
required to distinguish whether there is a gradual or step transition in preparation signals
during single trials.
Finally, it is worth reiterating that oculomotor captures more accurately reflect the
temporal and spatial allocation of saccade preparation rather than decision or attention
processes. Under some conditions, the formation of perceptual decisions appear to share a
21
common neural substrate with preparatory signals in brain regions involved in generating
the appropriate motor action (Gold and Shadlen, 2000;Gold and Shadlen, 2003;Juan et
al., 2004). Unlike perceptual decisions which require the decoding of immediate sensory
evidence, strategic decisions can be made well in advance of stimulus presentation and
even before the beginning of a trial. Although the decision-making process can occur
well in advance, to prevent premature or delayed strategic responses, it is still prudent for
motor preparation to reflect the specific timing of the task. Nevertheless, it would be
illogical for motor preparation to precede the decision process. Therefore, we conclude
that the decision process also occurs in advance of the trigger stimuli during mixedstrategy tasks, either before, or simultaneous with, motor preparatory processes.
22
Acknowledgements: This work was supported by a career development award from the
Human Frontier Science Program (HFSP) and a group grant from the Canadian Institutes
of Health Research (CIHR) and awarded to MCD. We thank J. Green, S. Hickman, F.
Paquin and R. Pengelly for technical assistance. J. Turner provided programming
expertise and E. Ryklin customized the data acquisition program. J. Chan, J. Kan, D.M.
Milstein, D. Theverajah, and R.Webb provided constructive feedback regarding the
manuscript.
23
FIGURE LEGENDS
Figure 1. All behavioral paradigms consisted of 75% standard trials (A) and 25%
distractor trials (B). Each panel represents successive computer screens in time. Arrows
indicate the subject‟s possible choices and the box indicates the computer‟s choice. These
schematics correspond to the pure-strategy and mixed-strategy tasks; for the instructed
task (not shown) only one target was presented per trial. Tsame denotes choice of target on
same side as distractor, Topp, choice of target on opposite side of distractor.
Figure 2. Eye position traces from representative blocks of the pure-strategy (A and B),
instructed (C and D) and mixed-strategy (E and F) tasks from one subject. Each block of
trials is segregated into standard trials (top panels) and distractor trials (bottom panels).
Black traces correspond to target-directed correct saccades and gray traces to distractordirected oculomotor captures. Successive black vertical dashed lines represent the
beginning and the end of the warning period, respectively. Vertical gray lines represent
the 5 possible times of distractor presentation.
Figure 3. Influence of distractor timing on oculomotor captures (top panels) and correct
saccades (bottom panels) across all subjects. Error bars represent 95% confidence
intervals. Asterisks denote times in which the percentages of Topp and Tsame significantly
differed (χ2 test, p<0.01). Note that the Topp, Tsame and oculomotor captures sum to 100%
for each time epoch in each task.
Figure 4. Influence of task timing on the percentage of oculomotor captures during the
mixed-strategy task. The solid and dashed lines represent the pattern of oculomotor
captures when the warning period was fixed at 600 and 1200 ms, respectively. Distractors
were equally likely to be presented at 100, 300, 500, 800 and 1100ms during the 1200ms
24
warning period block. Error bars represent 95% confidence intervals. The asterisks
denote a statistically significant difference between percentages of responses at the same
time points (χ2 test, p<0.01).
25
Reference List
Aumann RJ (1985) What is Game Theory Trying to Accomplish? In: Frontiers of
Economics (Arrow K, Hankapohja S, eds), pp 28-78. Oxford: Basil Blackwell.
Barraclough DJ, Conroy ML, Lee D (2004) Prefrontal cortex and decision making in a
mixed-strategy game. Nat Neurosci 7: 404-410.
Camerer CF, Ho TH, Chong JK (2002) Sophisticated experience-weighted attraction
learning and strategic teaching in repeated games. Journal of Economic Theory 104: 137188.
Carpenter RH, Williams ML (1995) Neural computation of log likelihood in control of
saccadic eye movements. Nature 377: 59-62.
Cisek P, Kalaska JF (2005) Neural correlates of reaching decisions in dorsal premotor
cortex: specification of multiple direction choices and final selection of action. Neuron
45: 801-814.
Coe B, Tomihara K, Matsuzawa M, Hikosaka O (2002) Visual and anticipatory bias in
three cortical eye fields of the monkey during an adaptive decision-making task. J
Neurosci 22: 5081-5090.
Cohen MX, Ranganath C (2007) Reinforcement learning signals predict future decisions.
J Neurosci 27: 371-378.
Corrado GS, Sugrue LP, Seung HS, Newsome WT (2005) Linear-Nonlinear-Poisson
models of primate choice dynamics. J Exp Anal Behav 84: 581-617.
Ding L, Hikosaka O (2007) Temporal development of asymmetric reward-induced bias in
macaques. J Neurophysiol 97: 57-61.
Dorris MC, Glimcher PW (2004) Activity in posterior parietal cortex is correlated with
the relative subjective desirability of action. Neuron 44: 365-378.
Dorris MC, Munoz DP (1998) Saccadic probability influences motor preparation signals
and time to saccadic initiation. J Neurosci 18: 7015-7026.
Dorris MC, Olivier E, Munoz DP (2007) Competetive Integration of Visual and
Preparatory Signals in the Superior Colliculus during Saccadic Programming. J Neurosci
27: 5053-5062.
Dorris MC, Pare M, Munoz DP (1997) Neuronal activity in monkey superior colliculus
related to the initiation of saccadic eye movements. J Neurosci 17: 8566-8579.
Dorris MC, Pare M, Munoz DP (2000) Immediate neural plasticity shapes motor
performance. J Neurosci 20: RC52.
26
Driver PM, Humphries DA (1988) Protean Behaviour: The Biology of Unpredictability.
Oxford University Press.
Erev I, Roth AE (1998) Predicting how people play games: Reinforcement learning in
experimental games with unique, mixed strategy equilibria. American Economic Review
88: 848-881.
Fundenberg D, Tirole J (1991) Game Theory. Cambridge: MIT Press.
Glimcher PW (2003) The neurobiology of visual-saccadic decision making. Annu Rev
Neurosci 26: 133-179.
Godijn R, Theeuwes J (2002) Programming of endogenous and exogenous saccades:
evidence for a competitive integration model. J Exp Psychol Hum Percept Perform 28:
1039-1054.
Gold JI, Shadlen MN (2000) Representation of a perceptual decision in developing
oculomotor commands. Nature 404: 390-394.
Gold JI, Shadlen MN (2003) The influence of behavioral context on the representation of
a perceptual decision in developing oculomotor commands. J Neurosci 23: 632-651.
Hanes DP, Schall JD (1996) Neural control of voluntary movement initiation. Science
274: 427-430.
Harsanyi JC (1974) Equilibrium-Point Interpretation of Stable Sets and A Proposed
Alternative Definition. Management Science Series A-Theory 20: 1472-1495.
Ikeda T, Hikosaka O (2003) Reward-dependent gain and bias of visual responses in
primate superior colliculus. Neuron 39: 693-700.
Janssen P, Shadlen MN (2005) A representation of the hazard rate of elapsed time in
macaque area LIP. Nat Neurosci 8: 234-241.
Juan CH, Shorter-Jacobi SM, Schall JD (2004) Dissociation of spatial attention and
saccade preparation. Proc Natl Acad Sci U S A 101: 15541-15544.
Lau B, Glimcher PW (2005) Dynamic response-by-response models of matching
behavior in rhesus monkeys. J Exp Anal Behav 84: 555-579.
Lee D, Conroy ML, McGreevy BP, Barraclough DJ (2004) Reinforcement learning and
decision making in monkeys during a competitive game. Brain Res Cogn Brain Res 22:
45-58.
Maimon G, Assad JA (2006) A cognitive signal for the proactive timing of action in
macaque LIP. Nat Neurosci 9: 948-955.
27
Maynard Smith J (1982) Evolution and the Theory of Games. Cambridge: Cambridge
University Press.
Miller GF (1997) Protean Primates: the evolution of adaptive unpredictability in
competition and courtship. In: Machiavellian Intelligence II: Extensions and Evaluations
(Whiten A BRW, ed), pp 312-340. Cambridge University Press.
Milstein DM, Dorris MC (2007) The influence of expected value on saccadic preparation.
J Neurosci 27: 4810-4818.
Munoz DP, Dorris MC, Pare M, Everling S (2000) On your mark, get set: brainstem
circuitry underlying saccadic initiation. Can J Physiol Pharmacol 78: 934-944.
Nash JF (1950) Equilibrium Points in N-Person Games. Proceedings of the National
Academy of Sciences of the United States of America 36: 48-49.
Palacios-Heurta I (2003) Professionals Play Minimax. Review of Economic Studies 70:
395-415.
Platt ML, Glimcher PW (1999) Neural correlates of decision variables in parietal cortex.
Nature 400: 233-238.
Rubinstein A (1991) Comments on the Interpretation of Game-Theory. Econometrica 59:
909-924.
Schall JD (2001) Neural basis of deciding, choosing and acting. Nat Rev Neurosci 2: 3342.
Schultz W (2006) Behavioral theories and the neurophysiology of reward. Annu Rev
Psychol 57: 87-115.
Seo H, Lee D (2007) Temporal filtering of reward signals in the dorsal anterior cingulate
cortex during a mixed-strategy game. J Neurosci 27: 8366-8377.
Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the parietal
cortex (area LIP) of the rhesus monkey. J Neurophysiol 86: 1916-1936.
Shinar J, Forte I, Kantor B (1994) Mixed Strategy Guidance: A New High-Performance
Missle Guidance Law. Journal of Guidance, Control and Dynamics 17: 129-135.
Smith JM (1982) Evolution and the Theory of Games. Cambridge University Press.
Sugrue LP, Corrado GS, Newsome WT (2004) Matching behavior and the representation
of value in the parietal cortex. Science 304: 1782-1787.
Theeuwes J, Kramer AF, Hahn S, Irwin DE (1998) Our eyes do not always go where we
want them to go: Capture of the eyes by new objects. Psychological Science 9: 379-385.
28
Theeuwes J, Kramer AF, Hahn S, Irwin DE, Zelinsky GJ (1999) Influence of attentional
capture on oculomotor control. J Exp Psychol Hum Percept Perform 25: 1595-1608.
Thevarajah, D. and Dorris, M. C. Influence of previous choices and rewards on superior
colliculus activity is modulated by behavioral context. Society for Neuroscience . 2007.
Ref Type: Abstract
Trappenberg TP, Dorris MC, Munoz DP, Klein RM (2001) A model of saccade initiation
based on the competitive integration of exogenous and endogenous signals in the superior
colliculus. J Cogn Neurosci 13: 256-271.
Usher M, McClelland JL (2001) The time course of perceptual choice: the leaky,
competing accumulator model. Psychol Rev 108: 550-592.
Utts JM, Heckard RF (2007) Mind on Statistics. Thompson.
Walker M, Wooders J (2001) Minmax Play at Wimbledon. The American Economic
Review 91: 1521-1538.
Wilimzig C, Schneider S, Schoner G (2006) The time course of saccadic decision
making: dynamic field theory. Neural Netw 19: 1059-1074.
Wong K, Wang X (2006) A Recurrent Network Mechanism of Time Integration in
Perceptual Decisions. J Neurosci 26: 1314-1328.
29
$
%
6WDQGDUG7ULDOV 'LVWUDFWRU7ULDOV
ORMS
'LVWUDFWRU
:DUQLQJ
3HULRG
PV
"
4SAME
"
4OPP
6XEMHFW&KRLFH
&RPSXWHU&KRLFH
$
&
3XUH6WUDWHJ\7DVN
(
,QVWUXFWHG7DVN
0L[HG6WUDWHJ\7DVN
+RUL]RQWDO(\H3RVLWLRQ'HJUHHV
%
'
)
7LPHIURP)L[DWLRQ3RLQW2IIVHWPV
!
$ISTRACTOR$IRECTED
/CULOMOTOR#APTURES
%
,QVWUXFWHG7DVN
"
$
&
4ARGET$IRECTED
#ORRECT3ACCADES
#
3XUH6WUDWHJ\7DVN
"OTH$IRECTIONS
2EWARDED$IRECTION
5NREWARDED$IRECTION
4OPP
4SAME
0L[HG6WUDWHJ\7DVN
4IMEFROM&IXATION0OINT/FFSETMS
$ISTRACTOR$IRECTED
/CULOMOTOR#APTURES
MS7ARNING0ERIOD
MS7ARNING0ERIOD
4IMEFROM&IXATION0OINT/FFSETMS