Articles in PresS. J Neurophysiol (July 30, 2008). doi:10.1152/jn.90703.2008 The Temporal and Spatial Allocation of Motor Preparation during a Mixed-Strategy Game Areh Mikulić and Michael C. Dorris Department of Physiology, Centre for Neuroscience Studies and Canadian Institutes of Health Research Group in Sensory-Motor Systems, Queen‟s University, Kingston, ON K7L 3N6 Canada Running Head: Motor Preparation Preceding Strategic Responses Corresponding Author: Michael Dorris, Department of Physiology, Queen‟s University, Botterell Hall, Room 440, Kingston ON, K7L 3N6 [email protected] 1 Copyright © 2008 by the American Physiological Society. ABSTRACT Adopting a mixed response strategy in competitive situations can prevent opponents from exploiting predictable play. What drives stochastic action selection is unclear given that choice patterns suggest that, on average, players are indifferent to available options during mixed-strategy equilibria. To gain insight into this stochastic selection process we examined how motor preparation was allocated during a mixed-strategy game. If selection processes on each trial reflect a global indifference between options then there should be no bias in motor preparation („unbiased preparation hypothesis‟). If, however, differences exist in the desirability of options on each trial then motor preparation should be biased towards the preferred option („biased preparation hypothesis‟). We tested between these alternatives by examining how saccade preparation was allocated as human subjects competed against an adaptive computer opponent in an oculomotor version of the game „matching pennies‟. Subjects were free to choose between two visual targets using a saccadic eye movement. Saccade preparation was probed by occasionally flashing a visual distractor at a range of times preceding target presentation. The probability that a distractor would evoke a saccade error and, when it failed to do so, the probability of choosing each of the subsequent targets quantified the temporal and spatial evolution of saccade preparation, respectively. Our results show that saccade preparation became increasingly biased as the time of target presentation approached. Specifically, the spatial locus to which saccade preparation was directed varied from trial to trial and its time course depended on task timing. Keywords: saccade, Nash equilibrium, game theory, decision-making, neuroeconomics 2 INTRODUCTION Neuroscientists have largely focused on the visuosaccadic system to examine the decision-making processes that precede motor actions (Glimcher, 2003;Schall, 2001;Schultz, 2006). When a saccade is initiated and where it is directed can be modeled by determining the region that first surpasses a threshold level of activity on topographically organized maps of potential saccade goals (Trappenberg et al., 2001;Usher and McClelland, 2001;Wilimzig et al., 2006). This selection process has been examined predominantly under conditions in which deterministic strategies were optimal; that is, either sensory cues or differences in expected payoffs indicated the „correct‟ choice for maximizing the intake of reward. Under these conditions, the quality of sensory instruction influences the rate at which activity rises towards threshold (Hanes and Schall, 1996;Shadlen and Newsome, 2001) whereas differences in expected payoffs predisposes baseline activation in favor of one of the potential actions (Carpenter and Williams, 1995;Dorris and Munoz, 1998;Ikeda and Hikosaka, 2003). In competitive situations, however, there often is no single correct action and instead a mixed-strategy is required to prevent opponents from exploiting predictable play (Fundenberg and Tirole, 1991). During repeated play of „rock-paper-scissors‟, for example, dynamic interactions between the two players drives the game towards a mixedstrategy equilibrium. For this game, the predicted equilibrium strategy is for each player to choose the available actions in equal proportions and stochastically from trial to trial (Nash, 1950). Such a response pattern suggests that players perceive the relative desirability of each action, on average, as equal. Equal desirability is further inferred from the fact that there is no incentive for varying from this equilibrium strategy and that 3 if a player perceived one action as more desirable they presumably would choose that action exclusively. Debate has arisen over what „drives‟ the selection process on individual trials if, overall, subjects are indifferent between the available actions and there are no instructive sensory cues in which to guide behavior (Aumann, 1985;Rubinstein, 1991). Understanding the neural processes leading to stochastic action selection has broad significance because a wide range of inter-personal, geo-political, pursuit-evasion, evolutionary and competitive sports relationships require such mixedstrategies (Driver and Humphries, 1988;Maynard Smith, 1982;Miller, 1997;PalaciosHeurta, 2003;Shinar et al., 1994;Smith, 1982;Walker and Wooders, 2001). Although optimal strategies have been outlined in game theory (Fundenberg and Tirole, 1991;Nash, 1950) and behavioral economists have introduced descriptive models that approximate human choice patterns (Camerer et al., 2002;Erev and Roth, 1998), only recently have the neural processes underlying mixed-strategy decision-making been examined. These neurophysiology experiments have focused on evaluative processes involved in assessing the outcome of past actions and rewards and in representing the desirability of choice stimuli (Barraclough et al., 2004;Cohen and Ranganath, 2007;Dorris and Glimcher, 2004;Seo and Lee, 2007). Little is known, however, about the processes immediately preceding action selection during mixed-strategy tasks. The current experiments use a behavioral measure of motor preparation to gain insight into the spatial and temporal allocation of mixed-strategy action selection. We put forth two possible processes, based on previous economic and neurophysiology considerations, which could lead to mixed-strategy action selection. First, the unbiased preparation hypothesis posits that selection processes associated with 4 competing actions are equivalent and this leads to unbiased motor preparation. If, on average, actions are equally desirable over many trials of a mixed-strategy game then it is plausible that actions are equally desirable on individual trials as well. In support of this, no bias is observed in the activity related to the selection and preparation of upcoming actions when the value of those actions are equivalent during the period of uncertainty preceding instructive sensory cues (Cisek and Kalaska, 2005;Dorris and Munoz, 1998;Platt and Glimcher, 1999). The second biased preparation hypothesis stems from an alternative theoretical perspective. Although actions may be equally desirable on average during mixed-strategy equilibria, slight differences may exist in their desirability from trial to trial (Harsanyi, 1974). Neurophysiological evidence in support of this hypothesis comes from free-choice experiments in which neuronal activities associated with choice alternatives are predictive of upcoming actions (Coe et al., 2002;Dorris and Glimcher, 2004;Sugrue et al., 2004). Moreover, under conditions of equal desirability, motor preparation signals are influenced by the previous sequence of events (Dorris et al., 2000). These small, initial trial-by-trial differences in selection processes could evolve to bias the allocation of motor preparation in advance of choice stimuli. To test between these two possibilities, humans played an oculomotor version of the mixed-strategy game „matching-pennies‟ using saccadic eye movements to indicate their choices. The otherwise covert process of saccade preparation was probed by the ability of visual distractors presented in advance of the choice targets to trigger erroneous saccades known as oculomotor captures (Theeuwes et al., 1998). The use of oculomotor captures is similar in its logic to how electrical micro-stimulation of the visuosaccadic 5 circuitry has been used to examine the ongoing formation of perceptual-based decisions (Gold and Shadlen, 2000;Gold and Shadlen, 2003). In those experiments, the deviation in the endpoints of electrically-evoked saccades provided an instantaneous read-out of the degree to which developing perceptual decisions shaped activity patterns in brain regions responsible for producing the appropriate motor response. Here, the type of saccadic response following non-invasive visual distractors provided the instantaneous read-out of the degree to which ongoing selection processes shape saccade preparation. Specifically, the probability that a distractor evokes an oculomotor capture and, when it fails to do so, the probability of choosing each of the subsequent targets are indicative of the level of saccade preparation at the time and location at which the distractor is presented (Dorris et al., 2007;Milstein and Dorris, 2007). Unequal allocation of either oculomotor captures or subsequent target choices would provide support for the biased preparation hypothesis. Equal allocation of these behavioral measures would provide support for the unbiased preparation hypothesis. Importantly, we are not trying to make the claim that mixed-strategy actions are selected exclusively by the brain‟s motor structures. Although oculomotor captures measure underlying saccade preparation most directly, other cognitive processes, such as those involved in decision-making on a higher level of abstraction and in visuospatial attention, are also likely important for this selection process. The use of oculomotor captures takes advantage of the fact that potentially diverse cognitive processes involved in this selection process must ultimately be consolidated in order to produce a single action. Therefore, our claim is that oculomotor captures provide insight into how the 6 summative selection process shapes the spatial and temporal allocation of saccade preparation. 7 METHODS Experiments were conducted using 10 human subjects (four female and six male, 22 to 34 years old, including the two authors) using eye-tracking procedures described previously (Milstein and Dorris, 2007). All procedures were approved by the Queen's University Human Research Ethics Board. Subjects were seated in front of a computer monitor with their heads stabilized on a chin rest positioned 59 cm from the centre of a 17‟‟ CRT monitor (refresh rate 100Hz) that spanned 32° of their central visual field. Left eye position was recorded at 250 Hz with resolution of 0.1˚ using an infra-red eye tracker system (Eyelink II, SR Research). Real-time data acquisition software (Gramalkn, Ryklin Software) was used for stimuli presentation and data collection. Data analysis was performed offline using MATLAB, version 7.04 (Mathworks Inc) on a Pentium 4 personal computer. Each task consisted of randomly interleaved standard trials (Fig. 1A – 75%) and distractor trials (Fig. 1B – 25%). Subjects were required to maintain central gaze fixation throughout the 800 ms presentation of the fixation point and after its removal during a fixed 600 ms warning period. Subjects indicated their choice by directing a saccade to one of two targets presented simultaneously 8° right and left of center. The fixed warning period and known target locations facilitated advanced saccade preparation (Dorris and Munoz, 1998). Following choice selection, a red box appeared around one of the targets for 400 ms indicating the computer opponent‟s choice (see below). Each trial ended with a central display of the monetary payoff that lasted 1000 ms followed by a 1000 ms intertrial interval. 8 The level of saccade preparation allocated towards the two targets was probed at specific times during the warning period of distractor trials (Fig. 1B). A single green stimulus was briefly flashed for 70 ms with equal probability at each of the two upcoming target locations. The distractor was equally likely to be presented at 100, 200, 300, 400, or 500 ms into the 600 ms warning period. Distractors occasionally triggered erroneous saccades referred to as oculomotor captures (Theeuwes et al., 1998) despite instructions not to look to the distractor and reward being withheld for doing so. Mixed-Strategy Task. Subjects competed in an oculomotor version of the game „matching pennies‟ against a dynamic computer opponent (Fig. 1). This opponent performed statistical analyses on the subject‟s history of previous choices and payoffs to uncover systematic biases in their choice strategy (see algorithm 2 from (Lee et al., 2004) for specific details). If both players chose the same target, the subject won $0.04; otherwise no monetary reward was received. Importantly, subjects were fully informed of the rules of the game and that they were playing a strategic game against a dynamic, competitive computer opponent. To examine how task timing influenced strategic responses, all subjects also performed a version of the mixed-strategy task in which the fixed warning period was extended from 600 to 1200 ms. Under this condition, distractors were equally likely to be presented at 100, 300, 500, 800 and 1100ms into the 1200 ms warning period. Pure-Strategy Task. The pure-strategy task (Fig. 1) was identical to the mixed-strategy task except the computer opponent always chose the same target for an entire block of trials and the payoff was reduced from $0.04 to $0.02 per trial. The rewarded direction alternated between the left and right targets across blocks of trials. 9 Instructed Task. The instructed task was also identical to the mixed-strategy task except that a single saccadic target was presented on each trial with equal probability to the left or right. Subjects received $0.02 for successfully acquiring the target with a saccade. Data Analysis. Subjects came in on four separate days and completed four, 220 trial blocks of each task. The first day was a practice session and this data was not included in the final analysis. Only seven of the ten subjects completed the pure-strategy task. The first 20 trials of each block were discarded from analysis to allow subjects time to adjust to the new task conditions. A correct saccade was defined as the first saccade initiated between 120 ms and 350 ms after target onset that landed within 3˚ of the target. An oculomotor capture was defined as a saccade initiated between 70 ms to 220 ms after the distractor onset that landed within 5° of the distractor location. The temporal and spatial constraints were relaxed slightly for distractor-triggered oculomotor captures because these saccades are known to be of shorter latency and hypometric relative to slower correct saccades (Milstein and Dorris, 2007;Theeuwes et al., 1999). The degree of response stochasticity was quantified using entropy (Lee et al., 2004). Entropy describes the measure of uncertainty about the state of the observed system. The entropy, H, can be calculated as follows: pk log 2 pk , H k where pk is the probability of the system being in the state k. Our blocks of trials were of sufficient length to look at patterns of three consecutive trials. There are eight possible left (L) right (R) sequences (RRR, RRL, RLR, RLL, LRR, LRL, LLR, and LLL). 10 8 Therefore, the specific calculation of entropy used here is H p k log 2 p k , whereby k 1 pk is the probability of finding a sequence k in the array of subject‟s outcomes. If behavior is truly stochastic, each of these sequences should be represented with equal probability corresponding to maximum entropy of 3 bits. We subjected data from distractor trials to a three-way repeated measures ANOVA with task type, distractor timing and the target response direction as factors. A chi-square test was used to test whether there were differences in proportions of responses to distractors presented at different times during the warning period. To test whether the proportion of responses increased or decreased we compared the first (i.e., 100 ms) and last (i.e., 500 ms) time of distractor presentation using 95% confidence intervals for populations of proportions (Utts and Heckard, 2007). 11 RESULTS Temporal Preparation of Mixed-Strategy Responses: Oculomotor Captures The first results section analyzes the pattern of distractor-directed oculomotor captures to determine how motor preparation was temporally allocated in advance of strategic response. First, to establish that distractors were effective probes of underlying saccade preparation, the pattern of oculomotor captures was examined during the purestrategy task in which the planned saccade was known with near certainty. During a representative block, subjects chose the rewarded target exclusively during standard trials (Fig. 2A) and, importantly, only distractors presented at the rewarded location triggered oculomotor captures (Fig. 2B). In total, subjects chose the rewarded target 99.5% of the time during standard trials and oculomotor captures were directed almost exclusively towards distractors flashed at rewarded rather than unrewarded locations (Fig. 3A; χ2 test, p<0.001). This conjunction between the direction of saccadic targets and oculomotor captures suggests that distractors effectively probed the spatial allocation of saccade preparation. Ideally, choices should be allocated equally and stochastically between the two targets during the instructed and mixed-strategy standard trials (Fig. 2C and E). This behavioral pattern is expected during instructed trials because the computer randomly selected either target location. This behavioural pattern should be approached during mixed-strategy trials because the dynamic computer opponent encouraged random choice selection by exploiting predictability in the subjects‟ behavior. We used the term “approached” because it is well documented that humans often fall short of achieving the exact proportions and stochasticity requirements predicted by the Nash equilibrium 12 solution (Erev and Roth, 1998). Overall, the percentage of saccades directed to each target did not differ from 50% during these two task (instructed: 49.6±0.9% left, t-test, p>0.05; mixed-strategy: 48.9±0.8% left, t-test, p>0.05) nor did they differ between these two tasks (t-test, p=0.55). To measure stochasticity, we calculated entropy in patterns of choices (see METHODS). Instructed and mixed-strategy tasks both approached maximal entropy of 3 bits of information (instructed: 2.95±0.007% bits; mixed-strategy: 2.94±0.009% bits; t-test, p>0.05). Consequently, any differences that exist between these two tasks during the following analysis of distractor trials cannot be accounted for by any differences in overall response patterns but presumably reflect the instructed versus voluntary nature of these two tasks. During the instructed task, oculomotor captures were directed to both left and right distractors which is consistent with the uncertainty of the upcoming target location (Fig. 2D). Overall, the pattern of oculomotor captures did not differ towards left or right distractors (not shown; p>0.05). In addition, the overall percentage of oculomotor captures increased as distractors were presented later in the warning period (Fig. 3C; ANOVA, p<0.001). Having established the pattern of oculomotor captures when the upcoming response was nearly certain (the pure-strategy task) and uncertain (the instructed task), we now examine the pattern of oculomotor captures during the mixed-strategy task. Like the two other tasks, the percentage of oculomotor captures increased when distractors were presented later in the warning period (Fig. 3E; p<0.001). Moreover, the pattern of oculomotor captures did not differ towards left or right distractors (not shown; p>0.05). Together, this pattern of oculomotor captures suggests that during the mixed-strategy task 13 saccade preparation begins in advance of target presentation and increases as the time of target presentation approaches. Increasing the warning period delayed the time course of saccade preparation (Fig. 4). Initially (i.e., ≤300ms distractor presentation), the percentages of oculomotor captures did not differ between short and long warning period blocks but these percentages diverged when distractors were presented later in the warning period (i.e., 500ms distractor presentation; p<0.01). By the end of each of the respective warning periods, the percentage of oculomotor captures reached similar levels. Lastly, we examined whether the time at which the distractor was presented during the warning period influenced oculomotor capture reaction times. Higher preparatory activity later in the warning period could not only increase the likelihood that visual transients associated with distractor presentation would surpass saccadic threshold but it could also reduce the time required for these visual transients to surpass saccadic threshold. We performed a two-way repeated measures ANOVA on the saccadic reaction times associated with oculomotor captures and found no significant effect of the factors task type (p=0.47) or distractor timing (p=0.73). This is consistent with other studies suggesting that transient visual bursts of activity in visuosaccadic structures are of such rapid onset and high frequency that if they trigger a saccade these saccades occur at very stereotyped latencies (Dorris et al., 1997;Dorris et al., 2007;Godijn and Theeuwes, 2002). Spatial Preparation of Mixed-Strategy Responses: Target-Directed Saccades The analysis of oculomotor captures revealed that during the mixed-strategy task saccade preparation increased as the time of target presentation approached. It is important to recognize that the percentage of oculomotor captures indicates how saccade 14 preparation is allocated probabilistically across blocks of trials rather than on a trial-bytrial basis. Equal proportions of left and right oculomotor captures during the mixedstrategy task could arise if saccades were prepared equally in both directions on every trial (i.e., unbiased preparation hypothesis) or biased in a particular direction on every trial (i.e., biased preparation hypothesis). For the latter to be the case, however, the direction of the planned saccade must vary stochastically from trial to trial. An analysis of target choices during distractors trials follows, whose purpose is to distinguish whether saccade preparation is spatially allocated in a biased or unbiased manner. This section analyzes the choice of targets on those distractor trials in which the distractor failed to trigger a saccade. The following logic is critical for establishing this analysis: saccades planned stochastically either towards the left or right, coupled with distractors presented equally likely at either location, results in distractors that are 50% likely to be presented at the same location to which a saccade is being prepared on any given trial. Therefore, target-directed saccades that occur during distractor trials will be classified as (Fig. 1B): 1) those directed towards the target on the same side as the distractor (Tsame), and 2) those directed towards the target on the side opposite the distractor (Topp). The assumption underlying this analysis is that the selection process continues to culminate towards choosing the target that would have occurred had the distractor not been presented. To test this assumption, target-directed saccades were first analyzed during the pure-strategy task. During the representative block, the subject chose the rewarded target regardless of where the distractor was presented (Fig. 2B – black traces) and this observation held for over 97% of all trials across subjects. This finding suggests that 15 ongoing saccade preparation processes continued largely undisturbed on those trials in which the distractor failed to trigger an oculomotor capture. Target choices during the pure-strategy task were then analyzed relative to distractor location rather than reward location (Fig. 3B). The percentage of Topp did not vary with the time of distractor presentation (χ2 test, p>0.05) remaining near 50% throughout the warning period (Fig. 3B – solid line). This pattern arose because oculomotor captures were successfully withheld during the 50% of trials in which the distractor was presented on the unrewarded side (Fig. 3A – unfilled squares) and saccades were subsequently directed towards the target on the opposite, rewarded side. The percentage of Tsame, however, decreased as the time of target presentation approached (Fig. 3B – dashed line; χ2 test, p<0.001). This pattern arose because as saccade preparation towards the rewarded target increased during the warning period, distractors presented on that same side were more likely to trigger an oculomotor capture. Importantly, conducting this target-directed analysis under conditions with a known saccade goal establishes how Topp and Tsame differ when saccade preparation is biased towards one location. There was no a priori reason that saccade preparation should be spatially biased during the instructed task because the target was equally likely to be presented in either direction. Consistent with this, the percentages of both Tsame and Topp decreased as the time of target presentation approached (Fig. 3D - Tsame, p<0.001; Topp, p<0.05) and these curves did not differ significantly from each other (p>0.05). Target choices following distractor presentation during the mixed-strategy task more closely resembled that of the pure-strategy task than the instructed task; the 16 percentage of Tsame responses decreased (χ2 test, p<0.001) while the percentage of Topp responses did not vary with the time of distractor presentation (χ2 test, p>0.05) (Fig. 3F). Importantly, the Topp and Tsame curves significantly differed at the end of the warning period suggesting a bias existed in favor of one response just before the target presentation in the mixed-strategy task (Fig. 3F; p<0.001). To describe the overall effects, we subjected data from distractor trials to a threeway repeated measures ANOVA with task type, distractor timing and target response direction (Tsame vs. Topp) as factors. There was an effect of the task type (p<0.0001), distractor timing (p<0.0001) and target response direction (p<0.0001) on the proportion of target-directed saccades following distractor presentation. There were significant interactions between task type and distractor timing (p<0.05), task type and target response direction (p<0.01), and distractor timing and target response direction (p<0.05). Lastly, to differentiate saccade preparation processes underlying the instructed and mixed-strategy tasks we conducted another three-way repeated measures ANOVA that included only these two task types. We found a significant interaction between task type and target response direction (p<0.005). This final statistical interaction is of particular importance because it highlights the differences in saccade preparation processes that exist between the instructed and mixed-strategy tasks. DISCUSSION Our results support a biased preparation mechanism preceding mixed-strategy actions. First, we found that strategic saccade preparation accumulated in advance of the presentation of target stimuli. This temporal property of the preparatory process was indexed by an increase in oculomotor captures as the time of target presentation 17 approached (Fig. 3A,C,E). This accumulation depended on the specific timing of the task as evidenced by a delay in the time course of oculomotor captures that accompanied extending the fixed warning period (Fig. 4). Second, we found that strategic saccade preparation became biased to one target over another in advance of the target stimuli. This spatial property of saccade preparation was evidenced by the pattern of targetdirected saccades that emerged when analyzed relative to the distractor location (Fig. 3B,D,F). Temporal and Spatial Allocation of Saccade Preparation Preceding Mixed-Strategy Responses This oculomotor version of „matching-pennies‟ mimics many features of the more familiar manual version of the game (a.k.a., „odds-evens‟, „matching‟, or „choosing‟). In the manual version, players indicate their choices by displaying either one or two fingers. Before game play, players must determine who will fulfill the role of „odds‟ and who „evens‟. „Odds‟ wins if the players display different numbers of fingers (i.e., one/two or two/one) and „evens‟ wins if the players display the same number of fingers (i.e., one/one or two/two). To facilitate simultaneous presentation, players must also agree upon the number of „primes‟ or fist-pumps that will be required before displaying their manual choices. The fixed warning period aids subjects in initiating their saccade choices within a narrow temporal window (Munoz et al., 2000) (i.e., 120-350 ms after target presentation) and thus performs a role analogous to priming. Oculomotor captures increased as distractors were presented later in the warning period (Fig. 3) and this increase was delayed in time when the duration of the warning period was extended (Fig. 4). Both of these observations suggest that motor preparation peaked coinciding with the 18 predictable timing of the target stimuli. This makes sense from a strategic standpoint; maintaining motor preparation at a high sustained level could lead to premature responses thus providing one‟s opponent with a strategic advantage whereas maintaining motor preparation at a low sustained level could result in delayed responses and disqualification from that round of play. Recently described behavioral and neural effects associated with predictably timed responses (Ding and Hikosaka, 2007;Janssen and Shadlen, 2005;Maimon and Assad, 2006) likely played a role in the timing of the strategic responses described here. The predictable target locations associated with this oculomotor „matching pennies‟ game also enabled us to examine the spatial allocation of motor preparation. This spatial analysis may not have been feasible with complexly encoded hand configurations typical of the manual version of the game. Moreover, probing with visual distractors provided insight into the spatial allocation of saccade preparation that would otherwise be difficult given the stochastic nature of choice selection during mixedstrategy games. Specifically, the analysis of target choices following successfully withheld oculomotor captures rests upon the underlying assumption that the same target would have been selected had the distractor not been presented. That subjects overwhelmingly continued to choose the rewarded target under the predictable purestrategy task partly bears this assumption out. Overall, the pattern of target choices following distractors (Fig. 3) shows that during the mixed-strategy task, preparatory processes became biased in favor of one response before the presentation of target stimuli. This locus of saccade preparation varied stochastically from trial to trial. The extent to which preparatory processes were 19 spatially biased was not as pronounced preceding strategic responses as when the rewarded saccade was known with full foreknowledge (i.e., compare the separation in the Topp and Tsame curves for the mixed- (Fig. 3F) and pure-strategy (Figs. 3B) tasks). The differing time course and extent of saccade preparation bias may reflect that ongoing deliberation takes place during the mixed-strategy task compared to early commitment during the pure-strategy task. The current results are important because they suggest that accumulation-towardsthreshold models, which have been so successful in describing simple perceptual decision-making, may also, be applicable to the formation of strategic decisions. The rate of activation accumulation preceding perceptual decisions is largely driven by the quality of sensory evidence (Hanes and Schall, 1996;Shadlen and Newsome, 2001) whereas baseline activation is largely shaped by prior information such as the probability or magnitude of reward (Dorris and Munoz, 1998;Ikeda and Hikosaka, 2003). During the mixed-strategy task, however, both of these selection factors are lacking. First, there is no sensory evidence indicating the correct choice; in fact, subjects are situated in the dark during the warning period. Second, subjects are presumably indifferent on average between available responses because their overall payoffs are equal once the mixedstrategy equilibrium is established (Nash, 1950). We speculate on two mechanisms that could lead to selection biases proceeding stochastic responses. One possibility is that noise in the accumulation signal randomly biases the selection process in favor of one of the options which, coupled with local recurrent excitation and distal competitive inhibition, further strengthens this selection path over time (Dorris et al., 2007;Wong and Wang, 2006). Another possibility is that the recent history of responses and rewards 20 subtly biases subsequent selection processes. In this case, a player would never actually be indifferent to the available responses because of small perturbations in subjective desirability between the options from trial to trial (Harsanyi, 1974). It was not feasible to examine this possibility in the current data set because interleaved distractor trials prohibitively disrupted the natural sequence of events. However, previous studies examining the choice patterns during mixed-strategy games suggests that the previous history of choices and rewards may indeed have a subtle effect on selecting upcoming choices (Corrado et al., 2005;Dorris et al., 2000;Lau and Glimcher, 2005;Lee et al., 2004;Thevarajah and Dorris, 2007). Of course, following a “win-stay/lose shift” strategy too strictly would lead to predictable behavior that could be exploited by one‟s opponent. Outstanding Issues Although our results are consistent with the gradual accumulation of preparatory activation towards one saccade threshold over another, they could also be achieved if the level of activation increased abruptly on each trial (i.e., a “Eureka!” transition). The pattern of increasing oculomotor captures could arise from this latter mechanism if such step transitions occurred more frequently as the time of the sensory trigger approached. Oculomotor captures only describe the probabilistic time course of saccade preparation across blocks of trials. Direct neuronal recording within visuosaccadic structures will be required to distinguish whether there is a gradual or step transition in preparation signals during single trials. Finally, it is worth reiterating that oculomotor captures more accurately reflect the temporal and spatial allocation of saccade preparation rather than decision or attention processes. Under some conditions, the formation of perceptual decisions appear to share a 21 common neural substrate with preparatory signals in brain regions involved in generating the appropriate motor action (Gold and Shadlen, 2000;Gold and Shadlen, 2003;Juan et al., 2004). Unlike perceptual decisions which require the decoding of immediate sensory evidence, strategic decisions can be made well in advance of stimulus presentation and even before the beginning of a trial. Although the decision-making process can occur well in advance, to prevent premature or delayed strategic responses, it is still prudent for motor preparation to reflect the specific timing of the task. Nevertheless, it would be illogical for motor preparation to precede the decision process. Therefore, we conclude that the decision process also occurs in advance of the trigger stimuli during mixedstrategy tasks, either before, or simultaneous with, motor preparatory processes. 22 Acknowledgements: This work was supported by a career development award from the Human Frontier Science Program (HFSP) and a group grant from the Canadian Institutes of Health Research (CIHR) and awarded to MCD. We thank J. Green, S. Hickman, F. Paquin and R. Pengelly for technical assistance. J. Turner provided programming expertise and E. Ryklin customized the data acquisition program. J. Chan, J. Kan, D.M. Milstein, D. Theverajah, and R.Webb provided constructive feedback regarding the manuscript. 23 FIGURE LEGENDS Figure 1. All behavioral paradigms consisted of 75% standard trials (A) and 25% distractor trials (B). Each panel represents successive computer screens in time. Arrows indicate the subject‟s possible choices and the box indicates the computer‟s choice. These schematics correspond to the pure-strategy and mixed-strategy tasks; for the instructed task (not shown) only one target was presented per trial. Tsame denotes choice of target on same side as distractor, Topp, choice of target on opposite side of distractor. Figure 2. Eye position traces from representative blocks of the pure-strategy (A and B), instructed (C and D) and mixed-strategy (E and F) tasks from one subject. Each block of trials is segregated into standard trials (top panels) and distractor trials (bottom panels). Black traces correspond to target-directed correct saccades and gray traces to distractordirected oculomotor captures. Successive black vertical dashed lines represent the beginning and the end of the warning period, respectively. Vertical gray lines represent the 5 possible times of distractor presentation. Figure 3. Influence of distractor timing on oculomotor captures (top panels) and correct saccades (bottom panels) across all subjects. Error bars represent 95% confidence intervals. Asterisks denote times in which the percentages of Topp and Tsame significantly differed (χ2 test, p<0.01). Note that the Topp, Tsame and oculomotor captures sum to 100% for each time epoch in each task. Figure 4. Influence of task timing on the percentage of oculomotor captures during the mixed-strategy task. The solid and dashed lines represent the pattern of oculomotor captures when the warning period was fixed at 600 and 1200 ms, respectively. Distractors were equally likely to be presented at 100, 300, 500, 800 and 1100ms during the 1200ms 24 warning period block. Error bars represent 95% confidence intervals. The asterisks denote a statistically significant difference between percentages of responses at the same time points (χ2 test, p<0.01). 25 Reference List Aumann RJ (1985) What is Game Theory Trying to Accomplish? In: Frontiers of Economics (Arrow K, Hankapohja S, eds), pp 28-78. Oxford: Basil Blackwell. Barraclough DJ, Conroy ML, Lee D (2004) Prefrontal cortex and decision making in a mixed-strategy game. Nat Neurosci 7: 404-410. Camerer CF, Ho TH, Chong JK (2002) Sophisticated experience-weighted attraction learning and strategic teaching in repeated games. Journal of Economic Theory 104: 137188. Carpenter RH, Williams ML (1995) Neural computation of log likelihood in control of saccadic eye movements. Nature 377: 59-62. Cisek P, Kalaska JF (2005) Neural correlates of reaching decisions in dorsal premotor cortex: specification of multiple direction choices and final selection of action. Neuron 45: 801-814. Coe B, Tomihara K, Matsuzawa M, Hikosaka O (2002) Visual and anticipatory bias in three cortical eye fields of the monkey during an adaptive decision-making task. J Neurosci 22: 5081-5090. Cohen MX, Ranganath C (2007) Reinforcement learning signals predict future decisions. J Neurosci 27: 371-378. Corrado GS, Sugrue LP, Seung HS, Newsome WT (2005) Linear-Nonlinear-Poisson models of primate choice dynamics. J Exp Anal Behav 84: 581-617. Ding L, Hikosaka O (2007) Temporal development of asymmetric reward-induced bias in macaques. J Neurophysiol 97: 57-61. Dorris MC, Glimcher PW (2004) Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron 44: 365-378. Dorris MC, Munoz DP (1998) Saccadic probability influences motor preparation signals and time to saccadic initiation. J Neurosci 18: 7015-7026. Dorris MC, Olivier E, Munoz DP (2007) Competetive Integration of Visual and Preparatory Signals in the Superior Colliculus during Saccadic Programming. J Neurosci 27: 5053-5062. Dorris MC, Pare M, Munoz DP (1997) Neuronal activity in monkey superior colliculus related to the initiation of saccadic eye movements. J Neurosci 17: 8566-8579. Dorris MC, Pare M, Munoz DP (2000) Immediate neural plasticity shapes motor performance. J Neurosci 20: RC52. 26 Driver PM, Humphries DA (1988) Protean Behaviour: The Biology of Unpredictability. Oxford University Press. Erev I, Roth AE (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American Economic Review 88: 848-881. Fundenberg D, Tirole J (1991) Game Theory. Cambridge: MIT Press. Glimcher PW (2003) The neurobiology of visual-saccadic decision making. Annu Rev Neurosci 26: 133-179. Godijn R, Theeuwes J (2002) Programming of endogenous and exogenous saccades: evidence for a competitive integration model. J Exp Psychol Hum Percept Perform 28: 1039-1054. Gold JI, Shadlen MN (2000) Representation of a perceptual decision in developing oculomotor commands. Nature 404: 390-394. Gold JI, Shadlen MN (2003) The influence of behavioral context on the representation of a perceptual decision in developing oculomotor commands. J Neurosci 23: 632-651. Hanes DP, Schall JD (1996) Neural control of voluntary movement initiation. Science 274: 427-430. Harsanyi JC (1974) Equilibrium-Point Interpretation of Stable Sets and A Proposed Alternative Definition. Management Science Series A-Theory 20: 1472-1495. Ikeda T, Hikosaka O (2003) Reward-dependent gain and bias of visual responses in primate superior colliculus. Neuron 39: 693-700. Janssen P, Shadlen MN (2005) A representation of the hazard rate of elapsed time in macaque area LIP. Nat Neurosci 8: 234-241. Juan CH, Shorter-Jacobi SM, Schall JD (2004) Dissociation of spatial attention and saccade preparation. Proc Natl Acad Sci U S A 101: 15541-15544. Lau B, Glimcher PW (2005) Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav 84: 555-579. Lee D, Conroy ML, McGreevy BP, Barraclough DJ (2004) Reinforcement learning and decision making in monkeys during a competitive game. Brain Res Cogn Brain Res 22: 45-58. Maimon G, Assad JA (2006) A cognitive signal for the proactive timing of action in macaque LIP. Nat Neurosci 9: 948-955. 27 Maynard Smith J (1982) Evolution and the Theory of Games. Cambridge: Cambridge University Press. Miller GF (1997) Protean Primates: the evolution of adaptive unpredictability in competition and courtship. In: Machiavellian Intelligence II: Extensions and Evaluations (Whiten A BRW, ed), pp 312-340. Cambridge University Press. Milstein DM, Dorris MC (2007) The influence of expected value on saccadic preparation. J Neurosci 27: 4810-4818. Munoz DP, Dorris MC, Pare M, Everling S (2000) On your mark, get set: brainstem circuitry underlying saccadic initiation. Can J Physiol Pharmacol 78: 934-944. Nash JF (1950) Equilibrium Points in N-Person Games. Proceedings of the National Academy of Sciences of the United States of America 36: 48-49. Palacios-Heurta I (2003) Professionals Play Minimax. Review of Economic Studies 70: 395-415. Platt ML, Glimcher PW (1999) Neural correlates of decision variables in parietal cortex. Nature 400: 233-238. Rubinstein A (1991) Comments on the Interpretation of Game-Theory. Econometrica 59: 909-924. Schall JD (2001) Neural basis of deciding, choosing and acting. Nat Rev Neurosci 2: 3342. Schultz W (2006) Behavioral theories and the neurophysiology of reward. Annu Rev Psychol 57: 87-115. Seo H, Lee D (2007) Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J Neurosci 27: 8366-8377. Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol 86: 1916-1936. Shinar J, Forte I, Kantor B (1994) Mixed Strategy Guidance: A New High-Performance Missle Guidance Law. Journal of Guidance, Control and Dynamics 17: 129-135. Smith JM (1982) Evolution and the Theory of Games. Cambridge University Press. Sugrue LP, Corrado GS, Newsome WT (2004) Matching behavior and the representation of value in the parietal cortex. Science 304: 1782-1787. Theeuwes J, Kramer AF, Hahn S, Irwin DE (1998) Our eyes do not always go where we want them to go: Capture of the eyes by new objects. Psychological Science 9: 379-385. 28 Theeuwes J, Kramer AF, Hahn S, Irwin DE, Zelinsky GJ (1999) Influence of attentional capture on oculomotor control. J Exp Psychol Hum Percept Perform 25: 1595-1608. Thevarajah, D. and Dorris, M. C. Influence of previous choices and rewards on superior colliculus activity is modulated by behavioral context. Society for Neuroscience . 2007. Ref Type: Abstract Trappenberg TP, Dorris MC, Munoz DP, Klein RM (2001) A model of saccade initiation based on the competitive integration of exogenous and endogenous signals in the superior colliculus. J Cogn Neurosci 13: 256-271. Usher M, McClelland JL (2001) The time course of perceptual choice: the leaky, competing accumulator model. Psychol Rev 108: 550-592. Utts JM, Heckard RF (2007) Mind on Statistics. Thompson. Walker M, Wooders J (2001) Minmax Play at Wimbledon. The American Economic Review 91: 1521-1538. Wilimzig C, Schneider S, Schoner G (2006) The time course of saccadic decision making: dynamic field theory. Neural Netw 19: 1059-1074. Wong K, Wang X (2006) A Recurrent Network Mechanism of Time Integration in Perceptual Decisions. J Neurosci 26: 1314-1328. 29 $ % 6WDQGDUG7ULDOV 'LVWUDFWRU7ULDOV ORMS 'LVWUDFWRU :DUQLQJ 3HULRG PV " 4SAME " 4OPP 6XEMHFW&KRLFH &RPSXWHU&KRLFH $ & 3XUH6WUDWHJ\7DVN ( ,QVWUXFWHG7DVN 0L[HG6WUDWHJ\7DVN +RUL]RQWDO(\H3RVLWLRQ'HJUHHV % ' ) 7LPHIURP)L[DWLRQ3RLQW2IIVHWPV ! $ISTRACTOR$IRECTED /CULOMOTOR#APTURES % ,QVWUXFWHG7DVN " $ & 4ARGET$IRECTED #ORRECT3ACCADES # 3XUH6WUDWHJ\7DVN "OTH$IRECTIONS 2EWARDED$IRECTION 5NREWARDED$IRECTION 4OPP 4SAME 0L[HG6WUDWHJ\7DVN 4IMEFROM&IXATION0OINT/FFSETMS $ISTRACTOR$IRECTED /CULOMOTOR#APTURES MS7ARNING0ERIOD MS7ARNING0ERIOD 4IMEFROM&IXATION0OINT/FFSETMS
© Copyright 2026 Paperzz