J Neurophysiol 114: 3374 –3385, 2015. First published October 14, 2015; doi:10.1152/jn.00884.2015. Action-outcome relationships are represented differently by medial prefrontal and orbitofrontal cortex neurons during action execution Nicholas W. Simon, Jesse Wood, and Bita Moghaddam University of Pittsburgh, Department of Neuroscience, Pittsburgh, Pennsylvania Submitted 14 September 2015; accepted in final form 13 October 2015 prefrontal cortex; orbitofrontal cortex; instrumental behavior; electrophysiology; reward (PFC) has long been implicated in the planning and execution of goal-directed actions (Arnsten et al. 2012; Baddeley and Della Sala 1996; Fuster 2001). This region is involved in emotion, perception, attention, and working memory, and thus is well suited to integrate these cognitive functions during action selection. Accordingly, PFC damage has been associated with aberrant decision-making and rewardrelated behavior, possibly resulting from failure to utilize representations of action-outcome relationships to guide action selection (Bechara and Van Der Linden 2005; Cardinal 2006; Killcross and Coutureau 2003; Ostlund and Balleine 2007; Schwartz 2006; Tran-Tu-Yen et al. 2009; Walton et al. 2007). Neuronal correlates of reward-directed actions have been observed across multiple regions of PFC (Furuyashiki et al. 2008; Homayoun and Moghaddam 2009; Hyman et al. 2013; Luk and Wallis 2009; Mulder et al. 2003; Narayanan and Laubach 2006; Peters et al. 2005; Sturman and Moghaddam 2011; Young and Shapiro 2011). Selection of the optimal behavioral strategy requires information about the relationship between actions and outcomes. Because prior research suggests that PFC acts as an “executor” region, PFC neural PREFRONTAL CORTEX Address for reprint requests and other correspondence: N. W. Simon, Univ. of Pittsburgh, Dept. of Neuroscience, A210 Langley Hall, Pittsburgh, PA 15260 (e-mail: [email protected]). 3374 activity would be expected to encode information about these relationships during action performance. PFC is divided into multiple subregions that play complementary roles in reward-related behavior (Brown and Bowman 2002; Buckley et al. 2009; Chudasama and Robbins 2003; Glascher et al. 2012; Gourley et al. 2010; Luk and Wallis 2013; Moghaddam and Homayoun 2007; Rudebeck et al. 2006). The orbitofrontal cortex (OFC) has been implicated in outcome representation (Fellows 2011; Gallagher et al. 1999; Izquierdo et al. 2004; Morrison and Salzman 2009; Padoa-Schioppa and Assad 2006; Roesch and Olson 2004; Rolls 2004; Rolls and Baylis 1994; Schoenbaum et al. 1998; Simmons and Richmond 2008; Tremblay and Schultz 1999; van Duuren et al. 2007), whereas medial PFC (mPFC) has been implicated in assigning value to actions (Behrens et al. 2007; Corbit and Balleine 2003; Histed et al. 2009; Luk and Wallis 2013; Matsumoto et al. 2007; Matsumoto and Tanaka 2004; Mulder et al. 2003; Oliveira et al. 2007; Rudebeck et al. 2008). In the current study, we recorded single units in rat OFC and mPFC as we independently manipulated the relationship between outcomes and actions (specifically, outcome value, and the number of action trials required to earn an outcome, or “action requirement”). This design allowed us to parse different aspects of the action-outcome relationship and provided insight about how PFC neurons encode this information. Finally, recording simultaneously from OFC and mPFC allowed us to assess whether this information is segregated among PFC regions. MATERIALS AND METHODS Subjects Male Sprague-Dawley rats (n ⫽ 21, 250 g upon arrival, Harlan, Frederick, MD) were used for this experiment. All subjects initially were pair-housed with a similarly aged animal on a 12:12-h reverse dark/light cycle (lights on at 7 pm), then individually housed following surgery. Rats were food deprived to ⬃90% of free feeding weight throughout the experiment. All research methods included here were approved by the Institutional Animal Care and Use Committee at the University of Pittsburgh. Apparatus Behavioral testing was performed in chambers with metal front and back walls, transparent Plexiglas side walls, and a steel wire floor (Coulbourn Instruments, Allentown, PA). The chamber was equipped with an automated feeder and corresponding food trough on one wall, and a nosepoke port mounted on the opposite wall. The chamber was located inside a sound attenuating cubicle, and behavior was monitored throughout each session using a camera mounted on the back wall. All behavioral experiments were designed and performed with Graphic State software (Coulbourn Instruments, Allentown, PA). 0022-3077/15 Copyright © 2015 the American Physiological Society www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Simon NW, Wood J, Moghaddam B. Action-outcome relationships are represented differently by medial prefrontal and orbitofrontal cortex neurons during action execution. J Neurophysiol 114: 3374–3385, 2015. First published October 14, 2015; doi:10.1152/jn.00884.2015.—Internal representations of action-outcome relationships are necessary for flexible adaptation of motivated behavior in dynamic environments. Prefrontal cortex (PFC) is implicated in flexible planning and execution of goal-directed actions, but little is known about how information about action-outcome relationships is represented across functionally distinct regions of PFC. Here, we observe distinct patterns of actionevoked single unit activity in the medial prefrontal cortex (mPFC) and orbitofrontal cortex (OFC) during a task in which the relationship between outcomes and actions was independently manipulated. The mPFC encoded changes in the number of actions required to earn a reward, but not fluctuations in outcome magnitude. In contrast, OFC neurons decreased firing rates as outcome magnitude was increased, but were insensitive to changes in action requirement. A subset of OFC neurons also tracked outcome availability. Pre-outcome anticipatory activity in both mPFC and OFC was altered when reward expectation was reduced, but did not differ with outcome magnitude. These data provide novel evidence that PFC regions encode distinct information about the relationship between actions and impending outcomes during action execution. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS Surgery Before behavioral training, rats were implanted bilaterally under isoflurane anesthesia with eight-channel microelectrode arrays for single unit recording in mPFC and OFC. Brain regions were counterbalanced by hemisphere across rats. Coordinates were ⫹3 AP, ⫹0.5 ML, and ⫺4 DV for mPFC, and ⫹3 AP, ⫹3.3 ML, and ⫺5.5 DV for OFC. Habituation, as described above, started at least 1 wk after surgery. Behavioral Training sucrose pellet reward. During this block of trials, the task proceeded identically to the training sessions. During a second block of 30 trials, every action was reinforced with delivery of four sucrose pellets. During the third and final block of 30 trials, actions were again reinforced with a single sucrose pellet. Therefore, outcome magnitude was manipulated (1:4:1 sucrose pellets), while the number of actions required remained constant. After this session, each animal was run on a session in which 1 action was reinforced with 1 reward, in order to ensure that the original action-outcome relationship was intact before proceeding with the second shift. Shift 2: alternating reward trials (action requirement shift). During the first block of 30 trials, a single action was reinforced with a single sucrose pellet. For the remainder of the session, the first action during the response period was unrewarded and caused immediate progression to the ITI. In the following trial, an action was reinforced with a single pellet. As such, trials alternated between unrewarded and rewarded, and two actions in consecutive trials were required for reinforcement. Because two actions were required for each reward delivery, we refer to this phase of the task as a shift in action requirement. It could also be interpreted by the rat as a change in reward probability, as one in every 2 responses is reinforced. If the subject performed a correct action but then missed the response period on the following trial, the two trial sequence reset. The subjects each performed 60 trials in the alternating reward trial block. Details of the task are depicted in Fig. 1. Electrophysiology Neural signals were buffered by a headstage amplifier before being amplified and analog band pass filtered by a preamplifier. The headstage amplifier and cables were connected to a commutator allowing the animal to move freely in the behavioral chamber. Spikes were analog filtered between 300 Hz and 8 kHz before being digitized at 40 kHz. Voltage threshold crossing waveforms were saved to disk by PC controlled recording software (Plexon, Dallas, TX). The behavioral system controller sent TTL pulses to the neural data acquisition system that were used to synchronize behavioral events with neural data. Units were sorted offline using standard techniques and the Offline Sorter software package (Plexon). Fig. 1. Behavioral procedures and histology. A: schematic detailing the task progression across experimental sessions. Following shaping, animals were given a baseline session, followed by the outcome magnitude shift session. After the magnitude shift session, another baseline session was performed before the action requirement shift was instituted. B: during the outcome magnitude shift session, the first 30 trials were 1 action:1 pellet, the second 30 were 1 action:4 pellets, and the final 30 were restored to 1 action:1 pellet. C: during the action shift session, the first 30 trials were 1 action:1 pellet, then the following 60 trials were 2 actions:1 pellet. In the action requirement shift session, each successful action during the cue was counted as a trial; thus 30 of these trials were unrewarded, and 30 were rewarded. In order to receive reinforcement, rats were required to complete 2 consecutive trials; a missed trial caused the 2 actions:1 pellet sequence to reset. D: electrode placements in medial prefrontal (mPFC) and orbitofrontal cortex (OFC). Each box represents the placement of one electrode array. Note that all OFC placements were located in the lateral subregion of the OFC. J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Rats were handled and habituated to the testing room, sucrose pellets, and connecting or disconnecting the headstage cable. After habituation, they were given a 30-min magazine training session, during which sucrose pellets (45 mg, Bio-Serv, Frenchtown, NJ) were delivered into a food trough at random intervals (75 ⫾ 45 s). On day 2, rats were trained to perform a single nose-poke when a port was illuminated, which was followed by immediate pellet delivery, then a 5-s intertrial interval (ITI). The nose-poke port light was extinguished immediately after the nose-poke was performed. This training session was repeated daily until rats completed at least 50 reinforced nosepokes. Once rats met this criterion, in all trials the nose-poke port was again illuminated, but only for a 10-s response period. A nose-poke during this period resulted in termination of the port light, and food was delivered after a 2-s delay. This delay was instituted to provide clear temporal dissociation of the action from outcome delivery, and was fixed at 2 s regardless of whether the rat entered the trough or not after the action. After a variable ITI of 10 –20 s, another trial was initiated. If the rat failed to respond during the 10-s response period, then the port light was extinguished and the trial proceeded directly to the ITI. Each trial was 30 s long, and each 45-min session consisted of 90 trials. Rats were trained until they were able to correctly perform the action in 80% of the trials. To assess how PFC units responded to changes in action-outcome relationships, rats were subjected to two shifts in task components on separate days: an outcome magnitude shift and an action requirement shift. Shift 1: outcome magnitude shift. During the first block of 30 trials, a single action during the response period was reinforced with a single 3375 3376 PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS Histology Statistical Analysis After completion of the experiment, rats were anesthetized with chloral hydrate and perfused with saline and 10% formalin solution. Brains were removed and placed in 10% formalin, then cryoprotected in a 30% formalin sucrose solution. Brains were sectioned in 60-m coronal slices, stained with cresyl violet, and mounted to microscope slides. Finally, electrode-tip placements were verified under a light microscope (Fig. 1). For comparisons in behavior between trial types, we used one-way repeated-measures ANOVA as an omnibus test, and paired t-tests to compare specific trial types. Paired t-tests were used to compare average neuronal activity between different trial types (1 pellet vs. 4 pellets) during the epochs of interest for the magnitude shift. One-way ANOVA was used to compare all three trial types (preshift, postshiftunrewarded, postshift-rewarded) for the alternating reward trial shift, with individual repeated-measures comparisons used to further analyze differences between pairs of trial types. Behavior Analysis Electrophysiological Analysis Analyses focused on data obtained from 3 sessions: 1) the final day of baseline training, 2) the outcome magnitude shift, and 3) the action requirement shift. All analyses were conducted using custom-made analysis routines executed in Matlab version 2010b (MathWorks, Nattick, MA). All single unit discharge rates were calculated in 50-ms bins and Z-score normalized against a 5-s segment of the ITI (9 s prior to trial initiation). Data were smoothed using a 5-point moving average filter prior to both analysis and presentation to minimize the error in estimates of firing rate as recommended by Kass et al. (2003). The current analyses focused on two periods: a peri-action window (5 bins, 250 ms total length) centered on the execution of each action, and an outcome anticipation window (10 bins, 500 ms) that immediately preceded reward delivery. These windows were selected based on the most commonly observed patterns of responses in the data (see Fig. 4). Selective units were defined as “activated” or “suppressed” for either of these events based on two criteria: 1) average firing rate (Hz) during the period surrounding the action/prior to the outcome was significantly different from average firing rate during the baseline period via a paired-samples t-test; and 2) the period contained at least three consecutive 50-ms bins in which the Z score exceeded 1 for activated units, or was less than ⫺0.75 for suppressed units. (A less stringent criterion was used for suppressed units because of floor effects.) All neuronal responses were visually verified to ensure the accuracy of the classification criteria. Units with a baseline firing rate greater than 10 Hz, and spike duration less than 0.5 ms, were classified as fast-spiking units (Connors and Gutnick 1990; Homayoun and Moghaddam 2007; Tierney et al. 2004). The percentage of units meeting these criteria was too low to permit reliable statistical analysis (5.65% across all three days of analysis), and therefore action and outcome anticipation-selective fast spiking units were excluded from this study. RESULTS Behavior Rats were first trained on a simple action-outcome task in which a nose-poke into a port was reinforced with a single sugar pellet. Then, to assess how PFC neurons encode shifts in action-outcome relationships, rats were subjected to two shifts in task contingencies on separate days: 1) an outcome magnitude shift, in which one instrumental action was rewarded with four pellets, then 2) an action requirement shift in which two actions were required to receive one pellet (Fig. 1). Before the shifts, rats required 5.8 ⫾ 0.4 sessions to achieve task criterion on the action-outcome task (responding correctly on at least 80% of trials). In the final training session, subjects completed 84.51% of trials, with an average latency to respond of 3.71 s. During the magnitude shift session, rats generally entered the food trough prior to food delivery. There was a difference between trial types in latency to enter the food trough after the nose poke (which is henceforth referred to as the “action”), such that rats were quicker to enter the food trough for the 4 pellet outcome than the 1 pellet outcome (t14 ⫽ 3.09, P ⫽ 0.01; Fig. 2A). This measure is indicative of greater motivation on the 4 pellet trials (Bohn et al. 2003; Holland and Straub 1979; Schoenbaum and Setlow 2003) and provides a critical behavior index demonstrating that rats actively discriminated between trial types. However, there was no difference in latency to perform the action between different outcome magnitudes (t21 ⫽ 0.22, P ⫽ 0.83; Fig. 2C). During the action requirement shift session, in contrast, a difference in response latency was observed between trial types (F2,32 ⫽ 7.031, P ⫽ 0.003; Fig. 2B). Rats demonstrated an increased latency to respond on unrewarded compared with rewarded trials (t16 ⫽ 3.58, P ⫽ 0.003). This indicated that they were able to track impending rewards on a trial-by-trial basis rather than expecting a 50% probability of reinforcement on all trials following the shift (Fig. 2). Data showing latency to enter the food trough from the action shift session are not reported here because there were a limited number of trials in which rats entered the food trough prior to reward delivery after this shift. Collectively, these data provide evidence that rats behaviorally distinguished trial types before and after shifts of both outcome magnitude and action requirement. Furthermore, latency to enter the trough was sensitive to changes in outcome magnitude, whereas latency to perform the action was sensitive to reward availability. Neural Activity During Behavior A total of 22 rats with microelectrode arrays implanted in mPFC and/or OFC were utilized for this experiment (Fig. 1). We recorded from 228 units in mPFC and 164 units in OFC J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 To assess whether rats were able to detect the shifts in actionoutcome relationships, we measured different aspects of behavioral performance. First, we assessed the latency to perform the action (nose-poke) after cue presentation. Following the alternating reward trial shift, when two actions were required to receive a reward, only trials that were part of a full, rewarded sequence were included in this measure (i.e., omitted trials and unrewarded trials not followed by a rewarded action were not included). Latency to enter the food trough after the action was also measured. Because there was a 2-s gap between the action and reward delivery, rats were often able to enter the trough prior to food delivery. For this analysis only trials in which rats entered the trough prior to food delivery were included, as latency to enter the food trough has often been used as a measure of motivation/expectation (Bohn et al. 2003; Holland and Straub 1979; Schoenbaum and Setlow 2003). Entries into the trough after reward delivery were not included in this measure, as these trials were likely more reflective of reaction time to pellet delivery (rather than reward expectation). This measure was used only for the outcome magnitude shift session, because rats were less likely to enter the food trough prior to reward delivery following the alternating reward trial shift. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS 3377 across the 3 sessions for which data are shown (final training session, outcome magnitude, and action-requirement shift). Our analyses focused on the peri-action and outcome anticipation periods of these sessions. Characterization of action and outcome anticipation evoked neuronal firing. During the final training session, after animals had acquired the action-outcome relationship, we recorded 77 mPFC and 68 OFC units. As expected, we observed selectivity for different task events in both PFC regions, with some units activated by events, and others suppressed by events (Fig. 3). Several of these units were responsive to the instrumental action, with the action temporally defined as when the rat activated a photodetector stationed inside the nose poke port. This period includes action preparation, action execution, and cue light offset. Among these units, 15 in mPFC and 11 in OFC units were activated in the periaction period, and 14 in mPFC and 14 in OFC units were suppressed by the action (Fig. 3). Representative examples of action-selective units are shown in Fig. 4, as is the average response of all selective PFC units. Note that, in both brain regions, the average of activated and suppressed neurons began their phasic response firing rate before the completion of the action (Fig. 4). This implies that units were not simply encoding the cue termination following the action, but the action itself. In general, subpopulations of units in both PFC regions represented action-related information with no overt differences in the topography of the neuronal response. In both regions, there also were subsets of units that demonstrated a sustained activation or suppression in firing rate after the action or before outcome delivery. In mPFC, 14 units were selectively activated during the outcome anticipation period (beginning 0.5 s before delivery and culminated at delivery), and 14 were suppressed (Fig. 3). In OFC, 11 units were activated, and 12 were suppressed during this period. As previously described, representative examples of pre-outcome responding and the average group response are depicted in Fig. 4, I–L. In most cases, neuronal responses took place in either the peri-action period, or after the action and prior to outcome delivery, with a relatively small percentage modulated by both events (Fig. 3). Outcome magnitude shift. The outcome magnitude shift session was arranged in an A-B-A design, that began with a single action producing a single sugar pellet (1 action:1 outcome), shifted to each action producing four pellets (1:4), then was restored to one action producing one pellet (1:1). We recorded 70 mPFC units and 45 OFC units in this session. In mPFC, 32 of these units were modulated (activated or suppressed) during action execution during the first 1:1 block of trials, with 16 of these being activated during the action and 16 suppressed during the action (Table 1). In activated units, there was no difference in the magnitude of population activity between the early and late 1:1 conditions in mPFC activated units (t15 ⫽ 0.91, P ⫽ 0.39); hence, these conditions were averaged together and compared with the 1:4 condition. There J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Fig. 2. Rats behaviorally discriminate between trial types. A: there was a difference in latency to enter the food trough after the action (and prior to reward delivery) based on outcome magnitude, with rats entering the trough more quickly on 4 pellet trials. B: there was no difference in latency to perform the action at the beginning of each trial based on outcome magnitude. C: there was a difference in latency to the action during the action requirement shift, such that subjects responded more quickly on rewarded trials than unrewarded trials following the shift. Trial types highlighted in gray on the X axis are post-shift, and rats alternated between these trials. All data are displayed as means ⫾ SE. Asterisk represents statistical significance as assessed by t-tests. 3378 PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS was no modulation of peri-action mPFC activity caused by the outcome magnitude shift (Fig. 5A, t15 ⫽ 1.66, P ⫽ 0.12). The lack of a consistent effect of outcome magnitude shift on neuronal activity is evident when each individual unit’s response is examined (Fig. 5B). Although more units showed reduced activity than increased activity (Table 1), neither group was substantial enough to produce significant effects. Units suppressed by the action in mPFC were also unresponsive to increases in outcome magnitude (t15 ⫽ 0.52, P ⫽ 0.61; Fig. 5). Similar to Fig. 4F, a subset of units (n ⫽ 20) demonstrated a gradual increase in neuronal activity before outcome delivery. There was no difference in pre-outcome activity between the early and late 1:1 blocks (t19 ⫽ 1.86, P ⫽ 0.078). When these responses were combined and compared to the preoutcome period during the 1:4 block, there was again no difference (t19 ⫽ 0.12, P ⫽ 0.902; Fig. 5, E and F). This indicated that mPFC units are activated prior to a rewarding outcome, but that the magnitude of this neural signal does not differ based on outcome magnitude. In OFC, 11 of OFC units were activated by actions, with no difference in the magnitude of neuronal activation observed between the 1:1 blocks of trials (t10 ⫽ 1.004, P ⫽ 0.34). In contrast to mPFC, OFC population activity increased less when the magnitude of the outcome was 4 pellets vs. 1 pellet (Fig. 5C, t10 ⫽ 2.61, P ⫽ 0.026). This response was carried by the majority of the units in the subpopulation (Fig. 5D). We probed for correlations between individual units and trial number after the shift, and only one unit demonstrated a significant negative correlation, indicating that the majority of units did not reduce activity in a linear fashion following the shift (mean r of activated units ⫽ ⫺0.10). Rather, units showed a reduction in action-evoked activity almost immediately following the shift to four pellets, and there was no difference between early and late trials following both the shift to four pellets (t10 ⫽ 0.29, P ⫽ 0.78) and the shift back to one pellet (t10 ⫽ 0.28, P ⫽ 0.79). A small subset (n ⫽ 8) of OFC units were suppressed during action performance, and these units did not discriminate between 1 and 4 pellet trials (t7 ⫽ 1.15, P ⫽ 0.29). Therefore, J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Fig. 3. mPFC and OFC units are modulated by multiple events during action-outcome task. A and B: individual mPFC unit and OFC unit activity aligned to action execution. C and D: percentage of units in mPFC (C) and OFC (D) activated or suppressed by action execution and outcome anticipation. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS 3379 action-selective units in OFC were modulated by changes in outcome magnitude, although this effect was confined to units activated during the action. Thirteen OFC units were activated during outcome anticipation. There was no difference in the magnitude of this activity based on upcoming outcome magnitude (1:1 vs. 1:1 block comparison: t12 ⫽ 0.889, P ⫽ 0.391; 1:1 vs. 1:4 comparison: t12 ⫽ 1.286, P ⫽ 0.229; Fig. 5, G and H), indicating that OFC neurons encode information about outcome magnitude during action performance but not during outcome anticipation. Alternating reward trials shift. The alternating reward trials shift, or shift in action requirement (number of actions per reward), began with a single action required to earn a single sugar pellet (1:1). The number of actions required to produce a sugar pellet was then increased (2:1). We recorded 81 mPFC units and 51 OFC units during this session. In mPFC, 15 units were classified as being activated during the action. We compared activity evoked by three types of actions within the session—preshift actions (all of which were rewarded), postshift unrewarded actions, and postshift rewarded actions—and found a difference in population response evoked by these actions (F2,30 ⫽ 10.98, P ⬍ 0.001, Fig. 6, A and B). Individual contrasts revealed that the preshift actions evoked greater neural activity than both unrewarded (P ⫽ 0.002) and rewarded actions (P ⫽ 0.001) following the shift. Unrewarded and rewarded actions evoked similar levels of activation in mPFC (t15 ⫽ 1.19, P ⫽ 0.25; Fig. 6). Although the percentage of mPFC units that exhibited decreased activity after the action shift was comparable to during the outcome shift (Table 1), the magnitude of this decrease was substantially greater here (105.59% decrease in activity during the action shift; 37.30% decrease during the outcome shift), indicating that mPFC units are much more sensitive to changes in actions required compared with outcome value. There were no correlations between action-evoked activation and trial number for any of these units after the action requirement shift (mean r of activated units ⫽ 0.002), indicating that activity was not reduced in a linear fashion after the shift. We compared blocks of early and late trials following the shift and found no differences between these blocks (t13 ⫽ 0.26, P ⫽ 0.80), indicating that activity was reduced immediately following the shift, and remained at this level throughout the session. Units in mPFC that were suppressed during the action (n ⫽ 17) also encoded information about the shift in action require- J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Fig. 4. Event-selective units in mPFC and OFC during action-outcome task. A–D: representative sample raster plots and peri-event time histograms from action-selective activated units are displayed as mPFC (A) and OFC (C). Firing rate in histogram is displayed as spikes/s (Hz), averaged across all trials, and aligned to the execution of the action. Z-score normalized firing rates of all action selective units in mPFC (B) and OFC (D) are depicted as means ⫾ SE (shaded area) aligned to execution of the action. E–H: representative samples from action-selective suppressed units are displayed as mPFC (E) and OFC (G). Z-score normalized firing rates of all action selective units are displayed as mPFC (F) and OFC (H). I–L: example mPFC (I) and OFC (K) units with activation during anticipation of outcome delivery. Mean firing rates of outcome-anticipation selective units aligned to outcome delivery are displayed as mPFC (J) and OFC (L). 3380 PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS ment. There was a near-significant difference in action-evoked suppression between the three trial types (F2,32 ⫽ 3.13, P ⫽ 0.057). Specifically, there was a significant difference between trials before and after the shift, such that units increased activity (i.e., decreased suppression) after the number of actions required for reward was shifted from 1 to 2 (t16 ⫽ 2.23, P ⫽ 0.041). Thus units that were either activated or suppressed by actions were sensitive to changes in action requirement. In the subset of units that activated during outcome anticipation (n ⫽ 24), mPFC neuronal activity was increased relative to baseline during the outcome anticipation period, reaching peak activity around delivery (Fig. 6, E and F). There was a difference in this anticipatory activity between the three action types (F2,46 ⫽ 5.13, P ⫽ 0.01), such that pre-shift activity was greater than post-shift activity during both unrewarded (P ⫽ 0.006) and rewarded trials (P ⫽ 0.007). There was again no difference between unrewarded and rewarded trials (P ⫽ 0.475). Taken together, these data suggest that mPFC neurons represented information about action requirement, but not outcome availability or the result of the past trial, by reducing firing rates during action execution and outcome anticipation when multiple actions were required for reward. In OFC, 29.41% of units were activated by action execution. There was no difference in population response between any action types (F2,14 ⫽ 0.833, P ⫽ 0.45; Fig. 6, C and D). A separate subset of action-selective OFC units was suppressed by the action (15.69%). In this case, there was a significant difference in activity among trial types (F1,14 ⫽ 11.54, P ⫽ 0.001; Fig. 6). Individual repeated-measures post hoc tests revealed a difference between unrewarded and rewarded trials J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Fig. 5. OFC but not mPFC units encode changes in outcome magnitude during action execution. Subjects performed an action to receive an outcome of 1, 4, or 1 pellets in successive blocks. A: mPFC units classified as activated by actions did not alter firing rate with outcome magnitude. Data are aligned to action execution and depicted as the Z score normalized mean ⫾ SE of selective units. As there were no significant differences in firing rates between the two 1 action:1 pellet trial blocks (data not shown), these data are depicted averaged together. B: mean firing rates for each individual activated unit in both conditions, with each pair of points representing a single unit’s normalized firing rate measured within a window spanning 250 ms surrounding the action. The overall group mean is depicted as a red line. C: OFC units classified as activated by action execution decreased firing rate when outcome magnitude was increased. D: mean firing rates for each individual action-activated unit. E: mPFC units suppressed by actions did not alter firing rate with changes in outcome magnitude. F: mean firing rates for each mPFC unit suppressed by the action. G: OFC units suppressed during the action did not alter firing rate with changes in outcome magnitude. H: mean firing rates for each unit suppressed by the action. I: mPFC units classified as activated by outcome anticipation did not alter firing rate with changes in outcome magnitude. J: mean firing rates for each individual unit activated by outcome anticipation, averaged across a 500-ms window before outcome delivery. K: OFC units classified as activated by outcome anticipation did not alter firing rate with changes in outcome magnitude. L: mean firing rates for each individual unit activated by outcome anticipation. Asterisks represent statistical significance via paired sample t-tests at ␣ ⫽ 0.05. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS Table 1. Overall summary of action and preoutcome selective neuronal activity during shifts Outcome Shift mPFC OFC mPFC OFC 23% 9% 64% 27% 23% 50% 31% 19% 24% 13% 81%* 6% 18% 38% 50% 13% 20% 56%* 25% 19% 28% 76%* 18% 6% 29% 33% 60% 7% 16% 88%* 12% 0% 23% 50% 37% 13% 29% 23% 54% 23% 29% 25% 71%* 4% 21% 10% 90%* 0% Percentage of units that were activated or suppressed by actions or outcome anticipation, and how these units were modulated following shifts in outcome magnitude or action requirement. A unit was considered increased or decreased if there was a minimum of 10% change in normalized firing rate. mPFC, medial prefontal cortex; OFC, orbitofrontal cortext. *Significant group effect in this direction. following the shift (P ⫽ 0.02), such that units were suppressed only during the actions of rewarded trials. Additionally, there was a difference between preshift rewarded trials and postshift unrewarded trials (P ⫽ 0.005), and no differences between rewarded trials pre- or postshift (P ⫽ 0.134). This suggests that units suppressed by actions in OFC track impending outcome availability. OFC units activated during the outcome anticipation period (n ⫽ 10) showed a difference in activity among the three trial types (F2,18 ⫽ 5.54, P ⫽ 0.011; Fig. 6, G and H), such that the first block (1:1) was different from both trial types in the second block (2:1; P values ⬍ 0.02). As with mPFC, there was no difference between the unrewarded and rewarded blocks following the action requirement shift (P ⫽ 0.96). Thus, in contrast with action-evoked activity, both mPFC and OFC encode action requirement during the preoutcome anticipation period. DISCUSSION We observed that neuronal activity in mPFC and OFC encoded distinct information about the causal relationship between actions and the resulting outcomes. Furthermore, these neuronal changes were mirrored by differences in behavior. During action execution, mPFC units encoded changes in action requirement (i.e., number of actions required to receive an outcome) while performing the action, but did not represent changes in outcome magnitude. In contrast, OFC units activated by action execution encoded changes in outcome magnitude during action performance, but were insensitive to changes in action requirement. OFC units suppressed by the action also encoded information about impending outcome availability. PFC Regions Encode Distinct Information about ActionOutcome Relationships During Action Performance Our observation that mPFC encodes the number of actions/ trials required for reward, whereas OFC encodes outcome magnitude, highlights differential encoding of information about action-outcome relationships in these two regions. This is consistent with previous studies demonstrating functional dissociation within PFC. Lesions of mPFC disrupt acquisition and adaptation of goal-directed behavior and changes in actionoutcome contingency (Balleine and Dickinson 1998; Birrell and Brown 2000; Corbit and Balleine 2003; Killcross and Coutureau 2003; Naneix et al. 2009; Ostlund and Balleine 2005). In addition, our data are in agreement with previous data showing that mPFC neuronal populations do not encode changes in outcome magnitude or availability through actionevoked population response (Kargo et al. 2007; Peters et al. 2005). Our OFC data are consistent with lesion studies suggesting that this region is critical for modifying behavior with changes in reward value (Fellows 2011; Schoenbaum et al. 2003; Sellitto et al. 2010), as well as recording studies demonstrating encoding of information about expected rewards during reward anticipation and reward-predictive cues (PadoaSchioppa and Assad 2006; Rogers et al. 1999; Rolls and Baylis 1994; Schoenbaum et al. 1998; Schultz 2000; Wallis and Miller 2003). The current study expands on previous work by recording from OFC and mPFC in the same subjects engaged in a task where the relationship between rewards and number of actions was independently manipulated. This design allowed us to observe that differences in OFC and mPFC processing exist during reward-driven instrumental action. This study also demonstrates that other aspects of action-outcome relationships (action requirement and outcome magnitude) are encoded in parallel during action performance. These similarities and differences are likely related to overlapping or distinct subcortical and sensory inputs to these PFC regions (Chandler et al. 2013; Kolb et al. 2004; Ongur and Price 2000). It is possible that the reduction of action-evoked activity in mPFC during the action shift was reflective of reduced reward value or general motivation, as requiring two actions per reward reduces the average reward associated with each action. Indeed, it has been shown that mPFC encodes information about incentive value during consummatory behavior (Parent et al. 2015). However, if mPFC activity were solely reflective of reward value, differences in activity would have been evident during the outcome magnitude shift, which directly manipulates reward value. Furthermore, if action-evoked activity in mPFC were reflective of general response motivation, this also would have been affected during the outcome shift. Therefore, at least during action performance, mPFC activity seems to provide information specifically about action requirement but not outcome value or overall motivational state. During the increase in action requirement, it is possible that the activity of action-selective mPFC neurons is modulated by effort expenditure, as effort-based decision making involves the mPFC (Floresco et al. 2008; Hosking et al. 2014; Khani et al. 2015). Increased effort is a direct function of the number of actions required for each reward; thus effort encoding in mPFC is consistent with a role for these networks in encoding information about action-outcome relationships. Similarly, an increased action requirement increases the average amount of delay between actions and rewards. OFC also has been implicated in incorporating delays into encoding of upcoming rewards (Roesch et al. 2006; Rudebeck et al. 2006). Consistent J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Units activated during action Increase postshift Decrease postshift No change Units suppressed during action Increase postshift Decrease postshift No change Units activated during outcome anticipation Increase postshift Decrease postshift No change Action Shift 3381 3382 PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS with these findings, unrewarded trials, which were more temporally distant from rewards than rewarded trials, were characterized by less action-evoked OFC activity. We have interpreted this as OFC signaling reward availability on a trial-by-trial basis; but an alternative interpretation is that action-evoked OFC activity provides information about the amount of “waiting” required prior to reward delivery. Negative Value Coding of Outcome Magnitude in OFC We observed that OFC activity was reduced when the outcome magnitude increased but restored to preshift activa- tion levels when the outcome returned to its initial, lower magnitude. Other studies have demonstrated either positive or negative reward value coding in OFC, and some have observed both occurring simultaneously in separate neural subpopulations (Burton et al. 2014; Morrison and Salzman 2009; O’Neill and Schultz 2010; Padoa-Schioppa and Assad 2006; Roesch and Olson 2004; Roesch et al. 2006; van Duuren et al. 2007). These studies differed from the current work by focusing on different task epochs, offering multiple rewarding options, and/or using well-trained animals responding to predictable changes in reward value. The current study suggests that J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Fig. 6. mPFC encodes shift in action requirement. For the alternating trials/action requirement shift, subjects performed a single action to receive a pellet reward, then, in a second block, performed two actions on consecutive trials before receiving reward. A: mPFC units classified as activated by action execution reduced firing rate as action requirement increased. There was no difference between unrewarded and rewarded actions during the second block. Data are aligned to action execution and depicted as Z score normalized mean ⫾ SE. B: mean firing rates for each individual mPFC action-activated unit for the first block and unrewarded and rewarded responses in the second block (shaded) during a 250-ms window surrounding the action. The overall mean is depicted as a red line. C: OFC units classified as activated by action execution did not alter firing rate following the change in action requirement. D: mean firing rates for each individual OFC unit activated by action execution. E: mPFC activity in neurons suppressed during the peri-action window. Action-evoked suppression was eliminated following the increase in action requirement. There was no difference between unrewarded and rewarded trials after the shift. F: mean firing rates for each mPFC unit suppressed during the action. G: OFC activity in units suppressed during the action. Suppressed units increased their action evoked activity on unrewarded trials, and their suppression was restored on rewarded trials. H: mean firing rates for each suppressed OFC unit. I: mPFC activity during outcome anticipation (X-axis aligned to outcome delivery) was reduced in a subpopulation of units following an increase in action requirement. There was no difference between unrewarded and rewarded trials after the shift. J: mean firing rates for each mPFC activated by outcome anticipation, with means taken from the 500-ms period preceding the reward. K: OFC activity during outcome anticipation was reduced following an increase in action requirement. As with mPFC, there was no difference between unrewarded and rewarded trials after the shift. L: mean firing rates for each OFC unit selective for outcome anticipation. For B, D, F, and H, asterisks represent statistical significance at ␣ ⫽ 0.05. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS changes in outcome magnitude are encoded predominantly via negative coding during action-evoked activity when alterations in outcome magnitude are unexpected and are not accompanied by changes in environmental cues. This contrasted with the positive action-encoding observed in mPFC during the action shift, in which a decrease in the possibility of reward was associated with a decrease in action-evoked activity. This suggests that, at least during actions, encoding of actionoutcome relationships does not adhere to a single value system in which either increasing the amount of actions required for a reward or decreasing reward magnitude are encoded in similar fashion. Rather, fluctuations in action requirement and outcome magnitude are encoded differently by distinct PFC regions. During outcome anticipation, we observed that both mPFC and OFC encoded information about decreased availability of expected reward, but not upcoming outcome magnitude. Sustained neuronal activation during anticipation of a reward or stimulus is commonly observed in PFC (Deco et al. 2005; Funahashi et al. 1989; Niki and Watanabe 1979; Totah et al. 2009). This persistent elevation in firing has been associated with maintaining a mental representation of upcoming rewards/ cues, and directing attention toward the expected outcome. Consistent with this idea, mPFC and OFC both signal reduced expectation of reward delivery during reward anticipation following an action, but provide divergent signals that represent either action requirement (mPFC), outcome magnitude (activated units in OFC), or impending reward availability (suppressed units in OFC) during action execution. It also is important to note that this information was carried by distinct subpopulations that subserved different task epochs. The observation that preoutcome OFC activity did not encode outcome magnitude was different from several previous studies that found OFC neuronal correlates of outcome magnitude during outcome anticipation (Roesch et al. 2006; van Duuren et al. 2007). Experimental design differences that may account for differences in outcome include: 1) presentation of cues that predicted the size of the upcoming reward; 2) extensive training relative to our study, in which changes in outcome magnitude had never previously been experienced; and 3) an extended outcome anticipation period that included reward delivery. current finding may be that OFC activity is responding to the stimulus of cue offset that immediately follows the action. If this were the case, one would expect the neuronal response to begin immediately upon cue termination, whereas we observed that the response typically began before cue termination. In addition, action-selective units were not modulated on trials where the subject failed to respond and the cue was terminated. Collectively, these data suggest that OFC neurons encode information about action-outcome relationships in addition to their previously ascribed role in Pavlovian conditioning. Conclusions Previous studies investigating the role of PFC during goaldirected behavior have focused on encoding of working memory during a discriminative cue, the anticipatory period prior to outcome delivery, or feedback following outcome or outcome omission (Fuster and Alexander 1971, Funahashi et al. 1993; Fuster and Alexander 1971; Luk and Wallis 2009; Matsumoto et al. 2007; Moorman and Aston-Jones, 2014; Padoa-Schioppa and Assad 2006; Rainer et al. 1998; Roesch et al. 2006; Roesch and Olson 2004; Simmons and Richmond 2008). Here we focused on action-evoked neuronal activity that peaked near action completion. We found that information about the value of actions and rewards is encoded separately and in parallel, with action requirement represented in mPFC, and outcome magnitude and availability represented in OFC. These data provide novel evidence that PFC subregions encode behaviorally relevant information during the execution of goal-directed actions, implicating PFC as an executive region that directly informs action selection. Inability to accurately internalize this information may contribute to the aberrant behavioral flexibility observed in psychiatric disorders involving PFC dysfunction. ACKNOWLEDGMENTS We thank Dr. S. Morrison for helpful comments on this manuscript. GRANTS This work was supported by National Institutes of Health Grants R01-MH084906 (B. Moghaddam), F32-DA-035050 (N. W. Simon), and T32-DA03111 (N. W. Simon, J. Wood). DISCLOSURES No conflicts of interest, financial or otherwise, are declared by the author(s). Action-Outcome Representations in OFC AUTHOR CONTRIBUTIONS Our findings demonstrate that both OFC and mPFC represent information about outcomes during action performance. The concept of OFC contributing to action-outcome learning is not unprecedented, as OFC lesions can impair action-outcome learning and alter decision-making (Noonan et al. 2012; Rhodes and Murray 2013). Furthermore, OFC activity related to instrumental actions has been observed previously, and OFC activity appears to inform decision-making directly (Furuyashiki et al. 2008; Noonan et al. 2012; Padoa-Schioppa and Assad 2006; Sul et al. 2010). OFC is often associated with Pavlovian (stimulus-outcome) learning (Balleine et al. 2011; Fellows and Farah 2005; Luk and Wallis 2013). Thus an alternative interpretation of the Author contributions: N.W.S., J.W., and B.M. conception and design of research; N.W.S. performed experiments; N.W.S. analyzed data; N.W.S., J.W., and B.M. interpreted results of experiments; N.W.S. prepared figures; N.W.S. drafted manuscript; N.W.S., J.W., and B.M. edited and revised manuscript; N.W.S., J.W., and B.M. approved final version of manuscript. REFERENCES Arnsten AF, Wang MJ, Paspalas CD. Neuromodulation of thought: flexibilities and vulnerabilities in prefrontal cortical network synapses. Neuron 76: 223–239, 2012. Baddeley A, Della Sala S. Working memory and executive control. Philos Trans R Soc Lond B Biol Sci 351: 1397–1403, 1996. Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37: 407– 419, 1998. J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 PFC Subregions Encode a Uniform Signal During Outcome Anticipation 3383 3384 PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS Holland PC, Straub JJ. Differential effects of two ways of devaluing the unconditioned stimulus after Pavlovian conditioning. J Exp Psychol Anim Behav Process 5: 65–78, 1979. Homayoun H, Moghaddam B. NMDA receptor hypofunction produces opposite effects on prefrontal cortex interneurons and pyramidal neurons. J Neurosci 27: 11496 –11500, 2007. Homayoun H, Moghaddam B. Differential representation of Pavlovianinstrumental transfer by prefrontal cortex subregions and striatum. Eur J Neurosci 29: 1461–1476, 2009. Hosking JG, Cocker PJ, Winstanley CA. Dissociable contributions of anterior cingulate cortex and basolateral amygdala on a rodent cost/benefit decision-making task of cognitive effort. Neuropsychopharmacology 39: 1558 –1567, 2014. Hyman JM, Whitman J, Emberly E, Woodward TS, Seamans JK. Action and outcome activity state patterns in the anterior cingulate cortex. Cereb Cortex 23: 1257–1268, 2013. Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex lesions in Rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci 24: 7540 –7548, 2004. Kargo WJ, Szatmary B, Nitz DA. Adaptation of prefrontal cortical firing patterns and their fidelity to changes in action reward contingencies. J Neurosci 27: 3548 –3559, 2007. Kass RE, Ventura V, Cai C. Statistical smoothing of neuronal data. Network 14: 5–15, 2003. Khani A, Kermani M, Hesam S, Haghparast A, Argandona EG, Rainer G. Activation of cannabinoid system in anterior cingulate cortex and orbitofrontal cortex modulates cost-benefit decision making. Psychopharmacology (Berl) 232: 2097–2112, 2015. Killcross S, Coutureau E. Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb Cortex 13: 400 – 408, 2003. Kolb B, Pellis S, Robinson TE. Plasticity and functions of the orbital frontal cortex. Brain Cogn 55: 104 –115, 2004. Luk CH, Wallis JD. Dynamic encoding of responses and outcomes by neurons in medial prefrontal cortex. J Neurosci 29: 7526 –7539, 2009. Luk CH, Wallis JD. Choice coding in frontal cortex during stimulus-guided or action-guided decision-making. J Neurosci 33: 1864 –1871, 2013. Matsumoto K, Tanaka K. The role of the medial prefrontal cortex in achieving goals. Curr Opin Neurobiol 14: 178 –185, 2004. Matsumoto M, Matsumoto K, Abe H, Tanaka K. Medial prefrontal cell activity signaling prediction errors of action values. Nat Neurosci 10: 647– 656, 2007. Mitra P, Bokil H. Observed Brain Dynamics. New York: Oxford Univ. Press, 2008. Moghaddam B, Homayoun H. Divergent plasticity of prefrontal cortex networks. Neuropsychopharmacology 33: 42–55, 2007. Moorman DE, Aston-Jones G. Orbitofrontal cortical neurons encode expectation-driven initiation of reward-seeking. J Neurosci 34: 10234 –10246, 2014. Morrison SE, Salzman CD. The convergence of information about rewarding and aversive stimuli in single neurons. J Neurosci 29: 11471–11483, 2009. Mulder AB, Nordquist RE, Orgut O, Pennartz CMA. Learning-related changes in response patterns of prefrontal neurons during instrumental conditioning. Behav Brain Res 146: 77– 88, 2003. Naneix F, Marchand AR, Di Scala G, Pape JR, Coutureau E. A role for medial prefrontal dopaminergic innervation in instrumental conditioning. J Neurosci 29: 6599 – 6606, 2009. Narayanan NS, Laubach M. Top-down control of motor cortex ensembles by dorsomedial prefrontal cortex. Neuron 52: 921–931, 2006. Niki H, Watanabe M. Prefrontal and cingulate unit activity during timing behavior in the monkey. Brain Res 171: 213–224, 1979. Noonan MP, Kolling N, Walton ME, Rushworth MF. Re-evaluating the role of the orbitofrontal cortex in reward and reinforcement. Eur J Neurosci 35: 997–1010, 2012. O’Neill M, Schultz W. Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68: 789 – 800, 2010. Oliveira FTP, McDonald JJ, Goodman D. Performance monitoring in the anterior cingulate is not all error related: expectancy deviation and the representation of action-outcome associations. J Cogn Neurosci 19: 1994 – 2004, 2007. Ongur D, Price JL. The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans. Cereb Cortex 10: 206 –219, 2000. J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Balleine BW, Leung BK, Ostlund SB. The orbitofrontal cortex, predicted value, and choice. Ann NY Acad Sci 1239: 43–50, 2011. Bechara A, Van Der Linden M. Decision-making and impulse control after frontal lobe injuries. Curr Opin Neurobiol 18: 734 –739, 2005. Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci 10: 1214 –1221, 2007. Birrell JM, Brown VJ. Medial frontal cortex mediates perceptual attentional set shifting in the rat. J Neurosci 20: 4320 – 4324, 2000. Bohn I, Giertler C, Hauber W. Orbital prefrontal cortex and guidance of instrumental behavior of rats by visuospatial stimuli predicting reward magnitude. Learn Mem 10: 177–186, 2003. Brown VJ, Bowman EM. Rodent models of prefrontal cortical function. Trends Neurosci 25: 340 –343, 2002. Buckley MJ, Mansouri FA, Hoda H, Mahboubi M, Browning PGF, Kwok SC, Phillips A, Tanaka K. Dissociable components of rule-guided behavior depend on distinct medial and prefrontal regions. Science 325: 52–58, 2009. Burton AC, Kashtelyan V, Bryden DW, Roesch MR. Increased firing to cues that predict low-value reward in the medial orbitofrontal cortex. Cereb Cortex 24: 3310 –3321, 2014. Cardinal RN. Neural systems implicated in delayed and probabilistic reinforcement. Neural Netwk 19: 1277–1301, 2006. Chandler DJ, Lamperski CS, Waterhouse BD. Identification and distribution of projections from monoaminergic and cholinergic nuclei to functionally differentiated subregions of prefrontal cortex. Brain Res 1522: 38 –58, 2013. Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J Neurosci 23: 8771– 8780, 2003. Connors BW, Gutnick MJ. Intrinsic firing patterns of diverse neocortical neurons. Trends Neurosci 13: 99 –104, 1990. Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental conditioning. Behav Brain Res 146: 145–157, 2003. Deco G, Ledberg A, Almeida R, Fuster J. Neural dynamics of cross-modal and cross-temporal associations. Exp Brain Res 166: 325–336, 2005. Fellows LK. Orbitofrontal contributions to value-based decision making: evidence from humans with frontal lobe damage. Ann NY Acad Sci 1239: 51–58, 2011. Fellows LK, Farah MJ. Is anterior cingulate cortex necessary for cognitive control? Brain 128: 788 –796, 2005. Floresco SB, St. Onge JR, Ghods-Sharifi S, Winstanley CA. Cortico-limbicstriatal circuits subserving different forms of cost-benefit decision-making. Cogn Affect Behav Neurosci 8: 375–389, 2008. Funahashi S, Bruce CJ, Goldman-Rakic PS. Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol 61: 331–349, 1989. Funahashi S, Chafee MV, Goldman-Rakic PS. Prefrontal neuronal activity in rhesus monkeys performing a delayed anti-saccade task. Nature 365: 753–756, 1993. Furuyashiki T, Holland PC, Gallagher M. Rat orbitofrontal cortex separately encodes response and outcome information during performance of goal-directed behavior. J Neurosci 28: 5127–5138, 2008. Fuster JM. The prefrontal cortex—an update: time is of the essence. Neuron 30: 319 –333, 2001. Fuster JM, Alexander GE. Neuron activity related to short-term memory. Science 173: 652– 654, 1971. Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. J Neurosci 19: 6610 – 6614, 1999. Glascher J, Adolphs R, Damasio H, Bechara A, Rudrauf D, Calamia M, Paul LK, Tranel D. Lesion mapping of cognitive control and value-based decision making in the prefrontal cortex. Proc Natl Acad Sci USA 109: 14681–14686, 2012. Goldman-Rakic PS. Cellular basis of working memory. Neuron 14: 477– 485, 1995. Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR. Dissociable regulation of instrumental action within mouse prefrontal cortex. Eur J Neurosci 32: 1726 –1734, 2010. Histed MH, Pasupathy A, Miller EK. Learning substrates in the primate prefrontal cortex and striatum: Sustained activity related to successful actions. Neuron 63: 244 –253, 2009. PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS Schoenbaum G, Setlow B, Nugent SL, Saddoris MP, Gallagher M. Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. Learn Mem 10: 129 –140, 2003. Schultz W. Multiple reward signals in the brain. Nat Rev Neurosci 1: 199 –207, 2000. Schwartz MF. The cognitive neuropsychology of everyday action and planning. Cogn Neuropsychol 23: 202–221, 2006. Sellitto M, Ciaramelli E, di Pellegrino G. Myopic discounting of future rewards after medial orbitofrontal damage in humans. J Neurosci 30: 16429 –16436, 2010. Simmons JM, Richmond BJ. Dynamic changes in representations of preceding and upcoming reward in monkey orbitofrontal cortex. Cereb Cortex 18: 93–103, 2008. Sturman DA, Moghaddam B. Reduced neuronal inhibition and coordination of adolescent prefrontal cortex during motivated behavior. J Neurosci 31: 1471–1478, 2011. Sul JH, Kim H, Huh N, Lee D, Jung MW. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. Neuron 66: 449 – 460, 2010. Tierney PL, Dégenètais E, Thierry AM, Glowinski J, Gioanni Y. Influence of the hippocampus on interneurons of the rat prefrontal cortex. Eur J Neurosci 20: 514 –524, 2004. Totah NK, Kim YB, Homayoun H, Moghaddam B. Anterior cingulate neurons represent errors and preparatory attention within the same behavioral sequence. J Neurosci 29: 6418 – 6426, 2009. Tran-Tu-Yen DAS, Marchand AR, Pape J-R, Di Scala G, Coutureau E. Transient role of the rat prelimbic cortex in goal-directed behaviour. Eur J Neurosci 30: 464 – 471, 2009. Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature 398: 704 –708, 1999. van Duuren E, Escamez FA, Joosten RN, Visser R, Mulder AB, Pennartz CMA. Neural coding of reward magnitude in the orbitofrontal cortex of the rat during a five-odor olfactory discrimination task. Learn Mem 14: 446 – 456, 2007. Wallis JD, Miller EK. Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task. Eur J Neurosci 18: 2069 –2081, 2003. Walton ME, Rudebeck PH, Bannerman DM, Rushworth MFS. Calculating the cost of acting in frontal cortex. Ann NY Acad Sci 1104: 340 –356, 2007. Young JJ, Shapiro ML. Dynamic coding of goal-directed paths by orbital prefrontal cortex. J Neurosci 31: 5989 – 6000, 2011. J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017 Ostlund SB, Balleine BW. Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. J Neurosci 25: 7763–7770, 2005. Ostlund SB, Balleine BW. The contribution of orbitofrontal cortex to action selection. Ann NY Acad Sci 1121: 174 –192, 2007. Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature 441: 223–226, 2006. Parent MA, Amarante LM, Liu B, Weikum D, Laubach M. The medial prefrontal cortex is crucial for the maintenance of persistent licking and the expression of incentive contrast. Front Integr Neurosci 9: 23, 2015. Peters YM, O’Donnell P, Carelli RM. Prefrontal cortical cell firing during maintenance, extinction, and reinstatement of goal-directed behavior for natural reward. Synapse 56: 74 – 83, 2005. Rainer G, Asaad WF, Miller EK. Selective representation of relevant information by neurons in the primate prefrontal cortex. Nature 393: 577–579, 1998. Rhodes SE, Murray EA. Differential effects of amygdala, orbital prefrontal cortex, and prelimbic cortex lesions on goal-directed behavior in rhesus macaques. J Neurosci 33: 3380 –3389, 2013. Roesch MR, Olson CR. Neuronal activity related to reward value and motivation in primate frontal cortex. Science 304: 307–310, 2004. Roesch MR, Taylor AR, Schoenbaum G. Encoding of time-discounted rewards in orbitofrontal cortex is independent of value representation. Neuron 51: 509 –520, 2006. Rogers RD, Owen AM, Middleton HC, Williams EJ, Pickard JD, Sahakian BJ, Robbins TW. Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J Neurosci 19: 9029 –9038, 1999. Rolls E, Baylis L. Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. J Neurosci 14: 5437–5452, 1994. Rolls ET. The functions of the orbitofrontal cortex. Brain Cogn 55: 11–29, 2004. Rudebeck PH, Behrens TE, Kennerley SW, Baxter MG, Buckley MJ, Walton ME, Rushworth MFS. Frontal cortex subregions play distinct roles in choices between actions and stimuli. J Neurosci 28: 13775–13785, 2008. Rudebeck PH, Walton ME, Smyth AN, Bannerman DM, Rushworth MFS. Separate neural pathways process different decision costs. Nat Neurosci 9: 1161–1168, 2006. Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci 1: 155–159, 1998. Schoenbaum G, Setlow B. Lesions of nucleus accumbens disrupt learning about aversive outcomes. J Neurosci 23: 9833–9841, 2003. 3385
© Copyright 2026 Paperzz