Action-outcome relationships are represented differently by medial

J Neurophysiol 114: 3374 –3385, 2015.
First published October 14, 2015; doi:10.1152/jn.00884.2015.
Action-outcome relationships are represented differently by medial prefrontal
and orbitofrontal cortex neurons during action execution
Nicholas W. Simon, Jesse Wood, and Bita Moghaddam
University of Pittsburgh, Department of Neuroscience, Pittsburgh, Pennsylvania
Submitted 14 September 2015; accepted in final form 13 October 2015
prefrontal cortex; orbitofrontal cortex; instrumental behavior; electrophysiology; reward
(PFC) has long been implicated in the
planning and execution of goal-directed actions (Arnsten et al.
2012; Baddeley and Della Sala 1996; Fuster 2001). This region
is involved in emotion, perception, attention, and working
memory, and thus is well suited to integrate these cognitive
functions during action selection. Accordingly, PFC damage
has been associated with aberrant decision-making and rewardrelated behavior, possibly resulting from failure to utilize
representations of action-outcome relationships to guide action
selection (Bechara and Van Der Linden 2005; Cardinal 2006;
Killcross and Coutureau 2003; Ostlund and Balleine 2007;
Schwartz 2006; Tran-Tu-Yen et al. 2009; Walton et al. 2007).
Neuronal correlates of reward-directed actions have been
observed across multiple regions of PFC (Furuyashiki et al.
2008; Homayoun and Moghaddam 2009; Hyman et al. 2013;
Luk and Wallis 2009; Mulder et al. 2003; Narayanan and
Laubach 2006; Peters et al. 2005; Sturman and Moghaddam
2011; Young and Shapiro 2011). Selection of the optimal
behavioral strategy requires information about the relationship
between actions and outcomes. Because prior research suggests that PFC acts as an “executor” region, PFC neural
PREFRONTAL CORTEX
Address for reprint requests and other correspondence: N. W. Simon, Univ.
of Pittsburgh, Dept. of Neuroscience, A210 Langley Hall, Pittsburgh, PA
15260 (e-mail: [email protected]).
3374
activity would be expected to encode information about these
relationships during action performance.
PFC is divided into multiple subregions that play complementary roles in reward-related behavior (Brown and Bowman
2002; Buckley et al. 2009; Chudasama and Robbins 2003;
Glascher et al. 2012; Gourley et al. 2010; Luk and Wallis 2013;
Moghaddam and Homayoun 2007; Rudebeck et al. 2006). The
orbitofrontal cortex (OFC) has been implicated in outcome
representation (Fellows 2011; Gallagher et al. 1999; Izquierdo
et al. 2004; Morrison and Salzman 2009; Padoa-Schioppa and
Assad 2006; Roesch and Olson 2004; Rolls 2004; Rolls and
Baylis 1994; Schoenbaum et al. 1998; Simmons and Richmond
2008; Tremblay and Schultz 1999; van Duuren et al. 2007),
whereas medial PFC (mPFC) has been implicated in assigning
value to actions (Behrens et al. 2007; Corbit and Balleine 2003;
Histed et al. 2009; Luk and Wallis 2013; Matsumoto et al.
2007; Matsumoto and Tanaka 2004; Mulder et al. 2003;
Oliveira et al. 2007; Rudebeck et al. 2008). In the current
study, we recorded single units in rat OFC and mPFC as we
independently manipulated the relationship between outcomes
and actions (specifically, outcome value, and the number of
action trials required to earn an outcome, or “action requirement”). This design allowed us to parse different aspects of the
action-outcome relationship and provided insight about how
PFC neurons encode this information. Finally, recording simultaneously from OFC and mPFC allowed us to assess
whether this information is segregated among PFC regions.
MATERIALS AND METHODS
Subjects
Male Sprague-Dawley rats (n ⫽ 21, 250 g upon arrival, Harlan,
Frederick, MD) were used for this experiment. All subjects initially
were pair-housed with a similarly aged animal on a 12:12-h reverse
dark/light cycle (lights on at 7 pm), then individually housed following surgery. Rats were food deprived to ⬃90% of free feeding weight
throughout the experiment. All research methods included here were
approved by the Institutional Animal Care and Use Committee at the
University of Pittsburgh.
Apparatus
Behavioral testing was performed in chambers with metal front and
back walls, transparent Plexiglas side walls, and a steel wire floor
(Coulbourn Instruments, Allentown, PA). The chamber was equipped
with an automated feeder and corresponding food trough on one wall,
and a nosepoke port mounted on the opposite wall. The chamber was
located inside a sound attenuating cubicle, and behavior was monitored throughout each session using a camera mounted on the back
wall. All behavioral experiments were designed and performed with
Graphic State software (Coulbourn Instruments, Allentown, PA).
0022-3077/15 Copyright © 2015 the American Physiological Society
www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Simon NW, Wood J, Moghaddam B. Action-outcome relationships are
represented differently by medial prefrontal and orbitofrontal cortex neurons
during action execution. J Neurophysiol 114: 3374–3385, 2015. First published October 14, 2015; doi:10.1152/jn.00884.2015.—Internal representations of action-outcome relationships are necessary for flexible
adaptation of motivated behavior in dynamic environments. Prefrontal
cortex (PFC) is implicated in flexible planning and execution of
goal-directed actions, but little is known about how information about
action-outcome relationships is represented across functionally distinct regions of PFC. Here, we observe distinct patterns of actionevoked single unit activity in the medial prefrontal cortex (mPFC) and
orbitofrontal cortex (OFC) during a task in which the relationship
between outcomes and actions was independently manipulated. The
mPFC encoded changes in the number of actions required to earn a
reward, but not fluctuations in outcome magnitude. In contrast, OFC
neurons decreased firing rates as outcome magnitude was increased,
but were insensitive to changes in action requirement. A subset of
OFC neurons also tracked outcome availability. Pre-outcome anticipatory activity in both mPFC and OFC was altered when reward
expectation was reduced, but did not differ with outcome magnitude.
These data provide novel evidence that PFC regions encode distinct
information about the relationship between actions and impending
outcomes during action execution.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
Surgery
Before behavioral training, rats were implanted bilaterally under
isoflurane anesthesia with eight-channel microelectrode arrays for
single unit recording in mPFC and OFC. Brain regions were counterbalanced by hemisphere across rats. Coordinates were ⫹3 AP, ⫹0.5
ML, and ⫺4 DV for mPFC, and ⫹3 AP, ⫹3.3 ML, and ⫺5.5 DV for
OFC. Habituation, as described above, started at least 1 wk after
surgery.
Behavioral Training
sucrose pellet reward. During this block of trials, the task proceeded
identically to the training sessions. During a second block of 30 trials,
every action was reinforced with delivery of four sucrose pellets.
During the third and final block of 30 trials, actions were again
reinforced with a single sucrose pellet. Therefore, outcome magnitude
was manipulated (1:4:1 sucrose pellets), while the number of actions
required remained constant. After this session, each animal was run on
a session in which 1 action was reinforced with 1 reward, in order to
ensure that the original action-outcome relationship was intact before
proceeding with the second shift.
Shift 2: alternating reward trials (action requirement shift). During
the first block of 30 trials, a single action was reinforced with a single
sucrose pellet. For the remainder of the session, the first action during
the response period was unrewarded and caused immediate progression to the ITI. In the following trial, an action was reinforced with a
single pellet. As such, trials alternated between unrewarded and
rewarded, and two actions in consecutive trials were required for
reinforcement. Because two actions were required for each reward
delivery, we refer to this phase of the task as a shift in action
requirement. It could also be interpreted by the rat as a change in
reward probability, as one in every 2 responses is reinforced. If the
subject performed a correct action but then missed the response period
on the following trial, the two trial sequence reset. The subjects each
performed 60 trials in the alternating reward trial block. Details of the
task are depicted in Fig. 1.
Electrophysiology
Neural signals were buffered by a headstage amplifier before being
amplified and analog band pass filtered by a preamplifier. The headstage amplifier and cables were connected to a commutator allowing
the animal to move freely in the behavioral chamber. Spikes were
analog filtered between 300 Hz and 8 kHz before being digitized at 40
kHz. Voltage threshold crossing waveforms were saved to disk by PC
controlled recording software (Plexon, Dallas, TX). The behavioral
system controller sent TTL pulses to the neural data acquisition
system that were used to synchronize behavioral events with neural
data. Units were sorted offline using standard techniques and the
Offline Sorter software package (Plexon).
Fig. 1. Behavioral procedures and histology. A: schematic detailing the task progression across experimental sessions. Following shaping, animals were given
a baseline session, followed by the outcome magnitude shift session. After the magnitude shift session, another baseline session was performed before the action
requirement shift was instituted. B: during the outcome magnitude shift session, the first 30 trials were 1 action:1 pellet, the second 30 were 1 action:4 pellets,
and the final 30 were restored to 1 action:1 pellet. C: during the action shift session, the first 30 trials were 1 action:1 pellet, then the following 60 trials were
2 actions:1 pellet. In the action requirement shift session, each successful action during the cue was counted as a trial; thus 30 of these trials were unrewarded,
and 30 were rewarded. In order to receive reinforcement, rats were required to complete 2 consecutive trials; a missed trial caused the 2 actions:1 pellet sequence
to reset. D: electrode placements in medial prefrontal (mPFC) and orbitofrontal cortex (OFC). Each box represents the placement of one electrode array. Note
that all OFC placements were located in the lateral subregion of the OFC.
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Rats were handled and habituated to the testing room, sucrose
pellets, and connecting or disconnecting the headstage cable. After
habituation, they were given a 30-min magazine training session,
during which sucrose pellets (45 mg, Bio-Serv, Frenchtown, NJ) were
delivered into a food trough at random intervals (75 ⫾ 45 s). On day
2, rats were trained to perform a single nose-poke when a port was
illuminated, which was followed by immediate pellet delivery, then a
5-s intertrial interval (ITI). The nose-poke port light was extinguished
immediately after the nose-poke was performed. This training session
was repeated daily until rats completed at least 50 reinforced nosepokes. Once rats met this criterion, in all trials the nose-poke port was
again illuminated, but only for a 10-s response period. A nose-poke
during this period resulted in termination of the port light, and food
was delivered after a 2-s delay. This delay was instituted to provide
clear temporal dissociation of the action from outcome delivery, and
was fixed at 2 s regardless of whether the rat entered the trough or not
after the action. After a variable ITI of 10 –20 s, another trial was
initiated. If the rat failed to respond during the 10-s response period,
then the port light was extinguished and the trial proceeded directly to
the ITI. Each trial was 30 s long, and each 45-min session consisted
of 90 trials. Rats were trained until they were able to correctly perform
the action in 80% of the trials.
To assess how PFC units responded to changes in action-outcome
relationships, rats were subjected to two shifts in task components on
separate days: an outcome magnitude shift and an action requirement
shift.
Shift 1: outcome magnitude shift. During the first block of 30 trials,
a single action during the response period was reinforced with a single
3375
3376
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
Histology
Statistical Analysis
After completion of the experiment, rats were anesthetized with
chloral hydrate and perfused with saline and 10% formalin solution.
Brains were removed and placed in 10% formalin, then cryoprotected
in a 30% formalin sucrose solution. Brains were sectioned in 60-␮m
coronal slices, stained with cresyl violet, and mounted to microscope
slides. Finally, electrode-tip placements were verified under a light
microscope (Fig. 1).
For comparisons in behavior between trial types, we used one-way
repeated-measures ANOVA as an omnibus test, and paired t-tests to
compare specific trial types. Paired t-tests were used to compare
average neuronal activity between different trial types (1 pellet vs. 4
pellets) during the epochs of interest for the magnitude shift. One-way
ANOVA was used to compare all three trial types (preshift, postshiftunrewarded, postshift-rewarded) for the alternating reward trial shift,
with individual repeated-measures comparisons used to further analyze differences between pairs of trial types.
Behavior Analysis
Electrophysiological Analysis
Analyses focused on data obtained from 3 sessions: 1) the final day
of baseline training, 2) the outcome magnitude shift, and 3) the action
requirement shift. All analyses were conducted using custom-made
analysis routines executed in Matlab version 2010b (MathWorks,
Nattick, MA). All single unit discharge rates were calculated in 50-ms
bins and Z-score normalized against a 5-s segment of the ITI (9 s prior
to trial initiation). Data were smoothed using a 5-point moving
average filter prior to both analysis and presentation to minimize the
error in estimates of firing rate as recommended by Kass et al. (2003).
The current analyses focused on two periods: a peri-action window
(5 bins, 250 ms total length) centered on the execution of each action,
and an outcome anticipation window (10 bins, 500 ms) that immediately preceded reward delivery. These windows were selected based
on the most commonly observed patterns of responses in the data (see
Fig. 4). Selective units were defined as “activated” or “suppressed” for
either of these events based on two criteria: 1) average firing rate (Hz)
during the period surrounding the action/prior to the outcome was
significantly different from average firing rate during the baseline
period via a paired-samples t-test; and 2) the period contained at least
three consecutive 50-ms bins in which the Z score exceeded 1 for
activated units, or was less than ⫺0.75 for suppressed units. (A less
stringent criterion was used for suppressed units because of floor
effects.) All neuronal responses were visually verified to ensure the
accuracy of the classification criteria.
Units with a baseline firing rate greater than 10 Hz, and spike
duration less than 0.5 ms, were classified as fast-spiking units (Connors and Gutnick 1990; Homayoun and Moghaddam 2007; Tierney et
al. 2004). The percentage of units meeting these criteria was too low
to permit reliable statistical analysis (5.65% across all three days of
analysis), and therefore action and outcome anticipation-selective fast
spiking units were excluded from this study.
RESULTS
Behavior
Rats were first trained on a simple action-outcome task in
which a nose-poke into a port was reinforced with a single
sugar pellet. Then, to assess how PFC neurons encode shifts in
action-outcome relationships, rats were subjected to two shifts
in task contingencies on separate days: 1) an outcome magnitude shift, in which one instrumental action was rewarded with
four pellets, then 2) an action requirement shift in which two
actions were required to receive one pellet (Fig. 1). Before the
shifts, rats required 5.8 ⫾ 0.4 sessions to achieve task criterion
on the action-outcome task (responding correctly on at least
80% of trials). In the final training session, subjects completed
84.51% of trials, with an average latency to respond of 3.71 s.
During the magnitude shift session, rats generally entered
the food trough prior to food delivery. There was a difference
between trial types in latency to enter the food trough after the
nose poke (which is henceforth referred to as the “action”),
such that rats were quicker to enter the food trough for the 4
pellet outcome than the 1 pellet outcome (t14 ⫽ 3.09, P ⫽ 0.01;
Fig. 2A). This measure is indicative of greater motivation on
the 4 pellet trials (Bohn et al. 2003; Holland and Straub 1979;
Schoenbaum and Setlow 2003) and provides a critical behavior
index demonstrating that rats actively discriminated between
trial types. However, there was no difference in latency to
perform the action between different outcome magnitudes
(t21 ⫽ 0.22, P ⫽ 0.83; Fig. 2C). During the action requirement
shift session, in contrast, a difference in response latency was
observed between trial types (F2,32 ⫽ 7.031, P ⫽ 0.003; Fig.
2B). Rats demonstrated an increased latency to respond on
unrewarded compared with rewarded trials (t16 ⫽ 3.58, P ⫽
0.003). This indicated that they were able to track impending
rewards on a trial-by-trial basis rather than expecting a 50%
probability of reinforcement on all trials following the shift
(Fig. 2). Data showing latency to enter the food trough from the
action shift session are not reported here because there were a
limited number of trials in which rats entered the food trough
prior to reward delivery after this shift. Collectively, these data
provide evidence that rats behaviorally distinguished trial types
before and after shifts of both outcome magnitude and action
requirement. Furthermore, latency to enter the trough was
sensitive to changes in outcome magnitude, whereas latency to
perform the action was sensitive to reward availability.
Neural Activity During Behavior
A total of 22 rats with microelectrode arrays implanted in
mPFC and/or OFC were utilized for this experiment (Fig. 1).
We recorded from 228 units in mPFC and 164 units in OFC
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
To assess whether rats were able to detect the shifts in actionoutcome relationships, we measured different aspects of behavioral
performance. First, we assessed the latency to perform the action
(nose-poke) after cue presentation. Following the alternating reward
trial shift, when two actions were required to receive a reward, only
trials that were part of a full, rewarded sequence were included in this
measure (i.e., omitted trials and unrewarded trials not followed by a
rewarded action were not included).
Latency to enter the food trough after the action was also measured.
Because there was a 2-s gap between the action and reward delivery,
rats were often able to enter the trough prior to food delivery. For this
analysis only trials in which rats entered the trough prior to food
delivery were included, as latency to enter the food trough has often
been used as a measure of motivation/expectation (Bohn et al. 2003;
Holland and Straub 1979; Schoenbaum and Setlow 2003). Entries into
the trough after reward delivery were not included in this measure, as
these trials were likely more reflective of reaction time to pellet
delivery (rather than reward expectation). This measure was used only
for the outcome magnitude shift session, because rats were less likely
to enter the food trough prior to reward delivery following the
alternating reward trial shift.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
3377
across the 3 sessions for which data are shown (final training
session, outcome magnitude, and action-requirement shift).
Our analyses focused on the peri-action and outcome anticipation periods of these sessions.
Characterization of action and outcome anticipation evoked
neuronal firing. During the final training session, after animals
had acquired the action-outcome relationship, we recorded 77
mPFC and 68 OFC units. As expected, we observed selectivity
for different task events in both PFC regions, with some units
activated by events, and others suppressed by events (Fig. 3).
Several of these units were responsive to the instrumental
action, with the action temporally defined as when the rat
activated a photodetector stationed inside the nose poke port.
This period includes action preparation, action execution, and
cue light offset. Among these units, 15 in mPFC and 11 in OFC
units were activated in the periaction period, and 14 in mPFC
and 14 in OFC units were suppressed by the action (Fig. 3).
Representative examples of action-selective units are shown in
Fig. 4, as is the average response of all selective PFC units.
Note that, in both brain regions, the average of activated and
suppressed neurons began their phasic response firing rate
before the completion of the action (Fig. 4). This implies that
units were not simply encoding the cue termination following
the action, but the action itself. In general, subpopulations of
units in both PFC regions represented action-related information with no overt differences in the topography of the neuronal
response.
In both regions, there also were subsets of units that demonstrated a sustained activation or suppression in firing rate
after the action or before outcome delivery. In mPFC, 14 units
were selectively activated during the outcome anticipation
period (beginning 0.5 s before delivery and culminated at
delivery), and 14 were suppressed (Fig. 3). In OFC, 11 units
were activated, and 12 were suppressed during this period. As
previously described, representative examples of pre-outcome
responding and the average group response are depicted in Fig.
4, I–L. In most cases, neuronal responses took place in either
the peri-action period, or after the action and prior to outcome
delivery, with a relatively small percentage modulated by both
events (Fig. 3).
Outcome magnitude shift. The outcome magnitude shift
session was arranged in an A-B-A design, that began with a
single action producing a single sugar pellet (1 action:1 outcome), shifted to each action producing four pellets (1:4), then
was restored to one action producing one pellet (1:1). We
recorded 70 mPFC units and 45 OFC units in this session. In
mPFC, 32 of these units were modulated (activated or suppressed) during action execution during the first 1:1 block of
trials, with 16 of these being activated during the action and 16
suppressed during the action (Table 1). In activated units, there
was no difference in the magnitude of population activity
between the early and late 1:1 conditions in mPFC activated
units (t15 ⫽ 0.91, P ⫽ 0.39); hence, these conditions were
averaged together and compared with the 1:4 condition. There
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Fig. 2. Rats behaviorally discriminate between trial types. A: there was a difference in latency to enter the food trough after the action (and prior to reward
delivery) based on outcome magnitude, with rats entering the trough more quickly on 4 pellet trials. B: there was no difference in latency to perform the action
at the beginning of each trial based on outcome magnitude. C: there was a difference in latency to the action during the action requirement shift, such that subjects
responded more quickly on rewarded trials than unrewarded trials following the shift. Trial types highlighted in gray on the X axis are post-shift, and rats
alternated between these trials. All data are displayed as means ⫾ SE. Asterisk represents statistical significance as assessed by t-tests.
3378
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
was no modulation of peri-action mPFC activity caused by the
outcome magnitude shift (Fig. 5A, t15 ⫽ 1.66, P ⫽ 0.12). The
lack of a consistent effect of outcome magnitude shift on
neuronal activity is evident when each individual unit’s response is examined (Fig. 5B). Although more units showed
reduced activity than increased activity (Table 1), neither
group was substantial enough to produce significant effects.
Units suppressed by the action in mPFC were also unresponsive to increases in outcome magnitude (t15 ⫽ 0.52, P ⫽ 0.61;
Fig. 5).
Similar to Fig. 4F, a subset of units (n ⫽ 20) demonstrated
a gradual increase in neuronal activity before outcome delivery. There was no difference in pre-outcome activity between
the early and late 1:1 blocks (t19 ⫽ 1.86, P ⫽ 0.078). When
these responses were combined and compared to the preoutcome period during the 1:4 block, there was again no
difference (t19 ⫽ 0.12, P ⫽ 0.902; Fig. 5, E and F). This
indicated that mPFC units are activated prior to a rewarding
outcome, but that the magnitude of this neural signal does not
differ based on outcome magnitude.
In OFC, 11 of OFC units were activated by actions, with no
difference in the magnitude of neuronal activation observed
between the 1:1 blocks of trials (t10 ⫽ 1.004, P ⫽ 0.34). In
contrast to mPFC, OFC population activity increased less when
the magnitude of the outcome was 4 pellets vs. 1 pellet (Fig.
5C, t10 ⫽ 2.61, P ⫽ 0.026). This response was carried by the
majority of the units in the subpopulation (Fig. 5D). We
probed for correlations between individual units and trial
number after the shift, and only one unit demonstrated a
significant negative correlation, indicating that the majority
of units did not reduce activity in a linear fashion following
the shift (mean r of activated units ⫽ ⫺0.10). Rather, units
showed a reduction in action-evoked activity almost immediately following the shift to four pellets, and there was no
difference between early and late trials following both the
shift to four pellets (t10 ⫽ 0.29, P ⫽ 0.78) and the shift back
to one pellet (t10 ⫽ 0.28, P ⫽ 0.79).
A small subset (n ⫽ 8) of OFC units were suppressed during
action performance, and these units did not discriminate between 1 and 4 pellet trials (t7 ⫽ 1.15, P ⫽ 0.29). Therefore,
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Fig. 3. mPFC and OFC units are modulated by multiple events during action-outcome task. A and B: individual mPFC unit and OFC unit activity aligned to action
execution. C and D: percentage of units in mPFC (C) and OFC (D) activated or suppressed by action execution and outcome anticipation.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
3379
action-selective units in OFC were modulated by changes in
outcome magnitude, although this effect was confined to units
activated during the action.
Thirteen OFC units were activated during outcome anticipation. There was no difference in the magnitude of this
activity based on upcoming outcome magnitude (1:1 vs. 1:1
block comparison: t12 ⫽ 0.889, P ⫽ 0.391; 1:1 vs. 1:4 comparison: t12 ⫽ 1.286, P ⫽ 0.229; Fig. 5, G and H), indicating
that OFC neurons encode information about outcome magnitude during action performance but not during outcome
anticipation.
Alternating reward trials shift. The alternating reward trials
shift, or shift in action requirement (number of actions per
reward), began with a single action required to earn a single
sugar pellet (1:1). The number of actions required to produce a
sugar pellet was then increased (2:1). We recorded 81 mPFC
units and 51 OFC units during this session. In mPFC, 15 units
were classified as being activated during the action. We compared activity evoked by three types of actions within the
session—preshift actions (all of which were rewarded), postshift unrewarded actions, and postshift rewarded actions—and
found a difference in population response evoked by these
actions (F2,30 ⫽ 10.98, P ⬍ 0.001, Fig. 6, A and B). Individual
contrasts revealed that the preshift actions evoked greater
neural activity than both unrewarded (P ⫽ 0.002) and rewarded actions (P ⫽ 0.001) following the shift. Unrewarded
and rewarded actions evoked similar levels of activation in
mPFC (t15 ⫽ 1.19, P ⫽ 0.25; Fig. 6). Although the percentage
of mPFC units that exhibited decreased activity after the action
shift was comparable to during the outcome shift (Table 1), the
magnitude of this decrease was substantially greater here
(105.59% decrease in activity during the action shift; 37.30%
decrease during the outcome shift), indicating that mPFC units
are much more sensitive to changes in actions required compared with outcome value.
There were no correlations between action-evoked activation and trial number for any of these units after the action
requirement shift (mean r of activated units ⫽ 0.002), indicating that activity was not reduced in a linear fashion after the
shift. We compared blocks of early and late trials following the
shift and found no differences between these blocks (t13 ⫽
0.26, P ⫽ 0.80), indicating that activity was reduced immediately following the shift, and remained at this level throughout
the session.
Units in mPFC that were suppressed during the action (n ⫽
17) also encoded information about the shift in action require-
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Fig. 4. Event-selective units in mPFC and OFC during action-outcome task. A–D: representative sample raster plots and peri-event time histograms from
action-selective activated units are displayed as mPFC (A) and OFC (C). Firing rate in histogram is displayed as spikes/s (Hz), averaged across all trials, and
aligned to the execution of the action. Z-score normalized firing rates of all action selective units in mPFC (B) and OFC (D) are depicted as means ⫾ SE (shaded
area) aligned to execution of the action. E–H: representative samples from action-selective suppressed units are displayed as mPFC (E) and OFC (G). Z-score
normalized firing rates of all action selective units are displayed as mPFC (F) and OFC (H). I–L: example mPFC (I) and OFC (K) units with activation during
anticipation of outcome delivery. Mean firing rates of outcome-anticipation selective units aligned to outcome delivery are displayed as mPFC (J) and OFC (L).
3380
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
ment. There was a near-significant difference in action-evoked
suppression between the three trial types (F2,32 ⫽ 3.13, P ⫽
0.057). Specifically, there was a significant difference between
trials before and after the shift, such that units increased
activity (i.e., decreased suppression) after the number of actions required for reward was shifted from 1 to 2 (t16 ⫽ 2.23,
P ⫽ 0.041). Thus units that were either activated or suppressed
by actions were sensitive to changes in action requirement.
In the subset of units that activated during outcome anticipation (n ⫽ 24), mPFC neuronal activity was increased relative
to baseline during the outcome anticipation period, reaching
peak activity around delivery (Fig. 6, E and F). There was a
difference in this anticipatory activity between the three action
types (F2,46 ⫽ 5.13, P ⫽ 0.01), such that pre-shift activity was
greater than post-shift activity during both unrewarded (P ⫽
0.006) and rewarded trials (P ⫽ 0.007). There was again no
difference between unrewarded and rewarded trials (P ⫽
0.475). Taken together, these data suggest that mPFC neurons
represented information about action requirement, but not
outcome availability or the result of the past trial, by reducing
firing rates during action execution and outcome anticipation
when multiple actions were required for reward.
In OFC, 29.41% of units were activated by action execution.
There was no difference in population response between any
action types (F2,14 ⫽ 0.833, P ⫽ 0.45; Fig. 6, C and D). A
separate subset of action-selective OFC units was suppressed
by the action (15.69%). In this case, there was a significant
difference in activity among trial types (F1,14 ⫽ 11.54, P ⫽
0.001; Fig. 6). Individual repeated-measures post hoc tests
revealed a difference between unrewarded and rewarded trials
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Fig. 5. OFC but not mPFC units encode changes in outcome magnitude during action execution. Subjects performed an action to receive an outcome of 1, 4,
or 1 pellets in successive blocks. A: mPFC units classified as activated by actions did not alter firing rate with outcome magnitude. Data are aligned to action
execution and depicted as the Z score normalized mean ⫾ SE of selective units. As there were no significant differences in firing rates between the two 1 action:1
pellet trial blocks (data not shown), these data are depicted averaged together. B: mean firing rates for each individual activated unit in both conditions, with each
pair of points representing a single unit’s normalized firing rate measured within a window spanning 250 ms surrounding the action. The overall group mean
is depicted as a red line. C: OFC units classified as activated by action execution decreased firing rate when outcome magnitude was increased. D: mean firing
rates for each individual action-activated unit. E: mPFC units suppressed by actions did not alter firing rate with changes in outcome magnitude. F: mean firing
rates for each mPFC unit suppressed by the action. G: OFC units suppressed during the action did not alter firing rate with changes in outcome magnitude. H:
mean firing rates for each unit suppressed by the action. I: mPFC units classified as activated by outcome anticipation did not alter firing rate with changes in
outcome magnitude. J: mean firing rates for each individual unit activated by outcome anticipation, averaged across a 500-ms window before outcome delivery.
K: OFC units classified as activated by outcome anticipation did not alter firing rate with changes in outcome magnitude. L: mean firing rates for each individual
unit activated by outcome anticipation. Asterisks represent statistical significance via paired sample t-tests at ␣ ⫽ 0.05.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
Table 1. Overall summary of action and preoutcome selective
neuronal activity during shifts
Outcome Shift
mPFC
OFC
mPFC
OFC
23%
9%
64%
27%
23%
50%
31%
19%
24%
13%
81%*
6%
18%
38%
50%
13%
20%
56%*
25%
19%
28%
76%*
18%
6%
29%
33%
60%
7%
16%
88%*
12%
0%
23%
50%
37%
13%
29%
23%
54%
23%
29%
25%
71%*
4%
21%
10%
90%*
0%
Percentage of units that were activated or suppressed by actions or outcome
anticipation, and how these units were modulated following shifts in outcome
magnitude or action requirement. A unit was considered increased or decreased
if there was a minimum of 10% change in normalized firing rate. mPFC,
medial prefontal cortex; OFC, orbitofrontal cortext. *Significant group effect
in this direction.
following the shift (P ⫽ 0.02), such that units were suppressed
only during the actions of rewarded trials. Additionally, there
was a difference between preshift rewarded trials and postshift
unrewarded trials (P ⫽ 0.005), and no differences between
rewarded trials pre- or postshift (P ⫽ 0.134). This suggests that
units suppressed by actions in OFC track impending outcome
availability.
OFC units activated during the outcome anticipation period
(n ⫽ 10) showed a difference in activity among the three trial
types (F2,18 ⫽ 5.54, P ⫽ 0.011; Fig. 6, G and H), such that the
first block (1:1) was different from both trial types in the
second block (2:1; P values ⬍ 0.02). As with mPFC, there was
no difference between the unrewarded and rewarded blocks
following the action requirement shift (P ⫽ 0.96). Thus, in
contrast with action-evoked activity, both mPFC and OFC
encode action requirement during the preoutcome anticipation
period.
DISCUSSION
We observed that neuronal activity in mPFC and OFC
encoded distinct information about the causal relationship
between actions and the resulting outcomes. Furthermore,
these neuronal changes were mirrored by differences in behavior. During action execution, mPFC units encoded changes in
action requirement (i.e., number of actions required to receive
an outcome) while performing the action, but did not represent
changes in outcome magnitude. In contrast, OFC units activated by action execution encoded changes in outcome magnitude during action performance, but were insensitive to
changes in action requirement. OFC units suppressed by the
action also encoded information about impending outcome
availability.
PFC Regions Encode Distinct Information about ActionOutcome Relationships During Action Performance
Our observation that mPFC encodes the number of actions/
trials required for reward, whereas OFC encodes outcome
magnitude, highlights differential encoding of information
about action-outcome relationships in these two regions. This
is consistent with previous studies demonstrating functional
dissociation within PFC. Lesions of mPFC disrupt acquisition
and adaptation of goal-directed behavior and changes in actionoutcome contingency (Balleine and Dickinson 1998; Birrell
and Brown 2000; Corbit and Balleine 2003; Killcross and
Coutureau 2003; Naneix et al. 2009; Ostlund and Balleine
2005).
In addition, our data are in agreement with previous data
showing that mPFC neuronal populations do not encode
changes in outcome magnitude or availability through actionevoked population response (Kargo et al. 2007; Peters et al.
2005). Our OFC data are consistent with lesion studies suggesting that this region is critical for modifying behavior with
changes in reward value (Fellows 2011; Schoenbaum et al.
2003; Sellitto et al. 2010), as well as recording studies demonstrating encoding of information about expected rewards
during reward anticipation and reward-predictive cues (PadoaSchioppa and Assad 2006; Rogers et al. 1999; Rolls and Baylis
1994; Schoenbaum et al. 1998; Schultz 2000; Wallis and
Miller 2003).
The current study expands on previous work by recording
from OFC and mPFC in the same subjects engaged in a task
where the relationship between rewards and number of actions
was independently manipulated. This design allowed us to
observe that differences in OFC and mPFC processing exist
during reward-driven instrumental action. This study also demonstrates that other aspects of action-outcome relationships
(action requirement and outcome magnitude) are encoded in
parallel during action performance. These similarities and
differences are likely related to overlapping or distinct subcortical and sensory inputs to these PFC regions (Chandler et al.
2013; Kolb et al. 2004; Ongur and Price 2000).
It is possible that the reduction of action-evoked activity in
mPFC during the action shift was reflective of reduced reward
value or general motivation, as requiring two actions per
reward reduces the average reward associated with each action.
Indeed, it has been shown that mPFC encodes information
about incentive value during consummatory behavior (Parent
et al. 2015). However, if mPFC activity were solely reflective
of reward value, differences in activity would have been
evident during the outcome magnitude shift, which directly
manipulates reward value. Furthermore, if action-evoked activity in mPFC were reflective of general response motivation,
this also would have been affected during the outcome shift.
Therefore, at least during action performance, mPFC activity
seems to provide information specifically about action requirement but not outcome value or overall motivational state.
During the increase in action requirement, it is possible that
the activity of action-selective mPFC neurons is modulated by
effort expenditure, as effort-based decision making involves
the mPFC (Floresco et al. 2008; Hosking et al. 2014; Khani et
al. 2015). Increased effort is a direct function of the number of
actions required for each reward; thus effort encoding in mPFC
is consistent with a role for these networks in encoding information about action-outcome relationships. Similarly, an increased action requirement increases the average amount of
delay between actions and rewards. OFC also has been implicated in incorporating delays into encoding of upcoming rewards (Roesch et al. 2006; Rudebeck et al. 2006). Consistent
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Units activated during action
Increase postshift
Decrease postshift
No change
Units suppressed during action
Increase postshift
Decrease postshift
No change
Units activated during outcome
anticipation
Increase postshift
Decrease postshift
No change
Action Shift
3381
3382
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
with these findings, unrewarded trials, which were more
temporally distant from rewards than rewarded trials, were
characterized by less action-evoked OFC activity. We have
interpreted this as OFC signaling reward availability on a
trial-by-trial basis; but an alternative interpretation is that
action-evoked OFC activity provides information about the
amount of “waiting” required prior to reward delivery.
Negative Value Coding of Outcome Magnitude in OFC
We observed that OFC activity was reduced when the
outcome magnitude increased but restored to preshift activa-
tion levels when the outcome returned to its initial, lower
magnitude. Other studies have demonstrated either positive or
negative reward value coding in OFC, and some have observed
both occurring simultaneously in separate neural subpopulations (Burton et al. 2014; Morrison and Salzman 2009; O’Neill
and Schultz 2010; Padoa-Schioppa and Assad 2006; Roesch
and Olson 2004; Roesch et al. 2006; van Duuren et al. 2007).
These studies differed from the current work by focusing on
different task epochs, offering multiple rewarding options,
and/or using well-trained animals responding to predictable
changes in reward value. The current study suggests that
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Fig. 6. mPFC encodes shift in action requirement. For the alternating trials/action requirement shift, subjects performed a single action to receive a pellet reward,
then, in a second block, performed two actions on consecutive trials before receiving reward. A: mPFC units classified as activated by action execution reduced
firing rate as action requirement increased. There was no difference between unrewarded and rewarded actions during the second block. Data are aligned to action
execution and depicted as Z score normalized mean ⫾ SE. B: mean firing rates for each individual mPFC action-activated unit for the first block and unrewarded
and rewarded responses in the second block (shaded) during a 250-ms window surrounding the action. The overall mean is depicted as a red line. C: OFC units
classified as activated by action execution did not alter firing rate following the change in action requirement. D: mean firing rates for each individual OFC unit
activated by action execution. E: mPFC activity in neurons suppressed during the peri-action window. Action-evoked suppression was eliminated following the
increase in action requirement. There was no difference between unrewarded and rewarded trials after the shift. F: mean firing rates for each mPFC unit
suppressed during the action. G: OFC activity in units suppressed during the action. Suppressed units increased their action evoked activity on unrewarded trials,
and their suppression was restored on rewarded trials. H: mean firing rates for each suppressed OFC unit. I: mPFC activity during outcome anticipation (X-axis
aligned to outcome delivery) was reduced in a subpopulation of units following an increase in action requirement. There was no difference between unrewarded
and rewarded trials after the shift. J: mean firing rates for each mPFC activated by outcome anticipation, with means taken from the 500-ms period preceding
the reward. K: OFC activity during outcome anticipation was reduced following an increase in action requirement. As with mPFC, there was no difference
between unrewarded and rewarded trials after the shift. L: mean firing rates for each OFC unit selective for outcome anticipation. For B, D, F, and H, asterisks
represent statistical significance at ␣ ⫽ 0.05.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
changes in outcome magnitude are encoded predominantly via
negative coding during action-evoked activity when alterations
in outcome magnitude are unexpected and are not accompanied
by changes in environmental cues. This contrasted with the
positive action-encoding observed in mPFC during the action
shift, in which a decrease in the possibility of reward was
associated with a decrease in action-evoked activity. This
suggests that, at least during actions, encoding of actionoutcome relationships does not adhere to a single value system
in which either increasing the amount of actions required for a
reward or decreasing reward magnitude are encoded in similar
fashion. Rather, fluctuations in action requirement and outcome magnitude are encoded differently by distinct PFC regions.
During outcome anticipation, we observed that both mPFC
and OFC encoded information about decreased availability of
expected reward, but not upcoming outcome magnitude. Sustained neuronal activation during anticipation of a reward or
stimulus is commonly observed in PFC (Deco et al. 2005;
Funahashi et al. 1989; Niki and Watanabe 1979; Totah et al.
2009). This persistent elevation in firing has been associated
with maintaining a mental representation of upcoming rewards/
cues, and directing attention toward the expected outcome.
Consistent with this idea, mPFC and OFC both signal reduced
expectation of reward delivery during reward anticipation following an action, but provide divergent signals that represent
either action requirement (mPFC), outcome magnitude (activated units in OFC), or impending reward availability (suppressed units in OFC) during action execution. It also is
important to note that this information was carried by distinct
subpopulations that subserved different task epochs.
The observation that preoutcome OFC activity did not encode outcome magnitude was different from several previous
studies that found OFC neuronal correlates of outcome magnitude during outcome anticipation (Roesch et al. 2006; van
Duuren et al. 2007). Experimental design differences that may
account for differences in outcome include: 1) presentation of
cues that predicted the size of the upcoming reward; 2) extensive training relative to our study, in which changes in outcome
magnitude had never previously been experienced; and 3) an
extended outcome anticipation period that included reward
delivery.
current finding may be that OFC activity is responding to the
stimulus of cue offset that immediately follows the action. If
this were the case, one would expect the neuronal response to
begin immediately upon cue termination, whereas we observed
that the response typically began before cue termination. In
addition, action-selective units were not modulated on trials
where the subject failed to respond and the cue was terminated.
Collectively, these data suggest that OFC neurons encode
information about action-outcome relationships in addition to
their previously ascribed role in Pavlovian conditioning.
Conclusions
Previous studies investigating the role of PFC during goaldirected behavior have focused on encoding of working memory during a discriminative cue, the anticipatory period prior to
outcome delivery, or feedback following outcome or outcome
omission (Fuster and Alexander 1971, Funahashi et al. 1993;
Fuster and Alexander 1971; Luk and Wallis 2009; Matsumoto
et al. 2007; Moorman and Aston-Jones, 2014; Padoa-Schioppa
and Assad 2006; Rainer et al. 1998; Roesch et al. 2006; Roesch
and Olson 2004; Simmons and Richmond 2008). Here we
focused on action-evoked neuronal activity that peaked near
action completion. We found that information about the value
of actions and rewards is encoded separately and in parallel,
with action requirement represented in mPFC, and outcome
magnitude and availability represented in OFC. These data
provide novel evidence that PFC subregions encode behaviorally relevant information during the execution of goal-directed
actions, implicating PFC as an executive region that directly
informs action selection. Inability to accurately internalize this
information may contribute to the aberrant behavioral flexibility observed in psychiatric disorders involving PFC dysfunction.
ACKNOWLEDGMENTS
We thank Dr. S. Morrison for helpful comments on this manuscript.
GRANTS
This work was supported by National Institutes of Health Grants R01-MH084906 (B. Moghaddam), F32-DA-035050 (N. W. Simon), and T32-DA03111 (N. W. Simon, J. Wood).
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
Action-Outcome Representations in OFC
AUTHOR CONTRIBUTIONS
Our findings demonstrate that both OFC and mPFC represent information about outcomes during action performance.
The concept of OFC contributing to action-outcome learning is
not unprecedented, as OFC lesions can impair action-outcome
learning and alter decision-making (Noonan et al. 2012; Rhodes and Murray 2013). Furthermore, OFC activity related to
instrumental actions has been observed previously, and OFC
activity appears to inform decision-making directly (Furuyashiki et al. 2008; Noonan et al. 2012; Padoa-Schioppa and
Assad 2006; Sul et al. 2010).
OFC is often associated with Pavlovian (stimulus-outcome)
learning (Balleine et al. 2011; Fellows and Farah 2005; Luk
and Wallis 2013). Thus an alternative interpretation of the
Author contributions: N.W.S., J.W., and B.M. conception and design of
research; N.W.S. performed experiments; N.W.S. analyzed data; N.W.S., J.W.,
and B.M. interpreted results of experiments; N.W.S. prepared figures; N.W.S.
drafted manuscript; N.W.S., J.W., and B.M. edited and revised manuscript;
N.W.S., J.W., and B.M. approved final version of manuscript.
REFERENCES
Arnsten AF, Wang MJ, Paspalas CD. Neuromodulation of thought: flexibilities and vulnerabilities in prefrontal cortical network synapses. Neuron
76: 223–239, 2012.
Baddeley A, Della Sala S. Working memory and executive control. Philos
Trans R Soc Lond B Biol Sci 351: 1397–1403, 1996.
Balleine BW, Dickinson A. Goal-directed instrumental action: contingency
and incentive learning and their cortical substrates. Neuropharmacology 37:
407– 419, 1998.
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
PFC Subregions Encode a Uniform Signal During Outcome
Anticipation
3383
3384
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
Holland PC, Straub JJ. Differential effects of two ways of devaluing the
unconditioned stimulus after Pavlovian conditioning. J Exp Psychol Anim
Behav Process 5: 65–78, 1979.
Homayoun H, Moghaddam B. NMDA receptor hypofunction produces
opposite effects on prefrontal cortex interneurons and pyramidal neurons. J
Neurosci 27: 11496 –11500, 2007.
Homayoun H, Moghaddam B. Differential representation of Pavlovianinstrumental transfer by prefrontal cortex subregions and striatum. Eur J
Neurosci 29: 1461–1476, 2009.
Hosking JG, Cocker PJ, Winstanley CA. Dissociable contributions of
anterior cingulate cortex and basolateral amygdala on a rodent cost/benefit
decision-making task of cognitive effort. Neuropsychopharmacology 39:
1558 –1567, 2014.
Hyman JM, Whitman J, Emberly E, Woodward TS, Seamans JK. Action
and outcome activity state patterns in the anterior cingulate cortex. Cereb
Cortex 23: 1257–1268, 2013.
Izquierdo A, Suda RK, Murray EA. Bilateral orbital prefrontal cortex
lesions in Rhesus monkeys disrupt choices guided by both reward value and
reward contingency. J Neurosci 24: 7540 –7548, 2004.
Kargo WJ, Szatmary B, Nitz DA. Adaptation of prefrontal cortical firing
patterns and their fidelity to changes in action reward contingencies. J
Neurosci 27: 3548 –3559, 2007.
Kass RE, Ventura V, Cai C. Statistical smoothing of neuronal data. Network
14: 5–15, 2003.
Khani A, Kermani M, Hesam S, Haghparast A, Argandona EG, Rainer G.
Activation of cannabinoid system in anterior cingulate cortex and orbitofrontal cortex modulates cost-benefit decision making. Psychopharmacology (Berl) 232: 2097–2112, 2015.
Killcross S, Coutureau E. Coordination of actions and habits in the medial
prefrontal cortex of rats. Cereb Cortex 13: 400 – 408, 2003.
Kolb B, Pellis S, Robinson TE. Plasticity and functions of the orbital frontal
cortex. Brain Cogn 55: 104 –115, 2004.
Luk CH, Wallis JD. Dynamic encoding of responses and outcomes by
neurons in medial prefrontal cortex. J Neurosci 29: 7526 –7539, 2009.
Luk CH, Wallis JD. Choice coding in frontal cortex during stimulus-guided
or action-guided decision-making. J Neurosci 33: 1864 –1871, 2013.
Matsumoto K, Tanaka K. The role of the medial prefrontal cortex in
achieving goals. Curr Opin Neurobiol 14: 178 –185, 2004.
Matsumoto M, Matsumoto K, Abe H, Tanaka K. Medial prefrontal cell
activity signaling prediction errors of action values. Nat Neurosci 10:
647– 656, 2007.
Mitra P, Bokil H. Observed Brain Dynamics. New York: Oxford Univ. Press,
2008.
Moghaddam B, Homayoun H. Divergent plasticity of prefrontal cortex
networks. Neuropsychopharmacology 33: 42–55, 2007.
Moorman DE, Aston-Jones G. Orbitofrontal cortical neurons encode expectation-driven initiation of reward-seeking. J Neurosci 34: 10234 –10246,
2014.
Morrison SE, Salzman CD. The convergence of information about rewarding
and aversive stimuli in single neurons. J Neurosci 29: 11471–11483, 2009.
Mulder AB, Nordquist RE, Orgut O, Pennartz CMA. Learning-related
changes in response patterns of prefrontal neurons during instrumental
conditioning. Behav Brain Res 146: 77– 88, 2003.
Naneix F, Marchand AR, Di Scala G, Pape JR, Coutureau E. A role for
medial prefrontal dopaminergic innervation in instrumental conditioning. J
Neurosci 29: 6599 – 6606, 2009.
Narayanan NS, Laubach M. Top-down control of motor cortex ensembles by
dorsomedial prefrontal cortex. Neuron 52: 921–931, 2006.
Niki H, Watanabe M. Prefrontal and cingulate unit activity during timing
behavior in the monkey. Brain Res 171: 213–224, 1979.
Noonan MP, Kolling N, Walton ME, Rushworth MF. Re-evaluating the role
of the orbitofrontal cortex in reward and reinforcement. Eur J Neurosci 35:
997–1010, 2012.
O’Neill M, Schultz W. Coding of reward risk by orbitofrontal neurons is
mostly distinct from coding of reward value. Neuron 68: 789 – 800, 2010.
Oliveira FTP, McDonald JJ, Goodman D. Performance monitoring in the
anterior cingulate is not all error related: expectancy deviation and the
representation of action-outcome associations. J Cogn Neurosci 19: 1994 –
2004, 2007.
Ongur D, Price JL. The organization of networks within the orbital and
medial prefrontal cortex of rats, monkeys and humans. Cereb Cortex 10:
206 –219, 2000.
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Balleine BW, Leung BK, Ostlund SB. The orbitofrontal cortex, predicted
value, and choice. Ann NY Acad Sci 1239: 43–50, 2011.
Bechara A, Van Der Linden M. Decision-making and impulse control after
frontal lobe injuries. Curr Opin Neurobiol 18: 734 –739, 2005.
Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the
value of information in an uncertain world. Nat Neurosci 10: 1214 –1221,
2007.
Birrell JM, Brown VJ. Medial frontal cortex mediates perceptual attentional
set shifting in the rat. J Neurosci 20: 4320 – 4324, 2000.
Bohn I, Giertler C, Hauber W. Orbital prefrontal cortex and guidance of
instrumental behavior of rats by visuospatial stimuli predicting reward
magnitude. Learn Mem 10: 177–186, 2003.
Brown VJ, Bowman EM. Rodent models of prefrontal cortical function.
Trends Neurosci 25: 340 –343, 2002.
Buckley MJ, Mansouri FA, Hoda H, Mahboubi M, Browning PGF, Kwok
SC, Phillips A, Tanaka K. Dissociable components of rule-guided behavior
depend on distinct medial and prefrontal regions. Science 325: 52–58, 2009.
Burton AC, Kashtelyan V, Bryden DW, Roesch MR. Increased firing to
cues that predict low-value reward in the medial orbitofrontal cortex. Cereb
Cortex 24: 3310 –3321, 2014.
Cardinal RN. Neural systems implicated in delayed and probabilistic reinforcement. Neural Netwk 19: 1277–1301, 2006.
Chandler DJ, Lamperski CS, Waterhouse BD. Identification and distribution of projections from monoaminergic and cholinergic nuclei to functionally differentiated subregions of prefrontal cortex. Brain Res 1522: 38 –58,
2013.
Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal
and infralimbic cortex to Pavlovian autoshaping and discrimination reversal
learning: further evidence for the functional heterogeneity of the rodent
frontal cortex. J Neurosci 23: 8771– 8780, 2003.
Connors BW, Gutnick MJ. Intrinsic firing patterns of diverse neocortical
neurons. Trends Neurosci 13: 99 –104, 1990.
Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental
conditioning. Behav Brain Res 146: 145–157, 2003.
Deco G, Ledberg A, Almeida R, Fuster J. Neural dynamics of cross-modal
and cross-temporal associations. Exp Brain Res 166: 325–336, 2005.
Fellows LK. Orbitofrontal contributions to value-based decision making:
evidence from humans with frontal lobe damage. Ann NY Acad Sci 1239:
51–58, 2011.
Fellows LK, Farah MJ. Is anterior cingulate cortex necessary for cognitive
control? Brain 128: 788 –796, 2005.
Floresco SB, St. Onge JR, Ghods-Sharifi S, Winstanley CA. Cortico-limbicstriatal circuits subserving different forms of cost-benefit decision-making.
Cogn Affect Behav Neurosci 8: 375–389, 2008.
Funahashi S, Bruce CJ, Goldman-Rakic PS. Mnemonic coding of visual
space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol 61:
331–349, 1989.
Funahashi S, Chafee MV, Goldman-Rakic PS. Prefrontal neuronal activity
in rhesus monkeys performing a delayed anti-saccade task. Nature 365:
753–756, 1993.
Furuyashiki T, Holland PC, Gallagher M. Rat orbitofrontal cortex separately encodes response and outcome information during performance of
goal-directed behavior. J Neurosci 28: 5127–5138, 2008.
Fuster JM. The prefrontal cortex—an update: time is of the essence. Neuron
30: 319 –333, 2001.
Fuster JM, Alexander GE. Neuron activity related to short-term memory.
Science 173: 652– 654, 1971.
Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and
representation of incentive value in associative learning. J Neurosci 19:
6610 – 6614, 1999.
Glascher J, Adolphs R, Damasio H, Bechara A, Rudrauf D, Calamia M,
Paul LK, Tranel D. Lesion mapping of cognitive control and value-based
decision making in the prefrontal cortex. Proc Natl Acad Sci USA 109:
14681–14686, 2012.
Goldman-Rakic PS. Cellular basis of working memory. Neuron 14: 477– 485,
1995.
Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR. Dissociable
regulation of instrumental action within mouse prefrontal cortex. Eur J
Neurosci 32: 1726 –1734, 2010.
Histed MH, Pasupathy A, Miller EK. Learning substrates in the primate
prefrontal cortex and striatum: Sustained activity related to successful
actions. Neuron 63: 244 –253, 2009.
PREFRONTAL CORTEX ENCODES ACTION-OUTCOME RELATIONSHIPS
Schoenbaum G, Setlow B, Nugent SL, Saddoris MP, Gallagher M. Lesions
of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition
of odor-guided discriminations and reversals. Learn Mem 10: 129 –140,
2003.
Schultz W. Multiple reward signals in the brain. Nat Rev Neurosci 1:
199 –207, 2000.
Schwartz MF. The cognitive neuropsychology of everyday action and planning. Cogn Neuropsychol 23: 202–221, 2006.
Sellitto M, Ciaramelli E, di Pellegrino G. Myopic discounting of future
rewards after medial orbitofrontal damage in humans. J Neurosci 30:
16429 –16436, 2010.
Simmons JM, Richmond BJ. Dynamic changes in representations of preceding and upcoming reward in monkey orbitofrontal cortex. Cereb Cortex 18:
93–103, 2008.
Sturman DA, Moghaddam B. Reduced neuronal inhibition and coordination
of adolescent prefrontal cortex during motivated behavior. J Neurosci 31:
1471–1478, 2011.
Sul JH, Kim H, Huh N, Lee D, Jung MW. Distinct roles of rodent
orbitofrontal and medial prefrontal cortex in decision making. Neuron 66:
449 – 460, 2010.
Tierney PL, Dégenètais E, Thierry AM, Glowinski J, Gioanni Y. Influence
of the hippocampus on interneurons of the rat prefrontal cortex. Eur J
Neurosci 20: 514 –524, 2004.
Totah NK, Kim YB, Homayoun H, Moghaddam B. Anterior cingulate
neurons represent errors and preparatory attention within the same behavioral sequence. J Neurosci 29: 6418 – 6426, 2009.
Tran-Tu-Yen DAS, Marchand AR, Pape J-R, Di Scala G, Coutureau E.
Transient role of the rat prelimbic cortex in goal-directed behaviour. Eur J
Neurosci 30: 464 – 471, 2009.
Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal
cortex. Nature 398: 704 –708, 1999.
van Duuren E, Escamez FA, Joosten RN, Visser R, Mulder AB, Pennartz
CMA. Neural coding of reward magnitude in the orbitofrontal cortex of the
rat during a five-odor olfactory discrimination task. Learn Mem 14: 446 –
456, 2007.
Wallis JD, Miller EK. Neuronal activity in primate dorsolateral and orbital
prefrontal cortex during performance of a reward preference task. Eur J
Neurosci 18: 2069 –2081, 2003.
Walton ME, Rudebeck PH, Bannerman DM, Rushworth MFS. Calculating
the cost of acting in frontal cortex. Ann NY Acad Sci 1104: 340 –356, 2007.
Young JJ, Shapiro ML. Dynamic coding of goal-directed paths by orbital
prefrontal cortex. J Neurosci 31: 5989 – 6000, 2011.
J Neurophysiol • doi:10.1152/jn.00884.2015 • www.jn.org
Downloaded from http://jn.physiology.org/ by 10.220.33.2 on June 15, 2017
Ostlund SB, Balleine BW. Lesions of medial prefrontal cortex disrupt the
acquisition but not the expression of goal-directed learning. J Neurosci 25:
7763–7770, 2005.
Ostlund SB, Balleine BW. The contribution of orbitofrontal cortex to action
selection. Ann NY Acad Sci 1121: 174 –192, 2007.
Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode
economic value. Nature 441: 223–226, 2006.
Parent MA, Amarante LM, Liu B, Weikum D, Laubach M. The medial
prefrontal cortex is crucial for the maintenance of persistent licking and the
expression of incentive contrast. Front Integr Neurosci 9: 23, 2015.
Peters YM, O’Donnell P, Carelli RM. Prefrontal cortical cell firing during
maintenance, extinction, and reinstatement of goal-directed behavior for
natural reward. Synapse 56: 74 – 83, 2005.
Rainer G, Asaad WF, Miller EK. Selective representation of relevant
information by neurons in the primate prefrontal cortex. Nature 393:
577–579, 1998.
Rhodes SE, Murray EA. Differential effects of amygdala, orbital prefrontal
cortex, and prelimbic cortex lesions on goal-directed behavior in rhesus
macaques. J Neurosci 33: 3380 –3389, 2013.
Roesch MR, Olson CR. Neuronal activity related to reward value and
motivation in primate frontal cortex. Science 304: 307–310, 2004.
Roesch MR, Taylor AR, Schoenbaum G. Encoding of time-discounted
rewards in orbitofrontal cortex is independent of value representation.
Neuron 51: 509 –520, 2006.
Rogers RD, Owen AM, Middleton HC, Williams EJ, Pickard JD, Sahakian BJ, Robbins TW. Choosing between small, likely rewards and large,
unlikely rewards activates inferior and orbital prefrontal cortex. J Neurosci
19: 9029 –9038, 1999.
Rolls E, Baylis L. Gustatory, olfactory, and visual convergence within the
primate orbitofrontal cortex. J Neurosci 14: 5437–5452, 1994.
Rolls ET. The functions of the orbitofrontal cortex. Brain Cogn 55: 11–29,
2004.
Rudebeck PH, Behrens TE, Kennerley SW, Baxter MG, Buckley MJ,
Walton ME, Rushworth MFS. Frontal cortex subregions play distinct roles
in choices between actions and stimuli. J Neurosci 28: 13775–13785, 2008.
Rudebeck PH, Walton ME, Smyth AN, Bannerman DM, Rushworth MFS.
Separate neural pathways process different decision costs. Nat Neurosci 9:
1161–1168, 2006.
Schoenbaum G, Chiba AA, Gallagher M. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci 1:
155–159, 1998.
Schoenbaum G, Setlow B. Lesions of nucleus accumbens disrupt learning
about aversive outcomes. J Neurosci 23: 9833–9841, 2003.
3385