International Journal of Psychophysiology 107 (2016) 37–43 Contents lists available at ScienceDirect International Journal of Psychophysiology journal homepage: www.elsevier.com/locate/ijpsycho Oscillatory profiles of positive, negative and neutral feedback stimuli during adaptive decision making Peng Li a, Travis E. Baker b, Chris Warren c, Hong Li a,⁎ a b c Brain Function and Psychological Science Research Center, Shenzhen University, Shenzhen, China Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montreal, Canada Department of Psychology, Leiden University, Leiden, The Netherlands a r t i c l e i n f o Article history: Received 20 October 2015 Received in revised form 22 June 2016 Accepted 30 June 2016 Available online 01 July 2016 Keywords: Theta Beta-gamma Feedback-related negativity Neutral feedback Reinforcement learning a b s t r a c t The electrophysiological response to positive and negative feedback during reinforcement learning has been well documented over the past two decades, yet, little is known about the neural response to uninformative events that often follow our actions. To address this issue, we recorded the electroencephalograph (EEG) during a time-estimation task using both informative (positive and negative) and uninformative (neutral) feedback. In the time-frequency domain, uninformative feedback elicited significantly less induced beta-gamma activity than informative feedback. This result suggests that beta-gamma activity is particularly sensitive to feedback that can guide behavioral adjustments, consistent with other work. In contrast, neither theta nor delta activity were sensitive to the difference between negative and neutral feedback, though both frequencies discriminated between positive, and non-positive (neutral or negative) feedback. Interestingly, in the time domain, we observed a linear relationship in the amplitude of the feedback-related negativity (neutral N negative N positive), a component of the event-related brain potential thought to index a specific kind of reinforcement learning signal called a reward prediction error. Taken together, these results suggest that the reinforcement learning system treats neutral feedback as a special case, providing valuable information about the electrophysiological measures used to index the cognitive function of frontal midline cortex. © 2016 Published by Elsevier B.V. 1. Introduction Our ability to predict and evaluate the consequences of our actions is fundamental to adaptive decision making. Reinforcement learning (RL) theory holds that if an action is followed by positive feedback then that action will have a greater probability of being performed again, whereas if an action is followed by negative feedback then that action will have a lesser probability of being performed again (i.e. Thorndike's Law of Effect: Catania, 1999). But in everyday life, not all of our actions are followed by such binary consequences, but rather by uninformative events. In fact, the term neutral operants has long been used by RL theorists to describe responses from the environment that neither increase nor decrease the probability of a behavior being repeated (Skinner, 1938). While observations of electrophysiological activity over frontal midline cortex have motivated a wealth of experimental and theoretical analyses of RL, it remains unclear how uninformative feedback is ultimately processed during trial-and-error learning tasks. Over the last decade, both time domain and time-frequency domain analyses of electrophysiological recordings have been increasingly used in research concerned with neural processes that differentiate ⁎ Corresponding author at: No 3688, Nanhai Road, Nanshan District, Shenzhen 518060, China. E-mail address: [email protected] (H. Li). http://dx.doi.org/10.1016/j.ijpsycho.2016.06.018 0167-8760/© 2016 Published by Elsevier B.V. performance feedback indicating positive outcomes (e.g., monetary gain, correct feedback) from negative outcomes (e.g., monetary loss, error feedback) (Weinberg et al., 2014). In the time domain, event-related brain potential (ERP) studies have revealed a negative-going deflection in the ERP that peaks over frontal-central recording sites approximately 250 ms following feedback presentation. This feedback-locked ERP component, termed the feedback-related negativity (FRN), is typically enhanced following unexpected task-relevant events (e.g. negative feedback, errors) and is reduced or absent following positive feedback.1 Interestingly, the few FRN studies examining neutral feedback have produced largely mixed results (Holroyd et al., 2006; Kujawa et al., 2013; Huang and Yu, 2014; Yu and Zhou, 2006). In particular, studies either report larger FRNs to neutral feedback compared to negative and positive feedback (Müller et al., 2005; Kujawa et al., 1 Recent evidence suggests that the difference in FRN amplitude between reward and error trials results from a positive-going deflection, the reward positivity (Rew-P), elicited by reward feedback (see Holroyd et al., 2008; Warren and Holroyd, 2012; Baker and Holroyd, 2011; Proudfit, 2015). Because the Rew-P typically occurs during the time-range of the FRN and P300, the difference-wave method is commonly used to isolate the reward positivity from other ERP components by taking the difference between the ERPs to positive and negative feedback. For the purpose of this study, we focused our analysis on condition-specific ERP effects by measuring the amplitudes of the FRN elicited by neutral, negative, and positive feedback. 38 P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 2013; Huang and Yu, 2014) or comparable FRN amplitudes between neutral and negative feedback (Holroyd et al., 2006).2 Although neutral feedback has yet been investigated in the time-frequency domain, electroencephalogram (EEG) oscillations in the theta frequency range (4–8 Hz) recorded over frontal midline areas of the scalp have been associated with outcome processing (Cavanagh and Frank, 2014; Cavanagh et al., 2012), as well as other cognitive processes related to effort, attention and motivation (for reviews Hsieh and Ranganath, 2014; Mitchell et al., 2008). Notably, more frontal midline theta power is observed following negative feedback compared to positive feedback, suggesting that this signal reflects an error-driven learning mechanism consistent with principles of reinforcement learning (Cavanagh et al., 2010). However, others have argued that frontal midline theta reflects sensitivity to important cognitive events in general rather than to errors in particular (Cavanagh et al., 2012), and signal the deployment of control (Cavanagh and Frank, 2014). Furthermore, power in the delta (1–4 Hz) and beta-gamma (20–30 Hz) frequency range have been shown to increase following positive feedback compared to negative feedback (Bernat et al., 2011; Cohen et al., 2007; HajiHosseini et al., 2012; Marco-Pallares et al., 2008). In particular, recent work has indicated that feedback-locked delta band activity that related to reward positivity appears to be specific to surprising rewards, but does not predict associated behavioral adjustments (Cavanagh, 2015). By contrast, feedback-locked beta-gamma activity is thought to reflect a salience signal and has been associated with ongoing adjustments of behavior (HajiHosseini and Holroyd, 2015) and cognitive demand (Chen et al., 2012; Gilbert and Sigman, 2007; Lee et al., 2003). While decomposing feedback-related EEG and ERPs into spectral quantities has provided a thorough understanding of the cognitive processes underlying RL, much of the existing research on these time-frequency components often characterizes feedback-locked oscillatory activity (delta, theta, beta-gamma) as total spectral power. Unfortunately, this approach does not capture all the information available in these signals because total power within a given frequency band consists of the stimulus phase-locked part of the EEG that gives rise to the ERP, called the “evoked” power, and the non phase-locked part of the EEG that is invisible in the ERP, called the “induced” power (TallonBaudry and Bertrand, 1999). Importantly, current thinking holds that these activities reflect different cognitive processes, such that evoked activity reflects bottom-up neural activity, whereas induced activity is thought to reflect top-down modulation (Tallon-Baudry and Bertrand, 1999). Indeed, it was recently demonstrated that total theta power is equally sensitive to outcome valence and outcome probability, however, evoked theta power was mainly sensitive to outcome valence whereas induced theta power was mainly sensitive to outcome probability (Hajihosseini and Holroyd, 2013). The role of delta and beta-gamma band phase dynamics in feedback processing remains unknown. The difference in dominant frequencies (delta, theta, and beta-gamma) between negative and positive feedback could provide a deeper understanding of the phase dynamics (evoked vs induced) at play during trial-by-trial RL. Furthermore, it is also important to consider the relationship between time domain and time-frequency domain measures. For example, a relationship between feedback-related delta activity and the amplitude of the P300 component has been demonstrated (e.g. Bernat et al., 2007; Cavanagh and Frank, 2014), possibly suggesting that the evoked portion of delta following feedback may contribute to feedback-related differences observed in the amplitude of the P300. However, this relationship has never been formally tested. Further, theta oscillations and the FRN have been extensively studied in parallel, decades-long literatures. Feedback-induced theta power and the FRN 2 It is interesting to note that across 5 experiments reported in Holroyd et al. (2006), the authors detailed that neutral feedback stimuli elicited larger FRNs than did the negative feedback stimuli, but was not statistically significant (see Figs. 1 and 2, Holroyd et al., 2006). occur at about the same time (200–400 ms post feedback) and share the same scalp location (over the frontal midline), suggesting a functional relationship between these two phenomena. In particular, converging evidence across multiple methodologies indicates that the anterior cingulate cortex (ACC) is the source of both frontal midline theta oscillations (Cavanagh and Frank, 2014) and the FRN (Holroyd and Yeung, 2012). Importantly, recent examinations of theta power and the FRN have provided a nuanced account about their relationship (Hajihosseini and Holroyd, 2013). Under this account, unexpected, task-relevant events elicit an ACC-dependent control process that manifests in the frequency domain as theta oscillations over frontal-central areas of the scalp (Cavanagh and Frank, 2014). In the time domain, the “evoked” portion of this theta activity that is consistent in phase across trials gives rise to the FRN (Hajihosseini and Holroyd, 2013; see also Yeung et al., 2004). Although both measures provide valuable information about cognitive function of frontal midline cortex, it has recently been argued that FRN amplitude is specifically sensitive to dopamine reinforcement learning signals whereas evoked theta power reflects the ACC response to unexpected events. Given the relationship between theta oscillations and FRN, it is perhaps surprising that neutral feedback has not yet been investigated in the time-frequency domain. Furthermore, because of the inconsistency in FRN studies examining neutral operants, the functional role of the ACC in the cognitive processes that underlie reinforcement learning remains incomplete. Thus, in order to further our understanding of the cognitive processes underlying informative and non-informative feedback during RL, the electrophysiological response to the good, the bad, and the neutral needs to be further characterized. In the present study, we present an harmonious application of both ERP (i.e. FRN) and time-frequency (i.e. evoked and induced delta, theta, and beta-gamma power) approaches in an attempt to elucidate the discrete aspects of the electrophysiological dynamics between positive, negative, and neutral feedback (Holroyd et al., 2012). 2. Methods 2.1. Participants and procedure Nineteen undergraduate students (eight males) aged 18–23 years participated in the experiment for monetary compensation. All participants had normal or corrected-to normal vision, were right-handed and had no neurological or psychological disorders. Two subjects were excluded out from the final analysis due to their poor behavioral performance. The study was approved by the local ethics committee. Participants were asked to perform a time estimation task (e.g. Miltner et al., 1997) that included neutral feedback. They were required to press the spacebar following a cue (a 1500 Hz sound that lasted 50 ms) to indicate that their estimate of 1 s had elapsed. Following their response, a feedback stimulus appeared on the screen indicating whether their estimation was correct (positive feedback, win 5 cents; a circle with a check mark), incorrect (negative feedback, 0 cents; a circle with a cross mark) or the feedback was absent (neutral feedback, either 5 or 0 cents: a circle with nothing inside). To note, participants did not know whether or not they received money following neutral feedback immediately, but would receive money for correct response (in total, 50% of the trials) at the end of the experiment. Participants were told that if their reaction time was within the time window from 900 ms to 1100 ms, they would receive positive feedback; otherwise they would get negative feedback. However, this time window narrowed by 10 ms if they responded correctly on the previous trial and widened by 10 ms if they responded incorrectly on the previous trial. For 1/3 of negative-feedback trials and 1/3 of positive-feedback trials, the appropriate feedback was randomly replaced with neutral feedback. Of the 288 trials, participants received 28% neutral feedback, 35% positive feedback and 37% negative feedback in total. P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 2.2. EEG Acquisition 3. Results The electroencephalogram (EEG) was recorded at 64 scalp sites using tin electrodes mounted in an elastic cap (Brain Product, Munchen, Germany), with a ground electrode placed on the frontal midline and references placed on the left and right mastoids. Vertical electrooculograms (EOGs) were recorded supra-orbitally and infra-orbitally relative to the left eye. The horizontal EOG was recorded as the difference in activity from the right versus the left orbital rim. The impedance of all electrodes was kept below 10 kΩ. The EEG and EOG were amplified using a 0.05–100 Hz bandpass and continuously digitized at 500 Hz/ channel for offline analysis. The EEG data were further filtered offline (0.1–40 Hz bandwidth) for ERP analysis. Then, ocular artifacts were corrected using the eye movement correction algorithm described by Gratton et al. (1983). Trials with EOG artifacts (mean EOG voltage exceeding 80 μV) and peak-to-peak deflection and those contaminated artifacts due to amplifier clipping exceeding 80 μV were excluded from averaging. Less than 5% of trials were rejected after preprocessing in each of the three conditions. 3.1. Trial-to-trial behavioral adjustment 39 The raw reaction time (RT) does not provide any more information than accuracy in this kind of time estimation task. However, the absolute value of change in RT (△ RT) between the N trial and N + 1 trial gives a measure of the behavioral adjustment following each of the three types of feedback, and controls for slow changes in performance over time (Dutilh et al., 2012). A one way ANOVA was conducted on this data with the three types of feedback as independent variables. As showed in Fig. 1A, results showed that the main effect of feedback valence was significant, F(2, 34) = 90.9 p b 0.001, η2 = 0.85. The following pairwise comparisons revealed that △ RT following negative feedback (230 ± 37 ms) was significantly larger than △ RT following neutral feedback (178 ± 43 ms), t(16) = 8.95, p b 0.001, and △ RT following neutral feedback was significantly larger than △ RT following positive feedback (144 ± 27 ms), t(16) = 5.05, p b 0.001. 3.2. ERP results 2.3. Time-frequency analysis. To extract time-frequency information from EEG data associated with feedback stimulus presentation, 2-s epochs centered on feedback onset were extracted from the single-trial data. The EEG epochs were convolved with a complex 7-cycle Morlet wavelet using custom-written Matlab routines (see Marco-Pallares et al., 2008; Hajihosseini and Holroyd, 2013) that implement the method described by Lachaux et al. (1999). Changes in power over time (squared amplitude of the convolution between the signal and the wavelet) in the 1 to 40 Hz frequency range were computed for each single trial and averaged for each subject and condition before creating grand averages across subjects. The relative change in the power for each condition was determined by averaging the baseline activity (100 ms prestimulus) across time for each frequency and then subtracting the average from each data point following stimulus presentation for the corresponding frequency. For each subject, the total power was calculated as the average value of time-frequency power across single trials and the evoked power was determined directly from the averaged ERPs, and induced power was then identified by subtracting the evoked theta power from the total theta power (Hajihosseini and Holroyd, 2013; Behroozmand et al., 2015). For statistical purposes, the mean induced and evoked power was obtained within a 200 ms window following the onset of the feedback stimulus (Delta [1–3 Hz], 300–500 ms; theta [4–8 Hz], 200– 400 ms; beta-gamma [20–40 Hz], 350–550 ms, cf. Cunillera et al., 2012; HajiHosseini and Holroyd, 2015). All our analyses were restricted to channel FCz. 2.4. ERP analysis The FRN was quantified by first segmenting the EEG into 800 ms epochs time-locked to the feedback stimulus, including a 200 ms baseline preceding the feedback. After baseline correction, FRN amplitude was evaluated for each participant and feedback condition (positive, negative, neutral) using a base-to-peak algorithm described in Holroyd et al. (2006), see also Holroyd et al., 2003; Holroyd et al., 2008), as follows. First, the most positive voltage within a 160 to 260 ms window following feedback presentation was taken as the base of the FRN. Then, the sample with the most negative value within a time window starting from the base of FRN to 400 ms after feedback presentation was taken as the peak. FRN amplitude was calculated as the difference between these base and peak amplitudes. The algorithm assigned 0 μV where no N200 was detected. FRN values at FCz were analyzed with a one-way analyses of variance (ANOVA). Fig. 1B illustrates the ERPs elicited by the neutral, negative, and positive feedback. A one-way ANOVA analysis on the FRN amplitude with feedback condition (Negative, Neutral & Positive) as a factor revealed a main effect of feedback,3 F(2, 32) = 21.09, p b 0.001, η2 = 0.57. Pairwise comparisons revealed that the FRN following neutral feedback (M = −7.80 μV, SEM = 1.04) was significantly larger than both FRN following negative feedback (M = −6.23 μV, SEM = 0.94), t(16) = 2.12, p b 0.05, and positive feedback (M = − 1.76 μV, SEM = 0.39), t(16) =5.36, p b 0.001. Consistent with previous research, the FRN following negative feedback was larger than the FRN following positive feedback, t(16) = 4.52, p b 0.001. The difference waves between negative and positive and between neutral and positive and corresponding scalp distributions were shown in Fig. 1C and Fig. 1D. 3.3. Time-frequency results 3.3.1. Delta (1–3 Hz) Fig. 2 shows the time-frequency results for both induced and evoked power across each feedback condition. Fig. 3 highlights the effect of feedback on delta activity. A two-way ANOVA on delta activity with Power (induced, evoked) and Feedback (positive, negative, neutral) revealed a main effect of Feedback, F(2, 32) = 8.81, p = 0.001, η2 = 0.36. Post-hoc analyses indicated that delta activity overall was characterized by greater power for positive feedback (M = 0.18 dB, SEM = 0.02) than negative (M = 0.11 dB, SEM = 0.02), t(16) = 3.96, p b 0.001, and neutral feedback (M = 0.10 dB, SEM = 0.02), t(16) = 3.97, p b 0.001. No differences in delta power were observed between negative and neutral feedback, t(16) = 0.55 , p = 0.59). The main effect of power didn't reach significant, F(1, 16) b 1, p = 0.38, η2 = 0.05. There was no significant interaction effect between power and feedback, F(2, 32) = 2.2, p = 0.13, η2 = 0.12. 3.3.2. Theta (4–8 Hz) As shown in Fig.2, a two-way ANOVA on theta activity with Power (induced, evoked) and Feedback (positive, negative, neutral) revealed a main effect of Power, F(1, 16) = 9.08, p b 0.01, η2 = 0.36, a main effect of feedback, F(2, 32) = 4.48, p b 0.05, η2 = 0.22, and an interaction between Power and Feedback, F(2, 32) = 9.94, p b 0.005, η2 = 0.38 (Fig. 3 Note that the pattern of results is the same if the FRN is measured by a mean amplitude approach from 220 to 320 ms: the main effect of valence is significant, F(1.2, 19.9) = 24.08, p b 0.001, η2 = 0.60. Pair-wise comparisons further showed that neutral feedback elicited significant larger FRN (M = 8.63 μV, SEM = 1.4) than the negative feedback (M = 10.88 μV, SEM = 1.62, p b 0.01). The negative feedback also elicited a larger FRN than positive feedback (M = 16.46 μV, SEM = 2.21, p b 0.001). 40 P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 Fig. 1. (A) Bar showed averaged changed RT cross trials for each condition, the error bars represents standard error; (B) Grand average ERP at FCz associated with neutral (green line), negative (red line), and positive (blue line) feedback; (C) Difference wave between neutral and positive condition (Neutral-Positive DW) and difference wave between negative and positive condition (Negative-Positive DW); (D) Scalp distributions of difference waves. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 2A). Post-hoc analyses indicated that theta was characterized by greater induced power (M = 0.60 dB, SEM = 0.09) than evoked power (M = 0.36 dB, SEM = 0.06), t(16) = 3.01, p b 0.01. In regards to Feedback, theta activity overall was characterized by reduced power for positive feedback (M = 0.41 dB, SEM = 0.07) relative to negative feedback (M = 0.55 dB, SEM = 0.09), t(16) = 2.68, p b 0.02, and neutral (M = 0.48 dB, SEM = 0.07), t(16) = 2.56, p b 0.03. Theta power following negative feedback didn't differ from that following neutral feedback, t(16) = 1.26, p = 0.23. More importantly, there was a significant interaction effect between Power and Feedback, post-hoc tests indicated that evoked power following positive feedback (M = 0.16 dB, SEM = 0.03) was significantly reduced compared to negative (M = 0.49 dB, SEM = 0.10; t(16) = 3.82, p = 0.002) and neutral (M = 0.42 dB, SEM = 0.06; t(16) = 5.20, p b 0.001) feedback. No differences were observed between evoked power following neutral and negative feedback, t(16) = 1.32, p = 0.21. By contrast, induced power following positive feedback was marginally significantly larger (M = 0.65 dB, SEM = 0.12) than neutral feedback (M = 0.54 dB, SEM = 0.09), t(16) = 1.96, p = 0.07. No other differences were detected (p N 0.05). 3.3.3. Beta-Gamma (20–40 Hz) A two-way ANOVA on beta-gamma activity with Power (induced, evoked) and Feedback (positive, negative, neutral) revealed a main effect of Feedback, F(2, 32) = 8.38, p = 0.001, η2 = 0.34, and an interaction between Power and Feedback, F(2, 32) = 8.41, p = 0.001, η2 = 0.35. The interaction (Fig. 3A & Fig. 3B) indicates that induced power Fig. 2. The time frequency representations were shown in Fig. 2: (A) Induced power following positive (left panel), negative (middle panel) and neutral (right panel) feedback; (B) Evoked power following positive (left panel), negative (middle panel) and neutral (right panel) feedbacks; All data recorded at channel FCz. P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 41 Fig. 3. The time course of power change of delta, theta and beta-gamma band for positive, negative and neutral feedback: (A) The time course of the change in induced delta (left panel), induced theta (middle panel), and induced beta-gamma (right panel) power associated with neutral (green solid line), negative (red solid line), and positive (blue solid line) feedback (B) The time course of the change in evoked delta (left panel), evoked beta-gamma (middle panel) and evoked-theta (right panel) power associated with neutral (green dash line), negative (red dash line), and positive (blue dash line) feedback, All data recorded at channel FCz. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) following neutral feedback (M = −0.043 dB, SEM = 0.039) was significantly reduced compared to negative (M = .064 dB, SEM = 0.040), t(16) = 2.74, p b 0.02, and positive (M = 0.179 dB, SEM = 0.069), t(16) = 3.47, p = 0.003, feedback. Induced beta-gamma power following positive feedback was larger than negative feedback, but the difference was only marginally significant t(16) = 2.05, p = 0.057. No significant difference was observed between the three feedback conditions for evoked power (all p N 0.05). As well, the main effect of Power was not significant, F(1, 16) = 2.65, p = 0.12, η2 = 0.14. Post-hoc analyses indicated that beta-gamma activity overall was characterized by greater power for positive feedback (M = 0.18 dB, SEM = 0.07) than negative (M = 0.06 dB, SEM = 0.04), t(16) = 2.11, p = 0.051, and neutral (M = −0.04 dB, SEM = 0.04), t(16) = 3.19, p b 0.005, feedback. Finally, negative feedback elicited a stronger beta-gamma response than the neutral feedback, t(16) = 2.65, p b 0.02. 4. Discussion The purpose of this study was to apply both a time-frequency and ERP analysis to the electrophysiological activity elicited by neutral feedback during an RL paradigm. Our ERP analysis revealed that neutral feedback elicited a larger FRN than negative feedback, replicating some previous work (Huang and Yu, 2014; Yu and Zhou, 2006), but inconsistent with other work (Holroyd et al., 2006). Our time-frequency analysis revealed three oscillatory frequencies that were sensitive to our feedback manipulation, theta, delta, and beta-gamma. In our time-frequency analysis, the theta and delta bands exhibited a similar pattern of sensitivity to feedback, whereby both frequency bands distinguished positive feedback from each of negative and neutral feedback, but did not distinguish between negative and neutral feedback. This similarity is in line with work from Bernat and colleagues, demonstrating that the time-domain FRN is produced by differences in both theta and delta power (Bernat et al., 2011; Bernat et al., 2015). However, we observed differences in the time-domain FRN between neutral and negative feedback, suggesting an additional layer to the FRN that cannot be explained solely by theta and delta power. In contrast to theta and delta activity, beta-gamma activity differentiated neutral feedback from informative feedback, and thus we speculate that the neutral-feedback FRN is produced by an interaction of two effects on EEG oscillations: valence effects (theta and delta), and informative versus uninformative feedback effects (beta-gamma). Perhaps our most important finding is that induced beta-gamma power dissociated between uninformative feedback (i.e. neutral feedback) and informative feedback. Interestingly, the lack of induced beta-gamma to neutral feedback coincided with the pattern of behavioral adjustments following feedback. In particular, subjects did not adjust their behavior systematically following neutral feedback, but tended to keep their response pattern following positive feedback, and made large adjustments in response time following negative feedback. The beta-gamma range has been associated with behavioral adjustments in other work (e.g. Cunillera et al., 2012; Van de Vijver et al., 2011), and has been shown to be more sensitive to valence in trialand-error learning tasks that require ongoing adjustments of behavior (such as the time-estimation task) (HajiHosseini and Holroyd, 2015). Notably, it has been suggested that (total) beta-gamma activity represents a “motivational value signal” (HajiHosseini et al., 2012, p.1683), and that beta-gamma increases are representative of increased attentional resources being applied to particularly important events. In this context, our findings support the idea that induced beta-gamma reflects a manifestation of a motivational value signal that energizes behavioral adjustments when feedback is meaningful, but that is absent when feedback provides no task-relevant information. Therefore, based on previous literature and on our beta-gamma effects, we speculate that induced beta-gamma activity reflects a motivational response to feedback (HajiHosseini and Holroyd, 2015) and active inhibition/disinhibition of motor commands (Cavanagh and Frank, 2014), possibly highlighting a top-down modulatory role during RL. The nature of induced delta and beta-gamma may provide further insight into the neural computations supporting RL. An influential theory holds that the FRN is produced by the impact of phasic increases and decreases in dopamine activity coding for positive and negative reward prediction error signals (RPE) on ACC (RPE-ACC theory; Holroyd and Coles, 2002; Holroyd et al., 2008). RPEs constitute the learning term in powerful reinforcement learning algorithms that indicate when events are “better” or “worse” than expected (Sutton and Barto, 1998) and substantial evidence over the past decade has confirmed that the FRN reflects an RPE signal (for reviews see Walsh and Anderson, 2012; Sambrook and Goslin, 2015). The RPE-ACC theory holds that FRN 42 P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 amplitude is regulated up and down by dopamine RPE signals conveyed to ACC. In particular, positive dopamine RPE signals conveyed to the ACC following unexpected positive feedback suppresses the production of the FRN whereas negative dopamine RPE signals enhance FRN amplitude (Holroyd et al., 2008; see Proudfit, 2015 for review). The FRN has been shown to categorically distinguish between positive RPEs and negative RPEs, showing a more negative voltage for the latter. The RPE-ACC theory requires the FRN to show two further properties beyond this categorical distinction. That is, the FRN should be sensitive to both the prior likelihood of reward (likely, unlikely) and outcome magnitude (how much better or worse than expected value an outcome is). With this in mind, the larger FRN following neutral feedback suggests that the size of the negative RPE varied as either a function of outcome magnitude or outcome likelihood, or possibly both. In regards to an effect of magnitude, it is possible that participants came to view neutral feedback a “much” worse than expected because neutral feedback gave no useable information. Alternatively, it is also possible that feedback likelihood modulated the negative RPE size following neutral feedback, resulting in a larger FRN amplitude. Because of the dynamics of trial-and-error learning tasks, subjects may have come to expect positive or negative feedback to follow their responses and categorized uninformative feedback as a relatively rare event (1/3 of trials vs. 2/3 of trials giving informative feedback). We note two potential limitations of our study. First, our analysis was based on data from 17 subjects. Though this sample size is on par and even exceeds most ERP studies, it is small relative to the recent trend of increasing sample sizes in ERP research. Second, our neutral feedback stimulus was relatively unique compared to the feedback in the positive and negative conditions, which could provoke a low-level salience reaction in our subjects that would not be seen to negative or positive feedback. That being said, the effects we report were exhibited later in the EEG than components typically associated with low-level physical features of stimuli, such as the N1 (Luck et al., 1990). 5. Conclusion To the best of our knowledge, a time-frequency and time-domain analyses have yet to be utilized together to investigate neutral operants during RL. Using neutral feedback during RL, we present further evidence of the role of induced beta-gamma oscillations as a motivational signal for behavioral adjustment (HajiHosseini et al., 2012), and argue that the FRN cannot be explained as an effect of theta oscillatory activity alone (Cavanagh and Frank, 2014). Our results suggest that the FRN also contains power from other frequency bands, including delta power (Bernat et al., 2011; Bernat et al., 2015), and beta-gamma. Lastly, these findings motivate further study of the role of these electrophysiological signals to neutral feedback in understanding individual differences (e.g. Hirsh and Inzlicht, 2008; Gu et al., 2010; Li et al., 2015) and psychopathology (e.g. Proudfit, 2015; Baker et al., 2011; Morris et al., 2008) associated with RL. For instance, according to the “aberrant salience” hypothesis, schizophrenia patients attribute salience to otherwise neutral environmental stimuli, and those stimuli may ultimately appear meaningful and evoke delusional mood in patients (for review, see Deserno et al., 2013). These issues are ripe for future investigations. Acknowledgment This work was supported by the National Natural Science Foundation of China (NSFC31300872&81171289), and MOE (Ministry of Education in China) Project of Humanities and Social Sciences (13YJC190013). References Baker, T.E., Stockwell, T., Barnes, G., Holroyd, C.B., 2011. Individual differences in substance dependence: at the intersection of brain, behaviour and cognition. Addict. Biol. 16, 458–466. Behroozmand, R., Ibrahim, N., Korzyukov, O., Robin, D.A., Larson, C.R., 2015. Functional role of delta and theta band oscillations for auditory feedback processing during vocal pitch motor control. Front. Neurosci. 9. Bernat, E.M., Malone, S.M., Williams, W.J., Patrick, C.J., Iacono, W.G., 2007. Decomposing delta, theta, and alpha time–frequency erp activity from a visual oddball task using pca. Int. J. Psychophysiol. 64 (1), 62–74. Bernat, E.M., Nelson, L.D., Steele, V.R., Gehring, W.J., Patrick, C.J., 2011. Externalizing psychopathology and gain–loss feedback in a simulated gambling task: dissociable components of brain response revealed by time-frequency analysis. J. Abnorm. Psychol. 120 (2), 352–364. Bernat, E.M., Nelson, L.D., Baskin-Sommers, A.R., 2015. Time-frequency theta and delta measures index separable components of feedback processing in a gambling task. Psychophysiology 52 (5), 626–637. Catania, A.C., 1999. Thorndike's legacy: learning, selection, and the law of effect. J. Exp. Anal. Behav. 72 (3), 425–428. Cavanagh, J.F., 2015. Cortical delta activity reflects reward prediction error and related behavioral adjustments, but at different times. Neuroimage 110, 205–216. Cavanagh, J.F., Frank, M.J., 2014. Frontal theta as a mechanism for cognitive control. Trends Cogn. Sci. 18 (8), 414–421. Cavanagh, J.F., Frank, M.J., Klein, T.J., Allen, J.J., 2010. Frontal theta links prediction errors to behavioral adaptation in reinforcement learning. NeuroImage 49 (4), 3198–3209. Cavanagh, J.F., Figueroa, C.M., Cohen, M.X., Frank, M.J., 2012. Frontal theta reflects uncertainty and unexpectedness during exploration and exploitation. Cereb. Cortex 22, 2575–2586. Chen, C.C., Kiebel, S.J., Kilner, J.M., Ward, N.S., Stephan, K.E., Wang, W.J., Friston, K.J., 2012. A dynamic causal model for evoked and induced responses. NeuroImage 59 (1), 340–348. Cohen, M.X., Elger, C.E., Ranganath, C., 2007. Reward expectation modulates feedback-related negativity and EEG spectra. NeuroImage 35, 968–978. Cunillera, T., Fuentemilla, L., Periañez, J., Marco-Pallarès, J., Krämer, U.M., Càmara, E., ... Rodríguez-Fornells, A., 2012. Brain oscillatory activity associated with task switching and feedback processing. Cogn. Affect. Behav. Neurosci. 12 (1), 16–33. Deserno, L., Boehme, R., Heinz, A., Schlagenhauf, F., 2013. Reinforcement learning and dopamine in schizophrenia: dimensions of symptoms or specific features of a disease group? Front. Psychiatry 4, 1–16. Dutilh, G., Ravenzwaaij, D.V., Nieuwenhuis, S., Han, L.J.V.D.M., Forstmann, B.U., Wagenmakers, E.J., 2012. How to measure post-error slowing: a confound and a simple solution. J. Math. Psychol. 56 (3), 208–216. Gilbert, C.D., Sigman, M., 2007. Brain states: top-down influences in sensory processing. Neuron 54 (5), 677–696. Gratton, G., Coles, M.G., Donchin, E., 1983. A new method for offline removal of ocular artifact. Electroencephalogr. Clin. Neurophysiol. 55, 468–484. Gu, R., Ge, Y., Jiang, Y., Luo, Y.J., 2010. Anxiety and outcome evaluation: the good, the bad and the ambiguous. Biol. Psychol. 85, 200–206. Hajihosseini, A., Holroyd, C.B., 2013. Frontal midline theta and N200 amplitude reflect complementary information about expectancy and outcome evaluation. Psychophysiology 50, 550–562. HajiHosseini, A., Holroyd, C.B., 2015. Sensitivity of frontal beta oscillations to reward valence but not probability. Neurosci. Lett. 602, 99–103. HajiHosseini, A., Rodríguez-Fornells, A., Marco-Pallarés, J., 2012. The role of beta-gamma oscillations in unexpected rewards processing. NeuroImage 60 (3), 1678–1685. Hirsh, J.B., Inzlicht, M., 2008. The devil you know neuroticism predicts neural response to uncertainty. Psychol. Sci. 19, 962–967. Holroyd, C.B., Coles, M.G.H., 2002. The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol. Rev. 109, 679–709. Holroyd, C.B., Yeung, N., 2012. Motivation of extended behaviors by anterior cingulate cortex. Trends Cogn. Sci. 16 (2), 122–128. Holroyd, C.B., Nieuwenhuis, S., Yeung, N., Cohen, J.D., 2003. Errors in reward prediction are reflected in the event-related brain potential. Neuroreport 14 (18), 2481–2484. Holroyd, C.B., Hajcak, G., Larsen, J.T., 2006. The good, the bad and the neutral: electrophysiological responses to feedback stimuli. Brain Res. 1105, 93–101. Holroyd, C.B., Pakzad-Vaezi, K.L., Krigolson, O.E., 2008. The feedback correct-related positivity: sensitivity of the event-related brain potential to unexpected positive feedback. Psychophysiology 45 (5), 688–697. Holroyd, C.B., HajiHosseini, A., Baker, T.E., 2012. ERPs and EEG oscillations, best friends forever: comment on Cohen et al. Trends Cogn. Sci. 16 (4), 192. Hsieh, L.T., Ranganath, C., 2014. Frontal midline theta oscillations during working memory maintenance and episodic encoding and retrieval. NeuroImage 85, 721–729. Huang, Y., Yu, R., 2014. The feedback-related negativity reflects “more or less” prediction error in appetitive and aversive conditions. Front. Neurosci. 8. Kujawa, A., Smith, E., Luhmann, C., Hajcak, G., 2013. The feedback negativity reflects favorable compared to nonfavorable outcomes based on global, not local, alternatives. Psychophysiology 50 (2), 134–138. Lachaux, J.P., Rodriguez, E., Martinerie, J., Varela, F.J., 1999. Measuring phase synchrony in brain signals. Hum. Brain Mapp. 8, 194–208. Lee, K.H., Williams, L.M., Breakspear, M., Gordon, E., 2003. Synchronous gamma activity: a review and contribution to an integrative neuroscience model of schizophrenia. Brain Res. Rev. 41 (1), 57–78. Li, P., Song, X., Wang, J., Zhou, X., Li, J., Lin, F., et al., 2015. Reduced sensitivity to neutral feedback versus negative feedback in subjects with mild depression: evidence from event-related potentials study. Brain Cogn. 100, 15–20. Luck, S.J., Heinze, H.J., Mangun, G.R., Hillyard, S.A., 1990. Visual event-related potentials index focused attention within bilateral stimulus arrays. II. Functional dissociation of P1 and N1 components. Electroencephalogr. Clin. Neurophysiol. 75 (6), 528–542. P. Li et al. / International Journal of Psychophysiology 107 (2016) 37–43 Marco-Pallares, J., Cucurell, D., Cunillera, T., García, R., Andrés-Pueyo, A., Münte, T.F., Rodríguez-Fornells, A., 2008. Human oscillatory activity associated to reward processing in a gambling task. Neuropsychologia 46, 241–248. Miltner, W.H.R., Braun, C.H., Coles, M.G.H., 1997. Event-related brain potentials following incorrect feedback in a time-estimation task: evidence for a “generic” neural system for error detection. J. Cogn. Neurosci. 9, 788–798. Mitchell, D.J., McNaughton, N., Flanagan, D., Kirk, I.J., 2008. Frontal-midline theta from the perspective of hippocampal “theta”. Prog. Neurobiol. 86 (3), 156–185. Morris, S.E., Heerey, E.A., Gold, J.M., Holroyd, C.B., 2008. Learning-related changes in brain activity following errors and performance feedback in schizophrenia. Schizophr. Res. 99, 274–285. Müller, S.V., Möller, J., Rodriguez-Fornells, A., Münte, T.F., 2005. Brain potentials related to self-generated and external information used for performance monitoring. Clin. Neurophysiol. 116, 63–74. Proudfit, G.H., 2015. The Reward Positivity: From Basic Research on Reward to a Biomarker for Depression (Psychophysiology). Sambrook, T.D., Goslin, J., 2015. A neural reward prediction error revealed by a meta-analysis of ERPs using great grand averages. Psychol. Bull. 141 (1), 213–235. Skinner, B.F., 1938. The Behavior of Organisms: An Experimental Analysis. Appleton-Century, New York. 43 Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning: An Introduction (Vol. 1, No. 1). MIT press, Cambridge. Tallon-Baudry, C., Bertrand, O., 1999. Oscillatory gamma activity in humans and its role in object representation. Trends Cogn. Sci. 3, 151–162. Van de Vijver, I., Ridderinkhof, K.R., Cohen, M.X., 2011. Frontal oscillatory dynamics predict feedback learning and action adjustment. J. Cogn. Neurosci. 23 (12), 4106–4121. Walsh, M.M., Anderson, J.R., 2012. Learning from experience: event-related potential correlates of reward processing, neural adaptation, and behavioral choice. Neurosci. Biobehav. Rev. 36 (8), 1870–1884. Warren, C.M., Holroyd, C.B., 2012. The impact of deliberative strategy dissociates ERP components related to conflict processing vs. reinforcement learning. Front. Neurosci. 6 (43), 1–17. Weinberg, A., Riesel, A., Proudfit, G.H., 2014. Show me the money: the impact of actual rewards and losses on the feedback negativity. Brain Cogn. 87, 134–139. Yeung, N., Bogacz, R., Holroyd, C.B., Cohen, J.D., 2004. Detection of synchronized oscillations in the electroencephalogram: an evaluation of methods. Psychophysiology 41 (6), 822–832. Yu, R., Zhou, X., 2006. Brain potentials associated with outcome expectation and outcome evaluation. Neuroreport 17, 1649.
© Copyright 2026 Paperzz