Behavioural Brain Research 291 (2015) 147–154 Contents lists available at ScienceDirect Behavioural Brain Research journal homepage: www.elsevier.com/locate/bbr Research report Drift diffusion model of reward and punishment learning in schizophrenia: Modeling and experimental data Ahmed A. Moustafa a,∗ , Szabolcs Kéri b,c,d , Zsuzsanna Somlai e , Tarryn Balsdon a , Dorota Frydecka f , Blazej Misiak f,g , Corey White h a School of Social Sciences and Psychology, Marcs Institute for Brain and Behaviour, University of Western Sydney, Penrith, NSW, Australia Nyírő Gyula Hospital–National Institute of Psychiatry and Addictions, Budapest, Hungary c University of Szeged, Faculty of Medicine, Department of Physiology, Szeged, Hungary d Budapest University of Technology and Economics, Department of Cognitive Science, Hungary e Semmelweis University, Department of Psychiatry and Psychotherapy, Budapest, Hungary f Wroclaw Medical University, Department and Clinic of Psychiatry, Wroclaw, Poland g Wroclaw Medical University, Department of Genetics, Wroclaw, Poland h Department of Psychology, Syracuse University, Syracuse, NY, USA b h i g h l i g h t s • • • • It is the first drift diffusion model of behavioral data from schizophrenia patients. Unlike controls, schizophrenia patients show punishment learning deficits. Schizophrenia patients show slow motor/encoding time. Unlike controls, schizophrenia patients use a strategy favoring accuracy over speed. a r t i c l e i n f o Article history: Received 22 January 2015 Received in revised form 5 May 2015 Accepted 13 May 2015 Available online 22 May 2015 Keywords: Schizophrenia Reinforcement learning Decision making Reward Punishment Drift diffusion model (DDM) a b s t r a c t In this study, we tested reward- and punishment learning performance using a probabilistic classification learning task in patients with schizophrenia (n = 37) and healthy controls (n = 48). We also fit subjects’ data using a Drift Diffusion Model (DDM) of simple decisions to investigate which components of the decision process differ between patients and controls. Modeling results show between-group differences in multiple components of the decision process. Specifically, patients had slower motor/encoding time, higher response caution (favoring accuracy over speed), and a deficit in classification learning for punishment, but not reward, trials. The results suggest that patients with schizophrenia adopt a compensatory strategy of favoring accuracy over speed to improve performance, yet still show signs of a deficit in learning based on negative feedback. Our data highlights the importance of applying fitting models (particularly drift diffusion models) to behavioral data. The implications of these findings are discussed relative to theories of schizophrenia and cognitive processing. © 2015 Elsevier B.V. All rights reserved. 1. Introduction International diagnostic systems classify schizophrenia (SZ) as a psychotic disorder, with several positive symptoms such as delusions and hallucinations, as well as negative symptoms, such as affective flattening, alogia or avolition. However, cognitive deficits ∗ Corresponding author at: Department of Veterans Affairs, New Jersey Health Care System, East Orange, New Jersey, United States of America. E-mail address: [email protected] (A.A. Moustafa). http://dx.doi.org/10.1016/j.bbr.2015.05.024 0166-4328/© 2015 Elsevier B.V. All rights reserved. are also increasingly recognized as the core component of SZ symptomatology. These deficits occur in multiple domains of cognitive functioning, with moderate to large effect sizes for impairments across memory, motor performance, attention, IQ, executive function and working or verbal memory, compared to controls [1]. Notably, these deficits precede the onset of overt psychosis and are a risk factor for the onset of SZ [2]. Several lines of evidence also indicate that cognitive impairment may predict functional outcomes, such as self-care, community functioning, and social problem solving and furthermore, that cognitive impairment may be a better predictor of these outcomes than psychotic symptoms [3,4]. 148 A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 Research has indicated that SZ patients show learning and decision making deficits, especially in the context of rewards and punishment. A deficit in updating the expected value of choices, especially loss, and disruption in associative learning underlying the representation of expectancies has been shown in the Iowa Gambling task (IGT) [5–7], the Monetary Incentive Delay task [8], the Wisconsin Card Sorting Task (WCST) [9], delayed reward discounting, and reinforcement learning paradigms (Gold et al. [19]). However, the deficit does not present in the same manner as the loss insensitivity of an orbitofrontal cortex lesion, as, in the WCST, patients do not always select significantly less from advantageous decks (as seen in patients with orbitofrontal cortex lesions [10]. Rather patients make more perseveration errors, indicating a role of learning or as Shurman et al. suggest, working memory (see also [11,12]). Furthermore, imaging studies have revealed reduced activity in the ventral striatum during the anticipation of gain or loss compared with normal controls [13,14], and reduced error-related negativity (ERN) in probabilistic learning tasks (Morris and co-workers) indicating an underlying deficit in signaling prediction errors for value based learning and decision making. However, there are also inconsistent results. For instance, Morris et al. [15], later found that although response-related ERN was reduced in SZ patients compared to controls, their feedback-related ERN was intact. Furthermore, whilst Polli et al. [16] found that SZ patients could immediately correct their errors in an antisaccade task, a later replication of this result accompanied by fMRI [16] showed reduced error-related activity in both dorsal and rostral Anterior Cingulate Cortex (ACC) (even once medication was taken account of), which has been associated with perseveration errors. Some studies have found little to no effect of reward. For instance, Waltz et al. [17] found that whilst SZ patients showed reduced activity for reward in a passive conditioning task, they showed intact responses to unexpected reward omissions, which is supported by a further conditioning experiment by Dowd and Barch [18]. Of note these experiments did not require participants to make value based decision. The differences in these results and even behavioral studies emphasize the need for a more subtle understanding of what is going on; one cannot simply compare a saccade task with a reward learning task with a conditioning task. Clearly there are instances in which SZ patients show significantly different behavior and neural responses from normal controls and it is important to understand the nuances of what might be driving these differences in some cases, and why they lay dormant in others. In an attempt to account for abnormalities in reward learning, Gold et al. [19] compared performance of SZ patients and normal controls on a number of various tasks, including the International Affective Picture System ratings, delayed reward discounting, the Wisconsin card sorting task, rapid reversal learning and reinforcement learning paradigms. Their results indicated that processing deficits may be explained by an inability to fully represent value. Gold et al. [19] linked this internal representational difference to the differences in patients’ pleasantness ratings when they are asked to imagine a scene to when they are shown the scene as a picture. Patients displayed normal positive emotion when presented with visual stimuli, but displayed poor pleasantness ratings when asked to imagine a scene, suggesting further that reports of anhedonia in SZ patients may come down to testing procedures that require a level of value representation that is impaired in patients. A further avenue of research to discern the nature of cognitive impairments in SZ patients may be found by deconstructing the decision making process to further dissociate at what stage SZ patients differ from normal controls. In this study, we applied a drift diffusion model (DDM), to behavioral data from patients with SZ to understand the information processing mechanism of impaired learning and decision making. We hypothesized that SZ patients will be impaired at learning from negative feedback, as suggested by prior studies, although here, we use a DDM model that takes into account both accuracy and reaction time. Further, given prior studies and observations of general slowness and motor impairments (see [49]) in SZ patients, we hypothesize that patients will show a combination of increased response caution (favoring accuracy over speed) and slower motor execution time in comparison to controls. 1.1. Drift diffusion models When comparing task performance between groups, it is important to note that multiple decision components could differ among participants. Thus, for example, observing slower responses for the SZ group could be indicative of a difference in response caution rather than a deficit in reward learning. In these situations, reaction time models like the drift diffusion model (DDM) [20] can be fitted to data to circumvent this problem. Notably, DDM can include parameters that can be mapped on to psychological constructs, allowing researchers to make comparisons of the intactness or disruption of different decision components in ways not possible with behavioral data alone. Because DDM is mathematically specified, it makes precise predictions about how the different components relate to reaction time and accuracy. Importantly, this process can be inverted, whereby observed behavioral data are fitted with DDM to estimate the values of the decision components driving the behavior. This technique has been ubiquitously applied to investigate processing differences across a range of domains [21,22] (White et al.). By estimating the values of the decision components for each participant, researchers are able to make group comparisons of these psychologically meaningful parameters. There are two main advantages to a DDM analysis over traditional RT or accuracy comparisons: increased specificity and increased sensitivity. For specificity, the DDM allows identification of which decision components account for behavioral performance in the task. For example, slower RT could be due to slow motor response (non-decision time), increased caution (boundary separation), and/or poorer task performance (drift rates). The model allows these components to be separately estimated to disentangle how they contribute to the observed behaviour. For sensitivity, past work has shown that DDM parameters are more sensitive to small differences than RT or accuracy. Several studies have shown that DDM parameters can detect differences that are not significant in the behavioural data (see [23,24]). This is because DDM controls for the effects of each decision component, meaning that any differences in response caution or bias are controlled for when estimating task performance (i.e., drift rates, see [24,25]). For example, imagine that Participant A has poorer learning ability than Participant B, but is more cautious when responding. This could lead to equivalent accuracy between them, as the lower accuracy from poor learning is offset by the higher accuracy from increased caution. Thus comparing accuracy values alone is insufficient to detect processing differences between them. In contrast, using the DDM approach circumvents this problem because it estimates multiple components of the process simultaneously, allowing the conclusion that the participants differ in both caution and learning. In this regard, DDM provides a principled method for comparing different aspects of the decision process between SZ patients and controls. This DDM approach was employed in the present study to investigate which components differ between SZ patients and controls in the reward and punishment learning task. A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 Table 1 Demographic and clinical characteristics of the participants. Data is expressed as mean ± SD. The last row shows antipsychotic regimen of the patients with schizophrenia in our sample; the number represents the number of patients with administered the corresponding antipsychotic medication. The study was approved by the local ethics board. After complete description of the study, written informed consent was obtained. SZ (n = 37) CONT (n = 48) Statistical difference M/F 15/22 18/30 Age (years) 36.8 (10.2) 37.4 (11.0) Education (years) 12.6 (2.5) 12.8 (3.0) chisquare = 0.04, d.f. = 1 (p = 0.85) t = 0.34, d.f. = 83 (p = 0.74) t = 0.27, d.f. = 83 (p = 0.79) Duration of illness (years) Number of episodes GAF PANSS positive PANSS negative PANSS general Medications 8.3 (6.6) 5.4 (4.7) 55.0 (19.3) 12.6 (4.8) 15.9 (6.9) 27.3 (8.1) clozapine (3), olanzapine (11), risperidone (8), aripiprazole (2), amisulpride (2), haloperidol (2) flupenthixol (4), clozapine + haloperidol (2), quetiapine + risperidone (3) 2. Methods Participants were 37 patients with schizophrenia and 47 healthy control volunteers with no psychiatric history. The patients were recruited at Department of Psychiatry and Psychotherapy (Semmelweis University, Budapest, Hungary). The patients participated in a psychosocial rehabilitation program and were not in an acute psychotic state at the time of testing. The control volunteers were hospital or university employees and their acquaintances who were matched to the patients for age, gender, and education (Table 1, all p’s > 0.05). A diagnosis of schizophrenia was based on the DSMIV criteria [26]. All participants received the Mini International Neuropsychiatric Interview Plus (MINI-Plus) [27]. Detailed medical records were available from all patients. Subjects with substance or alcohol use disorders were excluded from the study. General functioning was assessed with the Global Assessment of Functioning (GAF) scale [26]. Clinical symptoms were evaluated with the Positive and Negative Symptoms Scale (PANSS) [28] (Table 1). These scales were administered by trained clinicians (Z.S. and S.K.) who were blind to reward- and punishment-learning data at the time of clinical assessment (inter-rater reliability: Cohen’s kappa and inter-rater correlation r > 0.7). Assessment of the patients was based on individual interviews with the patients and with one of their family members. Patients and controls were matched for tobacco smoking (30% of participants were heavy smokers in both groups) because smoking may have an influence on rewardlearning [29]. Antipsychotic medications used by the patients were shown in Table 1. The average daily value of chlorpromazineequivalent antipsychotic dose was 378.6 mg (S.D. = 236.0) [30]. 2.1. Reward vs. punishment learning task We used the same task as employed in prior studies [31–35]. On each trial, participants viewed one of four images (S1–S4) (Fig. 1), 149 and were asked to guess whether it belonged to category A or category B. Stimuli S1 and S3 belonged to category A with 80% probability and to category B with 20% probability, while stimuli S2 and S4 belonged to category B with 80% probability and to category A with 20% probability. Stimuli S1 and S2 were used in the reward-learning task. In this task if the participant correctly guessed category membership on a trial with either of these stimuli, a reward of +25 points was received; if the participant guessed incorrectly, no feedback appeared. Stimuli S3 and S4 were used in the punishment-learning task. In this task, if the participant guessed incorrectly on a trial with either of these stimuli, a punishment of −25 was received; correct guesses received no feedback. The experiment was conducted on a Macintosh i-book, programmed in the SuperCard language (Allegiant Technologies, San Diego, CA). The participant was seated in a quiet testing room at a comfortable viewing distance from the screen. The keyboard was masked except for two keys, labelled “A” and “B” which the participant could use to enter responses. Before the experiment, the participant received the following instructions: “In this experiment, you will be shown pictures, and you will guess whether those pictures belong to category “A” or category “B”. A picture does not always belong to the same category each time you see it. If you guess correctly, you may win points. If you guess wrong, you may lose points. You will see a running total of your points as you play. We will start you off with a few points now. Press the mouse button to begin practice.” In the practice phase, the participant received sample trials from the punishment- and reward-learning tasks. The practice stimuli were not included in the experiment. The participant saw a practice image, with a prompt to choose category A or B, and a running tally of points at the lower right corner of the screen. The tally was initialized to 500 points at the start of practice. The participant was first instructed to press the “A” key, which resulted in a punishment of −25 and updated point tally and then the “B” key, which resulted in no feedback. The participant then saw a second practice figure and was instructed first to press the “B” key which resulted in a reward of +25 and updated point tally and then the “A” key, which resulted in no feedback. After these two practice trials, a summary of instructions appeared: “So . . . For some pictures, if you guess CORRECTLY, you WIN points (but, if you guess incorrectly, you win nothing). For other pictures, if you guess INCORRECTLY, you LOSE points (but, if you guess correctly, you lose nothing). Your job is to win all the points you can–and lose as few as you can. Remember that the same picture does not always belong to the same category. Press the mouse button to begin the experiment.” From here, the experiment began. On each trial, the participant saw one of the four stimuli (S1, S2, S3, or S4) and was prompted to guess whether it was an “A” or a “B”. On trials in the reward-learning task (with stimuli S1 or S2), correct answers were rewarded with positive feedback and gain of 25 points; incorrect answers received no feedback. On trials in the punishment-learning task (with stimuli S3 or S4), incorrect answers were punished with negative feedback and loss of 25 points; correct answers received no feedback. The no-feedback outcome was thus ambiguous, as it could signal lack of reward (if received during a trial with S1 or S2) or lack of punishment (if received during a trial with S3 or S4). The task contained 160 trials, divided into 4 blocks of 40 trials each. Within a block, trial order was randomized. Training on the reward-learning task (S1 and S2) and punishment-learning task (S3 and S4) were intermixed. Within each block, each stimulus appeared 10 times, 8 times with the more common outcome (e.g. category “A” for S1 and S3 and “B” for S2 and S4) and 2 times with the less common outcome. At the end of the 160 trials, if the participant’s running tally of points was less than 525 (i.e. no more than the 500 points awarded 150 A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 Fig. 1. Screen shot of the reward-punishment task used in the present study. (A) On each trial, the participant saw one of four abstract shapes and was asked whether this shape belonged to category A or B. (B) For some stimuli (S1 and S2), correct responses were rewarded with visual feedback and 25 points winnings in 80% of the trials, whereas for other stimuli (S3 and S4), incorrect responses were punished with a loss of 25 points in 80% of the trials (see text for more details). at the start of the experiment), additional trials were added on which the participant’s response was always taken as correct, until the tally was at least 525. This was added in order to minimize frustration in participants by ensuring that all participants terminated the experiment with more points than they had started with. Data from any such additional trials were not analyzed. Here, we used the following timing parameters. The response window was on until participants respond; this allowed for investigating reaction time differences among the groups as well as different task conditions, which is important for DDM analyses. Trials were separated by 1-s intervals. Feedback was on for 1 s, after subject response. The task duration was about 12–15 min. On each trial, the computer recorded whether the participant made the optimal response (i.e. category A for S1 and S3, and category B for S2 and S4), regardless of actual outcome. 2.2. Drift diffusion model Importantly, DDM belongs to a class of evidence accumulation models that posit simple decisions involving the gradual accumulation of noisy evidence until a criterial amount is reached. In the model, the decision process starts between the two boundaries that correspond to the response options (Fig. 2). Over time, noisy evidence from the stimulus is sampled and accumulated until the process reaches a boundary, signaling the commitment to that response. The time taken to reach the boundary corresponds to the decision time, and the overall response time is given the decision time plus residual non-decision time. Non-decision time in the model (Ter), accounts for the duration of processes outside the decision itself, namely encoding of the stimulus and execution of the motor response. In addition to the non-decision time component, DDM has three primary components that affect decision processing. The distance between the two boundaries (a-0), gives indices of response caution or speed/accuracy settings. A wide boundary separation means that more evidence needs to be sampled to reach a boundary, so responses will be slower. But the decision process is also less likely to reach the wrong boundary due to noisy evidence, so responses are simultaneously more accurate. Thus, boundary separation indicates how much evidence is required before committing to the response and provides a measure of the speed/accuracy tradeoff. The starting point of evidence accumulation (z), indicates a response bias for one option over the other. If the starting point is closer to one boundary, less evidence is required to reach that decision than the alternative. Thus if the starting point is closer to boundary A, responses for Option A will be more probable and faster than for Option B. Finally, the drift rate (v) gives an index of the direction and strength of the stimulus evidence driving the accumulation process. Positive values of drift rate indicate evidence for Option A and negative values indicate evidence for Option B. Further, a large absolute value of drift rate indicates very strong evidence for that option, which will result in fast responses and a high probability of choosing that option. The drift rate is tied to the task at hand, in this case it would indicate how well the participant has learned to correctly classify the stimuli after learning the reward and punishment contingencies. A DDM was fitted to each participant’s behavioral data using the X2 method (R. [36]). The .1, .3, .5, .7, and .9 quantiles of the reaction time distribution were calculated for both correct and error responses to represent the shape of the distributions. These quantiles were entered into the fitting routine along with the choice probabilities. Then the fitting routine uses a simplex algorithm [37] to adjust the parameter values and find the ones that provide the closest match to the observed data (by minimizing the X2 value). This process allows for the estimation of the different decision components in the DDM. A standard DDM was used in this study, with the following parameters estimated in the fitting routine: boundary separation (a), non-decision time (ter), starting point (z), and drift rates (v), for each condition. Additionally, DDM often includes variability parameters to account for the fact that these values can vary from trial to trial (see R. [38]). However, these variability parameters are not well-estimated when there are limited observations (160) in the data, so they were excluded when fitting the model. The relatively low number of observations in each learning block (40) also precludes fitting the model to the learning blocks separately to assess how the parameters change over the course of the learning blocks. That is, the 40 observations (10 per condition) in a block are not sufficient to accurately estimate DDM parameters. Thus, the model was fit to the overall data to investigate broad level differences in performance between patients and controls. To provide a thorough account of the data and statistical results, each between-group comparison was presented with the t-value, 95% confidence interval, and the Bayes Factor (BF). The latter one was derived using an online package for a Bayesian t-test [50], with the effect size set at .5 (small to moderate effect). Calculating BF A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 151 Fig. 2. Schematic of the drift diffusion model (DDM). See text for details. from these tests provides the relative evidence for or against the null hypothesis (i.e., no difference between groups). For example, BF of 3 in favor of the alternative hypothesis indicates that the alternative hypothesis (effect size of .5) is 3 times more likely than the null hypothesis based on the data. In general, BF of less than 3 indicates weak evidence, BF of 3-8 indicates moderate evidence, and BF > 10 indicates strong evidence. It should be noted that BF has the advantages of quantifying the strength of evidence and permitting evidence for the null hypothesis, the latter of which is not possible with traditional frequentist statistics. 3. Results 3.1. Behavioral results The choice probabilities and reaction time quantiles, averaged across all four blocks of the experiment, were shown in Fig. 3. For the left panel, the choice probabilities show that both groups learned to differentiate the stimuli, as S1 and S3 were more likely to be categorized as Response A than S2 and S4. In brief, there were no significant differences in response proportions between patients and controls (all p’s > .4). The right panel shows the reaction time quantiles collapsed across all responses. As the figure shows, SZ patients were significantly slower than controls (median reaction time: t(83) = 2.73, p = .008, 95% CI: [85.4, 542.3], Cohen’s d = .599). The resulting BF was 5.8 in favor of the alternative, indicating moderate evidence for a between-group difference in RT. These behavioral data were interpreted below through the use of DDM parameters. 3.2. Computational results An important consideration when using DDM to decompose data is that the estimated parameter values are only interpretable if the model successfully fits the data. To assess this, the predicted data from the best fitting parameters is plotted against the observed data in Fig. 4. The primary criterion for assessing model fit in this situation is assessing whether the model predictions matchup against the observed behavioral data. The figure shows that the model captured the choice probabilities and reaction time quantiles well, supporting the interpretation of the estimated parameters. Further, the best fitting X2 values did not differ between SZ patients and controls (p = .27), and were in the range of fitting values from similar studies (e.g., [24]). The DDM parameters were shown in Fig. 4 for SZ patients and controls. All parameter comparisons used a simple t-test to assess differences between patients and controls. The results show that multiple decision components differed between the groups. Nondecision time was significantly slower for SZ patients (t(83) = 2.94, p = .004, 95% CI: [.035,.180], Cohen’s d = .645), indicating slower encoding and/or motor execution time. The resulting BF was 9.24 in favor of the alternative, indicating moderate to strong evidence for a between-group difference in non-decision time. Boundary separation was likewise larger for SZ patients (t(83) = 3.27, p = .002, 95% CI: [.013,.054], Cohen’s d = .718), indicating more cautious speed/accuracy settings for the patients. The resulting BF was 20.5 in favor of the alternative, indicating very strong evidence for a between-group difference in boundary separation. For the starting point measure (z/a), higher values indicate a response bias for Response A. There was no significant difference in starting point between the groups (t(83)=.446, p = .657, 95% CI: [−.018,.028], Cohen’s d = .097). The resulting BF was 3.1 in favor of the null, indicating weak evidence for no between-group difference in response bias. The drift rates, which provide an index of how well participants learned to select the correct response, showed stronger evidence (better performance) for the control group in comparison with SZ patients. The drift rate measure in Fig. 4 was given as a discriminability measure separately for reward and punishment trials: larger values indicate a better ability to correctly match the stimulus with the response (S1 goes with response A, S2 goes with response B, etc.). A mixed-ANOVA was conducted with group (SZ, control) as the between-subject factor and condition (reward, punishment) as the within-subject factor. The ANOVA showed a trend for the main effect of group, with lower drift rates in the SZ patients (f(1,164) = 3.62, p = .059), but no main effect of condition (f(1,164) = .018, p = .89) nor an interaction (f(1,164) = 1.27, p = .26). Planned comparisons of the drift rates showed significantly lower drift rates for SZ patients than controls for punishment trials (t(83) = 2.19, p = 032, 95% CI: [.005, .101], Cohen’s d = .482). The resulting BF was 2.02 for the alternative, indicating weak evidence for a between-group difference in drift rates for punishment trials. Conversely, drift rates on reward trials showed no reliable difference between SZ patients and controls (t(83)=.635, p = .527, 95% CI: [−.054,.104], Cohen’s d = .139). The resulting BF was 2.82 in favor of the null, indicating weak evidence for no between-group difference in drift rates for reward trials. 4. Discussion Overall the DDM analysis shows that multiple decision components differ between patients and controls. Patients with SZ had 152 A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 Fig. 3. Behavioral data from SZ patients and controls. Left panel shows response proportions for each of the four stimulus conditions. Right panel shows reaction time quantiles for all responses. Error bars represent 95% confidence intervals. slower encoding/motor time, more cautious speed/accuracy settings, and a relative deficit in learning to avoid the worse choice for punishment trials. However, it should be noted that the evidence for the learning deficit in punishment trials was weak and further studies will be needed to understand how robust it is. Moreover, the group differences were strongest (as assessed by BF) for the non-decision time component, suggesting that slower encoding and motor time is the primary determinant of the slower RTs in SZ patients. Overall, these results suggest a multi-faceted profile of differences driving performance in this reward/punishment learning task. A notable finding from these results is that although SZ patients did not have different response proportions compared to controls, they had a significantly weaker drift rate from the DDM analysis Fig. 4. DDM parameters averaged across participants. Error bars represent 95% confidence intervals. z/a refers to starting point measure. A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 for punishment trials. This discrepancy is likely driven by the group difference in response caution: SZ patients were significantly more cautious in their responding, which leads to slower reaction time but also higher accuracy. Thus, our results suggest that the increase in caution might be a compensatory strategy to improve accuracy at the expense of response speed. Importantly, once this difference in response caution was controlled by the DDM analysis, a specific deficit in learning for punishment trials was observed in the drift rates. These findings point to the importance of controlling for differences in decision components that affect the behavioral measures of choice probabilities and reaction time (see [25]). This study is the first to apply DDM to behavioral data from SZ patients in a probabilistic learning task. The model-based analysis has several advantages over traditional comparisons of reaction time and accuracy. First, the model accounts for all of the behavioral data simultaneously, including accuracy values and reaction time distributions for correct and error responses. Thus, the full set of behavioral data is taken into consideration when estimating the DDM parameters. Second, the model provides more specificity in the analyses, as the values of the different decision components can be compared separately, such as response caution (speed/accuracy trade-off), response bias for one of the options (starting point), average rate at which information accumulates (drift rate) and nondecision time (encoding/motor time). Finally, the model provides more sensitivity to detect processing differences, as extraneous effects of the other parameters (e.g., response caution is controlled for when estimating the drift rates). Using DDM, we found that SZ patients had slower encoding/motor time, were more cautious in responding, and had a specific deficit in learning to avoid in the punishment trials in comparison to healthy control subjects. As mentioned above, increased response caution for patients could be indicative of a compensatory strategy whereby accuracy is improved at the expense of response speed. This compensatory strategy allowed the patients to perform comparably to controls in terms of the choice accuracy. However, once this difference in caution is taken into account through DDM, a deficit in learning to select the appropriate response on punishment trials (but not reward trials) still remains. This suggests a relative deficit in punishment learning associated with SZ. The punishment learning deficit revealed in this study has been reported previously [42]; however there is also one study, which did not confirm these findings (Waltz et al.). Although the deficit in drift rates for punishment trials was significant with traditional frequentist analysis, the evidence for this deficit in punishment learning was relatively weak based on the Bayesian analysis (BF = 2.0). This weak effect might partially account for why punishment learning deficits are detected in some studies but not others. Future studies will be important for determining how reliable and robust this effect is. Interestingly, the model revealed that this deficit is not necessarily caused by abnormal value estimation. Summerfield and Koechlin [39] showed that reward and punishment valences bias the decision starting point of normal participants. Similarly, in this study participants’ starting point was biased by reward, but SZ patients’ starting point did not differ significantly from controls. An important difference between the two paradigms is that in the paradigm developed by Summerfield and Koechlin, participants were explicitly told the value of each stimulus before the decision, whereas in the current study it had to be learned. It appears that at least the implicit learning of value did not appear to bias SZ patients’ decisions differently to healthy controls in this study. It is possible that our results are due to medication effect. It is difficult to dissociate medication from disease effects [40], as patients are often medicated in most cognitive studies, and medications used vary from a study to another. These can possibly explain conflicting results in the literature. For example, Waltz et al. 153 [41] found that schizophrenia patients showed diminished reward learning compared to controls. Some other studies have reported intact learning from reward but impaired learning from punishment in schizophrenia patients [42]. Thus, it is possible that our results are due to medications used rather than schizophrenia itself. We found that longer non-decision time, larger boundaries and slower drift rate all contribute to significantly slower reaction times in SZ patients. These slower reaction times have been reported across a number of different studies in SZ, thereby suggesting a more general deficit [43,44]. Baving et al. [45] suggested that patients with SZ show decision making impairment as they do not retrieve information about the potential options, which would perhaps account for the longer non-decision time, but might also explain the slower drift rate. Given the evidence for reduced errorrelated negativity and response negativity in SZ patients [44], there is a theme within the literature of a reduced evidence accumulation, or at least a reduced “automatic retrieval” of this evidence for accumulation, which is in agreement with the suggestion made by Gold et al. that decision making deficits in SZ patients is a result of being unable to ‘fully represent’ the value of an outcome. Thus, SZ patients may display a compensatory strategy of widening their boundaries to allow more, slower, evidence accumulation, resulting in slower reaction times in order to improve accuracy. To our knowledge, this is the first study to apply DDM to behavioral data from schizophrenia patients. Applying DDM may enable to provide a more detailed explanation of the nature of decision making processes in SZ. In this study, SZ patients did not differ significantly from normal controls in the way their starting point was biased by the reward associated with the decision, rather the deficit appeared to manifest in accessing this information, at the evidence accumulation phase and the action planning phase. The results of this study emphasize the need for further research into this area and an examination of how these cognitive deficits may interact with the positive and negative symptoms of SZ. Especially, it has been shown that sensitivity to feedback valence in the striatum is predictive of negative symptom severity (such as avolition or anhedonia), rather than to a diagnosis of schizophrenia itself [46,47]. Moreover, an important modulator of decision speed is motivational salience, thus negative symptoms severity may additionally, influence slower reaction times observed in schizophrenia patients [48]. Future studies could benefit from employing the DDM approach used in the present study. For example, it is unclear to what extent SZ patients show processing deficits in non-learning tasks; testing SZ patients in a perceptual discrimination task and using DDM to analyze the data could shed more light on the profile of cognitive processes associated with SZ. Future studies can address this question and a range of others by using the DDM approach from the present study. References [1] Heinrichs RW, Zakzanis KK. Neurocognitive deficit in schizophrenia: a quantitative review of the evidence. Neuropsychology 1998;12(3):426–45. [2] Kahn RS, Keefe RS. Schizophrenia is a cognitive illness: time for a change in focus. JAMA Psychiatr 2013;70(10):1107–12. [3] Green MF, Kern RS, Braff DL, Mintz J. Neurocognitive deficits and functional outcome in schizophrenia: are we measuring the right stuff? Schizophr Bull 2000;26(1):119–36. [4] Velligan DI, Bow-Thomas CC, Mahurin RK, Miller AL, Halgunseth LC. Do specific neurocognitive deficits predict specific domains of community function in schizophrenia? J Nerv Ment Dis 2000;188(8):518–24. [5] Brambilla P, Perlini C, Bellani M, Tomelleri L, Ferro A, Cerruti S, et al. Increased salience of gains versus decreased associative learning differentiate bipolar disorder from schizophrenia during incentive decision making. Psychol Med 2013;43(3):571–80. [6] Kester HM, Sevy S, Yechiam E, Burdick KE, Cervellione KL, Kumra S. Decisionmaking impairments in adolescents with early-onset schizophrenia. Schizophr Res 2006;85(1-3):113–23. 154 A.A. Moustafa et al. / Behavioural Brain Research 291 (2015) 147–154 [7] Lee Y, Kim YT, Seo E, Park O, Jeong SH, Kim SH, et al. Dissociation of emotional decision-making from cognitive decision-making in chronic schizophrenia. Psychiatr Res 2007;152(2-3):113–20. [8] Simon JJ, Biller A, Walther S, Roesch-Ely D, Stippich C, Weisbrod M, et al. Neural correlates of reward processing in schizophrenia—relationship to apathy and depression. Schizophr Res 2010;118(1-3):154–61. [9] Shurman B, Horan WP, Nuechterlein KH. Schizophrenia patients demonstrate a distinctive pattern of decision-making impairment on the Iowa Gambling Task. Schizophr Res 2005;72(2-3):215–24. [10] Bechara A, Damasio AR, Damasio H, Anderson SW. Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 1994;50(1-3):7–15. [11] Beninger RJ, Wasserman J, Zanibbi K, Charbonneau D, Mangels J, Beninger BV. Typical and atypical antipsychotic medications differentially affect two nondeclarative memory tasks in schizophrenic patients: a double dissociation. Schizophr Res 2003;61(2-3):281–92. [12] Ritter LM, Meador-Woodruff JH, Dalack GW. Neurocognitive measures of prefrontal cortical dysfunction in schizophrenia. Schizophr Res 2004;68(1):65–73. [13] Juckel G, Schlagenhauf F, Koslowski M, Filonov D, Wustenberg T, Villringer A, et al. Dysfunction of ventral striatal reward prediction in schizophrenic patients treated with typical, not atypical, neuroleptics. Psychopharmacology (Berl) 2006;187(2):222–8. [14] Juckel G, Schlagenhauf F, Koslowski M, Wustenberg T, Villringer A, Knutson B, et al. Dysfunction of ventral striatal reward prediction in schizophrenia. Neuroimage 2006;29(2):409–16. [15] Morris SE, Holroyd CB, Mann-Wrobel MC, Gold JM. Dissociation of response and feedback negativity in schizophrenia: electrophysiological and computational evidence for a deficit in the representation of value. Front Hum Neurosci 2011;5:123. [16] Polli FE, Barton JJ, Thakkar KN, Greve DN, Goff DC, Rauch SL, et al. Reduced error-related activation in two anterior cingulate circuits is related to impaired performance in schizophrenia. Brain 2008;131(Pt 4):971L 986. [17] Waltz JA, Schweitzer JB, Gold JM, Kurup PK, Ross TJ, Salmeron BJ, et al. Patients with schizophrenia have a reduced neural response to both unpredictable and predictable primary reinforcers. Neuropsychopharmacology 2009;34(6):1567–77. [18] Dowd EC, Barch DM. Pavlovian reward prediction and receipt in schizophrenia: relationship to anhedonia. PLoS One 2012;7(5):e35622. [19] Gold JM, Waltz JA, Prentice KJ, Morris SE, Heerey EA. Reward processing in schizophrenia: a deficit in the representation of value. Schizophr Bull 2008;34(5):835–47. [20] Ratcliff R. A theory of memory retrieval. Psychol Rev 1978;85:59–108. [21] Gomez P, Perea M. Decomposing encoding and decisional components in visual-word recognition: a diffusion model analysis. Q J Exp Psychol (Hove) 2014;67(12):2455–66. [22] Petrov AA, Van Horn NM, Ratcliff R. Dissociable perceptual-learning mechanisms revealed by diffusion-model analysis. Psychon Bull Rev 2011;18(3):490–7. [23] Pe ML, Vandekerckhove J, Kuppens P. A diffusion model account of the relationship between the emotional flanker task and rumination and depression. Emotion 2013;13(4):739–47. [24] White CN, Ratcliff R, Vasey MW, McKoon G. Anxiety enhances threat processing without competition among multiple inputs: a diffusion model analysis. Emotion 2010;10(5):662–77. [25] White CN, Ratcliff R, Vasey MW, McKoon G. Using diffusion models to understand clinical disorders. J Math Psychol 2010;54(1):39–52. [26] American Psychiatric Association, DSM-IV: Diagnostic and Statistical Manual of Mental Disorders, fourth ed. American Psychiatric Association, Washington, DC, 1994. [27] Sheehan DV, Lecrubier Y, Sheehan KH, Amorim P, Janavs J, Weiller E, et al. The Mini-International Neuropsychiatric Interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J Clin Psychiatr 1998;59(Suppl 20):22–33, quiz 34-57. [28] Kay SR, Fiszbein A, Opler LA. The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr Bull 1987;13(2):261–76. [29] Yip SW, Sacco KA, George TP, Potenza MN. Risk/reward decision-making in schizophrenia: a preliminary examination of the influence of tobacco smoking and relationship to Wisconsin Card Sorting Task performance. Schizophr Res 2009;110(1-3):156–64. [30] Woods SW. Chlorpromazine equivalent doses for the newer atypical antipsychotics. J Clin Psychiatr 2003;64(6):663–7. [31] Bodi N, Keri S, Nagy H, Moustafa A, Myers CE, Daw N, et al. Reward-learning and the novelty-seeking personality: a between- and within-subjects study of the effects of dopamine agonists on young Parkinson’s patients. Brain 2009;132(Pt 9):2385L 2395. [32] Keri S, Moustafa AA, Myers CE, Benedek G, Gluck MA. {alpha}Synuclein gene duplication impairs reward learning. Proc Natl Acad Sci 2010;107(36):15992–4. [33] Moustafa AA, Krishna R, Eissa AM, Hewedi DH. Factors underlying probabilistic and deterministic stimulus-response learning performance in medicated and unmedicated patients with Parkinson’s disease. Neuropsychology 2013;27(4):498–510. [34] Myers CE, Moustafa AA, Sheynin J, Vanmeenen KM, Gilbertson MW, Orr SP, et al. Learning to obtain reward, but not avoid punishment, is affected by presence of PTSD symptoms in male veterans: empirical data and computational model. PLoS One 2013;8(8):e72508. [35] Somlai Z, Moustafa AA, Keri S, Myers CE, Gluck MA. General functioning predicts reward and punishment learning in schizophrenia. Schizophr Res 2011. [36] Ratcliff R, Tuerlinckx F. Estimation of the parameters of the diffusion model: approaches to dealing with contaminant reaction times and parameter variability. Psychon Bull Rev 2002;9:438–81. [37] Nelder JA, Mead R. A simplex method for function minimization nelder. Computer J 1965:308–13. [38] Ratcliff R, McKoon G. The diffusion decision model: theory and data for twochoice decision tasks. Neural Comput 2008;20(4):873–922. [39] Summerfield C, Koechlin E. Economic value biases uncertain perceptual choices in the parietal and prefrontal cortices. Front Hum Neurosci 2010;4:208. [40] Foerde K, Poldrack RA, Khan BJ, Sabb FW, Bookheimer SY, Bilder RM, et al. Selective corticostriatal dysfunction in schizophrenia: examination of motor and cognitive skill learning. Neuropsychology 2008;22(1):100–9. [41] Waltz JA, Frank MJ, Wiecki TV, Gold JM. Altered probabilistic learning and response biases in schizophrenia: behavioral evidence and neurocomputational modeling. Neuropsychology 2011;25(1):86–97. [42] Fervaha G, Agid O, Foussias G, Remington G. Impairments in both reward and punishment guided reinforcement learning in schizophrenia. Schizophr Res 2013;150(2-3):592–3. [43] Hutton SB, Murphy FC, Joyce EM, Rogers RD, Cuthbert I, Barnes TR, et al. Decision making deficits in patients with first-episode and chronic schizophrenia. Schizophr Res 2002;55(3):249–57. [44] Morris SE, Heerey EA, Gold JM, Holroyd CB. Learning-related changes in brain activity following errors and performance feedback in schizophrenia. Schizophr Res 2008;99(1-3):274–85. [45] Baving L, Wagner M, Cohen R, Rockstroh B. Increased semantic and repetition priming in schizophrenic patients. J Abnorm Psychol 2001;110(1): 67–75. [46] Waltz JA, Kasanova Z, Ross TJ, Salmeron BJ, McMahon RP, Gold JM, et al. The roles of reward, default, and executive control networks in set-shifting impairments in schizophrenia. PLoS One 2013;8(2):e57257. [47] Waltz JA, Schweitzer JB, Ross TJ, Kurup PK, Salmeron BJ, Rose EJ, et al. Abnormal responses to monetary outcomes in cortex, but not in the basal ganglia, in schizophrenia. Neuropsychopharmacology 2010;35(12):2427–39. [48] Avila I, Lin SC. Motivational salience signal in the basal forebrain is coupled with faster and more precise decision speed. PLoS Biol 2014;12(3): e1001811. [49] Midorikawa A, Hashimoto R, Noguchi H. Impairment of motor dexterity in schizophrenia assessed by a novel finger movement test. Psychiat Res 2008;159(3):281–9, http://dx.doi.org/10.1016/j.psychres.2007.04.004. [50] Morey RD, Rouder JN. Bayes factor approaches for testing interval null hypotheses. Psychol Methods 2011;16:406–19.
© Copyright 2026 Paperzz