Dynamics of Reward and Stimulus Information in Human Decision

Dynamics of Reward and
Stimulus Information in Human
Decision Making
Juan Gao, Rebecca Tortell &
James L. McClelland
With inspiration from
Bill Newsome and Phil Holmes
Questions
• Can we track the time course of reward bias as
stimulus information is accumulated over time?
• How well can human participants adjust their
bias to optimize reward when optimal bias varies
over time?
• How do humans achieve the observed bias
effects?
• Can we distinguish between alternative
accounts of the data?
Design follows Rorie et al. but
there is a variable delay between
stimulus onset and go cue.
Reward cue signals which alternative
is worth 2 points 750 msec before
Stimulus onset.
Stimulus is a rectangle 1,3, or 5
pixels longer to the Left or Right.
Participant must respond within
250 msec of go cue.
4 or 5 Participants Show Reward
Bias Effect
Accuracy Analysis
Individual Differences
in Accuracy and Time
Parameters
Leak and Inhibition Dominant LCA:
Both can fit the d’ data
2-D inhibition-dominant LCA
can fit the data too
Final time slice
Optimal Criterion Placement
Optimal vs. Observed Bias Effects
Reward harvest rates
For short lags:
Reward bias in leak-dominant LCA
Reward as input
to the accumulators
Reward as offset to
initial conditions
Reward as constant
shift or shifted criterion
Like the data!
Excellent fits are obtained under leak dominance with
reward as a constant offset
But there are drawbacks to the
leak-dominant model
• Leak-dominance produces equivocal
decision states, while inhibition dominance
produces more categorical activations.
– These states may leave the participant better
prepared to respond when the signal comes.
• Evidence Juan will present later favors
inhibition dominance in similar paradigms
Reward Bias in Inhibition-dominant LCA (l < 0)
Reward as input
to the accumulators
Reward as offset to
initial conditions
Like the data!
Reward as constant
shift or shifted criterion
Simulation of Inhibition-Dominant LCA using
Parameters Derived from 1-D Reduction
Relationship between response
speed and choice accuracy
Different levels of activation of correct and
incorrect responses in Inhibition-dominant LCA
Final time slice
errors
correct
Preliminary Simulation of HighThreshold LCA
Conclusion and Future Directions
• Two viable models remain, though we favor the leak-dominant LCA
model.
– Juan Gao will present other evidence relevant to this later.
• There is evidence that the decision state remains continuous until
the response is made, consistent with the high-threshold model
– Further tests of the details of this model are necessary.
• We plan to examine whether the same approach can fit the primate
physiology data
• We look forward to seeing the activations of the accumulators in
Human MEG/EEG data