Active Bayesian Associative Learning John K. Kruschke Indiana University April, 2008 J. K. Kruschke: Active Learning Background: A typical associative learning task, and a phenomenon from associative learning: “Backward Blocking”. A traditional (non-Bayesian) model: The Rescorla-Wagner model. J. K. Kruschke: Active Learning Typical Learning Task Learn which foods produce or prevent nausea: ? J. K. Kruschke: Active Learning Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 Note: Cell with a 1 indicates presence of cue or outcome. Cell with a 0 indicates absence of cue or outcome. J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Interim Test (no corrective feedback): Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Interim Test (no corrective feedback): Please predict nausea or no nausea: Usual response is middling prediction of nausea. ? J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Actual result: J. K. Kruschke: Active Learning Final Test (no corrective feedback): Please predict nausea or no nausea: ? J. K. Kruschke: Active Learning Final Test (no corrective feedback): Please predict nausea or no nausea: Usual response is reduced prediction of nausea. This result (with its training structure) is called “backward blocking”. ? J. K. Kruschke: Active Learning Active Learning after Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 After being trained on backward blocking, your goal is to better determine which cues produce or prevent nausea. Which probe below do you prefer to test? J. K. Kruschke: Active Learning Associative Models: t w1 c1 J. K. Kruschke: Active Learning Outcome: Modeled as a combination of weighted cues. w2 c12 Associative Weights Cues A classic non-Bayesian approach: The Rescorla-Wagner model. T a i wi ci w c w1 c1 J. K. Kruschke: Active Learning w2 c12 Learning in the Rescorla-Wagner model. T a i wi ci w c w1 c1 J. K. Kruschke: Active Learning w2 c2 T w t w c c Weight Space: Learning is change from one position to another. J. K. Kruschke: Active Learning w2 -1.5 +1.5 -1.0 +1.5 -0.5 +1.5 0.0 +1.5 +0.5 +1.5 +1.0 +1.5 +1.5 +1.5 -1.5 +1.0 -1.0 +1.0 -0.5 +1.0 0.0 +1.0 +0.5 +1.0 +1.0 +1.0 +1.5 +1.0 -1.5 +0.5 -1.0 +0.5 -0.5 +0.5 0.0 +0.5 +0.5 +0.5 +1.0 +0.5 +1.5 +0.5 -1.5 0.0 -1.0 0.0 -0.5 0.0 0.0 0.0 +0.5 0.0 +1.0 0.0 +1.5 0.0 -1.5 -0.5 -1.0 -0.5 -0.5 -0.5 0.0 -0.5 +0.5 -0.5 +1.0 -0.5 +1.5 -0.5 -1.5 -1.0 -1.0 -1.0 -0.5 -1.0 0.0 -1.0 +0.5 -1.0 +1.0 -1.0 +1.5 -1.0 -1.5 -1.5 -1.0 -1.5 -0.5 -1.5 0.0 -1.5 +0.5 -1.5 +1.0 -1.5 +1.5 -1.5 w1 Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 Note: Cell with a 1 indicates presence of cue or outcome. Cell with a 0 indicates absence of cue or outcome. J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning End of Phase I. J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning Rescorla-Wagner model does not show backward blocking. J. K. Kruschke: Active Learning Rescorla-Wagner model does not show backward blocking. And, has no way to predict active learning preferences. J. K. Kruschke: Active Learning Design of the (rest of the) Talk: 2 x 2 x 3 factorial Bayesian models of associative learning: Kalman filter. Noisy-logic gate. Goals for active learning: Minimize expected uncertainty. Maximize expected maximum believability. Training structures: Backward blocking. Blocking and reduced overshadowing. Ambiguous cue. J. K. Kruschke: Active Learning The Bayesian Ontology Multiple hypotheses entertained simultaneously. Not just a single state as in traditional models. Graded degree of belief in each hypothesis. Hypotheses specify the probability of each possible stimulus/outcome value. Not deterministic as in traditional models. Learning is shifting of beliefs from less likely to more likely hypotheses. Bayes’ rule specifies how. J. K. Kruschke: Active Learning Bayesian Learning In the Mind Start with: p(t | c , w) p(w) A model that specifies the probability of outcome value t, given cues c and hypothetical weights w. Prior believabilities of every possible hypothetical weight combination w. J. K. Kruschke: Active Learning Bayesian Learning In the Mind Observe: p(t | c , w) p(w) c J. K. Kruschke: Active Learning Cues c. Actual outcome t. t Bayesian Learning In the Mind Learn: p(t | c , w) p(w) c J. K. Kruschke: Active Learning Update beliefs by Bayes’ rule. p(t | c , w) p( w) p( w | t , c ) dw p(t | c , w) p(w) t Bayesian Learning In the Mind Learn: p(t | c , w) p(w) c Update beliefs by Bayes’ rule. p(t | c , w) p( w) p( w | t , c ) dw p(t | c , w) p(w) t Everyday Bayesian reasoning: Logic of exoneration. Holmesian deduction. J. K. Kruschke: Active Learning Kalman Filter T pt | c , w ~ N t | w c , v w1 c1 J. K. Kruschke: Active Learning w2 c2 pw ~ N w | , C Kalman Filter Updating: Step 1. Linear Dynamics w1 w2 pw ~ N w | , C * D c1 J. K. Kruschke: Active Learning c2 C DCD U * T Kalman Filter Updating: Step 2. Bayesian Learning w1 w2 c1 J. K. Kruschke: Active Learning * * pw ~ N w | , C * T * 1 *T * ' v c C c t c C c T * 1 * T * * c2 C ' C v c C c C c c C Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 Note: Cell with a 1 indicates presence of cue or outcome. Cell with a 0 indicates absence of cue or outcome. J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning Kalman filter does show backward blocking. J. K. Kruschke: Active Learning Noisy-logic gate Outcome is present/absent. Positive weight is probability that outcome occurs if that cue alone is present. w1 c1 J. K. Kruschke: Active Learning w2 c2 Negative weight is probability that outcome does not occur, when otherwise it would have, if that cue alone is present. Noisy-logic gate cj ci pt 1 | c , w 1 (1 wi ) (1 w j ) .t . wi 0 j s .t . w j 0 i s noisy OR noisy NOT AND w1 c1 J. K. Kruschke: Active Learning w2 c2 pw ~ ... Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 Note: Cell with a 1 indicates presence of cue or outcome. Cell with a 0 indicates absence of cue or outcome. J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning Noisy logic gate does show backward blocking. J. K. Kruschke: Active Learning Design of the (rest of the) Talk: 2 x 2 x 3 factorial Bayesian models of associative learning: Kalman filter. Noisy-logic gate. Goals for active learning: Minimize expected uncertainty. Maximize expected maximum believability. Training structures: Backward blocking. Blocking and reduced overshadowing. Ambiguous cue. J. K. Kruschke: Active Learning Active Learning after Backward Blocking Phase Frequency Cue 1 Cue 2 Outcome I 10 1 1 1 II 10 1 0 1 After being trained on backward blocking, your goal is to better determine which cues produce or prevent nausea. Which probe below do you prefer to test? J. K. Kruschke: Active Learning Uncertainty of a Belief Distribution (a.k.a. entropy) Spread out: high uncertainty. J. K. Kruschke: Active Learning Piled up: low uncertainty. Uncertainty of a Belief Distribution (a.k.a. entropy) Spread out: high uncertainty. Piled up: low uncertainty. A candidate goal for active learning: Belief distribution with low uncertainty. J. K. Kruschke: Active Learning Uncertainty of a Belief Distribution (a.k.a. entropy) Spread out: high uncertainty. Piled up: low uncertainty. U w p( w) log p( w) w J. K. Kruschke: Active Learning Current Uncertainty is computed from current belief distribution. J. K. Kruschke: Active Learning For a candidate probe, we compute the expected uncertainty: EU w (c ) da p(a | c ) U w|a ,c J. K. Kruschke: Active Learning where p(a | c ) dw p(a | c , w) p( w) = .96*0.566 + .04*.624 = .40*0.559 + .60*.532 Probe: 1 0 p = .96 outcome = 1 J. K. Kruschke: Active Learning Probe: 0 1 p = .04 outcome = 0 p = .40 p = .60 outcome = 1 outcome = 0 Active Learning after Backward Blocking Both Kalman filter and noisy-logic gate match human intuition when minimizing Expected Uncertainty. J. K. Kruschke: Active Learning Maximum Probability of a Belief Distribution Small max p. Larger max p. A candidate goal for active learning: Belief distribution with high max p. J. K. Kruschke: Active Learning Max P and Minimal Entropy do not always correspond: Lower uncertainty. J. K. Kruschke: Active Learning Higher Max p. For a candidate probe, we compute the = .96*53 + .04*27.2 = .40*49.7 + .60*84.9 expected max p: Probe: 1 0 p = .96 outcome = 1 J. K. Kruschke: Active Learning Probe: 0 1 p = .04 outcome = 0 p = .40 p = .60 outcome = 1 outcome = 0 Active Learning after Backward Blocking Both Kalman filter and noisy-logic gate match human intuition when maximizing Expected Max P: J. K. Kruschke: Active Learning Design of the (rest of the) Talk: 2 x 2 x 3 factorial Bayesian models of associative learning: Kalman filter. Noisy-logic gate. Goals for active learning: Minimize expected uncertainty. Maximize expected maximum believability. Training structures: Backward blocking. Blocking and reduced overshadowing. Ambiguous cue. J. K. Kruschke: Active Learning Blocking and Reduced Overshadowing Phase Freq. Cue 1 Cue 2 Cue 3 Cue 4 Outc. I II J. K. Kruschke: Active Learning 6 6 1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 1 1 Active Learning after Blocking and Reduced Overshadowing Phase Freq. Cue 1 Cue 2 Cue 3 Cue 4 Outc. I II 6 6 1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 1 1 After being trained, your goal is to better determine which cues produce or prevent nausea. Which probe below do you prefer to test? J. K. Kruschke: Active Learning Kalman filter predictions for Expected Uncertainty… J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning ? ? Kalman filter predictions: No preference! Same for Max P. J. K. Kruschke: Active Learning Noisy-logic gate predictions for Expected Uncertainty… J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning ! Noisy logic gate predictions: Match human preference! Same for Max P. J. K. Kruschke: Active Learning Ambiguous Cue Remember: Foods can be anti-emetic, i.e., prevent nausea. Freq. Cue 1 Cue 2 Cue 3 Outcome 10 1 1 0 1 10 0 1 1 0 After being trained, your goal is to better determine which cues produce or prevent nausea. Which probe below do you prefer to test? J. K. Kruschke: Active Learning Kalman filter predictions for Expected Uncertainty… J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning Cue 2 is least preferred! J. K. Kruschke: Active Learning Noisy logic gate predictions for Expected Uncertainty… J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning J. K. Kruschke: Active Learning Cue 2 is most preferred! J. K. Kruschke: Active Learning Kalman filter Tied preference for 1 & 3 Noisy logic gate Preference for 2 Minimize Expected Uncertainty Tied preference for 1 & 3 Maximize Expected Max P J. K. Kruschke: Active Learning Tied preference for 1 & 2 Summary of Active Learning Predictions and Match to Human Preference Bayesian model of (passive) associative learning Kalman filter Goal for active learning Minimize Expected Uncertainty J. K. Kruschke: Active Learning Maximize Expected Max P Noisy logic gate (Backward) Blocking Blocking & Reduced Overshadow. Ambiguous Cue (Backward) Blocking Blocking & Reduced Overshadow. Ambiguous Cue (Backward) Blocking Blocking & Reduced Overshadow. Ambiguous Cue (Backward) Blocking Blocking & Reduced Overshadow. Ambiguous Cue The Future of (Associative) Learning Theory Bayesian model of (passive) associative learning Goal for active learning Kalman filter Noisy logic gate New Models … Minimize Expected Uncertainty New Training Structures New Training Structures New Training Structures Maximize Expected Max P New Training Structures New Training Structures New Training Structures New Goals … New Training Structures New Training Structures New Training Structures J. K. Kruschke: Active Learning The Future of (Associative) Learning Theory Bayesian model of (passive) associative learning Kalman filter Goal for active learning Noisy logic gate New Models … Rich hierarchical models. Other levels of analysis: New Training Structures New Training Structures neuron, functional module within individual, corporation. Minimize Expected Uncertainty New Training Structures Maximize Expected Max P New Training Structures New Training Structures New Training Structures New Goals … New Training Structures New Training Structures New Training Structures J. K. Kruschke: Active Learning The Future of (Associative) Learning Theory Bayesian model of (passive) associative learning Kalman filter Goal for active learning Minimize Expected Uncertainty New Training Structures Maximize Expected Max P New Training Structures Noisy logic gate New Models … Rich hierarchical models. Other levels of analysis: New Training Structures New Training Structures neuron, functional module within individual, corporation. New Training Structures New Training Structures Different information objectives. Incorporate utilities of expected outcomes. New Training Structures New Training Structures New Training Structures New Goals … Incorporate costs of different probes. Consider multiple steps ahead. J. K. Kruschke: Active Learning
© Copyright 2026 Paperzz