Sensor Fusion Rosalyn Moran Short Course on Bayesian Inference, Virginia Tech, 26th January 2012 Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Gaussian Probability Density new data p( y | ) prior knowledge p( ) p ( | y ) p ( y | ) p ( ) posterior likelihood ∙ prior Bayes theorem allows one to formally incorporate prior knowledge into computing statistical probabilities. The “posterior” probability of the parameters given the data is an optimal combination of prior knowledge and new data, weighted by their relative precision. Bayes Rule for Gaussians Parametric Techniques: Posit a specific functional form on a Probability density function The central limit theorem states that, (under general circumstances), the mean of M random variables tends to be normally distributed, in the limit as M tends to infinity Form specified in terms of adjustable parameters The mean, μ and variance σ2 x ∼ N(μ,σ2) Univariate, Case ( x )2 p( x) exp 2 1/ 2 2 (2 ) 2 1 Bayes Rule for Gaussians Parametric Techniques: Posit a specific functional form on a Probability density function Form specified in terms of adjustable parameters The mean, μ (vector) and covariance Σ (matrix) x2 X ∼ N(μ,Σ) x1 Σ = E[(X-μ)(X-μ)T] Multivariate, Case p( x ) 1 (2 ) x [ x1..xn ]T n/2 1/ 2 1 T 1 exp ( x ) ( x ) 2 Bayes Rule for Gaussians Coherence matrix = normalised covariance 1 Parametric Techniques: 0.8 5 Posit a specific functional form on a Probability density function 0.6 10 0.4 15 0.2 20 Form specified in terms of adjustable parameters The mean, μ (vector) and covariance Σ (matrix) -0.2 30 -0.4 -0.6 -0.8 40 E[(X-μ)(X-μ)T] Multivariate, Case 25 35 X ∼ N(μ,Σ) Σ= 0 5 p( x ) 1 (2 ) x [ x1..xn ]T n/2 1/ 2 10 15 20 25 30 35 1 T 1 exp ( x ) ( x ) 2 40 -1 Bayes Rule for Gaussians Theorem The Product of two Gaussian PDFs is also a Gaussian P(x1) P(x3) P(x2) x1 x2 x3 Bayes Rule for Gaussians Eg. Combining Multi Subject Probabilities via Bayesian Parameter Averaging for Fixed Effects Analysis of DCM Parameters Assuming: Likelihood distributions from different subjects are independent individual posterior covariances group posterior covariance C|1y1 ,..., y N | y ,..., y 1 group posterior mean N N 1 C | yi i 1 N 1 C | yi | yi C | y1 ,..., y N i 1 individual posterior covariances and means Bayes Rule for Gaussians Theorem The Product of two Gaussian PDFs is also a Gaussian P(x1) P(x3) P(x2) x1 x2 When forming a posterior using a likelihood this property is called conjugacy x3 Bayes Rule for Gaussians Moments of Posterior can be found analytically: Precision is inverse variance eg. a variance of 0.1 is a precision of 10. For a Gaussian prior with mean, μ0 and precision λ 0, and a Gaussian likelihood with mean μD and precision λ D the posterior is Gaussian with λ= λ0 + λD μ = (λ 0 / λ) μ0 + (λ D / λ) μD So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Bayes Rule for Gaussians So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Since: ( x 0 )2 , p( x0 ) exp 2 1/ 2 2 (20 ) 2 0 ( x D ) 2 1 p( D | x) exp 2 1/ 2 2 (2 D ) 2 D p(D | x) p( x0 ) p( x | D) p( D) 1 Bayes Rule for Gaussians So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Taking Logs: log p( x | D) log p(D | x) log p( x0 ) log p( D) 1 1 log p( x | D) 2 ( x2 d2 2xD ) 2 ( x2 02 2x0 ) 2 0 2 D Bayes Rule for Gaussians So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Some algebra: log p( x | D) log p(D | x) log p( x0 ) log p( D) 1 2 1 2 2 2 log p( x | D) 2 ( x d 2xD ) 2 ( x 0 2x0 ) 2 0 2 D 2 x2 1 1 0 D 0 D 2 2 x 2 2 2 2 2 0 D 0 D 2 0 2 D 1 2 ( x )2 2 Bayes Rule for Gaussians So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Matching Moments: 2 x2 1 D 1 1 0 D 0 2 2 2 x 2 2 x ( ) 2 2 2 2 0 D 0 D 2 0 2 D 2 1 1 2 2 2 0 D 1 Matching Coefficients in x2 λ= λ0 + λD Bayes Rule for Gaussians So, (1) precisions add and (2) the posterior mean is the sum of the prior and data means, but each weighted by their relative precision. Matching Moments: 2 x2 1 0 D 0 1 D 1 2 2 2 x 2 2 x ( ) 2 2 2 2 2 2 2 0 D D D 0 0 2 0 D 2 2 2 2 0 D 0 2 D 2 2 D 0 2 Matching Coefficients in x μ = (λ 0 / λ) μ0 + (λ D / λ) μD Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Visual and Haptic Sensors Ernst and Banks (Nature, 2002) asked subjects which of two sequentially presented blocks was the taller. Subjects used either vision alone, touch alone or a combination of the two. If vision v and touch h information are independent given the height x then, x p(v,h,x)=p(v|x)p(h|x)p(x) v Bayesian fusion of the sensory information produces a posterior density p(x|v,h) = [p(v|x)p(h|x)p(x)]/p(v,t) p(x|v,h) [p(v|x)p(h|x)p(x)] Then under the uniform prior p(x) = const. xˆ MAP xˆ ML arg max p ( x | v, h) x t Visual and Haptic Sensors Fusion through different noisy sensory channels: Bayes optimal Sensory Fusion: A means of estimation (producing the lowest-variance estimate) is to add the sensor estimates weighted by their normalised reciprocal variance And add the inverse variances to produce a more precise posterior estimate xˆ wi sˆi wv v wh h i 1 / i2 wi 1 / 2j v2 h2 2 v h2 2 vh Is this true for human agents? Visual and Haptic Sensors Testing Fusion through different noisy sensory channels: Probability Taller 1. Measure Unimodal Thresholds 84% Threshold Tv/h = 21/2 σv/h Point of Subjective Inequality ∆Height Visual and Haptic Sensors Testing Fusion through different noisy sensory channels: 2. Predict Bimodal weighting Probability Taller 1 / i2 xˆ wi sˆi : wi i 1 / 2j TH2 H2 wv TV2 V2 wh ∆Height TH2 Tv2 wv 2 , wH 2 2 TV TH TV TH2 Visual and Haptic Sensors Testing Fusion through different noisy sensory channels: 3. Predict Bimodal Threshold Probability Taller 1 / i2 xˆ wi sˆi : wi i 1 / 2j TH2 H2 wv TV2 V2 wh ∆Height TH2 Tv2 wv 2 , wH 2 2 TV TH TV TH2 2 TVH Tv2TH2 2 Tv TH2 Visual and Haptic Sensors Testing Fusion through different noisy sensory channels: 4. Present Bimodal Stimuli: Visually and Haptically specified heights differ by ∆ 5. Measure TVH Visual and Haptic Sensors 6. Repeat for Different levels of Visual Noise Visual and Haptic Sensors Tv2 TH2 , wH 2 wv 2 2 TV TH2 TV TH 2 Threshold (TVH) TVH Noise level (%) Tv2TH2 2 Tv TH2 Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Helmholtz Perception as Inference In Helmholtz’s view our percepts are our best guess as to what is in the world, given both sensory data and prior experience. He proposed that perception is Unconscious inference Eg.The Helmholtz Machine Dayan, Hinton, Neal & Zemel, 1995 The Free Energy Principle, Friston Perception as Inference Laboratory of Dale Purves MD http://purveslab.net/seeforyourself/ Perception as Inference Laboratory of Dale Purves MD http://purveslab.net/seeforyourself/ Perception as Inference Bumps and holes Bayes rule in perception, action and cognition Wolpert and Ghahramani, 2005 Perception as Inference Bumps and holes Flip the image! P(state|sensory input)=[P(sensory input|state)P(state)]/P(sensory input) Perception as Inference Bumps and holes P(state|sensory input)=[P(sensory input|state)P(state)]/P(sensory input) State 1 = light source State 2 = bumps & holes Implies hierarchy … see later Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Updating Prior Beliefs: Decision Making Dynamics Yu, Dayan & Cohen, 2009 In the Eriksen Flanker task subjects have to implement the following stimulus-response mappings Stimulus Response 1: HHH Right 2: SHS Right 3: SSS Left 4: HSH Left The subject should press the right button if the central cue is H and left if it is S. On trial type one and three the flankers are compatible (M = C) and on two and four they are incompatible (M = I). Decision Making Dynamics Generative Model: Given noisy pattern of visual inputs comprising the three letter stimuli s = [s1, s2, s3] (s1=s3) On each trial, assume Gaussian neuronal response, x by three populations xt : = [x1(t), x2(t), x3(t)] p(x|s)=p(x1|s1) p(x2|s2) p(x3|s3) =N [μ(s1), σ2 ] .N [μ(s2), σ2 ] .N [μ(s3), σ2 ] Yu, Dayan & Cohen, 2009 Decision Making Dynamics p(x|s)= N [μ(s1), σ2 ] .N [μ(s2), σ2 ] .N [μ(s3), σ2 ] Some numbers on Sensory Response: Assume that for stimuli H :s1= s3 μ(s1)= μ(s3) = -1 S :s1= s3 μ(s1)= μ(s3) = 1 Assume independence in successive observations of the stimulus when it is on-screen, ie accumulating evidence during perception: p(xt=1, xt=2 xt=3…xt=T | s)=p(xt=1 |s) p(xt=2 |s) … p(xt=T s) si xit=1 xit=2 Time xit= 3 Yu, Dayan & Cohen, 2009 Decision Making Dynamics For a stream of inputs, the ideal observer’s belief about the identity of the target s2 and compatibility M at time t, is a function of the belief at the previous time point, and the latest input: P ( s2 , M | X t ) p ( xt | s2, M ) P ( s2, M | X t 1 ) s2' M ' p ( xt | s '2, M ' ) P( s '2, M ' | X t 1 ) P ( s2 H | X t ) P ( s2 H , M C | X t ) P ( s2 H , M I | X t ) Initialise with P( s2, M | X o ) (0.5) * (0.5) Then the agent will update their beliefs based on sensory inputs and make a response eg. when P(s2=H|Xt) > q (some threshold probability) Yu, Dayan & Cohen, 2009 Decision Making Dynamics P ( s2 , M | X t ) p ( xt | s2, M ) P ( s2, M | X t 1 ) s2' M ' p ( xt | s '2, M ' ) P( s '2, M ' | X t 1 ) P ( s2 H | X t ) P ( s2 H , M C | X t ) P ( s2 H , M I | X t ) when P(s2=H|Xt) > q (some threshold) Make a response Yu, Dayan & Cohen, 2009 Compatibility Bias Hypothesis What if there was a compatibility bias? P(M = C) > 0.5 ? Initialise with P( s2, M | X o ) (0.5) * p ( M C ) p( M C ) 0.9 Then according to the same updates with these new priors: 1: HHH Right 2: SHS Right 3: SSS Left 4: HSH Left P(s2=H|Xt) Yu, Dayan & Cohen, 2009 Compatibility Bias Hypothesis What if there was a compatibility bias? P(M = C) > 0.5 ? Initialise with P( s2, M | X o ) (0.5) * p ( M C ) p( M C ) 0.9 Then according to the same updates with these new priors: 1: HHH Right 2: SHS Right 3: SSS Left 4: HSH Left P(s2=H|Xt) Yu, Dayan & Cohen, 2009 Compatibility Bias Hypothesis Is there a compatibility bias in this task in human agents? Initialise with P( s2, M | X o ) (0.5) * p ( M C ) p( M C ) 0.9 P(s2=H|Xt) Yu, Dayan & Cohen, 2009 Spatial Uncertainty Hypothesis What if each x also responded to neighbouring stimuli? Before: p(x|s)= N [μ(s1), σ2 ] .N [μ(s2), σ2 ] .N [μ(s3), σ2 ] Now: x1(t) ~ N [a1μ1 + a2μ2, σ12 + σ22] x2(t) ~ N [a1μ2 + a2μ1 + a2μ3 , σ12 + 2σ22] x3(t) ~ N [a1μ3 + a2μ2 , σ12 + σ22] If x is driven equally by all stimuli; a1 = a2, then no spatial discrimination: H/S response based on majority vote; ie. the flankers P ( s2 , M | X t ) Initialise with Some discrimination ability a1 : primary stimulus signal a2 : neighbouring stimulus signal σ1 : primary stimulus noise σ2 : neighbouring stimulus noise p ( xt | s2, M ) P ( s2, M | X t 1 ) s 2' M ' p ( xt | s '2, M ' ) P( s '2, M ' | X t 1 ) a1 1.7; a2 0.3; 1 6; 2 3.5; 0.5, 0.03, q 0.9 Spatial Uncertainty Hypothesis P ( s2 , M | X t ) p( xt | s2, M ) P( s2, M | X t 1 ) s 2' M ' p ( xt | s '2, M ' ) P( s '2, M ' | X t 1 ) Initialised with a1 1.7; a2 0.3; 1 6; 2 3.5; 0.5, 0.03, q 0.9 Vs No Spatial Uncertainty Yu, Dayan & Cohen, 2009 Outline Bayes Rule for Gaussians Sensory Integration Perception as Statistical Inference Updating over time A Brief Introduction to the Free Energy Principle Review Nature Reviews Neuroscience 11, 127-138 (February 2010) The free-energy principle: a unified brain theory? Karl Friston A free-energy principle has been proposed recently that accounts for action, perception and learning.This Review looks at some key brain theories in the biological (for example, neural Darwinism) and physical (for example, information theory and optimal control theory) sciences from the free-energy perspective. Crucially, one key theme runs through each of these theories — optimization. Furthermore, if we look closely at what is optimized, the same quantity keeps emerging, namely value (expected reward, expected utility) or its complement, surprise (prediction error, expected cost).This is the quantity that is optimized under the free-energy principle, which suggests that several global brain theories might be unified within a free-energy framework ... in the sensorium There was a particular sound The sound has dynamics determined by properties, Frequency, x1 and Amplitude x2 ~ y Sensations The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y … Why do we not walk into fire? There was a particular sound ln( p( ~ y )) min The sound has dynamics determined by properties, Frequency and Amplitude ~ y Sensations Predictions: DEM inversion of HDM EEG Responses DCM for ERPs The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y … Why do we not walk into fire? There was a particular sound F ln( p( ~ y m)) The sound has dynamics determined by properties, Frequency and Amplitude ~ y Sensations Predictions: DEM inversion of HDM EEG Responses DCM for ERPs The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y …What does this tell us about the brain? There was a particular sound The sound has dynamics determined by properties, Frequency and Amplitude F ln( p( ~ y m)) F ln( p( ~ y m)) D(q( ) p( ~ y )) This implies A generative model, m ~ y Sensations An ensemble/recognition density, q The probability of an environmental state given the brain state The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y …What does this tell us about the brain? There was a particular sound F ln( p( ~ y m)) F ln( p( ~ y m)) D(q( ) p( ~ y )) min The sound has dynamics determined by properties, Frequency and Amplitude This implies A generative model, m ~ y Sensations An ensemble/recognition density, q The probability of an environmental state given the brain state Perception: Minimising divergence between recognition density and posterior probability of causes, move towards the bound The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y …What does this tell us about the brain? There was a particular sound F ln( p( ~ y m)) F ln( p( ~ y m)) D(q( ) p( ~ y )) min The sound has dynamics determined by properties, Frequency and Amplitude This implies Action: Resampling the environment A generative model, m ~ y Sensations An ensemble/recognition density, q The probability of an environmental state given the brain state Perception: Minimising divergence between recognition density and posterior probability of causes, move towards the bound The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y …What does this tell us about the brain? There was a particular sound F ln( p( ~ y m)) F ln( p( ~ y m)) D(q( ) p( ~ y )) min The sound has dynamics determined by properties, Frequency and Amplitude This implies A generative model, m ~ y Sensations An ensemble/recognition density, q The probability of an environmental state given the brain state Perception: Minimising divergence between recognition density and posterior probability of causes, move towards the bound The Tautology : Biological agents act/perceive to preclude phase transitions: minimise entropy of y …What does this tell us about the brain? There was a particular sound F ln( p( ~ y m)) F ln( p( ~ y m)) D(q( ) p( ~ y )) min The sound has dynamics determined by properties, Frequency and Amplitude This implies 1. A generative model, m 2. ~ y Sensations An ensemble/recognition density, q The probability of an environmental state given the brain state 3. Perception: Minimising divergence between recognition density and posterior probability of causes, move towards the bound … in the sensorium Hierarchies and dynamics ~ y Sensations 1. Generative Model: Hierarchical Dynamical Model in Generalised coordinates 1. Generative Model: … in the sensorium Hierarchical Dynamical Model in Generalised coordinates Hierarchies and dynamics v~ m z m 1 . . . ( i ) ~ (i ) ~ (i ) ~ (i ) ~ x f (~ x ,v ) w v~ i 1 g~ ( ~ x ( i ) , v~ ( i ) ) ~ z (i ) ~ y Sensations . . . (1) ~ (1) ~ (1) ~ (1) ~ x f (~ x ,v ) w ~ y g~ ( ~ x (1) , v~ (1) ) ~ z (1) The prediction at the ith level is a prior at the level below … These causes are Probabilistic ….many to one mappings “four candles” vs “fork handles” ~ y Sensations 1. Generative Model: HDM: Probabilistic … These causes are Probabilistic ….many to one mappings 1. Generative Model: HDM: Probabilistic (1) ~ (1) (1) ~ (1) ~ ~ ~ ~ p( y , x , v ) p( y x , v ) p( ~ x (1) , v~ (1) ) ~ (1) ~ (1) ~ ~ p( y x , v ) N ( ~ y : g~, 1 ( ) z ) p( ~ x (1) , v~ (1) ) p ( ~ x (1) v~ (1) ) p(v~ (1) ) (1) ~ (1) (1) ~ ~ 1 ~ ~ p( x v ) N ( Dx : f , ( ) w ) ~ y Sensations ~ p(v~ (1) ) N (v~ (1) : , v ) Gaussian noise accounting for higher order correlations within levels Hypothesis Model Inversion in Brain Dynamics (tomorrow) ~ y Sensations Summary Conjugacy of Gaussian Prior and Gaussian Likelihood, allow simple updates of sufficient statistics Maximum Likelihood equivalent under uniform priors Sensory fusion through ML estimates observed experimentally Evidence accumulation models available through Bayes rule Parameter priors can dramatically change updating behaviour Similar behaviours can be encapsulated by different models per se – coming up….
© Copyright 2026 Paperzz