MBI-Machiraju-lecture9 - Ohio State Computer Science and

Semi-Supervised
State Space Models
A Big Thanks To 
Istavan (Pisti) Morocz, Firdaus Janoos,
Prof. Jason Bohland
OSU/Harvard,MIT/Exxon
Harvard,
MNI
Quantitative Neuroscience Laboratory
Boston University
Sources
NIPS 2011
http://neufo.org/lecture_events
A Running Example
Dyscalculia
Difficulty in learning arithmetic that
cannot be explained by mental
retardation, inappropriate schooling, or
poor social environment
Core conceptual deficit dealing with
numbers
Very common : 3-6% of school-age
children
Heterogeneous
Dyslexia
Selective inability to build a visual representation of a
word, used in subsequent language processing, in the
absence of general visual impairment or speech
disorders
Affects 5-10% of the population
Spelling, phonological processing, word retrieval
Disorder of the visual word form system
Multiple varieties
Occipital, temporal, frontal, cerebellum
Experimental protocols
Event-related designs
- single stimuli/“events” at any time
point
- Periodic or spread across
frequencies
- Require rapidly acquired data(small
TR)
- Rapid events (less than ~20s apart)
give rise to temporal summation of
BOLD response
- Summation is close to linear, but
non-linearities are evident for small
ISIs.
Stimulus function (s(t))
Mental Arithmetic Paradigm
Mental Arithmetic
Involves basic manipulation of number and
quantities
Magnitude based system – bilateral IPS
Verbal based system – left AG
Attentional system – ps Parietal Lobule
Other systems – SMA, primary visual cortex, liPFC,
insula, etc
Cascadic Recruitment
Classical fMRI Pipeline
State-of-the-Art - ROI
Janoos et al., EuroVis2009
Another Way ?
Multi-voxel pattern analysis
Traditional analyses focus has focused on relationship
between task and individual brain voxels (or regions)
MVPA uses patterns of observed activation across sets of
voxels to decode represented information
– Relies on machine learning / pattern classification algorithms
– Claim: more sensitive detection of cognitive states (Mind
Reading)
– Does not employ spatial smoothing
– Typically conducted within individual subjects
Inter-voxel differences
contain information!
http://www.mrc-cbu.cam.ac.uk/people/nikolaus.kriegeskorte/infonotacti.html
Brain States
Brain States
Inspiration
Haxby, 2001
Mitchell,
2008
Functional Networks
Functional / Effective
Connectivity
Standard analysis of fMRI data conforms to a functional
segregation approach to brain function
i.e. brain regions are active for a stimulus type
Assumes the inputs have access to all brain regions
Pertinent Question: How do active brain regions interact with
one another? [ functional integration ]
Effective Connectivity = the functional strength of a specific
anatomical connection during a particular cognitive task; i.e. the
influence that one region has on another. ( Inferred )
Functional Connectivity = the temporal correlation between signal
from two brain regions during a cognitive task ( Measured )
[ But these are exceptionally fuzzy terms ]
A Solution – State Space
Models
Functional Distance ?
Is Zt1 < Zt2 ,or
Zt2 < Zt3 ,or
Sort Zt1, Zt2, Zt3
Zt1
Zt2
Zt3
State Space Model
Comprehensive Model
State-Space Model
Janoos et al., MICCAI 2010
Computation
al Workflow
Feature Space Estimation
Functional Distance
Transportation Distance
Functional Distance
Zt – activation patterns
f - transportation
Transportation Distance
Functional Connectivity
Estimation
Gaussian smoothing
HAC until
≈0.25N
Cluster-wise Correlation Estimation and Shrinkage
Voxel-wise Correlation Estimation
Clustering in Functional
Space
10
Brain State Label
5
0
10
5
0
0s
4s
8s
0s
4s
8s
Critique
No neurophysiologic model
Point estimates
Hemodynamic uncertainty
Temporal structure
Functional distance - an optimization
problem
No metric structure
Expensive !
Embeddings
A Solution
Distortion minimizing
Feature
Space Φ
Orthogonal Bases
Graph Partitioning
Normalized graph
Laplacian of F
Working in Feature Space Φ
Feature Selection
Y
Resampling with Replacement
Functional Network Estimation
Basis Vector φ(l,m) Computation
Bootstrap Distribution of Correlations ρ (l,m)
Feature Selection
Retain φ(l,m) if Pr[ρ (l,m) ≥ τΦ] ≥ 0.75
Φ
R
times
Model Size Selection
 Strike balance between model complexity and model
fit
 Information theoretic or Bayesian criteria
Notion of model complexity
 Cross-validation
IID Assumption
Estimation
Chosen Method
fMRI Data
Φ
Hyperparameter
Selection
Y
Feature-Space Transformation
Error
Rate
K, λW
Feature-space basis
y
Hyper
parameters
Model Estimation
E-step
Compute q(n)(x,z) from p(y,z,x|θ(n))
Until convergence
M-step
(n+1)
Estimate θ
: L(q(n), θ(n+1)) > L(q(n), θ(n))
s
Stimulus
Parameters
State Sequence Estimation
E-step
Compute q(n)(z) from p(z| y,x(n),θ)
Until convergence
M-step
x(n+1) = argmax L(q(n), x)
x
θ
Premise - EM Algorithm
Generalized EM Algorithm
http://mplab.ucsd.edu/tutorials/EM.pdf
Mean Field Approximation
Experimental Conditions
Comprehensive Model
Comparisons
HRFs
Optimal States
Spatial Maps
Population Studies (sort of)
Interpretation
Dyscalculic
Control
Dyslexic
Janoos et al., NeuroImage, 2011
MDS Plots
MDS Plots
Control Male
Control Female
Dyslexic Male
Dyslexic Female
Dyscalculic Male
Dyscalculic Female
Stage-wise Error Plots
Stage-wise MDS Plots
Phase 1
Phase 1: Product Size
Phase 2
Phase 2: Problem Difficulty
What Else ?
Maximally Predictive
Criteria
Multiple spatio-temporal patterns in fMRI
Neurophysiological
task related vs. default networks
Extraneous
Breathing, pulsatile, scanner drift
Select a model that is maximally
predictive with respect to task
Predictability of optimal state-sequence
from stimulus, s
“Resting State”
Rather than evoked responses, rs-fMRI looks at random, lowfrequency fluctuations of BOLD activity (Biswal, 1995)
 “industry standard” filters data at ~0.01 < f < 0.08 Hz
“Default mode” network (Raichle et al., 2001)
 Set of regions with correlated BOLD activity
 Activation decreases when subjects perform an explicit task
 Ventromedial PFC, precuneus, temporal-parietal junction…
But the default mode is only one network that emerges from the
correlation structure of resting state networks
 Smith et al (2009) showed various task-active networks emerge
from ICA based interrogation of rs-fMRI data
Summary
 Process model for fMRI
Spatial patterns and the temporal structure
Identification of internal mental processes
 Neurophysiologically plausible
Test for the effects of experimental
variables
Parameter interpretation
 Comparison of mental processes
Abstract representation of patterns
Thank You for Putting Up
with me for 9 Lectures