Hierarchical Models Stefan Kiebel Wellcome Dept. of Imaging Neuroscience University College London Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples Data fMRI, single subject fMRI, multi-subject EEG/MEG, single subject ERP/ERF, multi-subject Hierarchical model for all imaging data! Reminder: voxel by voxel model specification Time parameter estimation hypothesis statistic Intensity single voxel time series SPM Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples General Linear Model p 1 y X 1 1 y N = X N N: number of scans p: number of regressors p + N Model is specified by 1. Design matrix X 2. Assumptions about Linear hierarchical model Hierarchical model (1 ) (2) y X (1 ) X (2) (1 ) (1 ) (2) C ( n 1 ) X (n) (n) Multiple variance components at each level (i) Q (n) At each level, distribution of parameters is given by level above. What we don’t know: distribution of parameters and variance parameters. (i) k k (i) k Example: Two level model 1 1 yX 1 X 2 2 (1) y = X1 1 2 1 2 + 1 (1) X2 1 = X 2 + 2 (1) X3 Second level First level Algorithmic Equivalence Hierarchical model (1 ) (2) (2) (n) (n) y X (1 ) X (2) (1 ) (1 ) ( n 1 ) X (n) Parametric Empirical Bayes (PEB) EM = PEB = ReML Single-level model y X (1 ) X ... (1 ) ( n 1 ) (2) (1 ) (n) (n) X X (1 ) X (n) Restricted Maximum Likelihood (ReML) Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples Estimation y X N 1 Np p 1 EM-algorithm N 1 1 C |y ( X C X ) T |y C |y X C y T maximise L ln p(y| λ ) g J d M-step d L d 2 1 J g k Assume, at voxel j: E-step dL 2 C k Qk 1 1 jk j k Friston et al. 2002, Neuroimage And in practice? Most 2-level models are just too big to compute. And even if, it takes a long time! Moreover, sometimes we‘re only interested in one specific effect and don‘t want to model all the data. Is there a fast approximation? Summary Statistics approach First level Data Design Matrix ˆ1 ˆ12 Second level Contrast Images t T c ˆ T Vaˆr ( c ˆ ) SPM(t) ˆ 2 ˆ 22 ˆ11 ˆ11 2 ˆ12 ˆ12 2 One-sample t-test @ 2nd level Validity of approach The summary stats approach is exact under i.i.d. assumptions at the 2nd level, if for each session: Within-session covariance the same First-level design the same One contrast per session All other cases: Summary stats approach seems to be robust against typical violations. Mixed-effects y X [X (0) (1 ) X data X [X ] Q {Q V I Summary statistics (1 ) 1 X (1 ) , , X X (1 ) (2) Q ] (2) 1 X (1 ) T Step 1 ˆ (1 ) (X V T 1 X) 1 T X V 1 y REML { yy Y ˆ V T n , X , Q} (1 ) X X (2) (1 ) i X (1 ) (1 ) Qi X (1 ) T i EM approach (0) (2) j (2) Qj j Step 2 ˆ (2) ˆ (X V T (2) 1 X) 1 T X V 1 y Friston et al. (2004) Mixed effects and fMRI studies, Neuroimage , } Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples Sphericity C Cov( ) E ( ) T y X Scans ‚sphericity‘ means: Cov( ) I 2 Var ( i ) 1 2 Scans i.e. 2 2nd level: Non-sphericity Error covariance Errors are independent but not identical Errors are not independent and not identical 1st level fMRI: serial correlations t a t 1 t with t ~ N (0, 2 ) autoregressive process of order 1 (AR-1) N Cov( ) autocovariancefunction N Restricted Maximum Likelihood y X Cov( ) ? observed Q1 yj yj T voxel j ReML estimated Q2 ˆ1Q1 ˆ2Q2 Estimation in SPM5 y j X j j Cˆ Coˆv( ) ReML( y j y j , X , Q) T voxel j ReML (pooled estimate) ˆ j ,OLS X yj ˆ j , ML ( X V X ) X V y j Ordinary least-squares •optional in SPM5 •one pass through data •statistic using effective degrees of freedom T 1 1 T 1 Maximum Likelihood •2 passes (first pass for selection of voxels) •more accurate estimate of V 2nd level: Non-sphericity Errors can be non-independent and non-identical when… Several parameters per subject, e.g. repeated measures design Complete characterization of the hemodynamic response, e.g. taking more than 1 HRF parameter to 2nd level Example data set (R Henson) on: http://www.fil.ion.ucl.ac.uk/spm/data/ Overview Why hierarchical models? The model Estimation Expectation-Maximization Summary Statistics approach Variance components Examples Example I Stimuli: Auditory Presentation (SOA = 4 secs) of (i) words and (ii) words spoken backwards e.g. “Book” and “Koob” Subjects: Scanning: (i) 12 control subjects (ii) 11 blind subjects fMRI, 250 scans per subject, block design U. Noppeney et al. Population differences 1st level: Controls Blinds 2nd level: V c [1 1] T X Example II Stimuli: Auditory Presentation (SOA = 4 secs) of words Motion Sound Visual Action “jump” “click” “pink” “turn” Subjects: (i) 12 control subjects Scanning: fMRI, 250 scans per subject, block design Question: What regions are affected by the semantic content of the words? U. Noppeney et al. Repeated measures Anova 1st level: Motion Sound ? Visual ? ? = Action = = X 2nd level: 1 T c 0 0 V X 1 0 1 1 0 1 0 0 1 Example 3: 2-level ERP model Single subject Multiple subjects Trial type 1 Subject j ... ... Trial type n ... ... Trial type i Subject 1 Subject m Test for difference in N170 Example: difference between trial types ERP-specific contrasts: Average between 150 and 190 ms 2nd level contrast ... y = X 1 Identity matrix 1 -1 1 2 c T 1 = + X 2 second level first level 2 Some practical points RFX: If not using multi-dimensional contrasts at 2nd level (Ftests), use a series of 1-sample t-tests at the 2nd level. Use mixed-effects model only, if seriously in doubt about validity of summary statistics approach. If using variance components at 2nd level: Always check your variance components (explore design). Conclusion Linear hierarchical models are general enough for typical multi-subject imaging data (PET, fMRI, EEG/MEG). Summary statistics are robust approximation to mixed-effects analysis. To minimize number of variance components to be estimated at 2nd level, compute relevant contrasts at 1st level and use simple test at 2nd level.
© Copyright 2024 Paperzz