View - OhioLINK Electronic Theses and Dissertations Center

DYNAMIC ADJUSTMENT OF STIMULI IN REAL-TIME
FUNCTIONAL MAGNETIC RESONANCE IMAGING
by
I JUNG FENG
Submitted in partial fulfillment of the requirements
For the degree of Doctor of Philosophy
Dissertation Adviser: Dr. Curtis Tatsuoka
Department of Epidemiology and Biostatistics
Division of Biostatistics
CASE WESTERN RESERVE UNIVERSITY
August, 2013
&$6(:(67(515(6(59(81,9(56,7<
6&+22/2)*5$'8$7(678',(6
:HKHUHE\DSSURYHWKHWKHVLVGLVVHUWDWLRQRI
I JUNG FENG
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Ph.
D.
FDQGLGDWHIRUWKHBBBBBBBBBBBBBBBBBBBBBBGHJUHH
Ralph O'Brien VLJQHGBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
FKDLURIWKHFRPPLWWHH
Curtis Tatsuoka
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Mark Schluchter
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Kenneth A. Loparo
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
June 17, 2013
GDWH BBBBBBBBBBBBBBBBBBBBBBB
:HDOVRFHUWLI\WKDWZULWWHQDSSURYDOKDVEHHQREWDLQHGIRUDQ\
SURSULHWDU\PDWHULDOFRQWDLQHGWKHUHLQ
To my dear and loving husband, Keng-Chu Lin
i
Table of Contents
Table of Contents………………………………………………………………………….ii
List of Figures ..................................................................................................................... v
List of Tables .................................................................................................................... vii
Acknowledgments.............................................................................................................. ix
Abstract ............................................................................................................................... x
Chapter 1
Introduction ................................................................................................... 1
Chapter 2
Functional Magnetic Resonance Imaging ..................................................... 5
2.1
Introduction of Functional Magnetic Resonance Imaging....................................... 5
2.1.1
Introduction of fMRI ............................................................................................ 5
2.1.2
Experimental Design ............................................................................................ 7
2.1.3
Noise..................................................................................................................... 8
2.1.4
Pre-processing ...................................................................................................... 9
Spatial Smoothing........................................................................................................ 9
2.1.5
Statistical Modeling............................................................................................ 11
BOLD Signal Modeling............................................................................................. 11
Voxel-wise General Linear Model ............................................................................ 12
2.1.6
Multiple Comparisons Issue in Voxel-wise GLM ............................................. 15
2.1.7
Summary ............................................................................................................ 16
2.2
Real Time fMRI ..................................................................................................... 17
2.2.1
Introduction of Real Time fMRI ........................................................................ 17
2.2.2
Architecture System of rt-fMRI ......................................................................... 18
2.2.3
Real Time Pre-processing Steps ......................................................................... 18
2.2.4
Statistical Analysis of rt-fMRI ........................................................................... 19
Chapter 3
3.1
Sequential Analysis Methods...................................................................... 20
Wald’s Sequential Probability Ratio Test .............................................................. 21
One-sided Voxel-wise Wald’s SPRT ........................................................................ 21
Two-sided Voxel-wise Wald’s SPRT........................................................................ 23
Truncated SPRT ........................................................................................................ 25
Modified SPRT with Nuisance Parameters: Bartlett snd Cox ................................... 25
3.2
Sequential Estimation Method ............................................................................... 26
ii
3.3
Multiple Comparisons Issue in Sequential Method ............................................... 28
Sequential Bonferroni Method .................................................................................. 28
Chapter 4
Dynamic Localization and Stopping Using an SPRT Approach ................ 29
4.1
Overview of Statistical Issues ................................................................................ 29
4.2
Pre-processing Step................................................................................................ 29
4.2.1
4.3
Normalized Drift Correction .............................................................................. 30
Voxel-wise SPRT................................................................................................... 32
4.3.1
Voxel-wise General Linear Model in fMRI ....................................................... 32
General Linear Model................................................................................................ 32
One-sided Voxel-wise SPRT ..................................................................................... 33
Two-sided Voxel-wise SPRT .................................................................................... 35
Voxel-wise SPRT with Temporal Correlation .......................................................... 37
Multiple Comparison Correction ............................................................................... 37
Global Stopping Rule ................................................................................................ 38
4.3.2
4.4
Summary of Voxel-wise SPRT Procedures ....................................................... 39
Simulation Studies ................................................................................................. 40
4.4.1
Simulation Model ............................................................................................... 44
4.4.2
Results ................................................................................................................ 45
Efficiency of One-sided Voxel-wise SPRT on Activation Detection ....................... 45
Efficiency of Two-sided Voxel-wise SPRT on Activation Detection ....................... 48
Adjustment of Experimental Design of One-sided Voxel-wise SPRT on Activation
Detection .................................................................................................................... 50
Efficiency of One-sided Voxel-wise SPRT on Differential Activation Detection .... 52
4.5
Real fMRI Studies.................................................................................................. 54
4.5.1
Analysis Model .................................................................................................. 54
4.5.2
fMRI Data Analysis ........................................................................................... 55
4.5.3
Real fMRI Studies Results ................................................................................. 56
4.6
Discussion .............................................................................................................. 60
Limitations ................................................................................................................. 62
Conclusion ................................................................................................................. 64
Chapter 5
Cognitive Reserve ....................................................................................... 65
iii
5.1
Introduction of Cognitive Reserve ......................................................................... 65
5.2
CR in Normal Aging and Alzheimer’s Disease ..................................................... 68
5.2.1
Normal Aging ..................................................................................................... 68
5.2.2
Alzheimer’s Disease ........................................................................................... 71
5.2.3
CR and AD ......................................................................................................... 72
Chapter 6
Detection of CR in AD using rt-fMRI ........................................................ 78
6.1 Voxel-wise Sequential Estimation on Detecting Efficiency and Capacity of Neural
Reserve .............................................................................................................................. 81
6.1.1
Voxel-wise Sequential Estimation Approach .................................................... 82
6.1.2
Simulation Studies.............................................................................................. 87
6.1.2.1 Simulation Data Analysis Including One Task .................................................. 87
6.1.2.1.1
Simulation Model ........................................................................................... 90
6.1.2.1.2
Simulation Results .......................................................................................... 91
6.1.2.2 Simulation Data Analysis Including Three Tasks ............................................ 100
6.1.2.2.1
Simulation Model ......................................................................................... 102
6.1.2.2.2
Simulation Results ........................................................................................ 104
6.1.2.3 Investigation of Two Sequential Estimation Approaches ................................ 108
6.1.2.3.1
Methods ........................................................................................................ 109
6.1.2.3.2
Results .......................................................................................................... 112
6.1.3
6.2
6.2.1
Discussion and Conclusion .............................................................................. 118
Halving Algorithm and Voxel-wise SPRT on Detecting Neural Compensation . 121
Simulation Study .............................................................................................. 122
6.2.1.1 Method ............................................................................................................. 124
6.2.1.2 Simulation Results............................................................................................ 124
6.3
Discussion and Conclusion .................................................................................. 129
Chapter 7
Conclusion ................................................................................................ 130
Chapter 8
Bibliography ............................................................................................. 134 iv
List of Figures
Figure 2.1
The shape and timing of the HDR ………………………………………. 6
Figure 2.2
One-dimensional Gaussian kernel weighting structure over 3, 6, and 8
FWHM……………………………...…...……………………………… 11
Figure 4.1
Amounts of smoothing weighting applied on time point 100 by different
values of τ …………………………………………………………….... 31
Figure 4.2
Activation pattern of simulated fMRI image.………………….……….. 43
Figure 4.3
The activation strength structure of dataset (SNR=0.1)…………….….. 46
Figure 4.4
Voxels classified as active based on one-sided voxel-wise SPRT
(SNR=0.1)……….………………………………………………….….. 47
Figure 4.5
Voxels classified as active based on two-sided voxel-wise SPRT
(SNR=0.1)……….……………………………………………...……… 49
Figure 4.6
Voxels classified as active based on one-sided voxel-wise SPRT
(SNR=0.3)……….……………………………………………….…….. 51
Figure 4.7
Differential active strength structure of dataset with 0.1 SNR…………. 52
Figure 4.8
Voxels classified as differential active based on one-sided voxelwise SPRT (SNR=0.1)………………………..………………...………. 53
Figure 4.9
Adult face activity detection results……………………………………. 58
Figure 4.10
Differential activity detection classification results………….………… 59
Figure 5.1
Models of task related neural activity versus task………………...….… 67
Figure 5.2
Hypothesized relationship between task demand and activation in old
and young………………………………………………………………. 69
Figure 6.1
Activation pattern of simulated fMRI image………………………….... 89
v
Figure 6.2
Task activation estimates of six simulated images according to the
voxel-wise sequential estimation approach, d=0.5.……………….….… 96
Figure 6.3
Accuracy plots of six simulated images……………………………….. 99
Figure 6.4
Activation pattern and strength structure of simulated fMRI image... 101
Figure 6.5
Halving algorithm including five difficulty level tasks……………….. 122
Figure 6.6
The hypothesized activation curves for subjects starting to respond to
the given task at different loadings.……………………………..…… 123
vi
List of Tables
Table 4.1
Detection accuracies among the 4 simulated activation areas; onesided SPRT applied on dataset (0.1 SNR)………………………...……. 47
Table 4.2
Detection accuracies among the 4 simulated activation areas; twosided SPRT applied on dataset (0.1 SNR)………………………...……. 49
Table 4.3
Detection accuracies among the 4 simulated activation areas; onesided SPRT applied on dataset (0.3 SNR)………………………...……. 51
Table 4.4
Detection accuracies for the simulated differential activation areas;
one-sided SPRT applied on dataset (0.1 SNR)................................……. 53
Table 5.1
Cognitive reserve findings in people at early stage of AD.………….…. 75
Table 6.1
Simulated activation magnitudes and corresponding variance of noise... 89
Table 6.2
Required scan units of six different simulated images…………...……. 93
Table 6.3
Estimated means of activation magnitudes of active region under six
simulated images, two d values and two analysis approaches……...….. 95
Table 6.4
Accuracy percentages (%) of six simulated images, two d values and
two analysis approaches.……………………………………..…...……. 98
Table 6.5
Required scan units of two simulated images.……………….…...… 104
Table 6.6
Estimated means of activation magnitudes of eight active regions under
two simulated images, two d values and two analysis approaches..… 106
Table 6.7
(Differential) activation estimations accuracy percentages (%) of two
simulated images............................................................................… 108
Table 6.8
The values of compariable component 1 in CI generation involving the
two sequential estimation approaches, as a function of number of task
vii
stimuli..................................................................................................... 113
Table 6.9
The values of compariable component 2 in CI generation involving the
two sequential estimation approaches, as a function of number of task
stimuli..................................................................................................... 115
Table 6.10
The values of comparable products of component 1 and component 2 in
CI generation involving the two sequential estimation approaches, as a
function of number of task stimuli.......................................................... 117
Table 6.11
Global stopping time points of six simulated images analyzed by voxel
-wise sequential estimation approach, d=0.5 employed..................… 119
Table 6.12
Hypothesized activation magnitudes over task loadings.....………… 123
Table 6.13
Results from traditional GLM analysis……………………….......… 126
Table 6.14
Results from proposed halving algorithms and voxel-wise SPRT...… 128
viii
Acknowledgments
First of all, I would like to express my deepest appreciation to my research advisor,
Professor Curtis Tatsuoka for continuous support of my Ph.D. research, for his patience,
motivation, and immense knowledge. He keeps me with a spirit of adventure in regard to
research. This dissertation would not have been possible without his guidance and
persistence.
Furthermore, I would like to thank all the professors in my thesis committee, my
academic advisor, Prof. Ralph O'Brien, who passes on his professional wisdom on me
and always provided help when I needed it, and Prof. Mark Schluchter and Prof. Kenneth
A. Loparo for their encouragement, insightful comments, and probing questions.
I would like thank Sarah Carr who was always willing to lend a helping hand and
giving the best suggestion.
I would like to acknowledge the important support for this research from the Spitz
Brain Health Innovation Fund.
I would like to thank my family’s support and encouraging me with their best wishes.
Deepest thanks to Chia-Wei Soong who is my perfect family but without blood relations.
Finally, I would like to thank my beloved husband, Keng-chu Lin, who was always
there to cheering me up and stood by me through the good times and bad times.
ix
Dynamic Adjustment of Stimuli in Real-Time Functional Magnetic
Resonance Imaging
Abstract
by
I JUNG FENG
Conventional fMRI image analysis is performed by carrying out a massive number of
parallel regression analyses. fMRI signal is known for its low signal-noise-ratio, and its
complexity, such as reflected by spatial and temporal autocorrelation. In order to ensure
accurate localization of brain activity, stimulus administration in an fMRI session is often
lengthy and repetitive. In real time fMRI, signal processing is carried out while the signal
is being observed. This method allows for the dynamic adjustment of stimuli through
sequential experimental designs. We have developed a voxel-wise sequential probability
ratio test (voxel-wise SPRT) approach for dynamically localizing activation associated
with stimuli, as well as decision rules for the stopping of experimentation. Stopping is
dynamically determined when sufficient statistical evidence is collected to assess the
activation status of voxels across regions of interest. Simulation studies show that the
number of scan units can be reduced substantially compared to standard fMRI
experimental designs that are fixed and predetermined, while still achieving comparably
high levels of classification accuracy. An analysis based on actual brain imaging
confirms the promise of this approach.
x
An interesting application of dynamic adjustment of fMRI stimuli is in the area of
Alzheimer’s disease (AD). It is clear that there is a fair amount of heterogeneity in the
cognitive course of the disease. This has led to the development of theories related to the
notion of cognitive reserve, which posits that neural capacity, efficiency, and plasticity
play a role in this heterogeneity. It has been further hypothesized that cognitive reserve
levels at pre-symptomatic stage of AD will manifest specific neural activation patterns
under carefully designed fMRI experimentation that systematically varies difficulty
levels of a targeted task. A sequential testing approach is proposed for efficiently and
accurately identifying and classifying such patterns. Methods for characterizing cognitive
reserve that are studied here are comprised of two approaches. The first is sequential
estimation through monitoring confidence interval lengths over a range of experimental
conditions to assess efficiency and capacity. The other is sequential selection of difficulty
levels, to detect neural compensation, which is a reflection of plasticity. Both approaches
show high efficiencies and high detection accuracies in our fMRI simulation studies.
These two approaches open up new possibilities for studying and characterizing cognitive
reserve, which will in turn lead to a better understanding of processes in AD.
xi
Chapter 1
Introduction
Functional magnetic resonance imaging (fMRI) provides neural images with high
spatial resolution by non-invasively detecting task-related blood oxygen level dependent
(BOLD) signal changes associated with neural activity in the brain. Compared with other
neuroimaging technologies such as Positron emission tomography (PET), fMRI provides
neural images with the highest spatial resolution allowing for precise localization of brain
functioning. fMRI can enhance many clinical applications including disease diagnosis
and treatment, and measurement of change in neurological functioning. Used presurgically, it can also help prevent damage to cognitive and motor functions 1. However,
fMRI signals present analytical challenges because they are abundant, noisy and highly
correlated both spatially and temporally 1. The task-related signal investigators want to
detect is only a 0.5 to 2% signal change in measured BOLD signal 1. Ensuring accurate
spatial localization usually requires a pre-determined, redundant and lengthy fMRI
session
2, 3
. This not only leads to high costs for fMRI implementation, but also exposes
the signal data to fatigue and learning effects.
Progress in fMRI acquisition and computational processing makes it feasible to
observe brain activity during experimentation
4-10
. This is known as real-time fMRI (rt-
fMRI). Real-time processing of BOLD signals creates an important opportunity - the
ability to adapt experimental stimuli in real time according to individual response and
variability. Through real-time signal processing, experimentation can be terminated early,
when it becomes clear that voxels within a region of interest (ROI) have been activated.
As a result, a precise detection of neural activity can be obtained in fewer fMRI scan time
units than required in conventional fMRI experimentation. Two innovative statistical
1
methods for dynamic adjustment of stimuli via sequential analysis of real-time fMRI
signals are proposed. One is a hypothesis-testing approach, and the other is a sequential
estimation approach. Which method to use depends on the research goal.
Alzheimer's disease (AD) affected more than 25 million people worldwide in 2006,
and 5.2 million Americans suffer from this disease in 2013
11, 12
. Currently, there is no
cure for AD, however, research is ongoing on treatments to slow or stop the progression
of AD
12, 13
. These treatments will be the most effective when implemented during the
early stages of AD. People at the pre-clinical stage or with Mild Cognitive Impairment
(MCI) have different brain activation pattern compared with healthy people
14, 15
, a
phenomenon known as cognitive reserve (CR) 16. CR can be characterized by the change
of activation magnitude over task difficulties
16
. Measuring this in an fMRI paradigm
including multiple task loadings requires a long fMRI session which makes developing
an AD-onset prediction biomarker for clinical administration difficult. Voxel-wise
sequential estimation approach is proposed to shorten fMRI sessions needed to estimate
neural activity strength. This approach allows dynamic adjustment according to the
collected signal.
This thesis presents the application of powerful sequential statistical approaches to
dynamically adjusting fMRI sessions based on real-time responses. The hypothesistesting approach shortens the time needed to detect the activation region, while the
estimation approach decreases the time needed to estimate activation magnitudes. In sum,
the goal of our research is to develop methods that can utilize the characteristics of
observed BOLD signals to modify fMRI experimental design in real time. The practical
consequence of this work will be shorter scanning times, minimized fatigue and learning
2
effects, enhanced replicability of localization findings, and less expensive yet more
comprehensive fMRI testing, all of which can broaden the applicability of fMRI usage.
This dissertation is organized as follows:
Chapter 2 presents basic notions about the nature of fMRI and rt-fMRI signal analysis
problems. It also introduces conventional fMRI analysis procedures including preprocessing steps and general linear model (GLM) analysis.
Chapter 3 present sequential hypothesis-testing, Wald’s sequential probability ratio test
(Wald’s SPRT) and Srivastava’s sequential estimation and their application to rt-fMRI
signal analysis.
Chapter 4 proposes a voxel-wise SPRT obtained by combining Wald’s SPRT and GLM
approaches that can be used to identify stimuli-related activity in ROIs. Stimulus
administration can then be dynamically stopped when sufficient statistical evidence has
been collected to determine activation status across ROIs. Chapter 4 also presents a
method for individualized dynamically determining localization and decision rules for
stopping administration. Simulation analysis and real-time fMRI signal analysis help
demonstrate the effectiveness of proposed voxel-wise SPRT.
Chapter 5 introduces the concepts of CR and its correlation with aging and AD.
Clinical findings of neural activation change between healthy people and people in the
early stage of AD provide evidence for developing a predictive cognitive decline
biomarker of AD.
Chapter 6 discusses how to locate a well-defined brain CR function regions for AD
early stage detection using multiple task loadings of stimuli. Sequential estimation
methods in general and Srivastava’s sequential estimation method in particular are used
3
for estimating neural activity with a goal of understanding the characteristics of preclinical AD related CR activation patterns. Moreover, voxel-wise SPRT in conjunction
with halving algorithm selection rules to determine the task loading of stimuli
administration in real time is developed to provide further information on CR. This is in
terms of efficiently identifying at what difficulty levels neural compensation is induced.
Chapter 7 summarizes the major findings of this dissertation.
4
Chapter 2
Functional Magnetic Resonance Imaging
2.1 Introduction of Functional Magnetic Resonance Imaging
Through past centuries, physicians and philosophers tried to localize distinct brain
regions corresponding to particular human brain functions, such as mental processes,
perceptual functions and motor functions. With improvement of neuroimaging techniques
over the past decades, faster approaches with higher accuracy were developed on
acquisition mechanisms and computational algorithms 1, 17, 18.
Functional magnetic resonance imaging (fMRI) safely and non-invasively provides
neural activity image. Compared with other neuroimaging techniques, such as PET, fMRI
can provide a higher spatial resolution and relatively good temporal resolution image.
Because of these desirable capabilities of fMRI, the number of published papers and
mentions of fMRI in the PubMed dataset has rapidly grown from 1992 to now
19
. The
nature of fMRI signal and corresponding analysis approaches are introduced in the
following section.
2.1.1 Introduction of fMRI
fMRI indirectly measures electrical activity of neurons through correlated
physiological changes. Active neural cells require extra energy to communicate with
other related neurons. Increased blood flow brings more glucose and oxygen carried by
hemoglobin molecules to active region(s). Glucose and oxygen are converted to energy to
meet increasing metabolic requirements. The PET system detects analogue of glucose
with detectable marker to produce brain function image. fMRI, on the other hand, takes
advantage of oxygenated hemoglobin (oxy-Hb) and deoxygenated hemoglobin (deoxy5
Hb) with different magnetic properties. An increased concentration of oxy-Hb leads to a
relatively smaller concentration of deoxy-Hb, which suppresses magnetic resonance
(MR) signal, and then a relatively higher MR signal intensity is captured by fMRI
acquirement. The signal measured by fMRI depends on the change of oxygen
concentration and is called blood-oxygen-level-dependent (BOLD) signal contrast.
Although the fMRI doesn’t directly measure neural activity, a growing body of evidence
suggests that there is a correlation between task-induced BOLD signal and neuron
electrical activity 20.
The change of MR signal following onset of a neural stimulus is called hemodynamic
responses (HDR). Unlike neural activity, which only lasts milliseconds, HDR takes
around 5 seconds to achieve peak of activity level and then goes back to baseline after a
long below-baseline period, as described in Figure 2.1 This specific curve of HDR is
used to generate expected BOLD signal in fMRI signal modeling and function used to
model HDR is called hemodynamic responses function (HRF).
Figure 2.1 The shape and timing of the HDR. HDR is composed by a series of phases: initial dip,
rising period, achieving peak of signal value, decreasing period and then undershoot period.21
6
One whole brain 3-dimensional (3-D) image data is constructed by numbers of
rectangular prisms, named voxels. Each voxel, with size around 3-5 mm3, is recorded one
time series BOLD signal 1. The number of voxels included in one whole brain image
varies by fMRI machines. One example is that size for one voxel is 3 mm by 3 mm by 3
mm and 64 by 48 by 64 number voxels are included in one whole brain image. More than
one hundred thousand voxels leads fMRI being able to provide the highest spatial
resolution for brain function imaging localization.
The temporal resolution of fMRI is decided by the repetition time (TR), defined by the
time period between each successive excitation pulse. The value of TR is usually between
0.5 second and 3 seconds
18
. A shorter TR gives a better temporal resolution. However,
TR is limited by the possible interference that occurs when two successive excitation
pulses are too close with each other. Due to the relatively poor temporal resolution of
fMRI images, high frequency noise may be captured as low frequency noise (less than
1/128 Hertz).
Due to the recorded signal that is carried by blood flow, fMRI data is characterized
correlation, spatially and temporally.
2.1.2 Experimental Design
fMRI has two main experimental designs: block design and event-related design.
Block design is composed by blocks. Tasks under experimental and control conditions are
separately clustered in a fixed time period block. This is the simplest and most basic
fMRI experimental design, and is easiest to implement. However, an appropriate block
length selection is very important. When the rest period is smaller than 10 seconds, the
7
HDR will not go back to baseline
18
. Therefore, a too short period in the control block
will result in no difference between task-active and task-inactive signals. However, a too
long task period lead that task-related signal is difficult to be distinguished from low
frequency noise. Event-related designs use randomized present at in order and present at
in timing of each task. Therefore, event-related design is more flexible. It is difficult for
subjects to predict what task will appear and when the task will start. Task-related design
is expected with less learning effect caused by regular repeat stimulus in fMRI session.
Moreover, one important rule for designing fMRI paradigm is to include enough
number of stimuli repetition times in order to keep high fMRI detection power.
Traditionally, the number is suggested to decide based on the minimum signal-to-noise
ratio (SNR) to insure the collected signal is sufficient for analysis 2, 3.
2.1.3 Noise
fMRI noise comes from two major sources, the scanner system and the subject.
Unstable hardware and thermal motion of electrons within the scanner respectively result
in non-stationary low frequency drift change and Gaussian distributed noise over time.
However, the major source of noise is the subject him- or herself. Physical movements,
such as head motion and swallowing, will appear during fMRI session and cause non-task
related signals. Sudden movement leads to unexpected spikes in the BOLD signals.
Although heartbeat (normal rate is around 1.17Hz) and respiration rate (normal rate is
around 0.2Hz)
22
are high frequency, a poor temporal resolution may lead to these
physiological noises being collected as low frequency noise. Aside from these observable
and measurable movements, the main source of noise is spontaneous neural activity and
8
metabolic reaction that creates non-stationary noise.
All different kind noises are involved in the fMRI system. The task related signal,
what investigators want to detect, is only 0.5 to 5% change in measured BOLD signal 1.
Nevertheless, due to the fundamentally uncertainty characteristics, noise cannot be
completely eliminated from the BOLD signal.
2.1.4 Pre-processing
In order to increase accuracy of statistical analysis, BOLD signals are operated
through a series of computational procedures, called pre-processing procedures. The goal
of pre-processing is to remove, to the degree possible, non-task related signals and to
prepare data for the coming statistical analysis.
Head motion correction, spatial smoothing, and temporal filtering are used to eliminate
non-interesting signal. Co-registering the functional and structural data and distortion
correction are developed for increasing spatial localization. Finally, spatial smoothing and
time correction prepare data for following statistical analysis. Additionally, in order to
control the variability between brain structures of subjects, spatial normalization is
applied. Among all pre-processing procedures, one important and common used step,
spatial smoothing is introduced its detail in the following.
Spatial Smoothing
The purpose of spatial smoothing is to remove high spatial frequency by replacing old
spatial signal intensity with a spatial weighting average. Let us take two dimensional (2-
9
D) spatial smoothing as an example. Let I ( xi , y j ) denotes the BOLD response at spatial
coordinates ( xi , y j ) and replace this intensity by
IˆS ( xi , y j ) 
 s  x
x
q
q
r
 xi  s y  yr  y j  I ( xq , yr )
 s  x
x
q
q
r
 xi  s y  yr  y j 




IˆS ( xi , y j ) is the modified signal intensity. sx xq  xi and s y yr  y j are smoothing
kernel separately along with x direction and y direction. s x  xq  xi   s y  yr  y j  is the
weighting applied on voxel ( x q , y r ) . The denominator is to assure that the sum of the
weighting applied on I ( xi , y j ) equals 1. In fMRI data analysis, the most commonly used
smoothing function is Gaussian kernel. The to-be-smoothed voxel is located at the center
of Gaussian kernel,
 xx  
where sx  x  xi   e
i
2 x2
 y y  
2
2


and sy  y  yi   e
i
2 y 2
The variance  x2 and  y2 control the weighting of smoothness of each direction.
Greater σ2 leads to a heavier contribution from the voxel’s neighbors. Full width at half
maximum (FWHM) is used to represent the strength of weighting. FWHM  8ln 2 .
 x  xi 
2
and  y  y j  is the distance between the to-be smoothed voxel, ( xi , y j ) , and its
2
neighbors, (x, y) . When the distance is longer, the contributed weighting is smaller.
FWHM and the distance between voxel and its neighbors are two parameters in the
spatial smoothing process. Figure 2.2 displays the weighting over distances among three
FWHM values: 3, 6 and 8. 6 and 8 FWHMs are the most common values used in real
fMRI analysis. However, the appropriate value of FWHM is related with the size of the
10
ROIs. For a small region, too much smoothness may blur the desired ROI out and merge
with close declared active region.
One dimentional Gaussian kernel weighting
0.35
FWHM = 3
FWHM = 6
FWHM = 8
0.3
Weighting value
0.25
0.2
0.15
0.1
0.05
0
-10
-8
-6
-4
-2
0
2
4
6
8
10
Distance between voxel and it's weighting neighbores
Figure 2.2 One-dimensional Gaussian kernel weighting structure over 3, 6, and 8 FWHM.
2.1.5 Statistical Modeling
BOLD Signal Modeling
fMRI data is constructed by a 3-D matrix of voxels time series data. The most
common analysis is voxel-wise analysis. These voxels are assumed to represent the same
brain region over time.
The relationship between neural activity, s  t  , and BOLD response, f p  t  , exhibits a
linear time invariant (LTI) characteristic 1. This means that the transformation from
neural activity (input signal) to BOLD response (output signal) satisfies both the
properties: linearity and time invariance. The linearity characteristic represents that k
times bigger neural activity leads to a k times bigger amplitude of the BOLD signal.
Linearity also implies an additive property. For example, we have two responses
separately with two known amplitudes caused by two independent stimuli. When these
11
two stimuli occur together in time, the resulting signal would equal the sum of two
known amplitudes of response. Since the observed BOLD responses generally agree with
LTI property, an expected BOLD signal can be generated from given neural activity
through the convolution operation
19
. An experiment stimuli function is defined by s .
over time and the function describes the response to stimuli impulses. HRF, is represented
by h . .
Then an expected BOLD signal, f p  t  , can be generated by
f p t  
t
 h  u  s  t  u  du
0
The function of HRF, h . , is modeled by the double-gamma function with specific
parameter values:
a1
a2
 t 
 t  d1   t 
 t  d2 
h  t     exp  
  c   exp  

b1   d 2 
b2 
 d1 


where a1  6 , a2  12 , b1  b2  0.9 , c  0.35 and d i  ai bi  i  1, 2  23.
This double-gamma HRF model is one of most common canonical HRF’s and is good
at modeling the later undershoot characteristic of the BOLD signal 24.
Voxel-wise General Linear Model
Because the flexibility and interpretable properties, general linear model (GLM)
becomes the most popular method among current statistical analysis approaches. How
well the observed BOLD response associates with expected BOLD response from stimuli
is observed.
At one given voxel level, time series data is modeled as a linear combination of
12
functions of time, such as experimental design function and confounding effect function.
For one voxel’s intensity value at time t, yt is expressed as follows:
yt  b1 f1  t     b p f p  t     bP f P  t   et
where f p  t  represents the regressor of interest, function of task related expected
BOLD signal, or nuisance regressors (what researchers would like put into the model,
such as, intercept term) at time t. b p is the regression coefficients corresponding to pth
regressor. et is the error term with normal distribution N (0,  t2 ) . For general modeling,
the given voxel’s fMRI GLM is expressed by matrix as the following:
Yt  XB  E ; E ~ N (0, Σ  )
 y1 
 
 
 y n    f1 
 
 
 y t  t 1
fp

 e1 
 b1 


 
 
f P   b p    en 
t P
 
 


 bP  P 1  et  t 1
where Σ   is a general t  t variance-covariance unstructured matrix.
1,t 

 1,1 1,2

 2,2
 
2,1

Σ   
 
  ( t 1),t 



  t , t 1  t ,t 
t
,1

tt
Y is assumed with multivariate normal probability distribution as follows:
f Yt , B,  2V  
1
 2 
T /2
1
 1

exp   Yt  XB  '  Σ    Yt  XB  
 2

Σ  
Assume that the Σ   equals  t2Vt where  t2 is unknown and Vt is known, generalized
least square (GLS) estimations of regression coefficients and corresponding variances are
13
the best linear unbiased estimation (BLUE) based on the Gauss-Markov theorem. Under
the hypothesis, H 0 : bp  0 v.s. H a : bp  0 , there is one corresponding statistic, t statistic
or F statistic, respecting to each regressor of interest and corresponding p-value.
Comparing the p-value with given significant threshold, the task active statue of each
voxel is able to be defined. By displaying the t statistics or F statistics of voxels with
significant results in a 3-D image, a statistical parameter map (SPM) is then built 25. SPM
is the neural active image people will see from fMRI image.
For a known temporal autocorrelation structure, an accurate conclusion can be
developed. Intrinsic temporal autocorrelation structure, Vt , is reflected in the linear
model as follows for a given voxel:
Yt  XB  E
E ~ N (0,  t2Vt )
Based on the information of temporal autocorrelation structure on linear model, the
regression coefficient estimators and corresponding variances are computed by GLS as
following equations 26:
-1
Bˆ =  X'Vt X  X'VY
t


 t2 = Y' Vt -Vt X  X'Vt X  X'Vt Y /  t - K 
-1
 
-1

Var Bˆ =  X'Vt X  ˆ 2
where t is the number of observed sample size.
Bullmore proposed the first-order autoregressive model (AR(1)) as the intrinsic
temporal autocorrelation structure model
27
. The AR(1) approach has been well
investigated and displays an effective modeling for fMRI time-series characteristics 28-31
The AR(1) model applied on temporal variance-covariance structure is expressed as
14
the following:
 ij2   2  
j i
; 1  i, j  T
 1
 1,t 

  1,1  1,2
 12




2,1
2,2
2
2
  
    

 
 ( t 1),t 

 






  1t
t ,t 
t , t 1
 t ,1
t t





t ( t 1) 


1 t t

1

 
 t 1 
( t 1)  t
where 0    1 represents the strength of correlation.
2.1.6 Multiple Comparisons Issue in Voxel-wise GLM
For one statistical analysis, a statistical significant result, given α significant threshold,
shows that there is α confidence that the conclusion is a false positive, 0    1 . In
voxel-wise analysis, one fMRI brain image volume (64×48×64 = 196’608 voxels)
includes 196’608 hypotheses. At the 0.01 significant threshold, about 1996 voxels are
declared active by chance. In order to correct this problem, named “multiple comparisons
issue,” significant threshold correction methods were developed. First, the most
straightforward approach to controlling probability of one or more false positive
conclusion(s), called “family-wise error rate” (FWER), is the Bonferroni correction. The
new significant threshold,  B , is replaced by

N
, where N is the number of tests which
is the number of voxels in voxel-wise fMRI analysis. However, for 0.01 significant
threshold value, a correction for one image slice, 64×48 = 3072 voxels,  B is a extremely
small value,
0.01
 0.000003255 . A too conservative corrected threshold is computed.
3072
Therefore, another approach called false discovery rate (FDR) is proposed to apply on
15
fMRI data correction
32-34
. This approach controls the proportion of tests incorrectly
declared as significant out of all declared significant tests. FDR process is described as
following steps,
Step 1: Ranking the order of all p-value of voxels from smallest to biggest:
p1  p 2     p N 
where N is the total number of voxels and p n  represents the nth small p-value
Step 2: Declare the significant by
p n 
nq
N
where q is the proportion of false positive among all declared positive results. The
value of q is suggested from 0.01 to 0.2 in fMRI data analysis
17
. The FDR process
promises that E  FDR   q by controlling false declared tests. FDR correction approach
is easy to implement and the threshold is adaptive by the strength of the collected signal.
2.1.7 Summary
fMRI data analysis is constructed by a series of computational operations. After
collecting fMRI data simulated by an appropriate experimental design, reconstructed
fMRI image goes through a sequence of pre-processing steps to prepare data for the
following statistical analysis. In the end, statistical analysis inference displays the active
status of each voxel. A fMRI task active brain region map is then generated.
One important challenge is that how to define an appropriate experimental design. Due
to huge variance caused by various noises, fMRI SNR is usually relative low. Therefore,
in order to keep high activation detection accuracy, a lengthy fMRI paradigm is usually
16
implemented. Moreover, the experimental design for traditional statistical analysis
approach has to be defined before implementation. The length of paradigm is usually
suggested based on the lowest SNR which may lead a redundant sampling for subject
with higher SNR.
2.2 Real Time fMRI
2.2.1 Introduction of Real Time fMRI
Cox proposed the idea of real time fMRI (rt-fMRI) in 1995
35
. In recent years, the
speed of fMRI acquisition, computational processing, and analysis approaches have been
improved
4-10
. Then the idea of rt-fMRI is able to be put into practice. Rt-fMR system
allows investigators to observe simultaneous brain activity during an on-going
experimental implementation. Applications of fMRI are extended to a further and wider
field by performing rt-fMRI. Two important applications are brain-computer interfaces
(BCI) and pre-surgical planning. In BCI systems, by looking at their own brain activity in
real time, subjects are allowed to learn by training and then successfully self-regulate
their own cognitive behavior, such as the rank of pain intensity caused by fire on fingertip
36, 37
, unpleasant feelings due to odious drawings 38 and contamination anxiety 39. For pre-
surgical planning purpose, surgeons need to know if a damaged brain tissue resection
surgery will result in a destructive influence to patients with brain tumors, epilepsy, or
impaired vascular brain regions. A detailed cortical map of functions that will affect a
patient’s quality of life, such as language, motor and primary sensory functions, is
necessary. Rt-fMRI is able to provide this desired functional map 18. In addition, during
rt-fMRI sessions, spikes, spontaneous noise and head motion can be monitored and then
17
the quality of fMRI data can be controlled by removing them at the same time.
2.2.2 Architecture System of rt-fMRI
The rt-fMRI system is very similar to the fMRI system, as they require two
components in common. One is the MRI mechanism for BOLD signal acquisition and the
other is an imaging processing computer for pre-processing and statistical analysis. To
build a BCI system, another component, stimulus/feedback computer, is required for
displaying real time neurofeedback to investigators and subjects. Computers and
machines are connected by TCI/IP protocol.
A fully rt-fMRI system is defined by required processing time, from acquiring signal,
processing and data analysis and then displaying neurofeedback, shorter than one TR
which is 0.5 to 3 seconds 40.
2.2.3 Real Time Pre-processing Steps
Unlike traditional fMRI data, rt-fMRI only has time series data from the experimental
beginning to the current time point. Therefore, pre-processing steps applied on temporal
data, such as drift correction, is different from traditional algorithms. Besides, in order to
achieve real time, applied pre-processing steps have to be able to significantly improve
the accuracy of active signal detection in a very short time.
Common pre-processing steps were investigated for use on rt-fMRI already, such as
distortion correction, motion correction, temporal filtering, spatial smoothing, and spatial
normalization
41
. Further, an optical tracking device was developed for correcting head
motion, one serious problem for fMRI data analysis, in real time
18
42, 43
. In Magland’s
study, the head motion correction method
44
doesn’t significantly improve active status
classification accuracy in real data analysis 5. Nevertheless, drift correction method
shows a significant increase in the classification accuracy 5.
2.2.4 Statistical Analysis of rt-fMRI
Rt-fMRI is first proposed in 1995 by Cox 35. Cholesky decomposition was applied on
computing inverse matrix, which is the most time consuming step, in correlation
coefficients recursive computational algorithms. Because Cholesky decomposition
decomposes an inverse matrix into a lower triangular matrix and its conjugate transpose,
a simpler, easier calculation is achieved.
Following this idea, more flexible and general models for recursive and cumulative
calculation are presented. On the premise of minimizing computer cost and memory
footprint, sliding window GLM 45, multiple regression 46, and incremental GLM
40
were
proposed in the past ten years. Sliding window approach (SWA) is analyzing data in a
specific length of time period to prevent the influence from drift and spike noise. The
incremental approach (IA), on the other hand, collects all the data from the beginning to
the latest time point. Although SWA can avoid non-stationary noise, IA can give the
higher power of statistical inference by including the biggest sample size. In SWA and IA
approaches, the Gram-Schmidt Process is applied for orthogonally decomposing the
regressors’ functions, and this also speeds up the calculation time 40.
19
Chapter 3
Sequential Analysis Methods
The distinguishing features about sequential analytic procedures are that the sample
size is not fixed in advance, in contrast to other statistical procedures. Data are evaluated
while each new observation is obtained in a sequential procedure. The main difficulty of
design and analysis of fMRI signals is large variance, which is caused by different
sources of noise. In order to insure sufficient statistical power, a relative big variance is
usually assumed when designing the number of task repetitions, such as in a standard
block design. However, such variance varies among different subjects, implemented tasks
and scan machines. Implemented experimental designs are thus often either inadequate or
redundant. As we propose, sequential procedures are able to adjust the stopping point for
sampling according to the signals observed in real-time.
Two sequential approaches are described here: one-sided and two-sided Wald’s
sequential probability ratio test (Wald’s SPRT)
approach
47
and Srivastava’s sequential estimation
48
. Wald’s SPRT approach involves sequential hypothesis tests and the
procedure is terminated when there is sufficient evidence for the null hypothesis to be
rejected or accepted, in comparison to an alternative hypothesis. Srivastava’s sequential
estimation procedure will directly involve activation magnitude estimation associated
with a task, with sampling termination occurring when the length of confidence interval
(CI) is smaller than a given value. An interesting feature of Srivistava’s approach is that it
provides simultaneous coverage probabilities for multiple comparisons.
approaches are described in detail here.
20
These two
3.1 Wald’s Sequential Probability Ratio Test
One of the most implemented sequential tests in the literature is Wald’s Sequential
Probability Ratio Test (Wald’s SPRT), which was initially developed during Wald War II.
In Wald’s SPRT, the sample size is decided when the accumulated data is able to provide
sufficient information to make a decision under pre-defined type I and type II errors. In
most cases, sequential analysis can make a decision faster than fixed sample size analysis
under the same type I and type II errors. For simple hypothesis testing, it had been
demonstrated by Wald and Wolfowitz
49
that Wald’s SPRT requires minimal expected
sample size compared with all fixed sample and sequential tests.
One-sided Voxel-wise Wald’s SPRT
The goal of Wald’s SPRT is to test simple hypotheses. The general form of one-sided
hypothesis is
H0 : b – θ0= 0 versus Ha : b – θ0 ≥ δ
(3.1)
where b is the parameter of interest. δ is considered with practical important difference
from θ0.
Let α and β separately are specified as nominal type I and type II errors:
α = P(reject H0|H0 is true)
β = P(accept H0|Ha is true)
Random variable y is denoted with distribution f(y,b), where b is a vector of regression
parameter values. Therefore, y ~ f(y, b| H0) when H0 is true and y ~ f(y, b| H1) when Ha is
true. Let  yt t 1 denote independent identically random (i.i.d.) variables with probability
T
function f(yt,b). t is an integer value, bigger than zero and represents the number of
21
collected samples over time. The likelihood of t successive observations, Yt   yn n 1 , is
t
given by f(y1, b| H0)×… ×f(yt, b| H0) when H 0 is true, and f(y1, b| Ha)×… ×f(yt, b| Ha)
when H 1 is true. The test statistics of Wald’s SPRT, Ʌt, is based on comparing these two
likelihoods by analyzing their ratio:
 f Yt , b | H a  
 t f  yn , b | H a   t
 f  yn , b | H a  
 t  log 
 log  
  log 


 f Y , b | H  
 n 1 f  y , b | H   n 1
 f  y , b | H  
t
n
n
0 
0 
0 



Wald’s acceptance/rejection rule is defined as following,
1.
Continue sampling when B < Ʌt < A
2.
Stop sampling and accept H0 when Ʌt ≤ B
3.
Stop sampling and reject H0 when Ʌt ≥ A
Stop boundaries (A, B) are defined as ( log((1-β)/α) , log(β/(1-α)) ).
Note that computing a logarithm value leads to a cumulative sum of log likelihood
 f  yn , b | H a  
ratios. Only incremental log likelihood ratio, log 
 , is required to be added
 f  yn , b | H 0  
in to the sum of statistics when new data is collected. This speeds up computing time.
For non-i.i.d. data, Wald’s SPRT can be displayed by log ratio of the probability
density of the observed samples (y1, y2, …yt) under H0 and under Ha.
The more general format of Wald’s SPRT statistics is represented as following:
 f Yt , b | H a  
 f  y1 , yt , b | H a  
 t  log 
 log 
 f Y , b | H  
 f  y , y , b | H  
t
t
0 
1
0 


The concept of Wald’s SPRT is to rely on the likelihood ratio of the collected data (y1,
y2 … yt) to determine if this data more strongly supports the given null hypothesis or the
22
given alternative hypothesis. Moreover, as long as the true parameter value is greater than
θ0 + δ, and if supported for instance by a monotone likelihood ratio property, a higher
likelihood given H1 is true will be computed than if b – θ0 = δ. Conversely, if b – θ0 = 0,
the likelihood ratio is largest when for H1: b – θ0 = δ. This implies that the alternative
hypothesis can be extended from a simple hypothesis with H1: b – θ0 = δ to being written
as H1: b – θ0 ≥ δ, without affecting error bounds.
Two-sided Voxel-wise Wald’s SPRT
In this section, use of Wald’s SPRT for two-sided hypothesis is considered for
identifying whether the true parameter is equal to a given value or not.
The two-sided hypothesis that we consider for Wald’s SPRT is as follows:
H0 : |b - θ0| = 0 versus Ha : |b - θ 0| ≥ δ.
Again, a difference between b and θ0 greater than δ > 0 is considered as practically
important. The likelihood given the alternative hypothesis is true is a weighted average of
f(b = θ0 - δ) and f(b = θ0 + δ) where 1/2 has been proved to be the optimum weighting
amount 47, 50. The SPRT statistic for two-sided hypothesis for voxel-level analysis is:
1

 2  f Yt , b | 0     f Yt , b | 0     
 t  log 

f Yt , b | 0 




The SPRT statistic, including the statistical estimates that comprise it, are computed
after each new fMRI image is collected (in other words, in real-time). The decision is
made by the same rejection/acceptance rules as for one-sided hypothesis tests:
1. Continue sampling when B < Λt < A
2. Stop sampling and accept H0 when Λt < B
23
3. Stop sampling and reject H0 when A < Λt
where stopping boundaries (A,B) = ( log((1-β)/α) , log(β/(1-α)) ).
There are several nice things about Wald’s SPRT. First of all, the boundary of Wald’s
SPRT is only related to the specification of desired levels for the type I and type II errors.
Therefore, boundary can be computed without distribution knowledge. Secondly, the
conclusion made by Wald’s SPRT is designed by satisfying not only given Type I error
threshold (= α) but also for statistical power (= 1-β). Most fixed sample design tests can
only restrain either type I or type II errors. Thirdly, Wald’s SPRT is an economical test.
Usually they require smaller sample size than fixed sample size tests under the same type
I and type II errors to make a decision. However, there are two major disadvantages of
Wald’s SPRT: one is that there is no maximum sample size specified. This implies that an
unsatisfactorily large sample size may be required to make a conclusion in Wald’s SPRT
procedure. Further, for instance, when the true parameter of interest value is around half
value of δ, the expected sample size of SPRT can turn out to be a larger number,
especially under a small error probability. Ironically, such a scenario occurs when the
parameter value is the most indifferent to selection of either hypothesis. The other one is
that Wald’s SPRT is only allowed one unknown parameter, the parameter of interest in
the hypothesis. Wald’s SPRT does not directly apply for the situation when there exists
one or more nuisance parameter(s). For these two issues, a truncated SPRT and a
modified SPRT are separately described that address these concerns.
24
Truncated SPRT
For the first concern, a truncated SPRT is proposed to place an upper limit on sampling
and to solve the possible larger sample size issue
47, 51
. The acceptance/rejection rule is
formulated as the original SPRT when current sampling number t  T0 , where T0 is the
truncated time point. If the decision hasn’t been made before T0 , then
1.
accept H0 when  T  0
2.
reject H0 when 0   T
0
0
where  A, B  is defined as ( log((1-β)/α) , log(β/(1-α)) ).
Truncated SPRT is a practical remedy for Wald’s SPRT while the probability of errors
is small and/or the parameter of interest value is located between two hypotheses,
conditions that lead to long sampling sequences 51. However, by truncating the sequential
processes, the probability of type I and type II errors will increase 50.
Modified SPRT with Nuisance Parameters: Bartlett snd Cox
For the nuisance parameter issue, several extensions of Wald’s SPRT have been
proposed. Bartlett utilized conditional maximum likelihood estimation (MLE) as the
estimation of unknown parameters, given the H0 and Ha separately 52. Cox directly used
MLEs as estimators of the nuisance parameters
53
. Both extension approaches employ
exactly the same stopping boundaries as Wald’s SPRT. Although a cost of larger expected
sample sizes is observed compared with Wald’s SPRT when there are no nuisance
parameters, the expected sample size for modified SPRT is still smaller than the fixed
sample design when the true parameter value isn’t located in the middle of the
hypothesized null and alternative values 54.
25
3.2 Sequential Estimation Method
In some applications, estimation is more appropriate for answering the questions than
hypothesis-testing which may be artificial. There are two approaches for sequential
estimation: one is to minimize a formulated loss function (that includes a cost of
observation) into the sequential estimation process and another one is to establish a
sequential stopping rule when an appropriate length for CI is attained. In this study, we
focus on sequential estimation of regression parameters by applying the latter approach.
The following method was proposed by Srivastava in 1967 48.
For Yt comprised of a sequence of observations,  yn  n  1, 2, , t  , suppose
Yt  XB  Et
where Yt is a t × 1 vector including t observed components, X is a t × P known design
T
matrix, P is the number of regressors, B equals b1  bp  bP  , a
P× 1
regression coefficients vector, and Et , a t×1 error vector, equals  e1  en  et 
T
and have unknown distribution function with 0 mean and finite, but unknown variancecovariance matrix  t2Vt . Vt equals I t , a t×t identity matrix, when independent
observations are assumed, as in Srivastava shown.
The goal of Srivastava’s approach is to identify confidence region Rt in P-dimensional
Euclidean space for unknown B . This confidence region is with a prescribed width 2d
and prescribed coverage probability 1-α. This implies that RT conforms to two criteria: (1)
lim p( B  RT )  1   ; (2) the maximum diameter of RT ≤ 2d at Srivastava’s sequential
d 0
estimation stopping time point, T 48. Moreover, the CI of any normal linear combination
26
cB , where c is a 1×P contrast vector and cc '  1 , also fits in with these two criteria.
Hence, simultaneous coverage is to be attained for the family of linear contrasts.
The procedures of Srivastava’s sequential estimation is described as follows. Step 2 is
recursively employed until the stopping point.
Step 1: Define the specific length of CI as 2d and acceptable type I error α value.
Step 2: Start sampling by taking t0 observations y1, y2, ... yt0, where t0 ≥ P. Verify the
condition that lim t 1  X t ' X t  =Σ is a P×P positive definite matrix. Then, compute the
t 
estimation of unknown parameter Bt and σt2 with the following equations:
1
Bˆ t   X t 'Vt X t  X t 'VY
t t


ˆ t2  Yt ' Vt  Vt X t  X t 'Vt X t  X t 'Vt Yt /  t  P 
1
Below, and as in Srivastava, we will assume Vt equals I t , a t×t identity matrix.
Then, collect one extra sample at each time point and stop sampling when:


2
2  t 1  d t . Then, T  smallest t  t that satisfies the stopping criterion.

0
t
at2 t
Here, at  is any sequence with positive constants converging to the number a,
satisfying p (  2P  a )  1   .  2P represents a random variable distributed chi-square
distribution with P degree of freedom. λt is the maximum eigenvalue of t  X t ' X t  .
1
Final step: At stop time point T, the region RT is constructed as follows:
2
d
RT  [ Z : T 1{Z  BˆT }'( XT ' XT ){Z  BˆT }  ]
T
27
3.3 Multiple Comparisons Issue in Sequential Method
Sequential Bonferroni Method
One fMRI three-dimensional whole brain image volume can contain 64×48×64 =
196,608 voxels. Conducting voxel-wise analysis thus can include a very large number of
hypotheses if each voxel is statistically considered separately. This presents what is
known as the “multiple comparisons” issue, and care is needed in determining decision
rules for hypotheses, in order to preserve simultaneous Type I and Type II errors.
Bonferroni method is introduced here, as these corrections are easily reflected in the
stopping boundary specifications.55. The method strictly controls family-wise error rate
and family-wise power. Family-wise error rate, called FWHMI, represents the probability
of rejecting one or more true null hypotheses. FWHMII is the probability of accepting one
or more false null hypotheses. Family-wise power is the probability of detecting all true
significant differences. The Bonferroni procedure guarantees that FWHMI < α and
FWHMII < β based on the following modification. The Type I and Type II errors for a
single voxel’s hypothesis are modified as αn = α/N and βn = β/N, where N is the total
number of hypotheses (voxels) being considered at once
specific region of interest.
28
55
, which may involve only a
Chapter 4
Dynamic Localization and Stopping Using an SPRT
Approach
4.1 Overview of Statistical Issues
The processes of dynamic adjustment of stimuli methods are introduced in this section.
Real time pre-processing steps are initially applied to the original data. Then the
sequentially adaptive experimental design is performed by dynamic adjustment of stimuli
methods. Issues to be faced in implementing voxel-wise SPRT methods in rt-fMRI
include: handling of spatial and temporal autocorrelation, drift in fMRI signal over time,
nuisance parameters, and multiple comparison issues from the large number of voxels
being analyzed, even in restricted ROIs. Because of concern for computational speed in
real time, for now spatial autocorrelation is not modeled explicitly, rather spatial
smoothing of pre-processed data is conducted. Temporal autocorrelation estimation will
be considered. Also, computationally simple drift correction will be employed. Voxelwise SPRT will be proposed, as well as a stopping rule based on the Bonferonni
correction. In the next sections, simulations result will be presented.
4.2 Pre-processing Step
Two real time pre-processing steps are applied in this study: spatial smoothing and
normalized drift correction. Spatial smoothing has been introduced in previous chapter
(2.1.4.1); 2-dimensional spatial Gaussian smoothing with 6 FWHM was applied on 3 by
3 matrixes of voxels for simulated data analysis.
29
4.2.1 Normalized Drift Correction
One potentially serious issue with fMRI signals is the temporal low frequency drift
which is mainly caused by instability of scanner mechanisms. Low frequency drift with
great amplitude changes the baseline of the signal appreciably. Appropriate correction is
proved significantly increasing active status classification accuracy in rt-fMRI 5. One
approach is by adding a set of discrete cosine transform (DCT) functions in the design
matrix of GLM. However, this enlarges the size of the design matrix. A bigger design
matrix will increase the computational time on the inverse matrix calculation step.
Therefore, in order to compute efficiently, another drift correction approach, low
frequency filter, is applied at the pre-processing stage.
Let yt denote the BOLD response of one voxel at time point t. Then in the normalized
drift correction step replace yt by
yt  yt  dt
dt  dt 1    yt  dt 1 
  yt  1    yt 1    1     y2
t 2
t  1; d1  0
where yt is the corrected signal intensity at time t . dt is a weighting average of
previous time series data. The drift correction parameter, τ, is limited between 0 and 1. τ
decides how quickly the baseline is corrected. According to the Figure 4.1, a faster
correction is employed when a bigger value of  is applied.
30
Drift correction amount of weighting
1
tau = 0.1
tau = 0.5
tau = 0.9
0.8
0.6
0.4
0.2
0
0
10
20
30
40
50
60
70
80
90
100
Figure 4.1 Amounts of smoothing weighting applied on time point 100 by different values of  .
A data-specific low frequency drift line, dt t 1 , is generated by weighting temporal
T
neighbors’ values. Then by subtracting dt t 1 , modified signal  yt t 1 is ideally able to
T
T
remove low frequency noise from the recorded signal,  yt t 1 ,
T
In order to generate a comparable scale of weighting value of drift correction approach,
the coefficients in front of the linear combination of dt are normalized. The sum of the
coefficients equals:

  1       1       1  1      1   
t 2
t 2

 1  1   t 1 

 





 1  1   
t 1
Therefore, dt in drift correction methods divided by 1  1   is applied to modify
t 1
the scale of the linear combination of dt to create the comparable scale. Recursive
normalized drift correction algorithm is described as following,
31
yt  yt 
dt
1  1   
 t 1
dt  dt 1    yt  dt 1  t  1
d1  0,

The denominator, 1  1   
 t 1
 , guarantees that the sum of all weights applied on d equals
t
1.
4.3 Voxel-wise SPRT
4.3.1 Voxel-wise General Linear Model in fMRI
General Linear Model
The most common fMRI statistical analysis approach is by using voxel-wise GLM, to
see how well observed BOLD response associates with expected BOLD response from
stimuli. Voxels that are activated by a task are identified through conducting statistical
inference on a task-related regression parameter. For a given voxel, the GLM is:
Yt  XB  Et
where Yt is a t × 1 vector of measured BOLD signal intensities of the voxel over time,
and E t , a t × 1 vector, represents the error components. X is a t × P design matrix
including the expected BOLD signal change generated by convolving HRF and
T
implemented tasks stimuli function. B equals b1  bp  bP  , a P × 1 regression
coefficients vector that includes those parameters that are task-related. The t×1 error
vector, E t , is assumed to be normally distributed with mean zero and variance σ2Vt,
where σ2 is the error variance and Vt, a t×t matrix that may represent a temporal
32
autocorrelation structure. Yt is assumed with multivariate normal probability distribution
as follows:
f Yt , B ,  2Vt  
1
 2 
T /2
1
 1

exp   Yt  XB  '  2Vt  Yt  XB  
 2

 Vt
2
where  2Vt is the determinant of  2Vt . Using generalized least square (GLS)
estimation of regression parameters and corresponding variance values, one is able to
make inferences about a voxel’s activation status through testing of hypotheses. This will
be described next, in the context of a sequential testing framework based on SPRT.
One-sided Voxel-wise SPRT
We first consider SPRT for one-sided hypothesis tests on contrasts of voxel-specific
regression parameters from GLM models. Two-sided analogues are similar, as will be
seen later on. The general form of one-sided hypotheses is
H0: cB - θ0= 0 versus Ha: cB - θ0 ≥δ
(4.1)
where c equals c1  c p  cP  , a P × 1 contrast vector. cB is a linear
combination of corresponding coefficients. δ is considered with practical important
difference from θ0. For instance, the hypothesis test in (4.1) can be of the form H0 : bp =
0 against Ha: bp ≥ 1, and represents the test of whether or not a voxel is activating in
association with task , assuming the corresponding regressors of bp represents magnitude
of expected HRF activation associated with task k, and a magnitude of 1 is a practical
difference from 0. Other comparisons of task activations can be represented by linear
combinations of regression parameters associated across several tasks, which we
illustrate in the simulations.
33
cBˆ , the least squares (or maximum likelihood) estimate, is assumed to be distributed
normally with mean cBˆ and. It takes the form:

1
cBˆ  c  X'VX  X'VY

(4.2)
The statistic that is the basis for a one-sided SPRT is a likelihood ratio of cBˆ = θ1
given observations collected up to a time point t divided by the likelihood of cBˆ = θ0
given the same observations. The formula of likelihood ratio 54, 56 is:


 f cBˆ | 1
 t  log 
 f cBˆ | 0

 
 
1


1/2
  2  Var cBˆ
 log 
1

1/2
  2  Var cBˆ

 
 

1
2
1/2
1/2
cBˆ    'Var cBˆ 
   
 1
exp    cBˆ    'Var  cBˆ 
 2
1
0
 1
cBˆ  1 'Var cBˆ
exp 
 2
0
1
 cBˆ     
1
1

cBˆ  0





 

(4.3)
 cBˆ      cBˆ    'Var  cBˆ   cBˆ   
1
0
1
1
However, Wald’s original formulation of SPRT doesn’t account for unknown
 
“nuisance” parameters that are not the main focus of inference, in this case Var cBˆ .
Estimates of such values will consequently be assumed as true values in the calculation of
the SPRT, following as in Cox 53. These unknown parameters can vary between subjects.
 
For example, the variance, Var cBˆ , may differ from person to person and even change
across various fMRI experimental designs. Therefore, according to Cox’s work
 
Var cBˆ
,
is replaced by corresponding maximum-likelihood estimations (MLEs)
computed by the following equations:
34
53
 
1

Var cBˆ t  ˆ t2  c  X'VX  c '
(4.4)
For the scope of this investigation, the temporal autocorrelation structure, V, will be
assumed known, for computational simplicity.
Given this SPRT, stopping occurs for a single voxel as follows. Asymptotically, due to
consistency in the estimators of unknown parameters used in Cox’s SPRT, the same
horizontal stopping boundaries as for Wald’s SPRT can be employed
56
. Based on user
specified values of Type I error α and Type II error β, the decision is made by following
rejection/acceptance rules after collecting one scan image at scan time point t. These rules
are defined as:
1. Continue sampling when B < Λt < A
2. Stop sampling and accept H0 when Λt < B
3. Stop sampling and accept Ha when A < Λt
where stopping boundaries (A,B) = ( log((1-β)/α) , log(β/(1-α)) ).
Certainly, in practice, multiple voxels that comprise the regions of interest will be
analyzed simultaneously. The determination of whether to stop fMRI scanning for an
experiment should involve jointly considering all these voxels. A global stopping rule
will be proposed below that takes into account how many of the voxel-level analysis call
for stopping, so these single voxel stopping rules are still pertinent to the overall goal of
developing sequential methods for fMRI analyses.
Two-sided Voxel-wise SPRT
It may also be of interest to test two-sided hypotheses about values of cB, such as of
the form:
35
H0: | cB - θ0|=0 versus Ha: | cB - θ0| ≥ δ
(4.5)
where a difference between cB and θ0 greater than δ is considered as practically
important.
As an example, suppose cB = b1– b2, where b1 and b2 respectively reflect activation
levels for task 1 and 2. It may not be known a priori for which task activation may be
higher for the voxel, and hence a two-sided hypothesis would be appropriate. Given
normality of the error terms, the estimate cBˆ is assumed normally distributed with mean
 
of cB and Var cBˆ . Let δ> 0 be as in (4.5). The likelihood given the alternative
hypothesis is true is a weighted average of f( Yt | cBˆ = θ0 - δ) and f( Yt | cBˆ = θ0 + δ)
where 1/2 has been proved to be the optimum weighting amount 47, 50. The SPRT statistic
for two-sided hypotheses for voxel-level analysis is:
 



1
ˆ
ˆ
 2 f cB | 0    f cB |  0  
 t  log 
f cBˆ | 0



  



 
 
1
1
 1

 1

cBˆ  (0   ) 'Var cBˆ
cBˆ  ( 0   )    
1/2 exp 
 2
1/2
 2

   2  Var cBˆ
 





1



1
 1

cBˆ  (0   ) 'Var cBˆ
cBˆ  ( 0   )  
exp 
1/2


1/2
 2

   2  Var cBˆ


 log  

1
1
 1



ˆ
ˆ
ˆ
cB  0 'Var cB
cB  0 
1/2 exp 
1/2


2


ˆ
 2  Var cB

















 


 
 


(4.6)
36

  

  

  

Again, the SPRT statistic, including the statistical estimates that comprise it, are
 
computed after each new fMRI image is collected (in other words, in real-time). Var cBˆ
is replaced by the corresponding MLE separately computed by equation (4.4). The
decision is made by the same rejection/acceptance rules as for one-sided hypothesis tests:
1. Continue sampling when B < Λt < A and t is less than a pre-specified upper
bound for scan units
2. Stop sampling and accept H0 when Λt < B
3. Stop sampling and accept Ha when A < Λt
4. Stop sampling if the upper bound for scan units is reached, and accept H0 if ΛT
≤ 0; otherwise accept Ha.
Recall stopping boundaries (A, B) = ( log((1-β)/α), log(β/(1-α)) ).
(4.7)
Voxel-wise SPRT with Temporal Correlation
Wald’s SPRT is able to be implemented on i.i.d. data or on non-i.i.d. data 54. However,
one of the disadvantages of Wald’s SPRT is that only one unknown parameter is available
in the test. Nuisance parameters aren’t accounted for, such as variance parameter in
normal distribution or variance-covariance structure parameter in non-i.i.d. data.
The major noise in fMRI data, brain metabolism and physiology spontaneous
fluctuations results in a temporal autocorrelation
57
. Based on the model, a temporal
autocorrelation structure is included.
Multiple Comparison Correction
One fMRI three dimensional whole brain image volume can contain with hundreds of
thousands of voxels. Conducting voxel-wise analysis thus can include a very large
number of hypotheses if each voxel is statistically considered separately. This presents
37
what is known as the “multiple comparisons” issue, and care is needed in determining
decision rules for hypotheses, in order to preserve simultaneous Type I and Type II errors.
Bonferroni correction is used in this study, as these corrections are easily reflected in the
stopping boundary specifications. The Type I and Type II errors for a single voxel’s
hypothesis are modified as αn = α/N and βn = β/N, where N is the total number of
hypotheses (voxels) being considered at once
55
, which may involve only a specific
region of interest.
Global Stopping Rule
Since we are possibly testing a large number of voxels at once, a “global” decision is
needed on when to stop, based on aggregate performance of SPRTs across the tests. A
well-known irony (or drawback) associated with SPRTs is that the largest number of
observations required for stopping occurs precisely when we may be most indifferent to
the actual parameter value being tested
50
. For instance, suppose that the true parameter
value is half way between the null and alternative hypothesis values. This presents the
most difficult problem in terms of number of scan units that will be required to stop in a
classical SPRT. Yet, it also is the case where indifference to the classification results may
be greatest, as the value truly is “in-between” the null and alternative. For efficient
stopping, we suggest a rule that stops when a pre-determined, user-defined percentage of
voxel-level tests satisfy the stopping criteria of (7). This allows for circumventing the
waiting for voxel tests that are “stragglers” when the more clear-cut cases are already
decisively decided upon as active or clearly non-active. Therefore, a global stopping rule
is proposed that allows for identification of the most distinctive voxel activation levels,
but leaves some uncertainty for “borderline” cases that require relatively much more
38
testing, and for which the consequences of misclassification are less. In return, as we will
demonstrate, this allows for the chance to obtain great overall reductions in the number of
scan units needed for activation determinations across an ROI. A key to success for this
approach is to choose a pre-determined percentage that is reflective of the number of
voxels with activation levels that don’t lie between the null and alternative hypotheses.
Alternatively, this percentage should be at least large enough to include all significantly
activated voxels that would be of interest to the study at hand. The procedure for
implementing a global stopping rule is:
1. Predetermine a targeted percentage level, G%, that is acceptable in terms of
voxels that will have decisive classification
2. Stop fMRI scanning if at least G% of voxels satisfies (4.7) after multiple
comparisons adjustment via Bonferroni correction. Otherwise, continue
scanning.
3. At the stopping time point, T, the final activation decision of all the voxels are
made according to the following rules:
1) Accept H0 when ΛT ≤ 0
2) Accept Ha when ΛT > 0
4.3.2 Summary of Voxel-wise SPRT Procedures
In sum, the procedures of voxel-wise SPRT are recursively employed until a globallydetermined stopping point for experimental administration is reached. The following
steps are:
Step 1: Collect one new fMRI image.
39
Step 2: Apply real time pre-processing procedures, such as spatial smoothing and
normalized drift correction 5.
 

 based on equations (4.2) and (4.4).
Step 3: Compute MLEs of cB and Var c B
Step 4: Compute SPRT statistic Λt, based on equation (4.3) or (4.6), depending on the
form of the hypothesis test, and the MLEs from Step 3.
Step 5: Determine if stopping would be invoked for each voxel-level test based on
rejection/acceptance rules as in (7), incorporating Bonferroni correction as needed.
Step 6: Assess the global stopping criterion for the pre-determined target G%, to
determine if fMRI–wide stopping should be invoked. If not, repeat from Step 1.
Final Step: If the specified global stopping is rule satisfied, at each voxel all the fMRI
signal data that has been collected up to stopping for that voxel is used to make a final,
determination as to activation status. The likelihood ratio as in (4.3) for one-sided
hypothesis or (4.6) for two-sided hypothesis will be computed, and rule for deciding
between hypotheses is to select the associated hypothesized parameter value with the
largest corresponding likelihood value.
4.4 Simulation Studies
In simulations we explore whether the proposed approach of conducting simultaneous
voxel-wise SPRTs is able to achieve similar accuracy compared to a fixed, predetermined experimental design, with the conventional fMRI analyses method through
GLM, while at the same time significantly reducing the scanning times. R package
“nruRosim”
58
was used to generate simulated fMRI images and the simulated dataset
40
were analyzed within the Matlab environment (64-bit version R2012a The Mathworks,
Natick, MA).
In simulations, the fixed experimental design includes two tasks, task A and task B.
Each task block is presented for 4 seconds and then a rest block is presented for the next
20 seconds. The two tasks are presented by turns. This alternating cycle is repeated for 60
times; 30 for task A and 30 for task B. One rest block is applied in the beginning of the
experimental design. The block paradigm is given as the order of following sequence:
R|A|R|B|R|A|R|B|R|…
where R represents a rest block and A and B respectively represent task A block and
task B block. This gives a total of 730 image scans given 2 seconds TR. Each voxel has
its own simulated fMRI signal. The simulated fMRI signals were generated by combining
time series associated with activation activity and noise. Signal intensity of one voxel at
time point t, y(t), is given by
P
y t     p f p t   e t 
(4.8)
p 1
where fp(t) is the task p activation time series and βp is corresponding activation
strength, and e(t) is error at time t. fp(t) is obtained by convolving a commonly used
canonical HRF, the double gamma HRF, with task k stimuli time series, xp(t). P is the
total number of tasks performed in the simulated experimental design and P equals 2 in
this simulation study. The term e(t) is generated by taking account of white noise, low
frequency drift, physiological noise, temporal correlation and spatial correlation. White
noise is assumed to be normally distributed. Low frequency drift is generated by a basis
of discrete cosine functions. Heart beat and respiratory rate were separately set as 1.17
41
Hertz and 0.2 Hertz. Temporal correlated noise is generated based on an autoregressive
order one (AR(1)) model with a ρ value of 0.3, which is suggested for TR = 2 seconds 59.
Spatial correlation is also modeled by an AR(1) process with ρ value of 0.7. All the
voxels are generated by including the noise with the same criteria, as described above.
The quality of data over time can be indicated by a voxel-wise calculation of time
series SNR, which is defined here as follows:
SNR = the standard deviation (std) of task related signal intensity / std of non-task
related signal over one voxel time 60
(4.9)
Our simulations will be structured as follows. Two fMRI image scenarios are
simulated. Each image has three activation regions, as shown in Figure 4.2. One is
simulated with relatively lower SNR of 0.1, and another one is simulated with SNR of 0.3.
Both simulated images have a size of 48 × 48 voxels. The activation pattern includes
three circular shaped regions 1, 2 and 3 with exponential decay activation responses
which are separately activated by task A only (β1=β, β2=0), task B only (β1=0, β2=β) or
both task A and task B (β1=β, β2=β) shown in Figure 4.2. Region 1 includes 377 voxels,
region 2 includes 113 voxels and region 3 includes 377 voxels. They are separately
16.36%, 4.9% and 16.36% of the total number of voxels (2304) in one simulated image.
Region 1 is designed to explore the detection on a bigger size region of interest (ROI) and
region 2 is for smaller size ROI detection. Region 3 is used for investigating detection
when activation occurs for both tasks, including differential activation. Simulated image
with 0.1 SNR is generated by defining the maximum β values of the activation regions
equaling to 1 (β1=β2=1) and another image with 0.3 SNR is generated by defining the
maximum β values of the activation regions equaling to 3 (β1=β2=3). The remaining β
42
values of voxels in activation regions are exponentionally decaying to zero from the peak
value. The labeled 0.1 and 0.3 SNR values are computed according to the maximum β
value of the simulated image.
Simulated activated status
Region 1:Task A
Region 2:Task B
Region 3: Both
Region 4: None
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
Figure 4.2 Activation pattern of simulated fMRI image
Before simulated images were analyzed by voxel-wise SPRT and GLM, two steps of
pre-processing procedures, spatial smoothing and normalized drift correction have been
applied. Spatial smoothing is with Gaussian kernel 6 full width at half maximum (FWHM)
weighting on 3 by 3 voxels and a normalized drift correction method with parameter 0.1.
Real time pre-processing spatial smoothing
61
and drift correction
5
have been
investigated and are feasible.
Four simulation experiments were performed in this research. 1) Simulated image with
0.1 SNR were analyzed by one side voxel-wise SPRT for task A active regions in the first
simulation analysis. 2) In the second simulation analysis, the same simulated images and
same task A only activation are detected by two-sided voxel-wise SPRT. 3) SNR of 0.3
simulated images were analyzed for task A activation by one side voxel-wise SPRT in the
third simulation analysis. 4) In the last simulation, 0.1 SNR simulated images were
43
analyzed by one-sided voxel-wise SPRT to detect regions of differential activation
between task A and task B.
4.4.1 Simulation Model
For one voxel with t scan units represented in time series data, the linear model in the
simulation analysis, including intercept term and two tasks related regressors, is
described as follows:
Yt  XB  E ; E ~ N (0,  2 I )
 y1  1
   
  
 yn   1
  
   
 yt  1
f1 1

f1  n 

f1  t 
f 2 1 
 e1 

  b0    
 
f 2  n   b1   en 
   
  b2    
 et 
f 2  t  
where Yt includes the observed fMRI signal intensities from time point 1 to t (1≤n≤t.)
f1  and f 2    separately represent the expected task A and task B BOLD signals which
are generated by convoluting a double-gamma HRF with corresponding experimental
stimuli function. Type I error and Type II error are separately defined as 0.01 and 0.1.
The stopping boundaries in the SPRT method are corrected for the multiple comparisons
of voxels by the Bonferroni approach. In GLM analysis, the Bonferroni approach is too
conservative to detect the active regions. Therefore, the multiple comparison issue is
corrected by controlling the false discovery rate (FDR) in the following investigations 32,
33
44
4.4.2 Results
Efficiency of One-sided Voxel-wise SPRT on Activation Detection
The goal of the first simulation experiment is to explore the efficiency of the proposed
one-sided voxel-wise SPRT on high detection accuracy of the fMRI image with weak
task related signal. The hypothesis associated with task A activation is H0: cB = 0 against
Ha: cB ≥ 1 where c equals [0 1 0] and the hypothesis of task B is H0: cB = 0 against Ha:
cB ≥ 1 where c equals [0 0 1]. Here, activation strength with value greater than 1 is
considered as a practically significant value from zero in this study. A dataset with 0.1
SNR was generated and conventional fMRI analysis methods employed. The simulated
activation strength structure of the analyzed dataset is displayed in Figure 4.3. The
maximum strength of activation is 1 and the rest of the voxels are active with decreasing
magnitudes from the strongest voxel. Here we specified the ability on localizing the peak
of activation regions. The voxels with greater than 0.8 activation strength are considered
as located in the target areas, as they are the ones with values closest to 1. Therefore,
detection accuracy is presented as the percentage of voxels that is correctly detected
among the voxels truly having greater than 0.8 magnitudes. For instance, the detection
accuracy at region 3 is defined as the number of voxels correctly claimed as both task A
and task B active divided by the number of voxels with activation magnitudes greater
than 0.8 for both task A task B. At region 4, the percentage of voxels correctly identified
as inactive among truly inactive voxels is also calculated as the detection accuracy. The
detection accuracy values from both statistical methods and four declared activation
status of each voxel are presented in Table 4.1 and Figure 4.4. One-sided voxel-wise
SPRT is able to achieve around 95% detection accuracy among all four regions at 30%
45
global stopping time point and 222 scan units were required in total. Voxel-wise GLM is
able to achieve a similar accuracy when 360 scan units are employed. This means that the
proposed method only needs around 40% shorter scan time period to achieve comparable
high accuracies than conventional fixed design analyses. A fixed length of 360 scan units
was used as a benchmark as the required sample size in traditional GLM in the following
simulation analyses. This value was selected because the task activation detection at 360
scan units had similar accuracy with voxel-wise SPRT. For reference, we adopt this
fixed length for other analyses as well.
Figure 4.3 The activation strength structure of dataset (SNR=0.1)
46
222 scan units; 30 percentage STOP
360 scan units, FDR correction, 0.01 q value
50
50
45
0.5
0.5
0.
5
0.8
50
.67
00.
0.9
0.5
0
0.80.9
0.5
5
40
0.5
30
0.7.6
0
20
10
0.5
10
0.7
0.5
0.5
0.7.6
0
0
0.5
15
.67
00.
0.9
0.5
5
0
0.80.9
0.6 0 .7
0.5
20
0.
5
15
10
25
0.6
0.6
0.7
0.5
20
30
Y axis
25
35
0.6 0.7
0.5
0.5
30
0.6
0
0.8 .7
0.9
0.5
35
0.5
40
0.5
0.6
0
0.8 .7
0.9
40
claimed task A active
claimed task B active
claimed task A and B active
45
0.5
0.8
claimed task A active
claimed task B active
claimed task A and B active
0
10
20
30
40
50
X axis
Figure 4.4 Voxels classified as active based on one-sided voxel-wise SPRT (SNR=0.1). The
classification results of activation status by one-sided voxel-wise SPRT (left) and GLM (right)
were shown separately. Red x signs, green x and yellow x separately stand for voxels classified
respectively as task A active, task B active and both tasks active. Activation strength contours are
also labeled in these two plots, from 0.9 to 0.5. The thresholds for the voxel-wise SPRT are set
through 0.01 Type I error and 0.1 Type II error and adjusted by Bonferroni correction approach.
For voxel-wise GLM, FDR correction approach was adopted, with q= 0.01.
A. Detection accuracies in one-sided voxel-wise SPRT approach
(30% global stopping, 222 scan units )
True
activation
Region 1
Region 2
Region 3
strength
99.37%
(=1428/1437)
b=0.0
97.30%(=36/37)
88.89%(= 8/9)
94.59 %(=35/37)
1.0>b≥0.8
B. Detection accuracies in one-sided voxel-wise GLM approach
(360 scan units, FDR correction)
True
activation
Region 1
Region 2
Region 3
strength
Region 4
99.93%
(=1436/1437)
b=0.0
1.0>b≥0.8
Region 4
94.59 %(= 35/37)
100%(= 9/9)
100%(= 37/37)
Table 4.1 Detection accuracies among the 4 simulated activation areas; one-sided SPRT applied
on dataset (0.1 SNR). Detection accuracies are around 5% error percentage of declared activation
results by one-sided voxel-wise SPRT and voxel-wise GLM are displayed, except when 8 of 9
voxels within the activation range are correctly classified, leading to 88.89% accuracy.
47
Efficiency of Two-sided Voxel-wise SPRT on Activation Detection
A one-sided hypothesis is able to detect if the activation level of a voxel is in a region
with greater or smaller activation than the specified null hypothesis value, θ0. However,
researchers might also need two-sided hypotheses, such as of the form H0: cB = 0
against Ha: cB ≠ 0. Note c equals [0 1 0] for task A activation detection and c equals [0
0 1] for task B activation detection. Suppose |bk|>1, k=1 and 2, is considered as
practically important here. The simulated dataset with 0.1 SNR was again analyzed by
sequential and fixed approaches. For two-sided hypotheses, comparing the detection
accuracies from voxel-wise GLM, voxel-wise SPRT is able to achieve comparable
accuracies by 256 scan units. Around 30 % saving is observed. The plots in Figure 4.5
shows the voxels are identified as task A active, task B active or both task active by the
two approaches. Corresponding detection accuracies of four regions are also presented in
Table 4.2.
48
360 scan units, FDR correction, 0.01 q value
256 scan units, 20 percentage global STOP
50
50
45
0.5
0.5
0.8
0
50
.67
00.
0.9
0.5
5
40
0.
5
0 .9
0.8
0.5
30
0.7.6
0
20
10
0.5
10
0.5
0.5
0.7.6
0
0
0.7
15
.67
00.
0.9
0.5
5
0
0.80.9
0.5
0.5
0.
5
0.6 0.7
0.5
20
15
10
25
0.6
0.6
0.7
0.5
20
30
Y axis
25
35
0.6 0 .7
0.5
0.5
30
0.
06
0.8 .7
0.9
0.5
35
0 .5
40
0.5
0.6
0
0.8 .7
0.9
40
claimed task A active
claimed task B active
claimed task A and B active
45
0.5
0.8
claimed task A active
claimed task B active
claimed task A and B active
0
10
20
30
40
50
X axis
Figure 4.5 Voxels classified as active based on two-sided voxel-wise SPRT (SNR=0.1).
The
plot on the right side shows the classified activations by voxel-wise SPRT. The testing thresholds
are 0.01 Type I error and 0.1 Type II error and adjusted by Bonferroni correction approach in
terms of number of voxels in the ROI. Another plot shows the results from voxel-wise GLM,
FDR correction applied with q= 0.01.
A. Detection accuracies using a two-sided voxel-wise SPRT approach
(20 % global stopping, 256 scan units )
True
Region 1
Region 2
Region 3
activation
strength
Region 4
99.58%
(=1431/1437)
b=0.0
1.0>b≥0.8
94.59%(=35/37)
100%(=9/9)
91.89%(=34/37)
B. Detection accuracies using a two-sided voxel-wise GLM approach (360 scan units)
True
activation
strength
Region 1
Region 2
Region 3
99.93%
(=1436/1437)
b=0.0
1.0>b≥0.8
Region 4
94.59%(=35/37)
100%(=9/9)
94.59%(=35/37)
Table 4.2 Detection accuracies among the 4 simulated activation areas; two-sided SPRT applied
on dataset (0.1 SNR). Two-sided hypothesis Two-sided voxel-wise SPRT and GLM related
methods give comparable detection accuracies among four activation conditions. The proposed
sequential methods required 57% less scan units.
49
Adjustment of Experimental Design of One-sided Voxel-wise SPRT on Activation
Detection
In this section, a stronger task related signal dataset with 0.3 SNR is employed to
investigate the performance of SPRT when activation is actually stronger than the
specified threshold in the alternative hypothesis. It is expected that the required sample
size will be smaller. The hypothesis is H0: b1 = 0 against Ha: b1 >= 1 for detecting the
task A activation and H0: b2 = 0 against Ha: b2 >= 1 for detecting the task B activation.
The activation levels that were previously considered in the simulation with 0.1 SNR are
now assumed to have three times stronger task related signals. The results show that only
180 scan units are needed to identify with near 100% accuracy the voxels with activation
magnitudes of 0.8 or greater in regions 1, 2 and 3. Region 4 has around 92% inactivation
detection accuracy. Voxel-wise GLM is also able to achieve high accuracy of four
regions when the same fixed design of 360 scan units is adopted, as in the first simulation.
However, traditional GLM doesn’t allow for adaptive adjustment according to collected
data. Compared with traditional GLM, at least 50% scan units are saved by applying
voxel-wise SPRTs. The results are presented in Figure 4.6 and Table 4.3.
50
180 scan units; 20 percentage STOP
360 scan units, FDR correction, 0.01 q value
50
50
45
0.5
0.5
0.8
0
50
.67
00.
0.9
0.5
5
40
0.
5
0.80.9
0.5
30
0.7.6
0
20
10
0.5
10
0.5
0.7.6
0
0.5
0
0.7
15
.67
00.
0.9
0.5
5
0
0.80.9
0.5
0.5
0.
5
0.6 0.7
0.5
20
15
10
25
0.6
0.6
0.7
0.5
20
30
Y axis
25
35
0.6 0.7
0.5
0.5
30
0.6
0
0.8 .7
0.9
0.5
35
0.5
40
0.5
0.6
0
0.8 .7
0.9
40
claimed task A active
claimed task B active
claimed task A and B active
45
0.5
0.8
claimed task A active
claimed task B active
claimed task A and B active
0
10
20
30
40
50
X axis
Figure 4.6 Voxels classified as active based on one-sided voxel-wise SPRT (SNR=0.3). The plot
on the right side shows the voxels classified as active by voxel-wise SPRT. Boundaries are
derived for 0.01 Type I error and 0.1 Type II error, and corrected by the Bonferroni approach.
The plot on the left side shows the classified activation status generated by voxel-wise GLM, with
q= 0.01 FDR correction applied.
A. Detection accuracies in one-sided voxel-wise SPRT approach
(20% global stopping, 180 scan units )
True
activation
Region 1
Region 2
Region 3
strength
92.69%
(=1332/1437)
b=0.0
1.0>b≥0.8
97.30%(=36/37)
100%(=9/9)
100%(=37/37)
B. Detection accuracies in one-sided voxel-wise GLM approach
(360 scan units, FDR correction)
True
activation
Region 1
Region 2
Region 3
strength
Region 4
91.09%
(=1309/1437)
b=0.0
1.0>b≥0.8
Region 4
100%(=37/37)
100%(=9/9)
100%(=37/37)
Table 4.3 Detection accuracies among the 4 simulated activation areas; one-sided SPRT applied
on dataset (0.3 SNR). When the voxels have greater activation magnitude, activation status is
easier to be identified, and the required time period is shortened by applying SPRT.
51
Efficiency of One-sided Voxel-wise SPRT on Differential Activation Detection
Voxel-wise SPRT is not only able to detect activated regions but also regions with
differential activation. This can be achieved by testing a contrast of the task-related
parameters. For instance, in order to detect the areas with higher task A active strength
than task B active strength, a one-sided hypothesis can be set as H0: cB = 0 against Ha:
cB > δ, some δ >0, where c equals [0 1 -1]. Suppose a difference greater than 1 is
considered practically important here, so that Ha: cB > 1. The dataset with 0.1 SNR
was analyzed by voxel-wise SPRT and voxel-wise GLM here. The true differential
activation structure is displayed in Figure 4.7. Only region 1 has a positive difference
showing greater task A activity. Region 2, on the other hand, has a negative difference.
There is no activation difference between task A and task B in regions 3 and 4. For the
sequential and fixed design approaches, accuracies of identifying the voxels truly having
greater than 0.8 differential activation magnitudes were computed, as in Table 4.4. In
addition, the plots showing the voxels with classified differential activation are in Figure
4.8. 295 scan units are needed by voxel-wise SPRT to achieve around 95% differential
activation accuracy in region 1.
Figure 4.7 Differential active strength structure of dataset with 0.1 SNR
52
295 scan units, 20 percentage STOP
360 scan units, FDR correction, 0.01 q value
50
50
claimed differential activation
claimed differential activation
45
45
40
40
35
35
30
30
25
25
0.5
15
10
0.5
5
0
10
20
0.7.6
0
0 .9
0.8
0.5
0.5
0.7.6
0
10
0
0.7
0.5
15
0.6
0.6
0.7
0.5
20
0.5
20
0.5
5
30
40
50
0
0 .9
0.8
0
10
20
30
40
50
Figure 4.8 Voxels classified as differentially active based on one-sided voxel-wise SPRT
(SNR=0.1). The plot on the right side shows the voxels with classified differential activation by
voxel-wise SPRT. The thresholds are derived for 0.01 Type I error and 0.1 Type II error and
corrected by the Bonferroni approach. The left side plot shows the result from voxel-wise GLM,
q= 0.01 FDR correction applied.
A. Detection accuracies in one-sided voxel-wise SPRT approach
(20% global stopping, 295 scan units )
True
activation
Region 1
Region 2
Region 3
strength
100%
(=113/113)
b=0.0
97.35%
(=367/377)
1.0>b≥0.8
94.60% (=35/37)
B. Detection accuracies in one-sided voxel-wise GLM approach
(360 scan units, FDR correction)
True
activation
Region 1
Region 2
Region 3
strength
61.95%
64.19%
b=0.0
(=70/113)
(=242/377)
1.0>b≥0.8
Region 4
99.10%
(=1424/1437)
Region 4
71.82%
(=1032/1437)
100% (=37 /37)
Table 4.4 Detection accuracies for the simulated differential activation areas; one-sided SPRT
applied on dataset (0.1 SNR).
53
4.5 Real fMRI Studies
To demonstrate the potential impact of our proposed approaches with actual fMRI data,
we consider the problem of identifying a well-known and repeatedly described distinct
region in the human ventral visual pathway, the fusiform face area (FFA), which is
associated with showing faces, a distinct visual stimuli. The objective was to identify
specific regions related with seeing adults face but not house image. Subjects performed
600 scan units experiment including 8 different stimuli: adult face, computer GUI,
computer robot, computer humanoid, juvenile animals, houses, kid face and adult animals.
These 8 stimuli were presented in the screen in random order. Among the stimuli, 24
adult face stimuli and 24 house images were presented. Recorded data were performed
spatial smoothing in pre-processing steps and then 717 voxels located in FFA were
selected to be analyzed by voxel-wise SPRT and GLM.
4.5.1 Analysis Model
For one voxel with t scan units represented in time series data, the linear model in the
simulation analysis, including intercept term and 8 tasks related regressors, is described
as follows:
54
Yt  XB  E ; E ~ N (0,  2 I )
 y1  1
   
  
 yn   1
  
   
 yt  1
f1 (1)

f1 (n)

f1 (t )
f 2 (1)

f 2 (n )

f 2 (t )
f 3 (1)

f 3 (n)

f 3 (t )
f 4 (1)

f 4 (n)

f 4 (t )
f5 (1)

f5 (n)

f 5 (t )
f 6 (1)

f 6 ( n)

f 6 (t )
f 7 (1)

f 7 (n)

f 7 (t )
b0 
b 
 1
f8 (1)  b2   e1 
 
   b3    

 
f8 (n) b4   en 
   
   b5    
f8 (t )  b6   et 
 
b7 
b 
 8
where Yt includes the observed fMRI signal intensities from time point 1 to t (1≤n≤t.) f1   ,
f 2    , f 3  , f 4    , f 5  , f 6   , f 7  and f 8  separately represent the expected BOLD
signals of images stimuli of adults faces, computer GUI, computer robot, computer humanoid,
juvenile animals, houses, children’s faces and adult animals. BOLD signals are generated by
convoluting double-gamma HRF with corresponding experimental stimuli function. Type I error
and Type II error are separately defined as 0.01 and 0.1. The stopping boundaries in the SPRT
method are corrected for the multiple comparisons of voxels by a Bonferroni approach. In GLM
analysis, the Bonferroni approach is too conservative, making it difficult to detect the active
regions. Therefore, multiple comparison issue is corrected by controlling the FDR in the
following investigations.
4.5.2 fMRI Data Analysis
In this experiment, adult face and house were displayed in random order with variable
delay. The goal is to identify the adult face activation regions and differential activation
regions that activate when shown an adult face stimulus but not when shown a house
stimulus. We performed voxel-wise SPRTs of linear contrasts of GLM regression
55
parameters. The experimental design matrix, X, reflects one intercept term and all
conditions, including conditions listed above which are not of interest for this report.
Each BOLD stack was then spatially smoothed with a Gaussian 3D filter with FWHM of
2 voxels (6mm). After applying appropriate contrast matrix, the hypothesis can be built
for corresponding regions. The hypothesis of adult face detection is: H0: badult_face = 0
against Ha: badult_face > 1. The hypothesis of differential activation detection is: H0:
badult_face- bhouse = 0 against Ha: badult_face- bhouse > 1. Spatial smoothing was conducted as a
pre-processing step and then 717 voxels were selected as an ROI to focus the
identification of FFA in the analysis by voxel-wise SPRT and GLM.
4.5.3 Real fMRI Studies Results
In a real-data example from a subject found to have successful (differential) activity
localization, dynamic methods were applied to the existing data, and compared with more
traditional, GLM-based complete data findings of activation, in a masked region
consisting of 717 voxels. Dramatic savings in scan times were found with comparable
localization findings: 200 scan units per session over 3 sessions were used in practice
(600 totals including 24 adult face stimuli and 24 house stimuli), versus a reduction to
160 scan units, including 2 adult face stimuli, for activity detection and to 209 scan units,
including 3 adult face stimuli and 5 house image stimuli, for differential activity detection
required through our proposed sequential methods. Overall stopping of administration in
both detections was determined when more than 60% of the voxel-level SPRTs called for
stopping individually. The complete data results determined that 328 voxels were active
for the adult face stimuli, while presented dynamic, SPRT-based approach found 399
56
such active voxels. There was an overlap of 303 voxels being found active with both
approaches. For differential activation detection, traditional GLM approach identified 88
voxels that are active for the adult face but not house image and voxel-wise SPRT
detected 123 active voxels. 68 voxels are claimed active by both methods. The results are
presented in Figure 4.9 and 4.10 separately. In activity and differential activity detection,
the reductions in the sample size that is required by the SPRT method separately are
around 73% and 65% from 600 scan units. For differential activity detection, proposed
method requires around 87% and 80% smaller scan units of stimuli of adult face and
house images respectively. These huge savings were found and there are high
overlapping results compared with GLM-based method. Bonferroni correction was also
applied on traditional GLM. However, this leads too conservative results. The results
corrected by FDR are displayed here to compare with results from proposed voxel-wise
SPRT.
57
Result of voxel-wise GLM (600 scan units) and SPRT ( 60 percent STOP, 160 scan units)
FFA(717 voxels)
claim active by voxel-wise SPRT
claim active by voxel-wise GLM, 0.01q FDR
22
20
Z axis
18
16
14
12
10
8
24
22
45
20
40
18
35
30
16
25
14
20
15
12
10
Y axis
10
5
X axis
Figure 4.9 Adult face activity detection results. The voxels identified as activated by the adult
face stimulus by voxel-wise GLM methods are labeled by a red square. The voxels detected by
voxel-wise SPRT are labeled as blue cross signs. q=0.01 FDR correction was applied on GLM
related methods and SPRT bounds were derived for 0.01 Type I error and 0.1 Type II error,
corrected by Bonferroni adjustment for number of voxels in the ROI.
58
Results from fixed GLM (600 time units) and voxel-wise SPRT (60 percent STOP, 209 time units)
FFA (717 voxels)
claimed active by voxel-wise SPRT
claimed active by GLM
22
20
Z axis
18
16
14
12
10
8
24
22
45
20
40
18
35
30
16
25
14
20
15
12
Y axis
10
10
5
X axis
Figure 4.10 Differential activity detection classification results. The voxels claimed
differential active by voxel-wise GLM methods are labeled as red square and the voxels detected
by voxel-wise SPRT are labeled as blue cross signs.
59
4.6 Discussion
Voxel-wise SPRT can be a helpful tool in efficiently detecting (differential) brain
activation. In the first simulation, the ability of one-sided voxel-wise SPRT to locate the
peak of activation regions shows that less than 40% of the scan units are needed to
achieve higher than 90% detection accuracies compared with the conventional analytic
approach through GLM. These high accuracies are observed on all three activation
regions which separately represent bigger regions, smaller regions and regions involving
activation from both of the two tasks. This implies that the proposed method is
appropriate for different activation scenarios. Due to variation among people, different
structures of task specific activation strength are observed among different subjects.
Distinguishing the region with near zero activation levels may not be as important as
distinguishing the existence of “peaks” that are the strongest specific task related
activation regions and identifying their corresponding locations. This presented approach,
which can detect the peak of activation with a more than 40% shorter scan time period,
can provide a great savings in cost, as it decreases the time of the fMRI session.
The second prominent advantage is that the experimental design can be adjusted by the
subject’s real time responses in the presented approach. A lengthy fMRI design has a
danger of being affected by learning effects or fatigue, while an fMRI design that is too
brief risks inaccurate results due to insufficient data. In order to ensure precision for
detection, a long, pre-determined experiment design is usually performed.
This is
conservative in that less favorable variance and activation level values must be assumed
in the design, to reflect the range of values that may be observed for a target population.
An important characteristic of voxel-wise SPRT is its ability to adaptively determine
60
under pre-specified testing error thresholds when to stop administration of
experimentation, based on the collected data. Classification of activation status is made
once sufficient evidence is collected in a collective manner across ROIs. For instance, in
the third simulation, the required scan period for SPRT is shortened from 256 to 180 scan
units when stronger signals (0.3 SNR) are applied on analysis. At the same time, 360 scan
units were needed to attain certain accuracy levels in this experimental design based on
the more conservative fMRI signal intensity assumption of the first simulation that was
used as the basis for justifying the fixed design. In this case, a substantial scan unit
savings of 50% is found by applying the proposed approach. Proposed voxel-wise SPRT
allows adjusting the experimental design according to BOLD signals observed in real
time when variances differ among subjects.
The proposed approach is not only appropriate for choosing between one-sided
hypotheses but also two-sided hypotheses. The numerator in SPRT statistics, the
likelihood given the alternative hypothesis is true, is represented by weighted averages of
two possible likelihoods. Because the numerator of the statistic is diluted by considering
two ways of differentiation from the null hypothesis likelihood, less decisive evidence is
provided by voxel-wise SPRT statistics. Therefore, a larger scan unit number is naturally
expected. In the second simulation, the proposed sequential method is still able to detect
the regions with two-sided hypothesis tests and there is still around 30% saving on scan
units observed.
In our example, fourth simulation, the use of differential activation detection does not
save as many scan units as activation detection relative to the other scenarios.
Nevertheless, one-sided voxel-wise SPRT performs better in differential inactivation
61
detection accuracy in region 2, 3 and 4 (as shown in Figure 4.8). Further, the accuracy
levels in inactive regions (Type II error levels) are not improved by extending fMRI scan
sessions in GLM. The proposed sequential approach, on the other hand, provides high
accuracy of detection of differential activation and inactivation.
In the real-data example, an objective is to identify regions that activate when shown
an adult face stimulus but not when shown a house stimulus. This involved analysis of a
contrast of regression parameters. Again, large saving in scan times of over 65% were
observed using the sequential approach as compared with the length of the original
design.
Limitations
From a sequential testing point of view, Bonferroni correction can be a very
conservative approach to controlling the simultaneous Type I and Type II errors,
especially in fMRI analysis where there are large number of voxel-level tests being
conducted at once. Such adjustment can lead to very large or small significant threshold
values and consequently, a large gap between the two stopping boundaries for each
voxel-level test. This leads to a need for a large number of scan units, even with
sequential testing. These methods may be more successful when focused ROIs are being
studied, although the breadth of the simulations indicate that it may be possible to
consider several moderately sized ROIs simultaneously, and still achieve efficiency gains
while preserving statistical accuracy.
Further, temporal autocorrelation noise is generated by AR(1) model with a 0.3 ρ
value in the simulated fMRI image. The signal intensity can be modified by preprocessing steps, such as spatial smoothing and normalized drift correction. Taking the
62
0.1 SNR simulated data as an example, the given 0.3 ρ value is reduced to 0.06 after two
steps of pre-processing procedures. Therefore, ignoring temporal autocorrelation, one is
still able to get high accuracy levels in the analysis. Further investigation displayed that
the ρ values of noise terms including up to 0.7 in temporal autocorrelation decreased to
maximum 0.12 after two steps preprocessing steps. However, the intensity of temporal
correlation differs among various TR’s machines, experimental designs, and subjects 28.
Voxel-wise SPRT is able to include the temporal autocorrelation structure which is a
possible source of noise in fMRI data 57.
The global stopping rule relies on expert opinion for selecting the appropriate
percentage of voxels that have satisfied SPRT stopping criteria before stopping fMRI
experiment administration. As a future direction, a percentage selection rule that depends
on real-time data could be developed to optimize testing objectives when determining
when to stop.
Real time preprocessing procedures of fMRI such as drift correction, motion
correction, temporal filtering, and spatial smoothing have been previously investigated
41
5,
. In this study, a modified drift correction method of Magland’s method was
implemented 5. The subtracted value was normalized for keeping the weights equal 1.
The implementation of spatial smoothing decreases activation magnitudes of voxels,
especially those located at the boundary of active regions. When the decreased activation
is lower than half of δ value, the voxel will display a higher likelihood of inactivation
status. Then, the voxel has a great possibility to be miss-classified to inactive status. This
reduces the activation detection accuracy for the voxels located at the margins of active
region.
63
Conclusion
In conclusion, voxel-wise SPRT with stopping rule shows promise for dramatically
reducing scan time units in detecting activation compared to conventional fixed sample
analysis. This is achieved through individualized experimental designs, which adapt to a
subject’s response to tasks. Immediate practical implications include reduced costs for
fMRI sessions while insuring acceptable levels of statistical accuracy in activation
determinations. Importantly, this work also serves as a basis for development of more
complex and dynamic fMRI experimentation.
64
Chapter 5
Cognitive Reserve
5.1 Introduction of Cognitive Reserve
Repeated clinical observations show that people with the same degree of brain
pathology may have variant neuropsychological performances (NP) or clinical outcomes
62-67
68
. In 2002, Yaakov Stern proposed the concept of reserve to explain this inconsistency
. Stern postulated that the different flexibility and adaptability of cognitive processes or
brain networks underlying task performance among individuals leads some people to
have better clinical performance than others
69
. This implies that people with more
reserve will display less clinical manifestation compared with people with the same
severity of brain damage.
Reserve is comprised of two elements: brain reserve and cognitive reserve (CR) 69, 70.
Brain reserve, a quantitative measure, is derived from brain size and/or neuronal count 16,
71, 72
. Due to ample neural substrate remains to maintain normal function, people with
greater number of neurons can sustain more brain damage and show clinical deficit at a
later time
63, 73
. According to the hypothesis threshold model summarized by Satz
71
,
people with greater brain reserve capacity (BRC), which might simply imply people with
a bigger brain, have higher quantity to tolerance brain pathology.
In contrast, CR explains that the individual difference among variant degrees of brain
damage tolerance is caused by the brain actively coping with brain lesions by using more
pre-existing cognitive networks or by enlisting compensatory processes. Individuals with
more CR have better clinical performance than people with lower CR when both have the
same amount of brain pathology. Yaakov Stern suggests that CR might be implemented
in two forms: neural reserve and neural compensation 69, 74. The concept of neural reserve
65
is that among individuals there is functional variability in primary cognitive networks
corresponding to task performance. This inter-individual variability can be constructed
from different degrees of efficiency and/or capacity in these brain networks. Efficiency
and capacity can be explained in the relationship between brain functional activities and
task demands. Efficiency refers to the change of brain functional activity with increasing
task demands. When the task is getting harder, more neural activity is required to
accomplish the task. Compared to people with smaller efficiency, people with greater
efficiency have a lower increase in neural activity as the task is raised to a higher
difficulty level. Figure 5.1(A) shows that people with low efficiency display a sharper
slop as the task difficulty increases
75
. In one study investigating aging, healthy young
people were compared to healthy elders. The elder cohort demonstrated a greater degree
of increased neural activity when the task demand increased, but less improvement on the
underlying working memory performance 76. This demonstrates that efficiency is limited
by age-related brain networks in working memory. Importantly, people with high
efficiency are considered to have higher CR 75.
Capacity represents the maximum brain functional activity in the primary network as
task demands increase under the premise that the tested participants successfully perform
the task. The concept of capacity is illustrated in Figure 5.1(B). Furthermore, Figure
5.1(C) shows that the changes of efficiency and capacity occur jointly.
66
Figure 5.1 Models of task related neural activity versus task 75
Neural compensation refers to the alternative cognitive networks which will take place
to cope with the impaired brain networks. When the brain activity reaches the capacity
level in the primary network, task performance cannot be adequately supported so a
compensatory network is recruited. Neural compensation is usually engaged when the
damaged brain is no longer able to sustain the task with exceeding task demand.
Compensatory networks are typically found in the contralateral hemisphere
brain locations have also been reported
78-80
77
but other
. Efficiency and capacity in the primary
network and neural compensation are important for understanding the mechanisms of CR.
Recent studies show that CR is not only relevant in normal aging and Alzheimer’s
disease (AD)
16
but also in patients with vascular injury
traumatic brain injury
85
81-83
, Parkinson’s disease
, human immunodeficiency virus (HIV)
86
84
,
, neuropsychiatric
disorder 87, and multiple sclerosis 88. Therefore, investigation of CR characteristics allows
extensive research of cognitive decline related diseases.
67
Tucker and Stern (2011) posit that brain reserve and cognitive reserve contribute to the
maintenance of normal function both independently and interactively
89
. However, the
detail of interaction between these two components of reserve is not clear to date 16.
5.2 CR in Normal Aging and Alzheimer’s Disease
It is becoming increasingly clear that there is heterogeneity in cognitive trajectories
during the course of normal aging and AD
89-92
. The notion of CR gives a compelling
explanation for the lack of correspondence between brain pathology and cognitive
manifestations of changes in the brain. The CR hypothesis posits that differences in
capacity, efficiency, and adaptability of brain networks may be the underlying reason
why some individuals can perform better cognitively than others with similar underlying
pathology in AD or normal aging. The following sections introduce the current findings
of CR in normal aging and AD.
5.2.1 Normal Aging
Education, IQ, occupational complexity, literacy, leisure activity and cohesion of
social network are measures used as proxies to estimate CR in current CR research 75. A
major issue arises in that direct measurement of CR is not readily available. Still, fMRI
has shown great promise for detecting two main components of CR: neural reserve (i.e.,
capacity and efficiency) and neural compensation (adaptability). The key idea, as
developed by Stern and his colleagues, is to measure neural activity across difficulty
levels of a task 91. They have proposed a hypothetical model, Figure 5.2, that illustrates
how variable neural activation patterns across different difficulty levels of a task can
68
provide a way to assess CR
93
. For a young brain, activation in a primary network is
relatively low for easier tasks. On the other hand, for an older brain, which has less CR,
higher levels of activation are needed. This indicates that the younger brain has greater
neural efficiency. As depicted in Figure 5.2, the young brain also is able to call on
greater levels of activation as needed, as the difficulty level increases. In contrast, the
older brain is not able to draw upon CR to meet the needs of more difficult tasks. This
indicates that the older brain has less neural capacity. In terms of neural compensation,
cognitive reserve can be detected through fMRI when regions outside of a primary
network are activated and such activity is part of a compensatory network. Use of
difficulty levels is important for this aspect as well, as compensation may arise at
different levels, depending on the individual’s ability.
Figure 5.2 Hypothesized relationship between task demand and activation in old and young 69, 93.
Delayed Item Recognition Task Paradigm and Clinically Significant Findings
Stern et al (2012) reported a study where young and elder participants performed a
delayed item recognition task (DIR) to identify neural reserve and neural compensation
69
of working memory in normal aging. Each DIR trail included four phases: encoding,
retention, probe, and retrieve. Seventy, 2 second, blank intervals were also inserted into
the experimental paradigm randomly. During the 3 second encoding phase, 2 non-verbal
stimuli, computer generated, complex, closed-curve shapes were presented. Participants
were required to memorize them. After a 5 second retention time, participants were
shown one non-verbal stimulus during the probe phase. For the retrieve phase subjects
were asked to indicate whether the presented stimulus matched one of the previously
memorized stimuli. Stern designed 5 different time periods of probe phases, 0.125, 0.250,
0.5, 1 and 2 seconds, to develop 5 task difficulty levels. A shorter presentation period
during the probe phase represented a higher difficulty level. Two spatial patterns were
discovered from the resulting functional images. The degree of expression of the brain
network during the probe phase was recorded as a function of task demand. In pattern 1,
elders expressed higher task-related activation at the long probe phase, easiest task, and
had lower task-related activation at the short probe phase, hardest task. This spatial
pattern was expressed in the large inferior frontal cluster, including Brodmann area 6 and
Brodmann area 9, areas associated with working memory network regions
94
. This
suggests that elders have reduced efficiency and less capacity of this existing network
compared with the young. In pattern 2, elder participants had increased tasked-related
activation (more areas recruited) with increasing task demand compared to the youth
group. Moreover, midline Brodmann area 10, one major component in the second spatial
pattern, is not related to working memory. This second spatial pattern may reflect the
neural compensation which represents an alternative cognitive network and allows
subjects higher adaptation for working memory performance. In sum, three
70
characteristics of CR – efficiency, capacity and adaptation – are identified by performing
an experimental paradigm with different task loadings. The task difficulty also can be
designed in the number of letters or non-verbal shapes and the time period of the
retention phase 76, 91.
5.2.2 Alzheimer’s Disease
More than one third of individuals aged 80 and older are diagnosed with dementia and
more than 60% of these dementia cases are caused by AD
95, 96
. More than 25 million
people worldwide are suffering from this disease in 2006 and this number is expected to
triple by 2050 11, 97. AD is an neurodegenerative disorder. Patients will progressively lose
their cognitive, functional and behavioral ability. Increased evidence suggests that the
brain of AD patients has pathological damage several years before patients show clinical
symptoms. Although currently no treatment is able to reverse damaged neurons to cure
AD, a new treatment has been developed that can slow the course of the disease. It is
most effective when the patients receive treatment in an early stage of AD 13. Therefore,
there is a strong need to identify reliable and accurate cognitive markers, that along with
other biomarkers, predict the onset of AD. Currently, criteria for mild cognitive
impairment (MCI), which follows the preclinical AD asymptomatic phase, are not precise
enough for clinical trial use. Indeed, there is markered heterogeneity within MCI, with
only 10% to 15% of MCI subjects converting to AD yearly, with some never converting
98
.
71
5.2.3 CR and AD
Twenty-five percent to 67% of subjects who were considered cognitively normal from
longitudinal studies were found with pathological brain damage that meets diagnostic
criteria for dementia at autopsy 62-64, 66, 67. These early observations show that cognitively
healthy people may have reserve to help them retain normal NP. From later longitudinal
studies, epidemiological evidence suggests that people with high reserve demonstrate
clinical symptoms of AD later than people with low reserve
16, 99
. However, after the
onset of clinical symptoms, people with high reserve decline at a faster rate on cognitive
performance. Generally speaking, people with the same clinical severity may be
accompanied by divergent brain pathology. Identifying the individual characteristics of
CR can become one important diagnostic criterion for early AD detection. Moreover,
monitoring the change of CR would help clarify the prognoses and progression of disease
over time.
Lately, several functional imaging investigations found disparity in CR between
healthy people, MCI and AD 14, 15, 100-105. The following section introduces one performed
experimental paradigm and its clinical findings.
Face-Name Paradigm
People at an early stage of AD usually report losing the ability to learn and retain new
information, such as difficulty in remembering new names. This symptom may be caused
by advancing age or AD. The neural impairment for AD is specifically located at the
perforant pathway of the medial temporal lobe and prefrontal cortex, specifically the CA2,
CA3 and DG regions of hippocampus and the entorhinal cortex
15, 106
. Identifying the
brain function change during an explicit task is expected to demonstrate the different
72
activation pattern between AD patients and healthy elders. A face-name association task
elicts widespread neural activation and is widely accepted as a sensitive task for detecting
AD due to the areas recruited across pathologically affected areas. The task mainly
comprises three phases: associative encoding phase, distraction phase and recognition
phase. The detail of the paradigm varies by paradigm design, however, the main structure
is the same.
By applying face-name paired associative learning tasks, Pariente et al (2005) used
fMRI to identify the reserve regions in AD
15
. Subjects were instructed to memorize 12
face-names pairs in the associative encoding phase. Each pair was presented 6.4 seconds
and there was 0.1 second interstimulus interval between stimuli. After 3.7 mintues of a
blank screen presentation in the distraction task, subjects were presented with 1 face and
4 different names. The face and names were all selected from previous stimuli. Subjects
were asked to select one name that they thought matched the given face. This sequence
comprised one run. In total, 4 runs were performed and took 20.4 mintues. Only data
from subjects who correctly answered the match pair were included in the analysis.
Pariente et al found both hypo- and hyperactivation in patients in early stage AD
compared to healthy elders who were matched by age, gender and number of years of
education. Decreased activation was found in AD subjects in the right hippocampus
during the encoding and recognition phases. The hippocampus is well known as a brain
region affected by AD. Hyperactivation was detected in AD subjects bilaterally in the
parietal and frontal lobes, especially in the right medial frontal gyrus and right inferior
parietal lobule. The recruitment of additional cognitive resources are considered to allow
AD patients to maximize task performance. Lower activation shows AD patients with an
73
impaired normal network, and weaker neural reserve. Alternative neural activity suggests
the development of neural compensation over the course of AD. Since 2003, large
functional imaging studies have reported findings that agree with the presence of neural
reserve and compensation in people with AD related mutation gene, SCI (subjective
cognitive impairment), MCI and AD. These papers are summarized in Table 5.1.
According to these demonstrated proofs, subjects at the pre-symptomatic stage present
evidence of early hyperactivity in medial temporal lobe regions. After the onset of
dementia, hippocampal activation starts to fail on normal function and alternative
cognitive networks may start to activate. This evidence allow us to develop accurate
cognitive markers to predict the onset of AD and further to determine a patient’s
prognosis and progression over time by quantifying the patient’s neural expression of CR.
74
Reference
Sperling
[2002]
14
15
Pariente
[2005]
Neuroimaging
Subjects
Task
fMRI
▪ 10 healthy elders
▪ 7 mild AD*
Face-name association
encoding task.
Activation pattern during
encoding phase were
identified.
fMRI
▪ 17 healthy elders
▪ 12 probable AD
patients‡
Face-name paired
associative learning task.
Activation pattern during
encoding and recognition
phase were identified.
fMRI
▪ 15 Normal
control(NC)
▪ 15 low sum of box
score(low-SB) MCI
▪ 12 high-SB MCI
▪ 10 AD §
▪ Hippocampal formation
▪ Bilateral hippocampus
Hyperactivation
[Healthy<AD]
[neural compensation]
▪ Bilateral medial parietal
cortex
▪ Right posterior
cingulate regions
Face-name pairedassociate learning task.
Activation pattern during
encoding phase were
identified.
NC > AD
▪ Hippocampus
NC ≈ low-SB MCI >
high-SB MCI, AD
▪ Fusiform
Performance
AD exhibited significantly
worse performance than the
healthy elders on novel face
recognition and on recall of
the names†.
[neural compensation]
▪ Bilateral parietal
▪ Bilateral frontal lobes
Only signals from subject
correctly recognize facename pairs were analyzed.
[neural reserve]
NC < low-SB MCI
▪ Bilateral hippocampus
▪ Bilateral inferior frontal
NC, low-SB MCI and high
SB-MCI performed
relatively well on memory
test, but AD is significantly
worse than the other three
groups (p<0.05).
NC > high-SB MCI
▪ Hippocampus
75
Celone101
[2006]
Hypoactivation
[Healthy>AD]
*
Mild ADs were diagnosed by National Institute of Neurological and Communicative Disorders and Stroke-AD and Related Disorders Association (NINCDSADRDA criteria.) The mean [std] of mini-mental state examination (MMSE) score were 22.6[2.2] †
Healthy elders correctly identified 78% novel face and correctly recall 40% novel faces, compared with 60% (p=0.059) and 12% (p<0.005) for mild AD.
‡
Probable AD patients were identified by NINCDS-ADRDA criteria, Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSMIV.) the mean
[range] of MMSE is 25.1 [23, 28]. §
Subjects are classified on the basis of their clinical dementia rating (CDR) scale: Normal controls (CDR = 0.0); Low SB-MCI (CDR = 0.5 and sum-of-box
score (SB): 0.5~1.5); High SB-MCI (CDR = 0.5 and sum-of-box score (SB): 2.0~3.5); AD (CDR = 1.0)
Reference
Rodda 102
[2009]
108
Quiroz
[2010]
Neuroimaging
Subjects
Task
fMRI
▪ 10 controls
▪ 10 subjective
cognitive impairment
(SCI) **
Verbal episodic memory
encoding task.
Activation pattern during
encoding phase were
identified.
fMRI
▪ 19 young presenilin
1(PSEN1) E280A noncarriers
▪ 20 young PSEN1
E280A carriers
Face-Name PairedAssociate Learning Task
Activation pattern during
encoding phase were
identified.
Hypoactivaiton
[Healthy>AD]
Hyperactivation
[Healthy<AD]
[neural reserve]
▪ Left prefrontal cortex
[neural compensation]
▪ Left medial temporal
▪ Occipitoparietal
▪ Medial frontal cortex
Reiman 103
[2012]
fMRI
fMRI
▪ 18 healthy elders
▪ 16 MCI ††
▪ 24 young PSEN1
E280A non-carriers
▪ 20 young PSEN1
E280A mutation
carriers
Face-Name PairedAssociate Learning Task
Activation pattern during
encoding phase were
identified.
[neural reserve]
▪ Right anterior
hippocampus
[neural reserve]
▪ Hippocampus
MCI groups performed
significantly worse memory
and recall face-name pairs
compared with healthy
elders (two-sample t test,
p<0.05)
[neural reserve]
▪ Right hippocampus
▪ Parahippocampus
There is no enough
evidence to say that there is
a recognition memory
performance difference
between carriers and noncarriers (two-sample t test,
p=0.6).
**
††
There is not enough
evidence to say that there is
a recognition rate difference
between ACI and healthy
controls (two-sample t test,
p=ns).
There is not enough
evidence to say that there is
a recognition memory
performance difference
between carriers and noncarriers (two-sample t test,
p=0.59).
76
Putcha 104
[2011]
Face-Name PairedAssociate Learning Task.
Activation pattern during
encoding phase were
identified.
Performance
SCI is proposed as a clinical stage before MCI
MCI is identified by global CDR score. MCI (global CDR = 0.5) and cognitively normal (global CDR = 0.0)
Reference
Yaakov
105
Stern
[2000]
77
Grady CL
106
[2003]
Neuroimaging
15
H O PET
2
15
H O PET
2
Subjects
▪ 11 healthy elders
▪ 14 probable AD 9
▪ 12 healthy elders
▪ 12 mildly demented
AD
Task
Verbal recognition task
Face-name association
encoding task. Activation
pattern during semantic
task and episodic
memory task were
analyzed.
Hypoactivaiton
[Healthy>AD]
▪ Left anterior cingulate
▪ Anterior insula
Hyperactivation
[Healthy<AD]
Performance
▪ Left posterior temporal
cortex
▪ Calcarine cortex
▪ Posterior cingulate
▪ Vermis
[neural compensation]
The task demand is titrated
until each subject has 75%
word recognition accuracy.
▪ Bilateral dorsolateral
prefrontal cortex
▪ Posterior cortex
[neural compensation]
AD patients performed less
semantic and recognition
accuracy than control
(repeated measures
ANOVA; p<0.001).
AD patients involving
bilateral dorsolateral
prefrontal cortex and
posterior cortex network
displayed better semantic
and recognition
performance.
Table 5.1 Cognitive reserve findings in people at early stage of AD. This table summarizes the used functional neuroimaging mechanism,
subjects, task and the findings of activation patterns (hypoactivation and hyperactivation) in each study.
Chapter 6
Detection of CR in AD using rt-fMRI
Current fMRI methods present substantial challenges because signal data are very
noisy, meaning that it is not always straightforward to detect activation of a voxel or ROI
after administration of a stimulus. fMRI experimental designs, such as block designs,
attempt to overcome noisy data by repeatedly administering a stimulus a predetermined
number of times. However, such designs may be inefficient in terms of the amount of
stimuli that are administered because they do not take into account how an individual is
responding. These designs are also vulnerable to learning and fatigue effects.
Rt-fMRI involves the immediate processing of BOLD signals in order to determine
regions of activation from stimuli. Significantly, this makes it possible to develop
dynamic stimuli that can be adjusted based on how an individual is responding. For
instance, redundant administration of stimuli can be reduced, by stopping further
administration once activation levels are determined. If an unforeseen event, such as
unexpected head movement, occurs, administration of stimuli can be continued longer
than planned until clear statistical evidence about activation is obtained.
The goal here is to develop a statistical system for multiple-tasks testing of
standardized methods that can quickly and efficiently identify people at risk for AD by
assessing CR of those not yet manifesting functional deficits. The proposed method uses
novel real-time analogues of well-validated fMRI paradigms for CR that allow for
dynamic adjustment. These paradigms are the DIR
76, 91
and the face-name paired
associative learning tasks 14, 15. The DIR task has been used to differentiate neural reserve
among younger versus older brain activation patterns across difficulty levels, while the
face-name task has been shown to induce neural compensation in the early stages of AD.
78
Both paradigms will allow for testing of a range of difficulty levels that can be adjusted
in real time. For DIR, this can be the number of letters presented and/or the time of probe
after the encoding presentation
76, 91
. For the face-name task, stimuli length, number of
multiple choices 15 and/or face similarity can be manipulated.
Brain regions with a relatively low level of activation might be overlooked by
applying a relatively severe threshold, making the task-related activation patterns become
heavily influenced by the threshold selection 76. In addition, various cognitive activation
networks are used across individuals at different task-loading levels
91
. This means that
the subjects may demonstrate different activation patterns at similar levels of task loading.
However, in most brain function investigations, only one difficulty task is performed.
This not only causes incompatibility among literature results but also leads to problems
identifying decisive neural processing regions. Moreover, in order to identify the
characteristics of CR, the change in cognitive activation pattern over multiple task
loadings must be specified. Therefore, understanding the BOLD activation pattern as a
function of task demand level is crucial. The change of activation curve can lead to the
appropriate explanation of activation patterns. Studying the changes in task-related
activation over task demands can offer a measure of the efficiency, capacity and
compensatory process of each individual.
In order to detect the two main characteristics of CR, neural reserve and neural
compensation, we utilized two approaches to conducting statistical inference about the
activation status of a voxel in the context of a GLM framework. The first approach uses
sequential estimation through monitoring confidence interval lengths to sequentially
estimate activation magnitudes over task loadings in a neural reserve network. This
79
allows for some characterization of how a brain region responds with changes in task
demand, as reflected by different difficulty levels. The second approach uses sequential
decision-making in the selection of difficulty levels within a hypothesis-testing
framework to identify neural compensation. Since compensation may arise at different
difficulty levels in different individuals, sequential adjustment of difficulty level helps
identify if and when compensation arises much more efficiently than fixed experimental
designs do. In fMRI analysis, the most widely adopted approach to assess activation is
the use of GLM. It involves univariate voxel-level modeling, and allows for temporal
correlation and multivariate analysis of associations between tasks and activations. Task
activation magnitudes are represented by regression parameters that indicate strength of
association between observed BOLD responses with an expected HRF convoluted with
indicator variables denoting when task administration occurs.
First approach: sequential estimation through monitoring confidence interval
lengths over a range of experimental conditions to assess efficiency and capacity in
neural reserve. For efficiency and capacity measurement of neural reserve, it is helpful
to estimate actual activation magnitudes, as opposed to just determining whether or not
activation exceeds a certain level. This allows for more precise assessment of differences
in magnitudes across difficulty levels, which is necessary because clinical differences in
such analyses may be more nuanced than in the hypothesized relationship shown in
Figure 5.2. A natural approach involves sequentially monitoring the standard confidence
interval (CI) widths in the estimation of the regression parameters associated with task
activation. Stopping can be invoked when all CI widths are within respective target
lengths. These CI widths rely on estimates of the standard deviation of the relevant
80
regression parameters. This approach would require Bonferroni correction if there were
more than one regression parameter, or linear contrasts of parameters, being considered
simultaneously. Srivastava (1967) proposed a more general sequential estimation
approach 48 that also involves simultaneous coverage. The details of these two sequential
estimation approaches on estimating fMRI activation magnitudes and corresponding
simulation analysis will be described below.
Second approach: sequential decision making in a hypothesis-testing framework
to detect neural compensation. We have studied a hypothesis-testing framework, voxelwise SPRT, in which we test for the presence of task-associated activation, as reflected
by hypothesis tests of the corresponding regression parameter (β) being zero versus
exceeding a certain value. Such inference is useful in the detection of neural
compensation. However, in detecting neural compensation, it may be that compensation
is induced and observed at different difficulty levels in different individuals. It is not
necessary to start administration at the easiest difficulty level. We propose a halving
algorithm that can lead to substantial savings in identifying the minimum difficulty level
when activation occurs, for a given ROI. The proposed halving algorithm and
corresponding simulations will be presented in the following section.
6.1 Voxel-wise Sequential Estimation on Detecting Efficiency and
Capacity of Neural Reserve
The objective of this section is to develop a method for sequentially estimating
activation magnitudes that allows for efficiently establishing an activation curve over
81
increased task demands within a network that reflects neural reserve. Then, the efficiency
and capacity features in a normal cognitive network can be characterized, enhancing
further understanding of how cognitive heterogeneity arises in AD.
First, we will introduce the processes of two proposed voxel-wise sequential
estimation approaches. Then, the efficiency and accuracy of the proposed methods in
establishing (differential) activation will be demonstrated in two simulation studies.
Further, the components of the two proposed sequential estimation approaches will be
explored in a third simulation study.
6.1.1 Voxel-wise Sequential Estimation Approach
Two sequential estimation approaches involving confidence interval estimation are proposed
here for estimating voxel-level activation magnitude. First, voxel-wise sequential estimation
directly focuses on the estimation of confidence intervals for regression coefficients of interest.
Second, voxel-wise Srivastava’s sequential estimation allows all possible linear combination of
regression coefficients within the limited CI widths. These two methods are introduced as follows.
The GLM with intercept term for the given voxel with observations up to a time point t is
expressed as
Yt  XB  E ; E ~ N (0,  2V )
1
 y1 



 
 yn   1

 


 yt  t1 1
f1 1

f1  n 

f1  t 
f 2 1 

f2  n  

f2 t  
82
f P 1 
 b0 
 e1 




 
b
 1
 
f P  n 
  en 
 b2 

 
 
 


f P  t   t P 1 bP   P 11  et t1
 
where Yt is a t × 1 vector of measured BOLD signal intensities of the given voxel starting
from time point 1 to time point t, P is the number of task related regressors, f p  t 
represents the expected BOLD signal of the pth task and bp is the task corresponding
activation magnitude, p=1,…P. The variance-covariance matrix, V, is a t×t matrix
including a temporal autocorrelation structure.
Voxel-wise Sequential Estimation Approach
The CI for each regression coefficient or linear contrast of regression coefficients with
sample size t can be computed using the following equation:
cBˆ  t
t  P 1,1

 

Var cBˆ
2

-1
cBˆ  c  X 'VX  X 'VYt

 
-1
 
Var cBˆ  Var
Yt   c  X'VX  c '
where t
t  P 1,1

denotes ( 1 
2

2
(6.1)
)×100 percentile of the t distribution with t-P-1 degrees of
freedom. The half-length of the CI is:
t
t  P 1,1

 

Var cBˆ
(6.2)
2
Voxel-wise sequential estimation approaches can be recursively employed until the
stopping point as follows:
Step 1: Define the confidence percentage, 1-α, and the acceptable half-widths of the
CI, d.
83
Step 2: Start sampling by taking t0 signal observations y1, y2, ... yt0, where t0 ≥ P+1,
where P is the number of tasks presented in the fMRI session. Compute half the width of
CI at each time point, t, t ≥ t0, using
t
t  P 1,1
cBˆ  c
 

Var cBˆ

2
 X'VX 
1
X'VYt

 
1

Var cBˆ  ˆ t 2  c  X'VX  c '
(6.3)
where Yt includes the collected signal from time point 1 to time point t, X is a t × (P+1)
design matrix until time point t and c is a 1× (P+1) contrast vector to specify the linear
combination of regression coefficients of interest.
Step 3: Stop sampling at time point T, when
t
T  P 1,1

 

Var cBˆ  d .
(6.4)
2
Step 4: The CI of the linear contrasts of regression coefficients at stopping time point
T is shown in equation (6.5), is with the specified length given confidence 1-α.
cBˆ  t
T  P 1,1

 

Var cBˆ
(6.5)
2
Most of the time, researchers are interested in more than one linear contrast of
regression coefficients. Then, Bonferroni correction would be required to correct the
confidence percentage by dividing α by the number of estimations of interest. For a given
voxel, the final stopping time point is the time point at which all the estimations of
interest satisfy equation (6.4).
84
Voxel-wise Srivastava’s Sequential Estimation Approach
Srivastava’s sequential estimation approach ensures that the CI widths of all the linear
combinations of regression coefficients, cBˆ , are limited within 2d under 1-α confidence
at the stopping time point. The values of d and α are specified by the researcher. The
process of voxel-wise Srivastava’s sequential estimation approach for a given voxel is
described as follows:
Step 1: Define the simultaneous confidence percentage, 100*(1-α)%, and the
acceptable half-widths of CI, d.
Step 2: Start sampling by taking t0 signal observations y1, y2, ... yt0, where t0 ≥ P+1,
and P is the number of tasks designed in the fMRI session. At each following sampling
time point t, let Yt denote a t × 1 row vector of the observations, and X t be a t × (P+1)
design matrix, t ≥ P+1, where X t is of full rank. The estimators of unknown parameters
vector Bt and σt2 are computed using the following equations, assuming V equals I:
1
Bˆ t   X t ' X t  X t ' Yt


ˆ t 2  Yt ' I t  X t  X t ' X t  X t ' Yt /  t  (P+1) 
1
(6.6)
where It is a t × t identity matrix.
Continue collecting observations, one at a time,
while updating estimates. Stop
sampling when:
ˆ t 2  t 1 
d 2t
.
at2 t
Then, T is the smallest t value that is greater than t0. Note that
a t  is any sequence with
positive constants converging to the number a, satisfying p (  2P 1  a 2 )  1   .  2P1
85
represents a random variable distributed as a chi-square with P+1 degrees of freedom. λt
is the maximum eigenvalue of t  X t ' X t  , where lim t 1  X t ' X t  =Σ is a positive
1
t 
definite matrix.
Step 3: At stop time point T=t, the region Rt is constructed as follows:
d2
ˆ
ˆ
Rt  [ Z : t {Z  Bt }'( X t 'X t ){Z  Bt }  ]
n
1
Any CI of a linear combination of regression coefficients of interest that satisfy cc’=1
will have coverage of at least 1-α.
Global stopping
The above approaches are described at the voxel level. However, clearly in fMRI
applications, focus will be on the multiple voxels that comprise an ROI or network. As in
Chapter 4, we can apply a global stopping criteria that jointly considers performance and
accuracy in estimation across multiple voxels before deciding when to stop collection of
imaging data. It is possible that a small collection of voxels may have high estimated
variances, such as might be caused by a physical motion. Rather than sample for the
worst case scenario, as in Chapter 4, we may be willing to accept some estimation
uncertainty for a small, predetermined portion of the voxels, in order to gain efficiency in
data accrual durations. One approach could be to stop when a certain percentage of
voxels satisfy voxel-level stopping criteria, as described for the two sequential estimation
methods. While we do not systematically consider this in the simulations to follow, it is a
methodological consideration that should be considered in practical implementation.
86
6.1.2 Simulation Studies
The simulation studies explore the efficiency and (differential) activation estimation
accuracy of voxel-wise sequential estimation approach and voxel-wise Srivastava’s
sequential estimation approach. Three simulations were conducted. The first simulation
includes one task in the experimental design. Six different conditions are constructed by
using three different activation magnitudes, with two different variances of noise values.
In the second simulation, the simulated image is composed of four active areas with each
area activated by different combinations of three task stimuli. The accuracy and
efficiency of estimating task (differential) activation magnitudes of two analysis methods
were explored. The third simulation study compares the components of the two proposed
sequential estimation methods to understand how they differ. The advantages and
disvantages will be discussed.
The R package “nruRosim”
58
was used to generate simulated fMRI images and the
simulated datasets were analyzed within the Matlab environment (64-bit version R2012a
The Mathworks, Natick, MA).
6.1.2.1 Simulation Data Analysis Including One Task
In this simulation, a fixed experimental design including one task was defined. After an
8-second rest block, the alternating cycle of a 3-second task block and an 8-second rest
block was repeated 500 times. This yielded a total of 2754 image scans given a 2-second
TR. Each voxel had its own simulated fMRI signal using equation (4.8). The error term is
was comprised of white noise; low frequency drift; physiological noise; temporal
correlation, modeled by AR(1) with a 0.1 ρ value; and spatial correlation, modeled by
87
AR(1) with a 0.75 ρ value. The variance values of overall noise varied across simulated
images, as shown in Table 6.1. The SNR values ranged between 0.16 and 0.26, which are
controlled within the range of practical fMRI values.
Six simulated images were structured as follows. Each image had one activation region,
as shown in Figure 6.1. The activation magnitude β values were defined by the value in
Table 6.1. The maximum values of six images were separately 0, 0, 1.5, 1.5, 3 and 3. The
surrounding voxels were given β values with exceptional decay from the peak value of
the simulated image. Three activation magnitudes were generated for modeling an
activation curve of neural reserve over three task demands. For each activation magnitude,
two simulated images were generated. One image had a lower variance of noise and
another one had a higher variance of noise. This allowed investigating the relative
efficiency of the proposed sequential analysis approaches when smaller variance of noise
was detected. For images with larger activation magnitudes, a larger variance of noise
was given to agree with observed fMRI signal characteristics over increased task demand
91
.
88
Task magnitudes activation regions
30
25
y axis
20
15
10
5
0
0
5
10
15
20
25
30
x axis
Task inactive region
Task active region
Figure 6.1 Activation pattern of simulated fMRI image. Activation region includes 377 voxels
and inactivation region includes 523 voxels. In total, one simulated image is constructed of 900
voxels.
Simulated image
Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
The range of
simulated task
related β values
0
0
[0.97, 1.5]
[0.97, 1.5]
[1.95, 3]
[1.95, 3]
Variance of noise
1.51
3.50
3.50
9.49
9.49
14.82
Table 6.1 Simulated activation magnitudes and corresponding variance of noise.
Before simulated images were analyzed by two sequential estimation approaches, two
steps of pre-processing procedures, spatial smoothing and normalized drift correction
have been applied. Spatial smoothing is with Gaussian kernel 6 FWHM weighting on 3
by 3 voxels and a normalized drift correction method with parameter 0.1.
89
In this simulation study, the efficiency and accuracy of activation estimation analyzed
by two voxel-wise sequential estimation approaches under images with 3 different
activation levels will be investigated.
6.1.2.1.1 Simulation Model
For one voxel with t scan units represented in the time series data, the linear model in
the simulation analysis, including an intercept term and one task-related regressor s, is
described as follows:
Yt  XB  E ; E ~ N (0,  2 I )
 y1  1
   
  
 yn   1
  
   
 yt  1
f1 1 
 e1 


 
b   
f1  n   0    en 
 b
 
   1   
 et 
f1  t  
where Yt includes the observed fMRI signal intensities from time point 1 to t (1 ≤ n ≤ t)
and f1  represents the expected task BOLD signals which are generated by convoluting
double-gamma HRF with corresponding experimental stimuli function. The estimation is
conducted at a 95% confidence level. Half of the width of CI length is targeted to be d=1
and d=0.5.
Voxel-wise Sequential Estimation Approach
The task activation magnitude, b1, is estimated by T scan units where T is the smallest
t value that fits the following inequality
t
0.05
t  2,1
2
 

Var b1  d
90
(6.7)
 
Var b1 is computed using equation (6.3) where c = [ 0 1 ].
Voxel-wise Srivastava’s Sequential Estimation Approach
The task-activation magnitude, b1, is estimated by T scan units where T is the smallest
t value that fits the following inequality
ˆ
2
t
 t 1  
d 2t
at2 t
where p (  22  a 2 )  0.95 , λt is the maximum eigenvalue of t  X t ' X t 
(6.8)
1
, and
lim t 1  X t ' X t  =Σ is a positive definite matrix. ˆ t 2 is computed using equation (6.6).
t 
6.1.2.1.2 Simulation Results
The estimation targets are the task activation magnitudes, b1. b1 is computed by for T
scan units, where T is the smallest time point that fits the inequalities (6.7) for the voxelwise sequential estimation approach and (6.8) for the voxel-wise Srivastava’s sequential
estimation approach. d=1 and d=0.5 were both computed here. The value of d is selected
for allowing the voxel-level task activation magnitudes of two nearest task demands to be
distinguished. Here, the ranges of three activation areas are separately 0, from 0.97 to 1.5
and from 1.95 to 3. Therefore, d=1 and d=0.5 were selected.
The stop-sampling time points of the six simulated images under two d values and two
analysis approaches are presented in Table 6.2. The value of the stopping time point
mainly depends on the variance of noise as opposed to the activation magnitude. Results
show that images with greater variance require larger sample size. This phenomenon is
consistent with the form of the stopping rules, inequalities (6.7) and (6.8). Among the
91
 

stopping rules, the only components that reflect this feature are Var b1 from the voxel-
wise sequential estimation approach and
 

sequential estimation approach. Var b1
activation magnitude.
 T ˆ T 2
from the voxel-wise Srivastava’s
T
is the variance of estimated target task
 T ˆ T 2
represents the largest variance of any linear combination
T
estimations, cBˆ 109. Larger values of
 T ˆ T 2
lead to wider CIs, and require longer scan
T
times to attain target widths. However, Svristana’s method establishes a stop rule that
ensures simultaneous coverage for a fixed confidence width 2d > 0 for all linear
combinations of regression parameter estimates, cBˆ , where c is contrast vector with
cc '  1 . The condition required for the design matrix Xt, is that t 1 X t ' X t converges to a
positive definite matrix, which we believe is a general enough condition to broadly hold
for fMRI applications. Empirical checks of positive definiteness from our simulations via
Matlab support this view.
92
Simulated
image
Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
voxel-wise sequential estimation
approach
d=1
d=0.5
95%
95%
confidence
confidence
30
252
126
330
129
323
194
602
212
643
283
938
voxel-wise Srivastava’s
sequential estimation approach
d=1
d=0.5
95%
95%
confidence
confidence
162
352
197
495
188
486
294
1022
328
1090
462
1628
Table 6.2 Required scan units of six different simulated images. Image 1 and image 2 each have
0 true mean of activation region magnitude 1.23 and 1.87 standard deviations of noise,
respectively. Image 3 and image 4 each have 1.1867 true mean of activation region magnitudes
and 1.87 and 3.08 standard deviations of noise, respectively. Image 5 and image 6 each have
2.3735 true mean of activation region magnitudes and 3.08 and 3.85 standard deviations of noise,
respectively.
The estimated activation magnitudes were also computed here. In clinical fMRI
studies, the mean activation magnitude of a specific ROI is usually used to represent the
overall activation strength. The mean activation regions of the six simulated images are
described in Table 6.3. However, all the estimation values are slightly smaller than the
simulated mean activation values. The analyzed signals have been processed by spatial
smoothing and drift correction. These pre-processing steps not only remove non-taskrelated signals but also lessen the intensities of task-related signals. The true activation
magnitudes of the simulated signals after the same pre-processing steps are were also
computed and are displayed in Table 6.3. The estimated values are closer to the true
activation values processed by pre-processing steps. This implies that the estimated
signals will be slightly smaller than the simulated activation values due to the preprocessing steps. Estimations computed by larger sample sizes are slightly more accurate
in light of the simulated activation mean values after pre-processing. The estimated
93
voxel-level activation magnitudes of six simulated images are also compared with
simulated voxel-level activation values and are displayed in Figure 6.2.
Accuracy is presented as the percentage of voxels whose simulated activation
magnitudes are correctly included by corresponding CIs, as seen in Table 6.4. Most
accuracies are as high as 90%. However, accuracy decreased when the stopping time
point increased. Given a larger sample size, the task activation estimation will be
expected to be more accurate when compared to the true magnitudes after pre-processing
steps. The lower accuracy is inferred from the comparison with the simulated magnitudes
without pre-processing steps. The voxels that are correctly and incorrectly included by
CIs at the voxel-wise sequential estimation approach’s stopping time point, d=0.5, are
displayed in Figure 6.3. The misclassified voxels are mainly located on the boundary of
active areas where spatial smoothing has its greatest influence. This confirms the lower
accuracy obtained after the pre-processing steps.
We also analyzed results through application of a Bonferroni correction for multiple
comparison. The α value is modified as αBon= α/N, where N is the total number of
hypotheses (voxels) being considered at once. More accurate task activation estimates
and higher accuracies were both achieved. However, there is a trade-off with the greater
sample size required.
94
Simulated
image
Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
Mean of
simulated β1
0
0
1.1867
1.1867
2.3735
2.3735
voxel-wise sequential estimation
approach
d=1
d=0.5
voxel-wise Srivastava’s sequential
estimation approach
d=1
d=0.5
Mean of
simulated β1
after preprocessing
95% confidence
95% confidence
95% confidence
95% confidence
0
0
1.0546
1.0546
2.1091
2.1091
0.6521
0.1228
1.0106
1.0570
2.2365
2.0537
0.0336
0.0643
1.0167
1.0712
2.1237
2.0141
0.0581
0.0916
1.0605
1.0641
2.1613
2.1240
0.0232
0.0589
1.0265
1.0431
2.0943
2.0440
Table 6.3 Estimated means of activation magnitudes of active region under 6 simulated images, two d values and two analysis approaches.
95
(A) image 1
(B) image 2
Estimated Task magnitudes activation strength
, STOP point = 252
0.25
0.2
0.05
0.1
0
0
-0.1
-0.05
-0.2
-0.3
-0.1
-0.4
30
-0.15
30
20
20
10
0.1
0.1
0.05
0
-0.1
0
-0.2
-0.05
-0.3
-0.1
-0.4
30
-0.15
-0.2
0
30
20
20
-0.25
10
10
0
Y axis
0.15
0.2
0.4
0.3
0.2
0.2
0.1
0.1
0
-0.1
0
-0.2
-0.3
-0.1
-0.4
30
-0.2
-0.2
0
10
0
-0.2
-0.1
-0.4
30
-0.2
0
30
20
-0.3
20
10
-0.4
0
Y axis
X axis
0
-0.4
X axis
1.6
Estimated Task magnitudes activation strength
, STOP point = 323
Estimated Task magnitudes activation strength
, STOP point = 602
True Task magnitudes activation strength
1.6
1.4
1.4
1.4
1.4
1.2
0.8
0.5
0.6
0
0.4
0.2
-0.5
30
0
30
20
20
10
Y axis
0
1
1
0.8
0.5
0.6
0
0.4
0.2
-0.5
30
20
10
-0.4
X axis
1
1
0.8
0.5
0.6
0
0.4
0.2
-0.5
30
0
0
Y axis
10
0
0
20
-0.2
-0.4
X axis
30
20
30
20
10
0
-0.2
1.2
Activation magnitude
1
1
1.2
1.5
1.5
10
Y axis
10
0
0
1
1
0.8
0.5
0.6
0
0.4
0.2
-0.5
30
0
30
20
20
-0.2
10
-0.4
X axis
1.2
1.5
Activation magnitude
1.5
Activation magnitude
96
(D) image 4
True Task magnitudes activation strength
Activation magnitude
0.1
0
10
10
0
Y axis
(C) image 3
0.2
0.2
-0.3
20
-0.25
X axis
0.3
0.4
30
20
10
0
Y axis
X axis
0.6
0.3
Activation magnitude
0.1
Activation magnitude
Activation magnitude
0.15
0.3
0.4
0.4
0.2
0.3
0.4
0.5
0.5
0.25
0.2
Estimated Task magnitudes activation strength
, STOP point = 330
True Task magnitudes activation strength
Activation magnitude
True Task magnitudes activation strength
Y axis
10
0
0
-0.2
-0.4
X axis
(E) image 5
(F) image 6
3
2.5
3
2.5
3
2
1.5
1.5
1
0.5
1
0
-0.5
30
0.5
20
10
Y axis
2
2
1.5
1.5
1
0.5
1
0
-0.5
30
0.5
30
20
0
0
20
10
X axis
2.5
Y axis
3
2.5
2
2
1.5
1.5
1
1
0.5
0
0.5
-0.5
30
0
0
30
20
10
X axis
0
20
10
0
2.5
2
2
1.5
1.5
1
1
0.5
0
0.5
-0.5
30
30
20
10
0
3
3
2.5
Activation magnitude
2
Estimated Task magnitudes activation strength
, STOP point = 938
2.5
3
2.5
Activation magnitude
Activation magnitude
True Task magnitudes activation strength
3
Activation magnitude
True Task magnitudes activation strength
Estimated Task magnitudes activation strength
, STOP point = 643
Y axis
10
0
0
20
0
20
10
-0.5
X axis
30
Y axis
10
0
0
-0.5
X axis
97
Figure 6.2 Task activation estimations of 6 simulated images according to the voxel-wise sequential estimation approach, d=0.5. Two images are
displayed at each figure. The left one is the stimulated activation pattern and the right one is the estimated activation pattern. x axis and y axis are
the coordinates of simulated image. Z axis is the activation magnitude. From Figure (A) to Figure (F), the activation structures of image 1, 2, …6
are separately displayed.
Simulated
image
Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
voxel-wise sequential estimation
approach
d=1
d=0.5
95%
95%
confidence
confidence
83.67
100.00
100.00
99.78
100.00
95.44
99.33
89.89
92.67
82.67
90.89
74.78
voxel-wise Srivastava’s
sequential estimation approach
d=1
d=0.5
95%
95%
confidence
confidence
100.00
100.00
100.00
98.56
99.56
91.44
97.89
83.22
88.89
77.78
87.11
66.22
Table 6.4 Accuracy percentages (%) of six simulated images, two d values and two analysis
approaches.
98
(B) Task magnitudes estimation of image 2:
CI includes the true beta, STOP point = 330
(A) Task magnitudes estimation of image 1:
CI includes the true beta, STOP point = 252
30
20
20
y axis
y axis
30
10
10
0
0
0
0
10
20
30
x axis
correctly include true zero beta value
10
20
30
x axis
not include true zero beta value
correctly include true zero beta value
(C) Task magnitudes estimation of image 3:
CI includes the true beta, STOP point = 323
(D) Task magnitudes estimation of image 4:
CI includes the true beta, STOP point = 602
30
30
25
20
y axis
y axis
20
15
10
10
5
0
0
10
20
0
0
30
10
not include true non-zero beta value
correctly include true non-zero beta value
not include true zero beta value
correctly include true zero beta value
20
30
x axis
x axis
not include true non-zero beta value
correctly include true non-zero beta value
not include true zero beta value
correctly include true zero beta value
(E) Task magnitudes estimation of image 5:
CI includes the true beta, STOP point = 643
(F) Task magnitudes estimation of image 6:
CI includes the true beta, STOP point = 938
30
30
25
20
y axis
y axis
20
10
15
10
5
0
0
10
20
0
0
30
x axis
10
20
30
x axis
not include true non-zero beta value
correctly include true non-zero beta value
not include true zero beta value
correctly include true zero beta value
not include true non-zero beta value
correctly include true non-zero beta value
not include true zero beta value
correctly include true zero beta value
Figure 6.3 Accuracy plots of six simulated images. According to the sequential estimation
method and d=0.5, the voxels correctly included by corresponding CIs are shown as dark color: o
for true non-zero magnitudes and x for true zero magnitudes. Voxels with true magnitudes that
aren’t included by corresponding CIs are displayed as light color: o for true non-zero magnitudes
and x for true zero magnitudes.
99
6.1.2.2 Simulation Data Analysis Including Three Tasks
The second simulation included three tasks—A, B, and C—in the experimental design
to investigate the efficiency and accuracy of (differential) activation estimation under a
scenario involving multiple tasks. Each task block is presented for 3 seconds followed by
an 8-second rest block. The three tasks are presented in turn separated by a rest period R.
This alternating cycle, A|R|B|R|C|R, is repreated 200 times. One 8-second rest block is
applied in the beginning of the experimental design. Given 2 seconds TR, this yields a
simulated BOLD signal that is 3304 scan units long. Each voxel in the simulated image is
generated using equation (4.8). The simulated image constructed of 1600 voxels involves
four activation regions which are separately activated by task A only, by task B only, by
task C only and by both task A and task C. Each activation region includes 197 voxels.
The task-related β values are defined by the value in Figure 6.4. The activation
magnitude ranges of tasks A, B and C are separately from 0.76 to 1, from 1.53 to 2 and
from 2.29 to 3. The voxels are given β values with exponential decay from the maximum
value of each active region. Error terms of each voxel are constructed by white noise; low
frequency drift; physiological noise; temporal correlation, modeled by AR(1) with a 0.1 ρ
value; and spatial correlation, modeled by AR(1) with a 0.75 ρ value. Two simulated
images were generated with the same activation structure but different variance of error
terms. The first image has a lower variance, 1.51, which leads to 0.32 to 0.95 SNR values.
The second image has a higher variance, 4.08, which leads to 0.19 to 0.58 SNR values.
The SNR values are selected in order to be within the range of those observed with
practical fMRI.
100
(A) Task A activation pattern of simulated fMRI image
(B) Task B activation pattern of simulated fMRI image
3
3
2.5
2.5
3
2
2.5
2
1.5
1.5
1
Area A2
1
0.5
0
0.5
-0.5
-1
40
Area A1
Activation magnitude
Activation magnitude
3
2
1
0.5
0
0.5
-0.5
0
30
-0.5
30
40
20
10
10
0
Y axis
0
-0.5
30
20
20
10
1.5
1
40
20
Area B
1.5
-1
40
0
30
2
2.5
-1
0
Y axis
X axis
(C)Task C activation pattern of simulated fMRI image
10
0
-1
X axis
(D) Task C - Task A activation pattern of simulated fMRI image
3
3
Area C-A 2
Area C1
2.5
2.5
3
2
2.5
2
1.5
1.5
Area C2
1
1
0.5
0
0.5
-0.5
-1
40
0
30
40
30
20
Activation magnitude
Activation magnitude
3
2
2.5
2
1.5
1.5
Area C-A 3
1
1
0.5
0
0.5
-0.5
-1
40
0
30
-0.5
40
Area C-A 20
1
30
20
10
Y axis
0
0
-0.5
20
10
10
-1
Y axis
X axis
10
0
0
-1
X axis
Figure 6.4 Activation pattern and strength structure of simulated fMRI image. Fig. (A) shows the
two task A active areas: A1 and A2. Fig. (B) shows the one task B active area: B. Fig. (C) shows
the two task active areas: C1 and C2. Fig. (D) shows the 3 three task C and task A differential
active areas: C-A1, C-A2 and C-A3.
Before the simulated images were analyzed using the two sequential estimation
approaches, they underwent two steps of pre-processing procedures, spatial smoothing
and normalized drift correction. Spatial smoothing is with Gaussian kernel 6 FWHM
weighting on 3 by 3 voxels and a normalized drift correction method with parameter 0.1.
This simulation study investigates the efficiency and accuracy of (differential)
activation estimation analyzed by two voxel-wise sequential estimation approaches.
101
6.1.2.2.1 Simulation Model
For one voxel with t scan units represented in the time series data, the linear model in
the simulation analysis, including an intercept term and three task-related regressors, is as
follows:
Yt  XB  E ; E ~ N (0,  2 I )
 y1  1
   
  
 yn   1
  
   
 yt  1
f A 1
f B 1


f A 1
f B 1


f A 1
f B 1
f C 1
 e1 
  b0   
  

bA  
f C 1     en 
 b   
  B   
b
f C 1  C   et 
where Yt includes the observed fMRI signal intensities from time point 1 to t (1 ≤ n ≤ t)
and f A   , f B    and f C   separately represent the expected BOLD signals of tasks A,
B and C which are generated by convoluting double-gamma HRF with corresponding
experimental stimuli function. The estimation is conducted with a 95% confidence level.
The width of CIs are limited by d=1 and d=0.5.
Voxel-wise Sequential Estimation Approach
The task activation magnitudes, bA, bB, bC and the differential activation magnitude
between bC and bA , are estimated with T scan units where T is the greatest stopping time
point among TA, TB, TC and TC-A. TA, TB, TC and TC-A are separately the smallest tA, tB, tC,
and tC-A values that fit the following inequalities:
102
 
t

Var bA  d
t

Var bB  d
t

Var bC  d
0.05
t A  4,1
24
0.05
tB  4,1
24
0.05
tC  4,1
24
 
 
(6.9)



1

t
Var
bC  bA    d


0.05
 2

tC-A  4,1
24





1



 bC  bA   are computed by following
Var bA , Var bB , Var bC and Var 
 2

 
 
 
equation (6.3) where contrast vectors separately equal to  0 1 0 0 ,  0 0 1 0 ,
0
0 0 1 and  0

1
2
1 
.
2 
0
In order for coverage levels using Srivastava’s
sequential estimation to be applicable, difference contrasts are modified by multiplying
differential activation magnitudes between two tasks by 1 / 2 , satisfying the condition
cc’=1.
Voxel-wise Srivastava’s Sequential Estimation Approach
The task activation magnitudes, bA, bB, bC and the differential activation magnitude
between bC and bA , are estimated by T scan units where T is the smallest t value that fits
the following inequality
ˆ t 2  t 1 
where
p (  24  at2 )  0.95 ,
λt
is
the
d 2t
at2 t
maximum
(6.10)
eigenvalue
of t  X t ' X t 
1
,
lim t 1  X t ' X t  =Σ is a positive definite matrix, and ˆ t 2 is computed using equation (6.6).
t 
103
6.1.2.2.2 Simulation Results
The goal of this simulation is to explore the efficiency and accuracy of the proposed
voxel-wise sequential estimation approaches in (differential) activation estimation. The
estimated targets here are the task (differential) activation magnitudes, bA , bB , bC and
. Estimations were computed at the stopping time points that satisfy inequalities
b
C  bA
(6.9) for the voxel-wise sequential estimation approach and (6.10) for the voxel-wise
Srivastava’s sequential estimation approach. We analyzed two simulated images with
different variances of noise using two sequential approaches, with d=1 and d=0.5. The
stopping time points are described in Table 6.5. The GLM includes 4 four regression
parameters. Increasing the number of normal contrast combinations leads to a high  T ˆ t 2
value. The stopping time point of the voxel-wise Srivastava’s sequential estimation
approach is close to twice as big as the stopping time of the voxel-wise sequential
estimation approach. The investigation of the relationship between these two sequential
estimation approaches will be discussed in the next section.
Simulated
image
Image 1
Image 2
voxel-wise sequential estimation
approach
d=1
d=0.5
95%
95%
confidence
confidence
225
479
291
743
voxel-wise Srivastava’s
sequential estimation approach
d=1
d=0.5
95%
95%
confidence
confidence
388
875
564
1702
Table 6.5 Required scan units of two simulated images. Both images are constructed based on the
same four activation areas pattern, but with different variances of noise. Images 1 and 2 have 1.23
and 2.02 standard deviations of noise, respectively. The means of voxel-level estimations of eight active areas were separately computed
and compared to the means of simulated activation magnitudes of corresponding active
104
areas. The means of simulated magnitudes processed by two pre-processing steps were
also presented. The results are presented in Table 6.6. Compared to the simulated means
through pre-processing steps, the estimated means are accurate. The estimation
accuracies of four (differential) activation values, bA, bB, bC and bC-bA, were separately
computed and described in Table 6.7. Most accuracy values are higher than 80%.
Compared with accuracies under d=1, the accuracies under d=0.5 are slightly decreased.
This phenomenon may be caused by changes in magnitudes introduced by the preprocessing steps.
105
Image
Image1
Estimated
activation
area
Mean of
simulated β
value
Mean of
simulated β
value
(after preprocessing
steps)
voxel-wise sequential
estimation approach
voxel-wise Srivastava’s
sequential estimation
approach
d=1
d=0.5
d=1
d=0.5
95%
confidence
95%
confidence
95%
confidence
95%
confidence
106
A1
0.8713
0.7333
0.7886
0.7644
0.7607
0.7613
A2
0.8713
0.6463
0.6532
0.6511
0.6414
0.6432
B
1.7427
1.4672
1.4836
1.4790
1.4682
1.4770
C1
2.6140
2.2004
2.2576
2.2304
2.2496
2.2269
C2
2.6140
2.2486
2.3177
2.2882
2.3030
2.2681
C-A1
-0.8713
-0.6851
-0.6325
-0.6543
-0.6349
-0.6682
C-A2
2.1784
2.2874
2.3093
2.3186
2.3367
2.3049
C-A3
1.7427
1.6022
1.6645
1.6371
1.6616
1.6249
Image
Image2
Estimated
activation
area
Mean of
simulated β
value
Mean of
simulated β
value
(after preprocessing
steps)
voxel-wise sequential
estimation approach
voxel-wise Srivastava’s
sequential estimation
approach
d=1
d=0.5
d=1
d=0.5
95%
confidence
95%
confidence
95%
confidence
95%
confidence
107
A1
0.8713
0.7333
0.7225
0.6845
0.6957
0.6914
A2
0.8713
0.6463
0.7012
0.6920
0.6667
0.6668
B
1.7427
1.4672
1.4586
1.4424
1.4230
1.4807
C1
2.6140
2.2004
2.1620
2.1620
2.1450
2.1960
C2
2.6140
2.2486
2.3742
2.3522
2.3721
2.2828
C-A1
-0.8713
-0.6851
-0.5862
-0.5968
-0.5968
-0.6509
C-A2
2.1784
2.2874
2.2793
2.2778
2.2411
2.2841
C-A3
1.7427
1.6022
1.6730
1.6601
1.7054
1.6160
Table 6.6 Estimated means of activation magnitudes of 8 active regions under 2 simulated images, two d values and two analysis approaches.
voxel-wise sequential
estimation approach
Image
Image1
Image2
Estimated
target
voxel-wise Srivastava’s
sequential estimation
approach
d=1
d=0.5
d=1
d=0.5
95%
confidence
95%
confidence
95%
confidence
95%
confidence
bA
100.00
99.38
99.88
91.69
bB
100.00
96.44
97.56
93.50
bC
96.19
88.56
90.75
85.19
bC - bA
97.25
89.06
91.13
81.50
bA
100.00
97.12
98.81
87.19
bB
99.50
95.44
96.75
88.62
bC
94.63
87.00
88.88
79.06
bC - bA
95.50
86.19
88.62
70.56
Table 6.7 (Differential) activation estimates accuracy percentages (%) of two simulated images.
6.1.2.3 Investigation
of
Two
Sequential
Estimation
Approaches
Voxel-wise Srivastava’s sequential estimation consistently requires greater sample
size than the voxel-wise sequential estimation approach in the previous two stimulation
studies. The objective in this section is to explore the efficiency of these two methods
under different numbers of task stimuli and different task orders in the experimental
design.
108
6.1.2.3.1 Methods
Experimental Design
Two typies of experimental design were studied here, one in which multiple stimuli
are systematically presented and another in which they are randomly presented.
Take In the three-stimuli scenario, the experimental design with systematic order is
presented the task stimuli as follows:
A|R|B|R|C|R
where A represents task A stimuli, B represents task B stimuli, C represents task C
stimuli and R represents a rest period.
Task stimuli are presented randomly in the experimental design with random order. One rest period is inserted into any two task stimuli. Systematic stimuli order may lead to
a higher correlation among task stimuli which will decrease the variance of estimated
differential activation. Therefore, the difference obtained by applying two different types
of stimuli presentation order is explored here.
The Components of Two Sequential Estimation Approaches
Voxel-wise Sequential Estimation Approach
The stopping inequality in the voxel-wise sequential estimation approach is
t

T  P 1,1
2K
 

 d
 Var c B
109
(6.11)
Component 1 : t
T  P 1,1

2K
where p  t  tT  P 1,/2 K   1 

2K
 

  ˆ 2  c  X ' X  1 c '
Component 2 : Var c B
T
T
T
where T is the stopping time point, P is the number of task-related regression
parameters., 1-  is the confidence level, c is the contrast vector and B is the regression
coefficients vector. K is the number of estimations estimated simultaneously, d is the
limited half CI width, ˆ T 2 is the estimated variance by T sample size and XT is a T×(P+1)
experimental design matrix (including intercept term).
Voxel-wise Srivastava’s Sequential Estimation
ˆ
T
2
 T 1aT2 T
T
 d 2 (6.12) Component 1 : aT2 where P  2P 1  aT2   1  
Component 2 :
 T ˆ T 2  T 1
T
where ˆ T 2 is the estimated variance by T sample size, T is the stopping time point, P is
the number of task related regression parameters, and  T is the maximum characteristics
root of T  X T ' X T  . XT is a T×(P+1) experimental design matrix (including intercept
1
term), d is the limited half the length of CI, and 1-  is the confidence level.
Target Estimations
For the voxel-wise sequential estimation approach, the target estimations have to be
listed before processing the statistical approach. The most common target estimations of
one, two and three task stimuli in fMRI analysis are considered here. When there is one
110
task stimulus in the experimental design, task activation is the only arget estimation. The
 

stopping rule is tT  P 1,1/2 K  Var b1  d . b1 is the estimate of task 1 activation
magnitude. The total number of estimations, K, equals 1. When there are two task stimuli
in the experimental design, task 1 activation, b1 ; task 2 activations, b2 ; and differential
1
 b1  b2  , are the target values to be estimated.
2
activation between these two tasks,
The
stopping
rules
are,
respectively,
 

tT  P 1,1/2 K  Var b1  d
,




1

tT  P 1,1/2 K  Var b2  d , tT  P 1,1/2 K  Var 
b1  b2    d . The total number of

 2



 
estimations, K, equals 3. When the number of enrolling task stimuli increases to three, the
total number of estimations, K, equals 6. These six target estimated values are task 1
activation, b1 ; task 2 activation, b2 ; task 3 activation, b3 ; and differential activation
between task 1 and task 2,
task 3,
1
1
 b1  b2  ; task 1 and task 3,  b1  b3  and task 2 and
2
2
 

1
 b2  b3  . The stopping rules are respectively: tT  P 1,1/2 K  Var b1  d ,
2
 

tT  P 1,1/2 K  Var b2  d
tT  P 1,1/2 K
 

tT  P 1,1/2 K  Var b3  d
,



1
 Var 
b1  b2    d

 2



,
tT  P 1,1/2 K



1
tT  P 1,1/2 K  Var 
b2  b3    d .

 2



111



1
 Var 
b1  b3    d

 2



,
and
Six simulated signals were generated to investigate the difference between comparable
components of the two proposed estimation methods. Among these six signals, three
signals were generated according to systematic stimuli presentation order and the other
three were generated according to random order. For each order of stimulus presentation,
three signals separately included one, two or three task stimuli. For all the signals, each
rest period lasts for 8 seconds and task period presentation lasts for 3 seconds. Each task
is presented 20 times. In total, 334 image scans were collected given a 2-second TR.
6.1.2.3.2 Results
The right hand side of inequality (6.11) squared and inequality (6.12) are the same.
The corresponding left-hand side of two inequalities will be compared and discussed here.
A 95% confidence value is defined.
Component 1 in the two sequential estimation approaches separately is
t
T  P 1,1 /2 K

2


2
2
2
where p  t  tT  P 1,1 /2 K   1   / 2 K or aT where P  P 1  aT  1   . T
equals 334 here. The components are both related with confidence percentage, 1-α, but
the order of stimuli presentation is not. α applied on the t-value is modified by Bonferroni
correction when there is more than one estimation of interest in inference. The respective
values of component 1 under the scenarios of one, two or three task stimuli included in an
experimental design are described below. As can be seen, aT2  334 is consistently greater
than  t334  P 1,0.05/2 K  under the different task designs.
2
112
Components 1 in voxel-wise sequential estimation
and voxel-wise Srivastava approach
10
Square of t value
9
Chi square value
8
7
6
5
4
3
1
2
number of task
3
Number of task in XT
Comparable
Component 1
t
334  P 1,0.05/2 K

2
2
2
a334
where P  2P 1  a334
  0.95
1(K=1)
2(K=3)
3(K=6)
3.8696
5.7898
7.0451
5.9915
7.8147
9.4877
Table 6.8 The values of comparable component 1 in CI generation involving the two sequential
estimation approaches, as a function of number of task stimuli. The above figure displays the
trends. K equals the number of estimations being conducted simultaneously.
113
Comparable component 2 in CI generation involving the two sequential estimation
approaches
separately
are
the
variance

of
T ˆ T 2  T 1
1

Var cBˆ  ˆ T 2  c  X T ' X T  c ' and
 
T
contrast

regression
parameters,
. The maximum contrast variance,
 

 , among K estimations in the first approach is selected to compare with Var c B


 T ̂ T2  T 1 ,
T
which represents the largest variance among any linear combination of
the original regression parameters. ˆ T 2 in the first approach and
ˆ
T
2
 T 1 in the
second cancel each other out when T is big enough. Accordingly, the maximum from K
c  X T ' X T  c ' values and
1
T
were computed for comparison, as seen in Table 6.9.
T
Comparisons involving experimental design with systematic task stimulus order and with
random task stimulus order are displayed separately. The values of comparable
component 2 employed with random order experimental design are greater than the
values employed with systematic order experimental design. The values of comparable
component 2
in CI generation involving the voxel-wise Srivastava’s sequential
estimation under one, two or three task stimuli are consistently greater than the
corresponding values of voxel-wise sequential estimation.
114
(A) Components 2 in voxel-wise sequential estimation and
voxel-wise Srivastava approach
0.09
(B) Components 2 in voxel-wise sequential estimation and
voxel-wise Srivastava approach
0.09
0.08
0.08
Sequential estimation contrast variance
Srivastava maximum eigenvalue variance
Sequential estimation contrast variance
Srivastava maximum eigenvalue variance
0.07
0.07
0.06
0.06
0.05
0.05
0.04
0.04
0.03
0.03
0.02
1
2
0.02
1
3
2
3
number of task
number of task
Number of task in XT
Presented order
of stimuli
1(K=1)
2(K=3)
3(K=6)
c  X T ' X T  c ' values
0.0229
0.0264
0.0383
T
T
0.0233
0.0371
0.0845
c  X T ' X T  c ' values
0.0283
0.0323
0.0420
T
T
0.0288
0.0410
0.0845
Comparable
Component 2
maximum from K
1
Systematic order
maximum from K
1
Random order
Table 6.9 The values of comparable component 2 in CI generation involving the two sequential
estimation approaches, as a function of number of task stimuli. The above figures displays the
trends. K equals the number of estimations being conducted simultaneously.
115
Finally, the comparable products of component 1 and component 2,  tT  P 1, /2 K  ×
2
(maximum from K c  X T ' X T  c ' values) and a T2 ×
1
T
are explored here. The results
T
in Table 6.10 show that the comparable products of component 1 and component 2 in CI
generation involving the voxel-wise Srivastava’s sequential estimation are consistently
greater than the corresponding values in the voxel-wise sequential estimation. The
difference quickly increases when three tasks are included in the experimental design.
Experimental designs with systematic and random order show similiar values of
comparable products of component 1 and component 2.
116
(A) Components in voxel-wise sequential estimation and
voxel-wise Srivastava approach
(B) Components in voxel-wise sequential estimation and
voxel-wise Srivastava approach
0.9
0.9
0.8
0.7
Sequential estimation
Srivastava sequential estimation
0.8
Sequential estimation
Srivastava sequential estimation
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
1
2
0.1
1
3
2
number of task
3
number of task
Number of task in XT
Stimulus are
presented order
Comparable products of
component 1 and
component 2
t
T  P 1, /2 K

2
from K c  X T ' X T  c '
3(K=6)
0.0887
0.1527
0.2699
0.1399
0.2903
0.8017
0.1095
0.1871
0.2957
0.1726
0.3208
0.8017
values)
aT2 ×
t
T  P 1, /2 K

2
T
T
× (maximum
from K c  X T ' X T  c '
1
Random order
2(K=3)
× (maximum
1
Systematic order
1(K=1)
values)
aT2 ×
T
T
Table 6.10 The values of comparable products of component 1 and component 2 in CI generation
involving the two sequential estimation approaches, as a function of number of task stimuli. The
above figures displays the trends. K equals the number of estimations being conducted
simultaneously.
117
6.1.3 Discussion and Conclusion
The objective here is to develop a voxel-wise sequential estimation approach for
efficiently estimating the activation magnitudes over task demands. This involves
estimating activation levels with a certain level of precision, as measured by targeted CI
length, across a range of difficulty levels. In the first simulation study, voxel-wise
sequential estimation and voxel-wise Srivastava’s sequential estimation both show high
accuracies on three activation magnitude level estimations. Take voxel-wise sequential
estimation, d=0.5, for example. In estimation over three task demand levels, the
conservative approach of assuming a worst-case variance assumption will require an
fMRI scanning session lasting
1870 (=330+602+938) scan units. When the BOLD
signals with smaller variance of noise are detected, the proposed method is able to
shorten the experimental design to require only 1218 (=252+323+643) scan units. In
other words, around 35% saving on the scan units can be achieved when variances
decrease from 3.50 to 1.51 (for images with low activation magnitudes), from 9.49 to 3.5
(for images with medium activation magnitudes) and from 14.82 to 9.49 (for images with
high activation magnitudes). These savings will be even larger when the difference
between upper-bound variance of noise and actual variance is larger. Moreover, the
required sample sizes at each respective global stopping time point are described in
Table 6.11. Consider a case in which a 90% global stopping is acceptable. The required
sample size decreases to 1097 (=227+291+579) scan units, a 40% savings. Thus, the
employment of a global stopping rule can help limit the maximum required sample size
among the voxels of interest.
118
Simulate
d image
Image 1
Image 2
Image 3
Image 4
Image 5
119
Image 6
Global stopping
percentage
No. of scan units
No. of voxels
No. of scan units
No. of voxels
No. of scan units
No. of voxels
No. of scan units
No. of voxels
No. of scan units
No. of voxels
No. of scan units
No. of voxels
30%
40%
50%
60%
70%
80%
90%
100%
75.6
0
99
0
96.9
0
180.6
0
192.9
0
281.4
0
100.8
0
132
0
129.2
0
240.8
0
257.2
0
375.2
30
126
0
165
0
161.5
0
301
15
321.5
276
469
642
151.2
0
198
0
193.8
0
361.2
607
385.8
471
562.8
92
176.4
0
231
55
226.1
21
421.4
174
450.1
100
656.6
96
201.6
0
264
798
258.4
794
481.6
81
514.4
49
750.4
32
226.8
893
297
46
290.7
82
541.8
21
578.7
1
844.2
5
252
7
330
1
323
3
602
2
643
3
938
3
Table 6.11 Global stopping time points of 6 simulated images analyzed by voxel-wise sequential estimation approach, d=0.5 employed. Global
stopping percentage presents the percentage of number of voxels satisfying the stopping rules, equation (6.7). The corresponding stopping time
point and voxels satisfying stopping rules between previous and labeled percentages are displayed.
The second stimulation study explored the efficiency and accuracy of (differential)
activation estimations for an experimental design including three task stimuli. Both
sequential estimation approaches show high estimation accuracies on both activation and
differential activation magnitudes. When applying the voxel-wise sequential estimation
approach with d=0.5, the required sample size reduces from 743 when signals have a
larger variance of noise, to 479 when signals have a smaller variance of noise. More than
35.5% scan units will be saved when BOLD signals with smaller variance of noise are
detected.
In the last simulation study, Srivastava’s sequential approach required greater sample
sizes than sequential CI approaches under one, two or three tasks. In the stopping rule,
Srivastava applied
T ˆ T 2  T 1
T
to ensure simultaneous coverage for a fixed
confidence width 2d > 0 for all linear combination of regression parameter estimates.
However, this also requires a greater sample size, expecially when greater numbers of
task stimuli are included in the experimental design. When researchers would like to
focus on estimation of specific (differential) activations, the sequential CI approach is
recommended to save scan units. Srivastava’s sequential approach is only suggested for
researchers who would like to explore all possible linear combinations of regression
parameters after the fMRI implementation.
The order in which stimuli are presented—systematic versus random—does not have a
large influence on the performance of the two proposed approaches.
120
In order to have comparable estimated values with Srivastava’s sequential approach,
the differential activation estimation, such as tT  p ,/2 K



1
 Var 
b1  b2   , is estimated

 2



by the sequential estimation approach. However, there is no limitation on the estimated
target for the sequential estimation method. The estimated target can be



tT  p,/2 K  Var 
b1  b2  . The increased variance is expected to lead to an increased
value in component 2 of the first approach and, thus, an increased sample size. However,
the similar overall component change pattern is still observed.
6.2 Halving Algorithm and Voxel-wise SPRT on Detecting Neural
Compensation
The goal of the halving algorithm is to determine the minimum task difficulty level
that will activate the neural compensation network. This approach starts from a midrange difficulty level. Then, depending on whether estimated activation is detected or not,
a lower or higher difficulty level is administered. The assumption is that if a subject is not
activating at a higher level, then that subject won’t be active at a lower one either. There
is no need, therefore, to test at relatively lower levels if activation is not observed. On the
other hand, if activation is seen at the selected level, a lower level is administered next.
The halving algorithm is shown in tree representation in Figure 6.5 for a task with five
difficulty levels, and can be used to adaptively select a difficulty level for administration,
based on BOLD responses observed in real time. As shown in Figure 6.5, the
implemented stimulus starts from a task with difficulty 3, where the difficulty level is
121
ranked from the easiest one to the hardest one. Then, a harder or easier task is
implemented according to the task activation status at difficulty 3. Applying the halving
algorithm reduces the number of tasks that must be administered from five to at most
three. Voxel-wise SPRT is applied for determining activation status at each task difficulty
level.
Figure 6.5 Halving algorithm including five difficulty level tasks. Determinations of the
minimum difficulty level when network starts be active are denoted as green.
6.2.1 Simulation Study
In these simulations we explore how much fMRI session time could be saved by
applying the combination of halving algorithms and voxel-wise SPRT in place of
traditional GLM analysis. We built the activation response curves over task difficulty
according to the hypothesized youth activation curve from Stern’s investigation 93, shown
in Figure 5.2, using five task difficulties to construct the curve. The activation curve of
compensation network is assumed to be equal to the activation curve of youth in primary
networks. The corresponding activation magnitudes over task demands are shown in
122
Table 6.12. The simulation included subjects who start to use the compensation network
at task difficulty level 1, level 2, level 3, level 4, and level 5 or who are inactive at task
difficulty level 5. In total, six possible response curves (activation level patterns) were
assumed for the subsequent simulation shown in Figure 6.6.
Activation amplitude
(value of β)
0
0.38
0.62
0.82
0.92
0.97
Difficulty level
Inactive level
Initial activation level
One higher level
Two higher level
Three higher level
Four higher level
Table 6.12 Hypothesized activation magnitudes over task loadings.
Hypothesized activation curve over task demands
1
0.9
active start at difficulty level 1
active start at difficulty level 2
Activation magnitude(s)
active start at difficulty level 3
0.8
active start at difficulty level 4
0.7
active start at difficulty level 5
inactive for task difficulty level 5
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
Difficulty level(s)
Figure 6.6 The hypothesized activation curves for subjects starting to respond to the given task at
different loadings.
123
6.2.1.1 Method
The R package “nruRosim”
58
was used to generate simulated fMRI images and the
simulated datasets were analyzed within the Matlab environment (64-bit version R2012a
The Mathworks, Natick, MA).
In order to model realistic BOLD signals, a nonlinear Balloon model was used
110
. In
this simulation, the implemented experimental design included one task. This design
started with an 8-second rest period followed by a 3-second task stimulus. This
alternating cycle was repeated for 30 times and ended with one 8-second rest period for a
total of 338 seconds or 169 scan units with a 2-second TR. Each group with a different
initial activation level has a distinct activation curve over the five task difficulties, as
shown in Figure 6.6. For each group, five BOLD signals at five levels of task difficulty
were generated. The simulated BOLD signals were produced by combining time series
associated with activation activity and noise, as shown in equation (4.8). The
corresponding activation amplitudes were defined as the regression coefficient values
associated with the experimental design in GLM, as shown in Table 6.12. The thermal
noise modeled by normal distribution was also added in the simulated BOLD signal. The
SNR was kept below 0.5 to fit clinical fMRI signal observation. One hundred individual
signals were generated for each group.
6.2.1.2 Simulation Results
The goal here is to identify the difficulty level at which an individual starts to activate
specific ROIs, such as ROIs implicated with a neural compensation network. Given a
124
task with five difficulty levels, a traditional fixed design will require 150 alternative
alternating cycles, each involving one rest block followed by one task block. This can be
inefficient if measurable activation doesn’t occur until a certain difficulty level is tested,
which may vary by individual. At each difficulty level, a hypothesis test of H0: b = 0
versus Ha: b > 0 is conducted in the context of a GLM. In a fixed design, a decision of
activation status must be made at each difficulty level. If, however, one is mainly
interested in identifying the lowest level at which the null hypothesis is rejected at a
given significance level, it may instead be possible to sequentially “jump around” in the
testing of difficulty levels, and not necessarily test at all levels. Under certain conditions,
there can be great savings in the number of administered scan units by doing this.
First, we consider fixed design results, as displayed in Table 6.13. Each column
represents the true initial activity difficulty level, from 1 to 5, plus the inactive case. Each
row represents the classification decision made after observing the BOLD signals from
150 alternating cycles administered, in a GLM framework (t-test) at significance level
α=0.05. The last row shows the percentage of minimal initial activity difficulty levels
correctly classified using 100 simulations of actual difficulty level. The accuracies are
around 70% to 80%.
125
True level
1
2
3
4
5
Inactive
1
2
3
4
5
Inactive
83
17
0
0
0
0
7
72
21
0
0
0
6
6
81
7
0
0
3
4
5
74
14
0
6
4
3
2
72
13
2
4
4
7
1
82
Accuracy (%)
83
72
81
74
72
82
Classified
level
Table 6.13 Results from traditional GLM analysis. The boldface numbers at diagonal line
show the number of individuals whose initial activity level was correctly determined. In sum, 150
blocks were administered to each subject.
For the combination of halving algorithm and voxel-wise SPRT, the decision rule can
be denoted by the terminal node of the “path” taken in the tree representation of the
difficulty level selection strategy shown in Figure 6.5. At each non-terminal node, the
path direction of the next stage is determined by the result of hypothesis testing voxelwise SPRT of H0: b= 0 versus Ha: b > 0. δ = 0.38 is considered to be clinically active.
The same 100 individuals’ signals given each initial activation level were analyzed by
the combination of the halving and voxel-wise SPRT. The results are displayed in Table
6.14. Columns represent the true difficulty level associated with initial activation, from 1
to 5, plus the inactive case. The average total number of blocks administered per
individual is given in adjacent columns. Each row represents the classification decision
made after dynamically observing the BOLD signal using SPRT and the halving
algorithm, at significance level of 0.05 and with Type II error of 0.10. The last row shows
the percentage of correctly classified minimum difficulty levels among 100 simulations
per actual difficulty level. The accuracies are around 70% to 90%.
126
Our simulations show that the halving algorithm used in conjunction with voxel-wise
SPRT methods can result in a savings of 72-83% in scan time, depending on a subject’s
minimum level and compared to a fixed design requiring all difficulty levels to be
administered with similar accuracy. This work would help fMRI researchers gain the
widest range of the most pertinent information possible within given time constraints.
127
True level
1
2
Mean
(Std) of
required
no. of
alternative
cycle
Classified
initially
active level
30.27
(9.34)
33.61
(11.70)
3
Mean
(Std) of
required
no. of
alternative
cycle
2
128
56
(5.83)
37.99
(11.13)
27.63
(12.29)
4
Mean
(Std) of
required
no. of
alternative
cycle
Mean
(Std) of
required
no. of
alternative
cycle
0
5
Mean
(Std) of
required
no. of
alternative
cycle
0
0
0
0
1
83
2
17
3
0
6
4
0
0
12
5
0
0
1
Inactive
0
0
0
0
18
Accuracy
83%
92%
83%
82%
72%
4
83
39.63
(21.69)
29.05
(11.50)
34.27
(9.20)
30.88(0)
Mean
(Std) of
required
no. of
alternative
cycle
0
1
92
Inactive
8
82
10
29.12
(13.03)
37.28
(15.41)
41.98
(15.82)
5
5
72
25.28
(7.00)
35.15
(15.06)
41.61
(13.76)
32.79
(13.61)
4
47.38
(-)
36.09
(12.59)
0
3
92
47.83
(4.53)
25.18
(11.55)
92%
Table 6.14 Results from proposed halving algorithms and voxel-wise SPRT. The numbers at diagonal line show the number of individuals
declaimed correct initial activation level and totally required number of blocks. The accuracies among 6 initial activation groups were around 70%
to 90%. The columns next to the numbers are the mean of total required number of blocks. Std represents standard deviation.
6.3 Discussion and Conclusion
The goal of this chapter is to develop dynamic sequential statistical methods for
efficiently understanding three characteristics of CR: efficiency, capacity and
compensation. Two approaches are proposed. The first, voxel-wise sequential estimation,
offers a means to monitor the lengths of CIs over a range of experimental conditions, in
order to assess efficiency and capacity in neural reserve. The voxel-wise sequential
estimation approach is shown to have high efficiency and accuracy for (differential)
activation magnitude estimation. With increased number of task demands, greater
differences between the assumed largest variance and the variance from observed BOLD
signals will lead to relatively greater savings in the number of scan units required. The
second approach, sequential selection of difficulty level, as reflected in a combination of
a halving algorithm and voxel-wise SPRT, is used to dynamically detect if and when
neural compensation occurs. The savings in the number of scan units needed to make
such identifications in the simulation study is dramatic, up to 72-83%.
In sum, the proposed sequential methods can improve the fMRI experimental design
used to detect three characteristics of CR. This systematic approach should quickly and
efficiently provide information that can aid in the identification of pre-symptomatic AD.
129
Chapter 7
Conclusion
Voxel-wise sequential hypothesis-testing and estimation methods are proposed in this
dissertation for efficiently locating brain activation regions and estimating activation
magnitudes.
Four simulation studies were performed in the investigation of voxel-wise SPRT, the
proposed sequential analysis method for identifying the activation status. These
simulation conditions were selected to reflect realistic conditions with respect to expect
signal to noise (SNR) ratios. Compared with the traditional fMRI analysis method, GLM,
the SPRT approach was demonstrated to require at least a 40% shorter fMRI scan time
period for locating the peak of regions with high activation values in the first simulation
study. Further, the prominent advantage of voxel-wise SPRT, which will stop sampling
when sufficient evidence is collected for making voxel-level activation status decision, is
that the fMRI experimental design can be adjusted by the subject’s real time responses.
Relatively high activation levels, or relatively low error variance, can lead to shorter
required administrations for attaining targeted statistical precision levels. In the third
simulation study, the saving was increased to 50% while BOLD signals with greater SNR
were recorded. Voxel-wise SPRT is not only appropriate for choosing between one-sided
hypotheses but also two-sided hypotheses. In two-sided hypothesis, the proposed
approach still displayed 30% saving on fMRI scan units with higher than 90% detection
accuracy as shown in the second simulation study. In the fourth simulation, one-sided
voxel-wise SPRT showed greater differential inactivation detection accuracies compared
to a fixed design. Traditional GLM analysis actually showed increased Type II error for
inactive regions by extending fMRI scan session. In the real-data example that aimed on
130
identifying the adult face activation regions but not be active to house stimuli, a more
than 65% session saving was observed when using the proposed one-sided voxel-wise
SPRT as compared with the length of the original design.
While Bonferroni correction is applied on order to correct for the multiple comparison
issue in voxel-wise SPRT, the stopping boundaries turn out to be very large and small
values. This leads to a larger number of required scan units, so that focus on specific
regions of interest is helpful, to reduce the number of voxels being analyzed. Perhaps a
bigger concern is that parameter values that lie in between the null and hypothesis value
sets are difficult to distinguish. Ironically, while such parameter values are costly in
terms of expected required sample size, they also are the values for which indifference in
classification error may be greatest. To combat this phenomenon, a global stopping rule
is proposed. According to the expert opinion and empirical rule, a predetermined
percentage (less than 100%) is selected so that once a percentage of voxel-wise analyses
satisfy SPRT stopping criteria, fMRI experimentation is stopped. This approach allows
for savings in scanning to be achieved, while still allowing for desired statistical accuracy
for identifying the most strongly activated and least activated voxels.
As opposed to SPRT, which involves hypothesized testing, it may also be of interest to
directly estimate activation magnitude in the context of a GLM.
Two voxel-wise
sequential estimation methods are proposed here. The first voxel-wise sequential
estimation method focuses directly on the regression parameters of interest (including
contrasts). When the lengths of estimated CIs of interested regression parameters are
smaller than a specific value 2d under prescribed confidence value 1-α, the sequential
method stops sampling and computes the estimations of interest according to the recorded
131
signals. Voxel-wise Srivastava’s sequential estimation method, on the other hand, can
insure simultaneous coverage for a fixed confidence width 2d > 0 for all linear
combination of regression parameter estimations, cBˆ , where c is a contrast vector with
cc’=1 and B̂ is regression coefficients estimation component. Both methods were proved
with high efficiencies and (differential) activation estimation accuracies in the simulation
studies.
Srivastava’s method insures that all the linear combination of regression coefficients
being covered by confidence region with specific maximum width. However, greater
sample size is traded for this general coverage. In the simulation studies, when task
activations and differential activations between two tasks were considered interested in
the experimental design separately including 1, 2 and 3 task stimulus, voxel-wise
sequential estimation method stops sampling earlier than Srivastava’s general coverage
method. Further, the coefficients of contrast vector, c, is limited by cc’=1 in Srivastava’s
method and there is no limitation on the first sequential estimation method. Therefore, the
voxel-wise sequential estimation method is recommended with greater saving when
researches are interested in the task activations and differential activations between two
tasks.
An application of this work is in assessing and characterizing cognitive reserve (CR).
CR is used to explain the heterogeneity in cognitive functioning among individuals with
Alzheimer’s disease. Neural reserve and neural compensation are two main features of
CR. Neural reserve relates to neural efficiency and capacity of the brain. Neural
compensation reflects plasticity, and the ability of the brain to develop alternative
networks to overcome damage.
132
It is hypothesized that people with high CR have differential activation patterns at the
early stage of AD than those with low CR. High CR is of particular interest clinically, as
this can mask symptoms, and lead to short periods of cognitive morbidity. Study of CR
can lead to a better understanding of the Alzheimer’s disease process, and help in
identifying individuals with very early brain changes, which could facilitate early
intervention. In order to identify the efficiency, capacity and compensation of CR,
activation patterns across varying difficulty levels of a task can be used.
Sequential
estimation approaches can be used to identify and differentiate these patterns. Another
aspect of sequential decision combines a halving algorithm and SPRT hypothesis-testing
framework, to detect neural compensation that may occur at different difficulty levels,
depending on an individual. The saving in the second approach’s simulation study is
dramatically increased to 72-83%.
In conclusion, voxel-wise SPRT for detecting activation status, and voxel-wise
sequential estimation to estimate activation magnitudes, are proposed here. A sequential
selection of difficulty levels within a task also is developed. These methods are shown to
achieve high efficiency with high detection accuracies. These methods can be applied to
help in understanding the characteristics of CR, which will provide important insight into
the Alzheimer’s disease process.
133
Chapter 8
Bibliography
1.
Ashby FG. Statistical Analysis of fMRI Data: MIT press, 2011.
2.
Murphy K, Bodurka J, Bandettini PA. How long to scan? The relationship
between fMRI temporal signal to noise ratio and necessary scan duration. Neuroimage
2007;34:565-574.
3.
Parrish TB, Gitelman DR, LaBar KS, Mesulam MM. Impact of signal-to-noise on
functional MRI. Magn Reson Med 2000;44:925-932.
4.
Bagarinao E, Nakai T, Tanaka Y. Real-time functional MRI: development and
emerging applications. Magn Reson Med Sci 2006;5:157-165.
5.
Magland JF, Tjoa CW, Childress AR. Spatio-temporal activity in real time
(STAR): optimization of regional fMRI feedback. Neuroimage 2011;55:1044-1053.
6.
Cohen MS. Real-time functional magnetic resonance imaging. Methods
2001;25:201-220.
7.
Weiskopf N, Sitaram R, Josephs O, et al. Real-time functional magnetic
resonance imaging: methods and applications. Magn Reson Imaging 2007;25:989-1003.
8.
Sitaram R, Caria A, Veit R, et al. FMRI brain-computer interface: a tool for
neuroscientific research and treatment. Comput Intell Neurosci 2007:25487.
9.
LaConte SM. Decoding fMRI brain states in real-time. Neuroimage 2011;56:440-
454.
10.
Weiskopf N. Real-time fMRI and its application to neurofeedback. Neuroimage
2011.
11.
2010 Alzheimer's disease facts and figures. Alzheimers Dement 2010;6:158-194.
12.
2013 Alzheimer's disease facts and figures. Alzheimers Dement 2013;9:208-245.
134
13.
Shaffer JL, Petrella JR, Sheldon FC, et al. Predicting cognitive decline in subjects
at risk for Alzheimer disease by using combined cerebrospinal fluid, MR imaging, and
PET biomarkers. Radiology 2013;266:583-591.
14.
Sperling RA, Bates JF, Chua EF, et al. fMRI studies of associative encoding in
young and elderly controls and mild Alzheimer's disease. J Neurol Neurosurg Psychiatry
2003;74:44-50.
15.
Pariente J, Cole S, Henson R, et al. Alzheimer's patients engage an alternative
network during a memory task. Ann Neurol 2005;58:870-879.
16.
Stern Y. Cognitive reserve in ageing and Alzheimer's disease. Lancet Neurol
2012;11:1006-1012.
17.
Lazar NA. The Statistical Analysis of Functional MRI Data: Springer, 2008.
18.
Scott A. Huettel AWS, and Gregory McCarthy. Functional Magnetic Resonance
Imaging: Sinauer Associates, 2009.
19.
Russell A. Poldrack JAM, Thomas E. Nichols Handbook of Functional MRI Data
Analysis: Cambridge university press, 2011.
20.
Logothetis NK. What we can do and what we cannot do with fMRI. Nature
2008;453:869-878.
21.
Radiopaedia. BOLD imaging. In, 2009.
22.
Biswal B, DeYoe AE, Hyde JS. Reduction of physiological fluctuations in fMRI
using digital filters. Magn Reson Med 1996;35:107-113.
23.
Glover GH. Deconvolution of impulse response in event-related BOLD fMRI.
Neuroimage 1999;9:416-429.
135
24.
Friston KJ, Fletcher P, Josephs O, Holmes A, Rugg MD, Turner R. Event-related
fMRI: characterizing differential responses. Neuroimage 1998;7:30-40.
25.
Karl Friston JA, Stefan Kiebel, Thomas Nichols and William Penny Statistical
Parametric Mapping: The Analysis of Functional Brain Images 2007.
26.
Watson GS. Serial Correlation in Regression Analysis. Biometrika 1955;42:327-
341.
27.
Bullmore E, Brammer M, Williams SC, et al. Statistical methods of estimation
and inference for functional MR image analysis. Magn Reson Med 1996;35:261-277.
28.
Purdon PL, Weisskoff RM. Effect of temporal autocorrelation due to
physiological noise and stimulus paradigm on voxel-level false-positive rates in fMRI.
Hum Brain Mapp 1998;6:239-249.
29.
Kruggel F, von Cramon DY. Modeling the hemodynamic response in single-trial
functional MRI experiments. Magn Reson Med 1999;42:787-797.
30.
Purdon PL, Solo V, Weisskoff RM, Brown EN. Locally regularized
spatiotemporal modeling and model comparison for functional MRI. Neuroimage
2001;14:912-923.
31.
Harrison L, Penny WD, Friston K. Multivariate autoregressive modeling of fMRI
time series. Neuroimage 2003;19:1477-1491.
32.
Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in
functional neuroimaging using the false discovery rate. Neuroimage 2002;15:870-878.
33.
Benjamini Y, Hochberg Y. Controlling the False Discovery Rate - a Practical and
Powerful Approach to Multiple Testing. J Roy Stat Soc B Met 1995;57:289-300.
136
34.
Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple
testing under dependency. Ann Stat 2001;29:1165-1188.
35.
Cox RW, Jesmanowicz A, Hyde JS. Real-time functional magnetic resonance
imaging. Magn Reson Med 1995;33:230-236.
36.
deCharms RC, Christoff K, Glover GH, Pauly JM, Whitfield S, Gabrieli JD.
Learned regulation of spatially localized brain activation using real-time fMRI.
Neuroimage 2004;21:436-443.
37.
deCharms RC, Maeda F, Glover GH, et al. Control over brain activation and pain
learned by using real-time functional MRI. Proc Natl Acad Sci U S A 2005;102:1862618631.
38.
Caria A, Veit R, Sitaram R, et al. Regulation of anterior insular cortex activity
using real-time fMRI. Neuroimage 2007;35:1238-1246.
39.
Hampson M, Stoica T, Saksa J, et al. Real-time fMRI biofeedback targeting the
orbitofrontal cortex for contamination anxiety. J Vis Exp 2012.
40.
Bagarinao E, Matsuo K, Nakai T, Sato S. Estimation of general linear model
coefficients for real-time application. Neuroimage 2003;19:422-429.
41.
Caria A, Sitaram R, Birbaumer N. Real-Time fMRI: A Tool for Local Brain
Regulation. Neuroscientist 2011.
42.
Zaitsev M, Dold C, Sakas G, Hennig J, Speck O. Magnetic resonance imaging of
freely moving objects: prospective real-time motion correction using an external optical
motion tracking system. Neuroimage 2006;31:1038-1050.
43.
Speck O, Hennig J, Zaitsev M. Prospective real-time slice-by-slice motion
correction for fMRI in freely moving subjects. MAGMA 2006;19:55-61.
137
44.
Jenkinson M, Bannister P, Brady M, Smith S. Improved optimization for the
robust and accurate linear registration and motion correction of brain images.
Neuroimage 2002;17:825-841.
45.
Gembris D, Taylor JG, Schor S, Frings W, Suter D, Posse S. Functional magnetic
resonance imaging in real time (FIRE): sliding-window correlation analysis and
reference-vector optimization. Magn Reson Med 2000;43:259-268.
46.
Smyser C, Grabowski TJ, Frank RJ, Haller JW, Bolinger L. Real-time multiple
linear regression for fMRI supported by time-aware acquisition and processing. Magn
Reson Med 2001;45:289-298.
47.
Wald A. Sequential Analysis. New York: Wiley, 1947.
48.
Srivastava MS. On Fixed-Width Confidence Bounds for Regression Parameters
and Mean Vector. Journal of the Royal Statistical Society Series B (Methodological)
1967:p.132-140.
49.
Wald AaW, J. Optimum character of the sequential probability ratio test. The
Annals of Mathematical Statistics 1948;19:326-339.
50.
Govindarajulu Z. Sequential Statistics: World Scientific Pub Co Inc 2004.
51.
Sawasd Tantaratana JBT. Truncated sequential probability ratio test. Inform
Sciences 1977;13:283-300.
52.
Bartlett MS. The large-sample theory of sequential tests. Mathematical
Proceedings of the Cambridge Philosophical Society 1946;42:239-244.
53.
Cox DR. Large Sample Sequential Tests for Composite Hypotheses. Sankhyā:
The Indian Journal of Statistics, Series A 1963:5-12.
138
54.
Li JX. Sequential probability ratio tests for generalized linear mixed models,
doctoral dissertation, University of California Riverside, 2010.
55.
Shyamal K. De MB. Step-up and step-down methods for testing multiple
hypotheses in sequential experiments. Journal of Statistical Planning and Inference
2012;142:2059–2070.
56.
Shah PK, Jeske DR, Luck RF. Sequential hypothesis testing techniques for pest
count models with nuisance parameters. J Econ Entomol 2009;102:1970-1976.
57.
Yan C, Liu D, He Y, et al. Spontaneous brain activity in the default mode network
is sensitive to different resting-state conditions with limited cognitive load. PLoS One
2009;4:e5743.
58.
Marijke Welvaert JD, Beatrijs Moerkerke, Geert Verdoolaege, Yves Rosseel.
neuRosim: An R Package for Generating fMRI Data. Journal of Statistical Software
2011;44:1-18.
59.
Maus B, van Breukelen GJ, Goebel R, Berger MP. Optimal design of multi-
subject blocked fMRI experiments. Neuroimage 2011;56:1338-1352.
60.
Meyer FG, Shen X. Classification of fMRI time series in a low-dimensional
subspace with a spatial prior. IEEE Trans Med Imaging 2008;27:87-98.
61.
Posse S, Fitzgerald D, Gao K, et al. Real-time fMRI of temporolimbic regions
detects amygdala activation during single-trial self-induced sadness. Neuroimage
2003;18:760-768.
62.
Pathological correlates of late-onset dementia in a multicentre, community-based
population in England and Wales. Neuropathology Group of the Medical Research
Council Cognitive Function and Ageing Study (MRC CFAS). Lancet 2001;357:169-175.
139
63.
Katzman R, Terry R, DeTeresa R, et al. Clinical, pathological, and neurochemical
changes in dementia: a subgroup with preserved mental status and numerous neocortical
plaques. Ann Neurol 1988;23:138-144.
64.
Price JL, Morris JC. Tangles and plaques in nondemented aging and "preclinical"
Alzheimer's disease. Ann Neurol 1999;45:358-368.
65.
Crystal H, Dickson D, Fuld P, et al. Clinico-pathologic studies in dementia:
nondemented subjects with pathologically confirmed Alzheimer's disease. Neurology
1988;38:1682-1687.
66.
Morris JC, Storandt M, McKeel DW, Jr., et al. Cerebral amyloid deposition and
diffuse plaques in "normal" aging: Evidence for presymptomatic and very mild
Alzheimer's disease. Neurology 1996;46:707-719.
67.
Mortimer JA, Snowdon DA, Markesbery WR. Head circumference, education and
risk of dementia: findings from the Nun Study. J Clin Exp Neuropsychol 2003;25:671679.
68.
Stern Y. What is cognitive reserve? Theory and research application of the
reserve concept. J Int Neuropsychol Soc 2002;8:448-460.
69.
Stern Y. Cognitive reserve. Neuropsychologia 2009;47:2015-2028.
70.
Stern Y. Cognitive reserve and Alzheimer disease. Alzheimer Dis Assoc Disord
2006;20:112-117.
71.
Satz P. Brain Reserve Capacity on Symptom Onset After Brain Injury: A
Formulation and Review of Evidence for Threshold Theory. Neuropsychology
1993;7:273-295.
140
72.
Katzman R. Education and the prevalence of dementia and Alzheimer's disease.
Neurology 1993;43:13-20.
73.
Schofield PW, Logroscino G, Andrews HF, Albert S, Stern Y. An association
between head circumference and Alzheimer's disease in a population-based study of
aging and dementia. Neurology 1997;49:30-37.
74.
Stern Y, Habeck C, Moeller J, et al. Brain networks associated with cognitive
reserve in healthy young and old adults. Cereb Cortex 2005;15:394-402.
75.
Steffener J, Stern Y. Exploring the neural basis of cognitive reserve in aging.
Biochim Biophys Acta 2012;1822:467-473.
76.
Zarahn E, Rakitin B, Abela D, Flynn J, Stern Y. Age-related changes in brain
activation during a delayed item recognition task. Neurobiol Aging 2007;28:784-798.
77.
Cabeza R. Hemispheric asymmetry reduction in older adults: the HAROLD
model. Psychol Aging 2002;17:85-100.
78.
Grady CL, Maisog JM, Horwitz B, et al. Age-related changes in cortical blood
flow activation during visual processing of faces and location. J Neurosci 1994;14:14501462.
79.
Reuter-Lorenz P. New visions of the aging mind and brain. Trends Cogn Sci
2002;6:394.
80.
Madden DJ, Turkington TG, Provenzale JM, et al. Adult age differences in the
functional neuroanatomy of verbal recognition memory. Hum Brain Mapp 1999;7:115135.
81.
Dufouil C, Alperovitch A, Tzourio C. Influence of education on the relationship
between white matter lesions and cognition. Neurology 2003;60:831-836.
141
82.
Dufouil C, Alperovitch A, Ducros V, Tzourio C. Homocysteine, white matter
hyperintensities, and cognition in healthy elderly people. Ann Neurol 2003;53:214-221.
83.
Elkins JS, Longstreth WT, Jr., Manolio TA, Newman AB, Bhadelia RA, Johnston
SC. Education and the cognitive decline associated with MRI-defined brain infarct.
Neurology 2006;67:435-440.
84.
Glatt SL, Hubble JP, Lyons K, et al. Risk factors for dementia in Parkinson's
disease: effect of education. Neuroepidemiology 1996;15:20-25.
85.
Kesler SR, Adams HF, Blasey CM, Bigler ED. Premorbid intellectual functioning,
education, and brain size in traumatic brain injury: an investigation of the cognitive
reserve hypothesis. Appl Neuropsychol 2003;10:153-162.
86.
Farinpour R, Miller EN, Satz P, et al. Psychosocial risk factors of HIV morbidity
and mortality: findings from the Multicenter AIDS Cohort Study (MACS). J Clin Exp
Neuropsychol 2003;25:654-670.
87.
Barnett JH, Salmond CH, Jones PB, Sahakian BJ. Cognitive reserve in
neuropsychiatry. Psychol Med 2006;36:1053-1064.
88.
Sumowski JF, Chiaravalloti N, DeLuca J. Cognitive reserve protects against
cognitive dysfunction in multiple sclerosis. J Clin Exp Neuropsychol 2009;31:913-926.
89.
Tucker AM, Stern Y. Cognitive reserve in aging. Curr Alzheimer Res
2011;8:354-360.
90.
Jack CR, Jr., Knopman DS, Jagust WJ, et al. Tracking pathophysiological
processes in Alzheimer's disease: an updated hypothetical model of dynamic biomarkers.
Lancet Neurol 2013;12:207-216.
142
91.
Stern Y, Rakitin BC, Habeck C, et al. Task difficulty modulates young-old
differences in network expression. Brain Res 2012;1435:130-145.
92.
Curtis T. Modeling the heterogeneity in risk of progression to Alzheimer's disease
across cognitive profiles in mild cognitive impairment. Alzheimer’s Research & Therapy
in press.
93.
Stern Y. Cognitive reserve : theory and applications. New York: Taylor and
Francis, 2007.
94.
Wager TD, Smith EE. Neuroimaging studies of working memory: a meta-analysis.
Cogn Affect Behav Neurosci 2003;3:255-274.
95.
de Leon MJ, Mosconi L, Blennow K, et al. Imaging and CSF studies in the
preclinical diagnosis of Alzheimer's disease. Ann N Y Acad Sci 2007;1097:114-145.
96.
Mahieux F, Onen F, Berr C, et al. Early detection of patients in the pre demented
stage of Alzheimer's disease: the Pre-Al Study. J Nutr Health Aging 2009;13:21-26.
97.
Weiner MW, Aisen PS, Jack CR, Jr., et al. The Alzheimer's disease neuroimaging
initiative: progress report and future plans. Alzheimers Dement 2010;6:202-211 e207.
98.
Davatzikos C, Bhatt P, Shaw LM, Batmanghelich KN, Trojanowski JQ.
Prediction of MCI to AD conversion, via MRI, CSF biomarkers, and pattern
classification. Neurobiol Aging 2011;32:2322 e2319-2327.
99.
Stern Y, Gurland B, Tatemichi TK, Tang MX, Wilder D, Mayeux R. Influence of
education and occupation on the incidence of Alzheimer's disease. JAMA
1994;271:1004-1010.
143
100.
Celone KA, Calhoun VD, Dickerson BC, et al. Alterations in memory networks in
mild cognitive impairment and Alzheimer's disease: an independent component analysis.
J Neurosci 2006;26:10222-10231.
101.
Rodda JE, Dannhauser TM, Cutinha DJ, Shergill SS, Walker Z. Subjective
cognitive impairment: increased prefrontal cortex activation compared to controls during
an encoding task. Int J Geriatr Psychiatry 2009;24:865-874.
102.
Reiman EM, Quiroz YT, Fleisher AS, et al. Brain imaging and fluid biomarker
analysis in young adults at genetic risk for autosomal dominant Alzheimer's disease in the
presenilin 1 E280A kindred: a case-control study. Lancet Neurol 2012;11:1048-1056.
103.
Putcha D, Brickhouse M, O'Keefe K, et al. Hippocampal hyperactivation
associated with cortical thinning in Alzheimer's disease signature regions in nondemented elderly adults. J Neurosci 2011;31:17680-17688.
104.
Stern Y, Moeller JR, Anderson KE, et al. Different brain networks mediate task
performance in normal aging and AD: defining compensation. Neurology 2000;55:12911297.
105.
Grady CL, McIntosh AR, Beig S, Keightley ML, Burian H, Black SE. Evidence
from functional neuroimaging of a compensatory prefrontal network in Alzheimer's
disease. J Neurosci 2003;23:986-993.
106.
Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes.
Acta Neuropathol 1991;82:239-259.
107.
Quiroz YT, Budson AE, Celone K, et al. Hippocampal hyperactivation in
presymptomatic familial Alzheimer's disease. Ann Neurol 2010;68:865-875.
144
108.
McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical
diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA Work Group under the
auspices of Department of Health and Human Services Task Force on Alzheimer's
Disease. Neurology 1984;34:939-944.
109.
Muller K. Regression and ANOVA: An Integrated Approach Using SAS
Software: SAS Institute 2002.
110.
Buxton RB, Wong EC, Frank LR. Dynamics of blood flow and oxygenation
changes during brain activation: the balloon model. Magn Reson Med 1998;39:855-864.
145