presentation_v01 - The Institute for Signal and Information

Automated Identification
of Abnormal Adult EEGs
A Thesis Proposal by:
Silvia López de Diego
Neural Engineering Data Consortium
College of Engineering
Temple University
Philadelphia, Pennsylvania, USA
Introduction
Electroencephalography (EEG)
• Electroencephalography (EEG)
refers to the recording of
electrical activity along the scalp
• It is used to treat conditions
such as sleep disorders and
epilepsy
• Because EEG is noninvasive and
relatively cheap, it is still used
despite the emergence of
technologies such as Magnetic
Resonance Imaging (MRI)
S. López de Diego: Abnormal EEGs
December 8, 2016
4
Manual Interpretation of EEGs
• Manual interpretation of an EEG is performed by a board-certified
neurologist. It takes several years to receive this certification.
• Interrater agreement is low: the interpretation of an EEG depends
somewhat on the training and subjective judgement of the examiner.
• Increasing the interrater agreement for EEG interpretation is one of the
advantages of an automated technique.
Patient
Preparation
• Patients are
prepared for the
test
S. López de Diego: Abnormal EEGs
EEG Recording
• EEG ranging from
22 minutes to
several days is
recorded
EEG is
Interpreted
• Certified
physicians
interpret EEG
EEG Report is
Produced
• Report of findings
(e.g. abnormality)
is prepared
December 8, 2016
5
Manual Interpretation of EEGs
• The EEG interpretation task can be broken down in:
• Recognition of transients: Events that include pathological and
physiological waveforms, such as spike and sharp waves
discharges
• Analysis of background: General characteristics present in all
EEG recordings that are usually observed when making a
normal/abnormal classification
Example of EEG Background
S. López de Diego: Abnormal EEGs
Example of EEG Transient
December 8, 2016
6
Normal EEG Characteristics
• The main characteristics of a normal EEG are the following:
• Reactivity: Response to certain physiological changes or
provocations.
• Alpha Rhythm: Waves originated in the occipital lobe
(predominantly), between 8-13 Hz and 15 to 45 μV.
• Mu Rhythm: Central rhythm of alpha activity commonly
between 8-10 Hz visible in 17% to 19% of adults.
• Beta Activity: Activities in the frequency bands of 18-25 Hz, 1416 Hz and 35-40 Hz.
• Theta Activity: Traces of 6-7 Hz activity present in the frontal or
frontocentral regions of the brain.
The normal/Abnormal classification heavily depends on the frequency,
presence or distortion of this feature. Its emergence during the closedeyes period is known as Posterior Dominant Rhythm (PDR)
We decided to focus on this characteristic.
S. López de Diego: Abnormal EEGs
December 8, 2016
7
Abnormal EEG Classification
S. López de Diego: Abnormal EEGs
December 8, 2016
8
Automatic Abnormal EEG Classification
• A general method for the for the classification of normal and abnormal
EEGs is a task that has not been explored yet
• Previous studies have focused on the classification of very specific
conditions, such as the classification of athletes with residual
functional deficits after a concussion
• Most of these studies have not been conducted with clinical EEGs
• This study proposes the establishment of a general method for the
classification of normal and abnormal EEGs, which would be more
useful in a clinical setting, where patients are evaluated for an ample
number of conditions
• To do this, the focus of the study will be the automatic analysis of the
background EEG
Output
Input
Normal
Features
Model
Abnormal
S. López de Diego: Abnormal EEGs
December 8, 2016
9
Background
The System
Output
Input
Normal
Model
Features
Abnormal
kNN
Pilot Studies
S. López de Diego: Abnormal EEGs
RF
HMM
Baseline System
December 8, 2016
11
Classification of Sequential Data
• EEGs, like speech signals, are the
product of a physiological process that
unfolds in time
• Machine learning approaches that treat
the observations as i.i.d. would fail to
exploit the sequential nature of the
data
• This, added to the success that Hidden
Markov Models (HMMs) have shown in
the area of speech recognition served
as motivation for the selection of these
model for the baseline system
S. López de Diego: Abnormal EEGs
December 8, 2016
12
Hidden Markov Models (HMMs)
• HMMs are a class of doubly stochastic processes in which discrete state
sequences are modeled as Markov chains, but in this case, the system emits a
visible observation, or symbol, in every state.
• If 𝒀 represents a sequence of feature vectors 𝒀 = 𝒚𝟏 , 𝒚𝟐 , … , 𝐲𝐓 , where 𝒚𝒕 is the
vector observed at time 𝒕, and 𝒘𝒊 is the 𝒊𝒕𝒉 event in a dictionary, the problem
becomes the finding of the most probable event:
𝑤 = 𝑎𝑟𝑔𝑚𝑎𝑥{𝑃(𝒀|𝒘)𝑃(𝒘)}
S. López de Diego: Abnormal EEGs
December 8, 2016
13
Hidden Markov Models (HMMs)
If the system has N states, an L-component Gaussian Mixture Model (GMM),
forward probability 𝛼 𝑖, 𝑡 , backward probability 𝛽 𝑗, 𝑡 and probability that the
model generates the symbol series 𝒀, the transition probability from 𝑖 to 𝑗 at
time 𝑡 is:
𝛾𝑖 𝑖, 𝑗 =
𝛼 𝑖,𝑡−1 𝑎𝑖𝑗 𝑏𝑖𝑗 (𝑌𝑡 ,𝜇𝑖𝑗 ,Σ𝑖𝑗 )𝛽 𝑗,𝑡
𝑃(Y|𝑀)
The estimation formulas for the transition probability are:
𝑎𝑖𝑗 =
𝑡 𝛾𝑖
𝑖,𝑗
𝑡 𝑗 𝛾𝑖 𝑖,𝑗
𝑏𝑖𝑗 =
𝑡:𝑘 𝛾𝑖
𝑖,𝑗
𝑡 𝛾𝑖 𝑖,𝑗
If 𝒀 follows an 𝒏 dimentional normal distribution, then:
𝑏𝑖𝑗 (𝑌𝑡 , 𝜇𝑖𝑗 , 𝛴𝑖𝑗 ) =
−1
𝑖𝑗 (𝑌𝑡 −
1/2
(2𝜋)𝑛/2 𝛴𝑖𝑗
exp{−(𝑌𝑡 − 𝜇𝑖𝑗 )𝑡
𝜇𝑖𝑗 )/2}
Where 𝝁𝒊𝒋 is the average of the output
vector, 𝒊𝒋 𝒊𝒔 the covariance and 𝒕 and
− 𝟏 represent the transpose and the
inverse respectively
S. López de Diego: Abnormal EEGs
December 8, 2016
14
HMMs and Deep Neural Networks (DNNs)
• Advances in computer hardware and deep learning/machine learning
algorithms have facilitated the faster training of Deep Neural Networks
(DNNs)
• There have been a series of breakthroughs in the area of automatic speech
recognition. Deep Learning has surpassed the performance of HMMs in
several speech recognition tasks, such as Switchboard, in which the error
rate was decreased to 6.9%
• With sufficient data, deep learning systems can significantly improve
performance
• Long Short Term
Corpus
Training Speech
SGMM WER
DNN WER
BABEL Pashto
10 hours
69.2%
67.6%
BABEL Pashto
80 hours
50.2%
42.3%
Fisher English
2000 hours
15.4%
10.3%
S. López de Diego: Abnormal EEGs
December 8, 2016
15
Experimental Setup
Data
• The data used was a demographically balanced subset of the TUH EEG
Corpus. The data was divided as follows:
Set
Normal
Abnormal
Training
82 EEGs
80 EEGs
Evaluation
51 EEGs
55 EEGs
S. López de Diego: Abnormal EEGs
December 8, 2016
17
Experimental Design
First 60 seconds of each EEG recording were used
Signal Features were extracted
• MFCC-like features (8 cepstral coefficients)
• Differential Energy
• First and second derivatives
Vectors for the selected channel were concatenated in a
supervector
PCA was used to reduce the dimensionality of the feature
matrix.
S. López de Diego: Abnormal EEGs
December 8, 2016
18
Random Forest and the Number of Trees
• The performance of the systems higher than 20 trees are comparable to
each other.
• Taking performance and computational time for the classification into
account, a number of 50 trees was chosen for the rest of the experiments.
S. López de Diego: Abnormal EEGs
December 8, 2016
19
kNN: Tuning the System
S. López de Diego: Abnormal EEGs
•
The lowest k for the best operating
interval was chosen.
•
This point corresponds to k = 20.
•
The best error rate achieved by the
system is 41.79% for PCA = 86.
December 8, 2016
20
Channel Comparison
• The system was evaluated for
the highlighted channels
• The performance for the T5-O1
channel was better for all
operating points with PCA
dimensions higher than 20.
• This correlates with the
information learned from
neurologists about their
reliance on occipital
channels for the
classification of EEGs.
S. López de Diego: Abnormal EEGs
December 8, 2016
21
Summary of Pilot Studies
Error Rates for the systems described so far:
No.
System Description
Error
1
kNN (k = 20)
41.79%
3
RF (Ntrees = 50)
31.66%
Confusion Matrix for kNN:
Ref/Hyp
Normal
Abnormal
Normal
50.49%
49.50%
Abnormal
34.00%
66.00%
S. López de Diego: Abnormal EEGs
December 8, 2016
22
GMM-HMM Experiments
• This set of experiments was conducted with the full set of features
• The optimized system was then tested with the same feature input as the
pilot experiments for comparison
• The experiments can be summarized as follows:
• Gaussian Mixture/HMM State Analysis
• Signal Input Analysis
• Channel Analysis
S. López de Diego: Abnormal EEGs
December 8, 2016
23
GMM-HMM Experiments
Gaussian Mixture/HMM State Analysis Results:
# Gaussian Mixtures
1
1
1
2
2
2
3
3
3
4
4
4
S. López de Diego: Abnormal EEGs
# HMM States
1
2
3
1
2
3
1
2
3
1
2
3
Correct Detection (%)
69.81%
65.09%
65.09%
76.42%
80.19%
77.36%
76.42%
82.08%
83.02%
82.08%
64.15%
77.36%
December 8, 2016
24
GMM-HMM Experiments: GM/HMM Analysis
Signal Input Analysis Results:
Input (min)
#Gaussians/#HMM States
Correct Detection (%)
5
3/3
80.19%
10
3/3
83.02%
15
3/3
80.19%
20
3/3
79.25%
25
3/3
76.42%
Channel Analysis Results:
#Gaussians/#HMM States
3/3
3/3
3/3
3/3
3/3
S. López de Diego: Abnormal EEGs
Channel
Fp1-F7
T5-O1
F7-T3
C3-Cz
P3-O1
Correct Detection (%)
80.19%
83.02%
80.19%
79.25%
76.42%
December 8, 2016
25
Summary of Results
• The table below shows a summary of the results obtained through the
systems implemented so far:
System Description
kNN (k=20)
RF (Nt=50)
PCA-HMM #GM = 3 #HMM States = 3)
GMM-HMM (#GM = 3 #HMM States = 3)
kNN Confusion Matrix
Ref/Hyp
Normal
Abnormal
Normal
50.49%
49.50%
Abnormal
34.00%
66.00%
GMM-HMM Confusion Matrix
Ref/Hyp
Normal
Abnormal
Normal
78.18%
21.82%
Abnormal
11.76%
88.24%
S. López de Diego: Abnormal EEGs
Error (%)
41.80%
31.70%
25.64%
16.98%
• The GMM-HMM baseline system
showed a significant decrease in
the false alarm rate in comparison
with the kNN system
• The best GMM-HMM system will
serve as a baseline for the
normal/abnormal classification
problem
December 8, 2016
26
Timeline of Future Work
GMM-HMM Experiments
December-January
• Set up deep learning system for a second pass of deep learning after the
GMM-HMM processing:
• Implement and optimize a Stacked Denoising Autoencoders (SdA)
system for the classification and increase the number of channels that
are taken into account for the classification decision.
• Expand and evaluate the normal/abnormal TUH database subset:
• Generate simple natural language processing (NLP) scripts to obtain
EEG sessions that have been evaluated and classified by neurologists
and form a larger, demographically balanced, subset of the data.
February
• Implement a long short term memory system for the normal/abnormal
classification of EEGs.
• This system will be implemented with the Theano Python library for
deep learning and evaluated in the expanded dataset.
• Evaluate the SdA implementation on the expanded dataset.
March-May
• Complete the writing of the thesis and work on publications.
• Defend this thesis.
S. López de Diego: Abnormal EEGs
December 8, 2016
28

Download Report

presentation_v01 - The Institute for Signal and Information

Paperzz.com

Your Paperzz