Active Deep Learning-Based Annotation of
Electroencephalography Reports for Cohort
Identification
Ramon Maldonado, Travis Goodwin,
Sanda M. Harabagiu
The University of Texas at Dallas
Human Language Technology Research Institute
http://www.hlt.utdallas.edu/~{ramon, travis, sanda}
Conflicts
There are no conflicts of interest
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Introduction
• Clinical electroencephalography (EEG) is the most
important investigation in the diagnosis and
management of epilepsies.
• As more clinical EEG becomes available, the
interpretation of EEG signals can be improved by
providing neurologists with results of search for
patients that exhibit similar EEG characteristics.
• MERCuRY (Multi-modal ElectroencephalogRam
patient Cohort discoveRY) - Goodwin &
Harabagiu (2016)1 for cohort identification
Introduction
QUERY: Patients taking topiramate (Topomax) with a
diagnosis of headache and EEGs demonstrating sharp waves,
spikes or spike/polyspike and wave activity
EXAMPLE RECORD:
CLINICAL HISTORY: Recently [seizure]PROB-free but with
[episodes of light flashing in her peripheral
vision]PROB followed by [blurry vision]PROB and
[headaches] PROB
MEDICATIONS: [Topomax]TR
DESCRIPTION OF THE RECORD: There are also bursts of
irregular, frontally predominant [sharply contoured
delta activity]ACT, some of which seem to have an
underlying [spike complex]ACT from the left midtemporal region.
Introduction
Active Learning has been proven to effectively reduce the amount of
human annotation and validation needed when an efficient sampling
mechanism is utilized because it selects, for validation, instances
whose annotation will have the most impact on learning quality. In the
work of Hahn et al. (2012)2, active-learning-based annotation
operating on MEDLINE abstracts was used to identify medical
concepts.
However, in our work, in addition to annotating medical concepts in
biomedical text, we
1. annotate attributes of those concepts and
2. annotate non-contiguous mentions of one type of medical
concept (EEG Activities) using an annotation schema that captures
the semantic richness of attributes of EEG Activities.
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
The Data
EEG reports from Temple University Hospital (TUH)
– 25,000 reports from 15,000 patients collected over 12 years
Sections:
1. Clinical History: Lists past and current medical problems, symptoms, signs, and
treatments as well as significant medical events.
2. Medications
3. Introduction: depiction of the techniques used for the EEG
4. Description: a complete and objective description of the EEG, noting all observed
activity, patterns, and events
5. Impression: states whether the EEG test is normal or abnormal and, if abnormal,
lists the abnormalities in order of importance
6. Clinical Correlation: explains what the EEG findings mean in terms of clinical
interpretation
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Multi-task Active Deep Learning
The goal of the Multi-task Active Deep Learning (MTADL) paradigm is to
concurrently perform multiple annotation tasks corresponding to the
identification of:
1.
2.
3.
4.
5.
6.
EEG Activities
EEG Events
Medical Problems
Medical Treatments
Medical Tests
The relevant attributes for each medical concept type
1.
2.
3.
4.
The Modality3 of each of the above medical concepts
The Polarity3 of each of the above medical concepts
Medical Concept Type
EEG Activity attributes
Multi-task Active Deep Learning
The MTADL Paradigm consists of 5 steps:
STEP 1: The development of an annotation schema
STEP 2: Annotation of initial training data
STEP 3: Design of deep learning methods capable of learning from the data
STEP 4: Development of sampling methods for MTADL
STEP 5: Usage of the Active Learning system involving:
STEP 5.a: Accepting/Editing annotations of sampled examples
STEP 5.b: Re-training the deep learning methods
MTADL – Annotation Schema
Medical Concept Annotation Schema
1. Type
1.
2.
3.
4.
5.
2.
Modality
1.
2.
3.
3.
Medical Problem
Medical Treatment
Medical Test
EEG Event
EEG Activity
Factual
Possible
Proposed
Polarity
1.
2.
Positive
Negative
MTADL – Annotation Schema
EEG Activity Attributes
1. Morphology: represents the type or “form” of EEG waves
1.
2.
Rhythm
Transient
1.
Single Wave
1.
2.
3.
2.
Complex
1.
2.
3.
3.
K-complex
Polyspike complex
…
Pattern
1.
2.
3.
2.
3.
4.
5.
6.
7.
8.
Spike
Sharp Wave
…
PLED
Suppression
…
Frequency Band: alpha, beta, delta, theta, gamma
Background: is the EEG activity in the background
Magnitude: describes the amplitude of the EEG activity if it is emphasized
Recurrence: describes how often the EEG activity occurs
Dispersal: describes the spread of the activity over regions of the brain
Hemisphere: describes which hemisphere of the brain the activity occurs in
Brain Location: the region of the brain in which the activity occurs
MTADL – Annotation Schema
When the patient relaxes and the eye blinks stop, there are frontally
predominant generalized spike and wave discharges as well as polyspike and
wave discharges at 4 to 4.5 Hz.
“spike and wave discharges”
“polyspike and wave discharges”
Morphology:
Spike and Slow Wave Complex
Polyspike and Slow Wave Complex
Freq. Band:
Theta
Theta
Background:
No
No
Magnitude:
Normal
Normal
Recurrence:
Repeated
Repeated
Dispersal:
Generalized
Generalized
Hemisphere:
n/a
n/a
Brain Location:
Frontal
Frontal
Multi-task Active Deep Learning
Active Learning Loop
EEG Reports
Automatically Annotated EEG Reports
EEG Report Annotation
SAMPLING
Manual Annotation of:
• EEG Activity Attributes
• EEG Events
• Medical Problems
• Medical Treatments
• Medical Tests
+ Modalidy
+Polarity
Initial Training Data
EEG Reports with Seed Annotations
Deep Learning-Based Identification of:
• Anchors of EEG Activity
• Boundaries of expressions of:
EEG Events
Medical Problems
Medical Treatments
Medical Tests
Validation/
Editing of
Sampled
Annotations
From
EEG Reports
Deep Learning-Based Recognition of:
Attributes of EEG Activities
EEG Concept TYPE
EEG Concept Modality
EEG Concept Polarity
Re-Training Data
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Deep Learning Architectures
• Stacked Long Short-Term Memory6 (LSTM) network
– EEG Activity Anchors
– Medical Concept Boundaries
• Deep Rectified Linear Network7 (DRLN)
– EEG Activity attributes including modality and polarity
– Medical Concept type (EEG Event, medical problem, medical treatment,
medical test), modality, and polarity
Deep Learning Architectures – Stacked LSTM
• Operates at the sentence level
• Assigns a label {I, O, B} to each token in the sentence
• occasional left anterior temporal sharp and slow wave
complexes
• Token Features:
•
•
•
•
•
Lemma of the token and previous/next tokens
PoS of the token and previous/next tokens
Phrase chunk of the token and the previous/next tokens
Brown cluster5 of the token
UMLS Concept Unique Identifier (cui) of UMLS concepts
containing the token
• Title of the section containing the token
• Two Models
Deep Learning Architectures – Stacked LSTM
b2
b3
bn
softmax
softmax
softmax
softmax
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
t1
t2
t3
tN
b1
Deep Learning Architectures – DRLN
Deep Rectified Linear Network for Attribute Classification
Traditionally, attribute classification is performed by training a classifier, such
as an SVM, to determine the value for each attribute. This approach would
require training 18 separate attribute classifiers for EEG Activities and 3
classifiers for all other medical concepts.
However, by leveraging the power of deep learning, we can simplify this task
by learning one multi-task embedding – a low-dimensional vector
representation of a medical concept – and use this representation to
determine each attribute simultaneously with the same deep learning
network.
Deep Learning Architectures – DRLN
Deep Learning Architectures – DRLN
DRLN Features
1. The text of medical concept mention itself
2. The lemmas of each token in the medical concept mention
3. The PoS of each token in the medical concept mention
4. The lemmas of 3 tokens before/after the medical concept
mention
5. The title of the containing section
Context Features: For each token, t, in the sentence:
6. The syntactic dependency path to t.
7. The number of words between the medical concept mention and
t
8. The number of “hops” in the syntactic dependency path from the
head of the medical concept mention to t
9. The number of medical concepts between the medical concept
mention and t
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Sampling Method
• Rank Combination Protocol4: combine several single-task active
learning selection decisions into one
• Usefulness rank
– The usefulness score 𝑠𝑋𝑗 (𝑑) of each un-validated EEG report 𝑑 is
calculated with respect to each annotation task 𝑋j
– Each score is translated into a rank 𝑟Xj(𝑑) where higher usefulness
means lower rank
– For each EEG report, we sum the ranks of each annotation task to get
the overall rank, 𝑟(𝑑)
– All reports are sorted by this rank and the reports with lowest rank are
selected for validation
• By combining the individual ranks for each annotation task, we are
able to choose the documents that have the most usefulness for all
the tasks as a whole.
Sampling Method
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Experimental Results
• Boundary Detection
– EEG Activity Anchors
– Other Medical Concepts
– Precision, Recall, F1
• Attribute Classification
– 10 attribute classes for EEG Activities
– 3 attribute classes for other medical concepts
– Precision, Recall, F1, Accuracy
• Active Learning
– Learning curve as active learning progresses
– F1 by active learning iteration
Experimental Results – Boundary Detection
The performance of the stacked LSTM models when automatically detecting
anchors and boundaries
EEG Activity Anchors
Other Medical Concept Boundaries
Measure
Exact
Partial
Measure
Exact
Partial
Precision
.8949
.9591
Precision
.9169
.9469
Recall
.8125
.8228
Recall
.8797
.8831
F1
.8517
.8857
F1
.8975
.9139
Experimental Results – Attribute Classification
Attribute
Accuracy
Precision
Recall
F1
Morphology
0.990
0.757
0.704
0.724
Hemisphere
0.924
0.775
0.754
0.762
Magnitude
0.909
0.806
0.710
0.750
Recurrence
0.831
0.739
0.724
0.731
Dispersal
0.871
0.775
0.733
0.751
Freq. Band
0.982
0.664
0.620
0.640
Background
0.960
0.890
0.820
0.854
Location
0.970
0.653
0.560
0.602
Modality
0.977
0.527
0.397
0.426
Polarity
0.970
0.909
0.741
0.816
Type
0.970
0.943
0.936
0.939
Modality
0.973
0.742
0.605
0.659
Polarity
0.978
0.829
0.719
0.770
The performance of the DRLN models when automatically detecting attributes. The first ten
rows correspond to EEG Activity attributes, the last three rows are attributes of the other
four medical concept types.
Experimental Results – Active Learning
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
Anchors
Boundaries
Activity Attributes
Other Attributes
Learning Curves shown for the first 100 EEG reports annotated and evaluated
with F1 measure.
Experimental Results - Discussion
•
Rare attribute values
–
–
–
•
Ungrammatical sentences
–
•
F1 score for morphology: 0.724
F1 score for morphology for classes with >=10 instances: 0.875
Future work may benefit from incorporating domain knowledge (Neurological Ontologies, general
knowledge representations)
“There are rare sharp transients noted in the record but without after going slow waves as would be
expected in epileptiform sharp waves.”
The annotations produces by MTADL enables the generation of EEG-specific
qualified medical knowledge
–
–
Graphical Representations
Embedded knowledge graphs
Outline
1.
2.
3.
4.
5.
6.
7.
Introduction
The data
Multi-task Active Deep Learning
Deep Learning Architectures
Sampling Method
Experimental Results
Conclusion
Conclusion
In this paper, we described a novel active learning annotation framework that
operates on a large corpus of EEG Reports using two deep learning
architectures.
• We devised an annotation schema capable of capturing the complexity
and semantic richness of EEG activity mentions in the reports
• We designed two deep learning architectures to
1. Discover the textual boundaries of medical concepts in the reports
2. Perform multi-task attribute detection
• We used a sampling method that allows the MTADL system to incorporate
information about each task into one active learning sampling decision
The experimental evaluations have yielded promising results.
Acknowledgements
Research reported in this publication was
supported by the National Human Genome
Research Institute of the National Institutes of
Health under award number 1U01HG008468.
The content is solely the responsibility of the
authors and does not necessarily represent the
official views of the National Institutes of
Health.
References
1. Goodwin TR, Harabagiu SM. Multimodal Patient Cohort Identification from EEG Report and
Signal Data. In: AMIA Annual Symposium Proceedings. American Medical Informatics
Association; 2016.
2. Hahn U, Beisswanger E, Buyko E, Faessler E. Active Learning-Based Corpus Annotation—The
PathoJen Experience. In: AMIA Annual Symposium Proceedings [Internet]. American Medical
Informatics Association; 2012 [cited 2016 Sep 23]. p. 301. Available from:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3540513/
3. Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2
Challenge. J Am Med Inform Assoc JAMIA. 2013 Sep;20(5):806–13.
4. Reichart R, Tomanek K, Hahn U, Rappoport A. Multi-Task Active Learning for Linguistic
Annotations. In: ACL [Internet]. 2008 [cited 2016 Sep 22]. p. 861–9. Available from:
http://www.anthology.aclweb.org/P/P08/P08-1.pdf#page=905
5. Brown PF, Desouza PV, Mercer RL, Pietra VJD, Lai JC. Class-based n-gram models of natural
language. Comput Linguist. 1992;18(4):467–79.
6. Pascanu R, Gulcehre C, Cho K, Bengio Y. How to construct deep recurrent neural networks.
ArXiv Prepr ArXiv13126026 [Internet]. 2013 [cited 2016 Sep 22]; Available from:
http://arxiv.org/abs/1312.6026
7. Glorot X, Bordes A, Bengio Y. Deep Sparse Rectifier Neural Networks. In: Aistats [Internet].
2011 [cited 2016 Sep 22]. p. 275. Available from:
http://www.jmlr.org/proceedings/papers/v15/glorot11a/glorot11a.pdf
Questions
© Copyright 2026 Paperzz