using text mining in a qualitative systematic review of digital health

USING TEXT MINING IN A QUALITATIVE SYSTEMATIC
REVIEW OF DIGITAL HEALTH ENGAGEMENT AND
RECRUITMENT – HOW TO SEARCH AND PRIORITISE
LARGE TEXT DATASETS
Sonia Garcia Gonzalez-Moral1, Steven Brewer2, Siobhán O’Connor3,4, Frances S Mair3,
Julie Glanville1
1 York Health Economics Consortium, University of York, UK; 2 Text Mining Solutions Ltd, York,
UK; 3 General Practice and Primary Care, University of Glasgow, UK; 4 School of Nursing,
Midwifery and Social Work, University of Manchester, UK
THE CHALLENGE
 Qualitative systematic reviews are challenging when a topic is broad, there are a large
volume of publications and the research question is complex (Gallacher et al, 2013).
 Balancing the need for adequate sensitivity and reasonably precise results in difficult.
 A qualitative systematic review on engagement and recruitment to person-centred
eHealth interventions was taxing given the range of technologies used as well as the vast
and diverse literature on eHealth and recruitment (Garcia et al, 2016).
AIMS
 To explore the use of text mining techniques to search for and prioritise the eHealth
literature for this review.
METHODS
Ten highly relevant papers identified through scoping searches were used to build an initial sensitive search strategy.
This returned 147,734 records via PubMed which were loaded into text mining software (VOSViewer).
Heat maps and cluster view diagrams that helped identify and prioritise relevant search terms were created.
Fig 2: Cluster view diagram
Fig 1: Heat map
Two more iterations were conducted to produce a final search strategy.
RESULTS 1
RESULTS 2
RESULTS 3
RESULTS 4
85,423 records were
retrieved from 6 online
bibliographical databases
(PubMed, Medline,
CINAHL, Embase, Scopus
and the ACM Digital
Library).
Deduplication and removal
of studies related to clinical
trials reduced the dataset
to 57,367 records.
These were loaded into
GATE 8.0 text analysis
software
Record prioritization rules
based on gazeteers (lists
of prioritised search terms
relevant to the topic) were
applied to the corpus.
1,423 prioritised records
were exported to Excel.
A MIMIR index was used to
store and query the
processed documents.
LESSONS LEARNED
 Although, there is a slight risk of missing important papers, which is
inherent to any search strategy design, the record prioritization using
rules designed in GATE 8.0 proved to be a pragmatic, low risk approach
to undertaking record selection in a systematic review (Garcia et al,
2016).
 Text mining technology can be used to discover and identify relevant
concepts and search terms to build search strategies and to prioritise
large volumes of literature for a systematic review (Thomas et al, 2011).
REFERENCES
• Gallacher, K., Jani, B., Morrison, D., et al. (2013) “Qualitative systematic reviews of treatment burden in stroke, heart failure
and diabetes-Methodological challenges and solutions.” BMC medical research methodology, 13(1), 10.
• Garcia, S., Brewer, S., O’Connor, S., Mair, FS., Glanville, J. Using text mining for search strategy development and record
prioritization in a qualitative systematic review of the barriers and facilitators to recruitment in person-centred digital health
interventions. Research Synthesis Methods, (2016) in press.
• Thomas, J., et al. (2011) “Applications of text mining within systematic reviews”. Research Synthesis Methods, 2(1), 1-14
CONTACT
Siobhán O’Connor, Lecturer,
School of Nursing, Midwifery & Social
Work, University of Manchester, UK
[email protected]
@shivoconnor