Economic information resources - York Health Economics Consortium

Improving search efficiency
for economic evaluations in
major databases using
semantic technology
Julie Glanville, Carol Lefebvre, Pamela
Negosanti, Bill Porter
[email protected]
Oct 2010
Overview
 Why are we interested in economic evaluations?
 Can economic evaluations be identified efficiently at
present?
 This research project
 Methods
 Results
 Discussion
 Next steps
Why are we interested in
economic evaluations?
 Systematic reviews and technology assessments frequently
consider cost-effectiveness as well as effectiveness outcomes
 This information is published in economic evaluations
 Cost-effectiveness analyses
 Cost-utility analyses
 Cost-benefit analyses
 Issues in identifying reports of economic evaluations
 Poor reporting
 abstracts may contain terms which signal an economic evaluation but not an
explicit term
 Economics is often mentioned in passing in abstracts
 Increases number of irrelevant records retrieved
Can economic evaluations be
identified efficiently?
 In healthcare databases
 Yes and No
 Specific economic evaluation databases are available (NHS
EED and HEED)
 BUT may need to carry out top up/supplementary searches
in large bibliographic databases
 Beyond healthcare
 Seem to be no economic evaluation databases
 Need to search large bibliographic databases such as ERIC
and Criminal Justice Abstracts
What about search filters?
Can search filters help?
 In healthcare databases
 Many search filters
 search filters to find economic evaluations in EMBASE and
MEDLINE achieve high sensitivity (100%) (1)
 BUT they have poor precision (less than 4%): very high proportion
of irrelevant studies are retrieved (1)
 Beyond health
 Few filters available
 Issues of precision likely to be similar to health
(1)Glanville
J, Kaunelis D, Mensinkai S. How well do search filters perform in identifying
economic evaluations in MEDLINE and EMBASE. Int J Tech Assess Hlth Care
2009;25:522-529
This research project
 How can we improve efficiency of retrieval of
economic evaluations in large bibliographic
databases?
 Traditional Boolean approaches don’t seem to be
helping
 Indexing isn’t very helpful at present
 Can semantic analysis software help?
 Collaboration with Expert System to explore potential
for identifying economic evaluations using their
Cogito software
Semantic Net
Semantic analysis
Analysis hat assigns a meaning, a sense, to a syntactic
structure and consequently to a linguistic unit, according
to the knowledge contained in the semantic network.
Methods
 Gold standard set of 1950 economic evaluation records
(published 2000, 2003, 2006)
 identified from NHS EED and then downloaded from MEDLINE.
 Comparator set of 4136 matching MEDLINE records for the 3
years (2000, 2003, 2006)
 not economic evaluations
 But identified using the NHS EED filter
 Loaded into Cogito
 Divided randomly into test sets and validation sets
 Used in-built semantic analysis and also created new rules to
categorise economic evaluations to categorise records as
economic evaluations or non-economic evaluations
Testing and validation
Test set
Validation
set
975
economic
evaluations
975
economic
evaluations
2068
comparator
records
2068
comparator
records
Results
Test set
(Gold Standard
records=975)
(Comparator records =
2068)
Validation set
(Gold Standard
records=975)
(Comparator records =
2068)
975
975
Number of gold standard (GS)
records retrieved
Number of comparator records
retrieved
Sensitivity
(number GS retrieved/number of
GS records)
203
385
100%
100%
Precision
(number of GS retrieved/number
of records retrieved)
82.77%
71.69%
Results, 2
Precision
(combined Test and
Validation sets)
Using Cogito in-built
semantic rules (no filter)
Using filter with records
scoring 50
77.23%
Sensitivity
(combined Test and
Validation sets)
100%
78%
90%
Using filter with records
scoring 100
80%
85%
Using filter with records
scoring 200
81%
83%
Discussion
 Cogito performs as well as Boolean searching in terms of
sensitivity
 Cogito has a much improved precision score compared to
performance of Boolean filters
 Over 70% (Cogito) compared to under 10% (Glanville et al)
 Cogito performs well ‘out of the box’
 Although early training efforts did not improve precision, further
exploration might yield improved results
Next steps




Identifying funding to carry out further exploration
Exploring economic evaluation identification optimisation further
Exploring the effects of importing results from a range of databases into
Cogito
Exploring whether semantic analysis has potential to achieve
improvements in retrieval of other hard to find research where filters do
not perform well
 diagnostic test accuracy studies and quality of life research

Exploring the potential of semantic analysis for analysing records by
study design obtained from a range of databases in healthcare, social
care, education and criminal justice contexts
 in-built rules are database independent.
For further information
Julie Glanville, York Health Economics
Consortium
[email protected]
Bill Porter at Expert System
http://www.expertsystem.net/
[email protected]