Searching the research literature

The Pearl Harvesting Information Retrieval
Theory







Developing a comprehensive search
Developing a comprehensive search filter
Internal validation of the search filter by using
the Boolean Subtraction Procedure
Developing precision into the search
External validation of a search filter
Choosing databases to search
The Pearl Harvesting Search Thesaurus




A search term (also called a keyword) is a set of
characters (i.e., a word or phrase) which
represents a topic and therefore can be used to
retrieve articles on that topic
A search filter is a set of search terms, all of
which refer to the same topic
There are usually a number of search terms
which all represent the same topic.
The database search engine looks for any
occurrence of the letters and characters that are
in the terms of the filter.
4. Layers of different
search filters
(is a search
strategy)
3. Collections of search (is a search filter)
terms which all pertain
to the same topic
2. Words or phrases
(are search terms)
1. Text and symbol
Characters



Articles are coded by subject heading, author,
abstract, title, date, journal title, etc.
Subject headings are terms that are
predefined as ones that are capable of
retrieving citations on a topic
These predefined terms are located in a
database thesaurus


One important goal of searching the research
literature is to find as many relevant citations
as possible
There are many purposes to being
comprehensive, for example
 Systematic reviews
 Student theses
 Coursework or text development





One easy way to gather more citations is to
use the truncation function *
Any letters following a root word will be
recognized by the search engine
For example if you just enter the search term
teacher, that is likely that the search engine
will retrieve; no more no less
It will miss citations with the plural teachers
But if you enter teacher* it will retrieve both


There are usually many terms that can be
used to represent the same topic
Entering many unique and relevant topic
related search terms will increase the number
of topic related citations that are retrieved

Searching is an act of communication
between the information seeker and the
authors of articles in a field. Database
indexers are also part of this conversation in
that they provide subject headings to help
people locate information



there are many terms to denote the same
topic
If the different terms used can be found then
they can be harvested to be used in search
filters.
the largest collection of terms that provide
unique and relevant citations comprise a
search filter that is maximally comprehensive
for a specific topic
Potentially relevant search terms can be found
where there are conversations about the topic
 Therefore texts with conversations about the
topic can be analyzed to potential terms
 Since it will be unmanageable to review all
relevant texts, a purposive sampling of texts rich
in the conversations can be used
 It is important to sample a wide variety of
sources in order to identify the range of terms
used to denote a topic


Relevant systematic reviews and metaanalyses (see the Pearl Harvesting Search
Thesaurus for search filters on systematic
reviews)
 the search terms used in the reviews are one
source where potential terms might be found
 The language found in the title and abstract's of
the original source references are another source
to review

Edited text books
 Chapters in edited text books that relate to the
use of terminology used to describe a topic can be
reviewed
▪ The chapter itself and the references of the chapter are
two sources of information

Crowd sourcing; a form of sampling
 Gather a group of individuals who are familiar with a
topic
 Have individuals each enter keywords they know
about the topic into a database
▪ No restrictions are place so that the sampling is down
independently and therefore randomly
 Review relevant articles to locate potential search
terms
 Gather the potential terms and develop one llist of
potential search terms


Truncation ensures the variants of a term are
accounted for
Double quotations ensures precision





Searching with phrases is prone to a problem
of word adjacency
E.g., special education -> special programs in
higher education
Easy solution is to use double parentheses “”
“special education” and only words side by
side will be retrieved
Also helpful with names in Google, retrieves
only exact spelling


How to determine which terms are essential
to the search filter
Boolean Subtraction Procedure


The purpose of this procedure is to determine
whether a search contributes to finding
citations on a topic that no other search term
will find
If a search term fills this requirement it is
deemed essential to the search filter

Method:
Potential search term A
NOT (all other potential search terms, e.g. B OR
C OR D OR etc)
 Review the results
 See if any citations exist ; if not the term is not
necessary
 If citations exist check to see if any are relevant
 If unique and relevant then that term is
necessary





Combine all validated search terms using the
Boolean OR function
A OR B OR C OR D OR etc.
Copy and paste the filter into the search
engine






Most search engines have various fields that can used
to search, e.g., title, abstract, author, journal title
There are also multiple field operators which allow for
searching across multiple fields all at once
The most powerful operator is one that searches all of
the fields
In the Proquest databases this is the default
“everywhere”
In MEDLINE it is all fields, or .af
Often it is not recommended to use all fields however
if the search filter has been developed to be precise
then extra information will be gathered with minimal
loss to precision
1. phrases entered with double quotations tells
the search engine to only retrieve citations that
contain the words in the phrase in the specific
order specified
 Failure to use double quotes may result in the
search engine looking for the words
separately anywhere in the bibliographic
information, i.e., that is very imprecise
2. Word sense disambiguation
 Some words have numerous more specific
versions and therefore can be further
specified to achieve better precision
 These are referred to as polysemic terms
 For example, review is very general including
such meanings as “book review”
 Therefore it can be further developed to be
“systematic review” OR “quantitative review”
OR “evidence-based review”






Enter the general term (polysemic term) into
a search box and review the citations
Note any more specific version phrases
See if they meet the criteria of the search
intent
create a list of these more specific phrases
Recursively enter the polysemic term NOT
the list of specific phrases
Review the outcome to see if there are any
more specific phrases



Repeat the search using the polysemic term
ensuring that there are no citations that only
can be found using the polysemic term
If any citations do not have a more specific
expression and only the polysemic term can
locate it, see if there is something unique in
the grammatical expression which can be
used as a unique identifier
For example, “search for a review” OR
“conduct a review”

Use the Boolean Subtraction Procedure to
see if each of the disambiguated search terms
is necessary




3. Use of the Boolean operator NOT
Acronyms in particular are prone to multiple
meanings
A way to eliminate non-relevant citations is
to subtract the non-relevant ones out
For example
 ASD refers to autism spectrum disorder but also
many other conditions such as Alzheimer
Dementia
 So use ASD NOT “Alzheimer Dementia”








4. enter a series of search filters
multiple search filters are common in a real world
search
For example, in looking for systematic review articles
on applied behavior analysis with children with autism
a number of filters are used “
Autism
Children
Applied behavior analysis
Systematic review
These are all connected in a search with AND, i.e.,
autism AND Children AND “applied behavior analysis
AND “systematic review”

Each time a search filter is added to the list
the search will become more focused, i.e.,
more precise according to the search intent


It is important to see if the search does what
it is supposed to – find as many relevant
citations as possible
Pearl Harvesting uses what is termed a quasigold standard reference base as a check to
see if can locate a variety of relevant known
citations




Quasi-gold standard reference base
Choose one or more systematic reviews or
meta-analyses on the topic being investigated;
preferably ones that have not been used in the
search term development process
1. analyze the search terms and subject headings
used to see if there are any terms not already
used
If so test these terms using the Boolean
Subtraction Procedure





Locate the original source studies of the systematic
reviews in one of the databases being used for the
investigation
Check to see if the Pearl Harvesting Search Filter could
locate each of these studies by analyzing their title,
abstract and keywords/subject headings
If unable to locate an article look for unique and relevant
identifying linguistic markers in the bibliographic
information
Test those linguistic markers as to their capably to find
unique and relevant citations using the Boolean
Subtraction Procedure
If the newly found linguistic marker locates unique and
relevant citations add it to the search filter


It is well established that a comprehensive
search requires searching in multiple
databases - not just one
However, there is no theory or systematic
approach at the present time for determining
which databases to use ju



There are numerous databases and no theory
as to which ones to choose in doing a
comprehensive review.
Some examples are: ERIC, PsycINFO , Web of
Sceince, Google Scholar , Summan





A Pearl Harvesting suggestion
Locate systematic reviews in the area being
investigated
Review the journals used in the original
source studies
Locate the databases that these journals are
indexed in
Use these as the sources to search in



Note that each database will have some
different journals therefore different articles
The number of citations found for each term
will therefore vary across databases
Also note, different databases have different
search engines so merely copying and pasting
the Pearl Harvested search filter into various
databases will not work efficiently

Polysemic terms are general with many possible
connotation
 For example, review, analysis, synthesis
They are very genera and not precisel when used in a
search
 Google and Google Scholar only allow a limited
number of search terms in one search (32), so the full
extent of a Pearl Harvested filter is not possible
 Polysemic terms if used in combination with a series
of search filters may prove to be easier to implement
without too much loss of precison in these databases


Lists examples of search filters
 The search filter can be copied and pasted into
appropriate databases

Lists published articles on Pearl Harvesting
Locate the thesaurus by typing into Google:
Pearl Harvesting Search Thesaurus
 or use the URL below
http://pearlharvestingsearchthesaurus.wikispaces.c
om/
