Evaluating the searching capabilities of search engines

Annals of Library and Information Studies
KUMAR
& PAVITHRA: EVALUATING THE SEARCHING CAPABILITIES OF SEARCH ENGINES
Vol. 57, June 2010,
pp.87-97
87
Evaluating the searching capabilities of search engines and metasearch engines:
a comparative study
B.T. Sampath Kumar1 and S.M. Pavithra2
Assistant Professor, Department of Library and Information Science, Kuvempu University, Jnana Sahyadri-577 451,
Shivamogga, Karnataka, Email: [email protected]
Department of Library and Information Science, Kuvempu University, Jnana Sahyadri-577 451
Shivamogga, Karnataka
Compares the searching capabilities of two search engines (Google and Yahoo) and two metasearch engines (Metacrawler and
Dogpile) on the basis of the precision value and relative recall. Fifteen queries which represented a broad range of library and
information science topics were selected and each query was submitted to the search engines and metasearch engines. The first 100
results in each scenario were evaluated and it was found that search engines did not achieve higher precision than the metasearch
engines. It is also found that despite the theoretical advantage of searching the databases of several individual search engines,
metasearch engines did not achieve higher recall. The results of the study offer guidance for internet surfers to choose appropriate
search tools for information retrieval. It also provides some inputs to search engine designers to make search engines’ search
capabilities more efficient.
Introduction
Finding the required information quickly and easily on
the Web remains a major challenge and more so if the
searcher has little prior knowledge of search strategies
and search techniques of search engines. The
exponential growth of web resources since the early
1990s has compounded the problem. Another reason is
the inherent ambiguity of human language. Most words
have more than one possible meaning and there are
also usually many words that can express the same
concept1. This is despite significant improvements in
search engine technology in recent times. However,
users are dependent on the search engines to seek online
information. Several sources report that more than 80%
of Web visitors use a search engine as a starting
point2-3.
SeachEngineWatch.com reports that the top ten search
engines execute well over a half-billion searches per
day for U.S. traffic alone. Web searching services such
as Google, Yahoo, Altavista, etc. are now the tools that
people access everyday to find information4. Even
though these engines search an enormous volume of
information at impressive speed but they have been the
subject of wide criticism for retrieving more irrelevant
sites, sites with more irrelevant links, duplicates and non-
scholarly information. The reasons include their
comprehensive databases having information of different
kinds like media, marketing, entertainment, advertisement
etc. In this context, this paper tries to evaluate the search
engines and metasearch engines on the basis of their
precision and relative recall.
Review of literature
There is a growing body of research examining the use
of Web search engines. Web research is now a major
interdisciplinary area of study, including the modeling of
user behavior and Web search engine performance.
Studies on Web search engine crawling and retrieving
have evolved as an important area of Web research since
the mid-1990s. Many search tools have been developed
and commercially implemented, but very little research
has investigated the usage and performance of Web
search engines.
Jansen, Spink and Saracevic5 conducted an in-depth
analysis of the user interactions with the Excite search
engine, and reported that user sessions are short and that
Web queries are also short. Holscher and Strube 6
examined European searchers on the Fireball search
engine, a predominantly German search engine, reporting
on the use of boolean and other query operators. They
88
ANN. LIB. INF. STU., JUNE 2010
note that experts exhibit different searching patterns than
novices. Jansen and Pooch7 reviewed the Web-searching
literature, comparing Web searchers with searchers of
traditional information retrieval systems and online public
access catalogues. The researchers report that Web
searchers exhibit different search characteristics than
searchers of other information systems, and they call
for uniformity in terminology and metrics for Web studies.
Montgomery and Faloutsos 8 analyze data from a
commercial research service, also noting short sessions
and queries. This stream of research provides useful
snapshots of Web searching. One limitation of these
studies, however, is that they are snapshots with no
temporal analysis comparing Web search engine usage
over time.
Another study by Chowdhury and Soboroff9 focuses on
a method for comparing search engine performance
automatically based on how they rank the known item
search result. In their study, initial query-document pairs
are constructed randomly. Then, for each search engine,
mean reciprocal rank is computed for over all querydocument pairs. If query-document pairs are reasonable
and unbiased, then this method could be useful. However
construction of query-document pairs requires a given
directory, which may not always be possible.
from a university Web site. Analysis was at the query
and term level. The researchers did not collect session
level data. The results of the query analysis were similar
to those reported in studies of Web search engines. The
term analysis results were targeted to the university
domain rather than the more general searching
environment of Web search engines.
Jansen and Spink 13 conducted a two-year study of
AlltheWeb.com users. The researchers noted even
shorter sessions from this temporal analysis of searchers
and a near total intolerance of viewing more than one
results page. There has been little analysis of pageviewing characteristics of Web searchers at any finer
level of granularity, although the authors report that Web
searchers of AlltheWeb.com view about five actual Web
documents. The researchers also noted a shift toward
commercial searching on AlltheWeb.com, although there
is less of it than on the Excite search engine.
Spink et al10 provided a four-year analysis of searching
on the Excite search engine using three snapshots. They
report that Web-searching sessions and query length have
remained relatively stable over time, although they noted
a shift from entertainment to commercial searching. The
researchers show that on the Excite search engine, Websearching sessions are very short, as measured by the
number of queries. The majority of Web searchers,
approximately 80%, view no more than 10 to 20 Web
documents. These characteristics have remained fairly
constant across the multiple studies. Can et al11 made
an attempt on automatic performance evaluation of Web
search engines. The experiments based on eight Web
search engines, 25 queries, and binary user relevance
judgments show that their method provides results
consistent with human-based evaluations. It is shown
that the observed consistencies are statistically
significant.
Shafi and Rather14 presents the result of a research
conducted on five search engines-Altavista, Google,
HotBot, Scirus and Bioweb for retrieving scholarly
information using biotechnology related search terms.
The search engines are evaluated taking the first ten
results pertaining to scholarly information for estimation
of precision and recall. It shows that Scirus is the most
comprehensive in retrieving ‘scholarly information’
followed by Google and HotBot. Koshman et al15 found
that results overlap and lack uniqueness among major
Web search engines. Singh’s16 study reveals that the
search engines (except Bioweb) perform well on
structured queries while Bioweb performs better on
unstructured queries. As far as currency of web pages
are concerned in environmental science, Google has
provided the maximum of 32.5% output posted/updated
in the year 2005-2006, followed by Teoma with 30%
output and so on. Another study found that Altavista
searched more number of sites while Excite searched
least number of sites17. In case of relevancy of search
engines majority of relevant sites were found in case of
Google (28%) followed by Yahoo (26%) and Altavista
(20%). Further analysis shows that more number of
irrelevant sites were found in case of Hotbot (61.6%),
Lycos (59.6%) and Altavista (54.8%).
There are studies that examine searching on specific
Web sites, rather than Web search engines. For example,
Wang et al12 analyzed 48 consecutive months of data
Jansen and Molina 18 evaluated the effectiveness of
different types of Web search engines in providing
relevant content from Web e-commerce queries. The
KUMAR & PAVITHRA: EVALUATING THE SEARCHING CAPABILITIES OF SEARCH ENGINES
researchers examined the most popular search engines
general purpose, paid for inclusion, directory, ecommerce, and metasearch engines and submitted Web
e-commerce queries to each. The researchers collected
the results, conducted relevance evaluations, and
reported little difference among the five search engine
types in relevance of either non-sponsored or sponsored
links. They also reported non-sponsored links as more
relevant than sponsored links. However, neither of these
studies did an in-depth examination of sponsored links
from the major search engines. Jansen19 discusses the
issues of click fraud with sponsored search and examines
several thousand sponsored and non-sponsored links from
the three major search engines in response to more than
100 e-commerce queries. The major finding is that
sponsored links are more relevant than non-sponsored
links in response to e-commerce queries.
Lewandowski et al 20 measure the frequency with which
search engines update their indices. Thirty eight websites
that are updated on a daily basis were analysed within a
time-span of six weeks. Authors found that Google
performs the best overall with the most pages updated
on a daily basis, but only MSN is able to update all pages
within a time-span of less than 20 days. In terms of
indexing patterns, MSN shows clear update patterns,
Google shows some outliers and the update process of
the Yahoo index seems to be quite chaotic. In an another
study, Lewandowski 21 analysed the update strategies of
the major web search engines Google, Yahoo, and MSN/
Live.com. The study found that the best search engine
in terms of up-to-dateness changes over the years and
that none of the engines has an ideal solution for index
freshness. A major problem identified in research is the
delay in making crawled pages available for searching,
which differs from one engine to another.
Thelwall22 compared the applications programming
interfaces of Google, Yahoo!, and Live Search for 1,587
single word searches. The hit count estimates were
broadly consistent but Yahoo! and Google reported 5–6
times more hits than Live Search. Yahoo! tended to return
slightly more matching URLs than Google and Live
Search returning significantly fewer. Yahoo! retrieved
URLs included a significantly wider range of domains
and sites than the other two, and there was little
consistency between the three engines in the number of
different domains. Google is recommended for hit count
estimates but Yahoo! is recommended for all other
89
Webometric purposes. Höchstötter and Lewandowski23
investigate the composition of search engine result pages.
Findings include that search engines use quite different
approaches to results pages composition and therefore,
the user gets to see quite different results sets depending
on the search engine and search query used.
Uyar24 investigates the accuracy of search engine hit
counts for search queries using Google, Yahoo and
Microsoft Live Search, and the accuracy of single and
multiple term queries. The results of the study show that
the number of words in queries affects the accuracy of
estimations significantly. The percentages of accurate
hit count estimations are reduced almost by half when
going from single word to two word query tests in all
three search engines. With the increase in the number of
query words, the error in estimation increases and the
number of accurate estimations decreases.
From the above discussion, it can be seen that the reported
findings of the studies conducted by various authors
obviously do not appear to agree with one another. The
methodologies and evaluation criteria used by the studies
differed as well. In this study, the authors have tried to
evaluate the searching capabilities and performance of
four search engines.
Methodology
Two search engines (Google and Yahoo) and two
metasearch engines (Metacrawler and Dogpile) were
randomly selected for evaluating the search capabilities.
Fifteen queries which represented a broad range of
library and information science topics (Appendix 1) were
submitted to Google and Yahoo which retrieved a large
number of results but only the first 100 results were
evaluated to limit the study. In case of metasearch
engines (Metacrawler and Dogpile) all the retrieved sites
are selected for evaluation since less than 100 sites are
retrieved. Each query was executed in the two search
engine and metasearch engines on the same day in order
to avoid temporal variations. In order to retrieve relevant
data from each search engine and metasearch engine,
the advance search features of search engines and
metasearch engines were used.
When a search is carried out in response to a search
query, many times the user is unable to retrieve the
relevant information. The quality of searching the right
information accurately is said to be the precision value
90
ANN. LIB. INF. STU., JUNE 2010
of the search engine25. In the present study, the search
results retrieved by the search engines and metasearch
engines are categorized as ‘more relevant’, ‘less
relevant’, ‘irrelevant’, ‘links’ and ‘sites can’t be
accessed’ on the basis of the following criteria26:
•
•
•
•
•
If the content of the web page closely matched
the subject matter of the search query, then it
was categorized as ‘more relevant’ and it was
given a score of 2.
If the content of the web page is not closely
related to the subject matter but consists of some
relevant aspects to the subject matter of the
search query, then it was categorized as ‘less
relevant’ and it was given a score of 1.
If the content of the web page is not related to
the subject matter of the search query, then it
was categorized as ‘irrelevant’ and it was given
a score of 0.
If the content of the web page consisted of a
whole series of links, rather than the information
required, then it was categorized as ‘links’ and it
was given a score of 0.5, if inspection of one or
two of the links proved to be useful.
If the site can’t be accessed for a particular URL
then the page was checked later. If this message
repeatedly occurred, then the page was
categorized as ‘site can’t be accessed’ and it
was given a score of 0.
advanced search options of Google were used for
retrieving information. Foreign language pages were often
difficult to assess for relevance and hence only English
pages were searched for each query. Search was
restricted to retrieve the sites where the search query
appears in the ‘title of the web page’. Since a large
number of search results were retrieved, only 100 sites
were selected for each query for further analysis.
Of the 1,156,733,010 sites only 1500 sites are selected
for 15 queries (100 sites for each query). Table 1
illustrates the total number of ‘more relevant sites’, ‘less
relevant sites’, ‘irrelevant sites’, ‘links’ and ‘sites can
not be accessed’. It is also clear from the table that
33.86% of sites are less relevant and only 18.46% of
sites are more relevant. The mean precision of Google is
found to be 0.80.
Precision of Yahoo
The data regarding the information relevancy of Yahoo
is given in Table 2.
Table 2 shows the search results of Yahoo. A total of
99,394,341 sites are retrieved for 15 queries. Yahoo also
retrieved more number of ‘less relevant sites’ (32.2%)
followed by ‘irrelevant sites’ (25.53). Only 15.9% of sites
are ‘more relevant’. Thus the mean precision of Yahoo
is 0.75. The comparative precision of Google and Yahoo
is shown in Figure 1.
Precision of metasearch engines
Use of these criteria enabled to calculate the precision
and relative recall of search engines/metasearch engines
for each of the queries using the following formula 27:
Sum of the scores of sites retrieved by a search engine
Precision =
Total number of sites retrieved
Total number of sites retrieved by a search engine
Relative Recall =
Sum of sites retrieved by the two search engines
Precision of Google
Google is the most popular search engine because Google
focuses on the link structure of the Web to determine
relevant results for the users. In the present study,
Unlike single source Web search engines, metasearch
engines do not crawl the internet themselves to build an
index of Web documents. Instead, a metasearch engine
sends queries simultaneously to multiple search engines,
retrieves the results from each, and then combines the
results from all into a single result listing at the same
time avoiding redundancy. In effect, Web metasearch
engine users are not using just one engine, but many
search engines at once to effectively utilize Web
searching. The ultimate purpose of a metasearch engine
is to diversify the results of the queries by utilizing the
innate differences of single source Web search engines
and provide Web searchers with the highest ranked
search results from the collection of Web search engines.
Although one could certainly query multiple search
engines, a metasearch engine distills these top results
automatically, giving the searcher a comprehensive set
of search results within a single listing, all in real time.
91
KUMAR & PAVITHRA: EVALUATING THE SEARCHING CAPABILITIES OF SEARCH ENGINES
Table 1— Precision of Google
Search
queries
Total no.
of sites
Selected
sites
Less
relevant
sites
26(26)
Irreleva
nt sites
Links
100
More
relevant
sites
14(14)
7(7)
52(52)
Sites
Precision
cannot
be accessed
1(1)
0.8
Q#1
81,100,000
Q#2
Q#3
411,000,000
100
14(14)
28(28)
34(34)
18(18)
6(6)
0.65
279,000,000
100
9(9)
29(29)
22(22)
40(40)
0(0)
0.67
Q#4
13,600,000
100
20(20)
28(28)
30(30)
11(11)
11(11)
0.73
Q#5
366,000,000
100
14(14)
42(42)
18(18)
24(24)
2(2)
0.82
Q#6
691,000
100
55(55)
35(35)
6(6)
0(0)
4(4)
1.45
Q#7
24,200
100
12(12)
35(35)
33(33)
12(12)
8(8)
0.65
Q#8
296,000
100
23(23)
42(42)
19(19)
14(14)
2(2)
0.95
Q#9
83,300
100
26(26)
49(49)
12(12)
10(10)
3(3)
1.06
Q#10
2,510
100
11(11)
47(47)
24(24)
10(10)
8(8)
0.74
Q#11
499,000
100
12(12)
17(17)
26(26)
37(37)
8(8)
0.59
Q#12
961,000
100
20(20)
31(31)
25(25)
22(22)
2(2)
0.82
Q#13
1,520,000
100
25(25)
31(31)
13(13)
27(27)
4(4)
0.94
Q#14
916,000
100
13(13)
23(23)
41(41)
13(13)
10(10)
0.55
Q#15
1,040,000
100
9(9)
45(45)
38(38)
3(3)
5(5)
0.64
Total
1,156,733,010
1500
277 (18.46) 508 (33.86)
348 (23.2) 293 (19.53) 74 (4.93)
0.80 *
Note: Number given in parenthesis represents the percentage
* Mean Precision
Table 2 — Precision of Yahoo
Search
queries
Total no.
of sites
Selected
sites
More
relevant
sites
Less
relevant
sites
Irrelevant
sites
Links
Sites
cannot be
accessed
Precision
Q#1
33,100,000
100
15(15)
17(17)
18(18)
49(49)
1(1)
0.71
Q#2
31,200,000
100
10(10)
26(26)
30(30)
32(32)
2(2)
0.62
Q#3
5,840,000
100
13(13)
17(17)
25(25)
43(43)
2(2)
0.64
Q#4
139,000
100
18(18)
48(48)
9(9)
23(23)
2(2)
0.95
Q#5
20,500,000
100
11(11)
31(31)
22(22)
35(35)
1(1)
0.70
Q#6
459,000
100
13(13)
37(37)
39(39)
4(4)
7(7)
0.65
Q#7
8,100
100
20(20)
28(28)
30(30)
8(8)
14(14)
0.72
Q#8
328,000
100
15(15)
23(23)
25(25)
35(35)
2(2)
0.70
Q#9
55,500
100
25(25)
40(40)
25(25)
2(2)
8(8)
0.91
Q#10
741
100
16(16)
43(43)
26(26)
8(8)
7(7)
0.79
Q#11
263,000
100
18(18)
30(30)
29(29)
18(18)
5(5)
0.75
Q#12
422,000
100
18(18)
41(41)
17(17)
22(22)
2(2)
0.88
Q#13
6,020,000
100
19(19)
30(30)
13(13)
30(30)
8(8)
0.83
Q#14
432,000
100
12(12)
22(22)
54(54)
10(10)
2(2)
0.51
Q#15
627,000
100
16(16)
50(50)
21(21)
9(9)
4(4)
0.86
Total
99,394,341
1500
239(15.9)
483 (32.2)
383 (25.53) 328 (21.86) 67 (4.46)
Note: Number given in parenthesis represents the percentage
* Mean Precision
0.75 *
92
ANN. LIB. INF. STU., JUNE 2010
1.6
1.4
Precision
1.2
1
Google
0.8
Yahoo
0.6
0.4
0.2
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
Search Queries
Fig.1 — Precision of Google and Yahoo
Table 3 — Precision of Metacrawler
Search
queries
Total
no.
of sites
More
relevant
sites
Less
relevant
sites
Irrelevant
sites
Links
Sites
cannot be
accessed
Precision
Q#1
53
4(7.54)
15(28.30)
8(15.09)
26(49.05)
0(0)
0.67
Q#2
67
5(7.46)
13(19.40)
22(32.83)
24(35.82)
3(4.47)
0.52
Q#3
68
16(23.52)
42(61.76)
2(2.94)
5(7.35)
3(4.41)
1.12
Q#4
62
10(16.12)
27(43.54)
6(9.67)
17(27.41)
2(3.22)
0.89
Q#5
65
14(21.53)
35(53.84)
8(12.30)
7(10.76)
1(1.53)
1.02
Q#6
56
5(8.92)
21(37.5)
10(17.85)
17(30.35)
3(5.35)
0.70
Q#7
80
16(20)
42(52.5)
12(15)
10(12.5)
0(0)
0.98
Q#8
85
10(11.76)
21(24.70)
26(30.58)
28(32.94)
0(0)
0.64
Q#9
105
29 (27.61)
25 (23.8)
19 (18.09)
30 (28.57)
2(1.9)
0.93
Q#10
77
8(10.38)
9(11.68)
45(58.44)
15(19.48)
0(0)
0.42
Q#11
72
10(13.88)
22(30.55)
18(25)
15(20.83)
7(9.72)
0.68
Q#12
55
9(16.36)
23(41.81)
5(9.09)
15(27.27)
3(5.45)
0.88
Q#13
66
23(34.84)
29(43.93)
3(4.54)
11(16.66)
0(0)
1.21
Q#14
49
8(16.32)
25(51.02)
7(14.28)
7(14.28)
2(4.08)
0.90
Q#15
64
15(23.43)
24(37.5)
21(32.81)
4(6.25)
0(0)
0.87
Total
1,024
182
373
212
231
26
0.83 *
(20.7)
(22.55)
(0.025)
(17.71)
(36.42)
Note: Number given in parenthesis represents the percentage
* Mean Precision
KUMAR & PAVITHRA: EVALUATING THE SEARCHING CAPABILITIES OF SEARCH ENGINES
93
Table 4 — Precision of Dogpile
Search
queries
Total
no.
of sites
More
relevant
sites
Less
relevant
sites
Irrelevant
sites
Links
Sites
cannot
be accessed
Precision
Q#1
53
11(20.75)
25(47.16)
2(3.72)
15(28.30)
0(0)
1.02
Q#2
67
16(23.88)
36(53.73)
8(11.94)
7(10.44)
0(0)
1.06
Q#3
67
15(22.38)
34(50.74)
9(13.43)
6(8.95)
3(4.47)
1.0
Q#4
62
14(22.58)
35(56.45)
10(16.12)
2(3.22)
1(1.61)
1.03
Q#5
67
17(25.37)
42(62.68)
7(10.44)
1(1.49)
0(0)
1.14
Q#6
40
6(15)
12(30)
9(22.5)
10(25)
3(7.5)
0.72
Q#7
78
10(12.82)
39(50)
9(11.53)
15(19.23)
5(6.41)
0.85
Q#8
68
5(7.35)
20(29.41)
24(35.29)
19(27.94)
0(0)
0.58
Q#9
106
23
32
21
27
3
0.86
Q#10
64
11(17.18)
18(28.12)
15(23.43)
19(29.68)
1(1.56)
0.77
Q#11
72
9(12.5)
22(30.55)
21(29.16)
18(25)
2(2.77)
0.68
Q#12
53
6(11.32)
19(35.84)
4(7.54)
20(37.73)
4(7.54)
0.77
Q#13
65
12(18.46)
26(40)
18(27.69)
9(13.84)
0(0)
0.83
Q#14
74
17(22.97)
39(52.70)
7(9.45)
9(12.16)
2(2.70)
1.04
Q#15
63
10(15.87)
26(41.26)
8(12.69)
10(15.87)
9(14.28)
0.80
Total
999
182 (18.21)
425 (42.54)
172 (17.21)
187 (18.71)
33 (3.3)
0.88 *
Note: Number given in parenthesis represents the percentage
* Mean Precision
In the present study two metasearch engines viz.,
Metacrawler and Dogpile have been used to study their
recall and precision. Since Metacrawler and Dogpile
retrieved very less number of sites for all 15 search
queries, it was decided to select all retrieved sites for
the study.
only one single search engine. Table 3 shows the search
results of Metacrawler.
Total 1,024 sites are retrieved, out of which 36.42% of
sites are ‘less relevant’ followed by ‘links’ (22.55). Only
17.71% of sites are ‘more relevant’ and thus the precision
of Metacrawler is 0.83.
Precision of Metacrawler
MetaCrawler was originally developed in 1994 at the
University of Washington by the then graduate student
Erik Selberg and Associate Professor Oren Etzioni. The
site joined the InfoSpace Network in 2000 and is owned
and operated by InfoSpace, Inc. MetaCrawler uses some
of Internet’s search engines, including Google, Yahoo!
Search, MSN Search, Ask Jeeves, About, MIVA,
LookSmart and more. With one single click, MetaCrawler
searches the best results from the combined pool of the
world’s leading search engines — instead of results from
Precision of Dogpile
Dogpile is relatively new metasearch engine which
searches Web sites, images, audio and video files, yellow
pages etc., It also brings together the results from some
of Internet’s popular search engines, including Google,
Yahoo! Search, Live Search, Ask.com, About, MIVA,
LookSmart, and more. Search result of Dogpile is
presented in Table 4 and it clear from the results of the
study that Dogpile retrieved 42.54% of ‘less relevant
sites’, 18.71% of sites having links. Only 18.21% of sites
94
ANN. LIB. INF. STU., JUNE 2010
Table 5 — Relative recall of Google and Yahoo
Search queries
Google
Total no. of sites
Yahoo
Relative
Recall
Total no. of sites
Relative Recall
Q#1
81,100,000
0.71
33,100,000
0.28
Q#2
411,000,000
0.92
31,200,000
0.07
Q#3
279,000,000
0.97
5,840,000
0.02
Q#4
13,600,000
0.98
139,000
0.01
Q#5
366,000,000
0.94
20,500,000
0.05
Q#6
691,000
0.60
459,000
0.39
Q#7
24,200
0.74
8,100
0.25
Q#8
296,000
0.47
328,000
0.52
Q#9
83,300
0.60
55,500
0.39
Q#10
2,510
0.77
741
0.22
Q#11
499,000
0.65
263,000
0.34
Q#12
961,000
0.69
422,000
0.30
Q#13
1,520,000
0.20
6,020,000
0.79
Q#14
916,000
0.67
432,000
0.32
Q#15
1,040,000
0.62
627,000
0.37
Total
1,156,733,010
99,394,341
0.07 *
0.92 *
* Mean relative recall
Table 6 — Relative recall of Metacrawler and Dogpile
Search queries
Metacrawler
Dogpile
Total no. of sites
Relative Recall
Total no. of sites
Relative Recall
Q#1
53
0.50
53
0.50
Q#2
67
0.50
67
0.50
Q#3
68
0.50
67
0.49
Q#4
62
0.50
62
0.50
Q#5
65
0.49
67
0.50
Q#6
56
0.58
40
0.41
Q#7
80
0.50
78
0.49
Q#8
85
0.55
68
0.44
Q#9
105
0.49
106
0.50
Q#10
77
0.54
64
0.45
Q#11
72
0.50
72
0.50
Q#12
55
0.50
53
0.49
Q#13
66
0.50
65
0.49
Q#14
49
0.39
74
0.60
Q#15
64
0.50
63
0.49
Total
1,024
0.50 *
999
0.49 *
* Mean Relative recall
& PAVITHRA:
KUMAR
c::
a..
.~
'gtJ)...
0.2
1
1.20
EVALUATING THE SEARCHING
CAPABILITIES
OF SEARCH ENGINES
95
1.4
0.4
0.8
0.6
---+-- Metacrawler
-1il-
2
3
4
5
6
7
8
Dogpile
9 10 11 12 13 14 15
Search Queries
Fig.2 -
••
c::
~
>CI)
CI)
CI)
~
Precision of Metacrawler
and Dogpile
1.2
0.4
0.6
0.2
1
0.8 0
---+-- Google
-1il- Yahoo
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Search Queries
Fig.3 -
Relative recall of Google and Yahoo
are 'more relevant'. The mean precision of Dogpile is
0.88.
Relative recall of Google and Yahoo
The term "recall" refers to a measure of whether or not
a particular item is retrieved or the extent to which the
retrieval of wanted items occurs. Recall is thus the ability
of a retrieval system to obtain all or most of the relevant
documents in the collection.
The relative recall of the Google and Yahoo is calculated
and presented in Table 5. It is evident from the above
table that the overall recall of the Google is 0.92 and
Yahoo is 0.07. In case of Google, the search query 4
has highest recall value (0.98) followed by a search query
3 (0.97) and the least recall is for search query 1 (0.71).
In ca'Se of Yahoo, the highest recall is for search query
13 (0.79) and least recall is for search query 4 (0.01).
The relative recall can be calculated using the following
Relative recall of Metasearch engines
Relative recall of search engines
formula28:
Total number of sites retrieved by a search engine
Relative Recall--------------------Sum of sites retrieved by the two search engines
The relative recall of the Metacrawler and Dogpile is
also calculated and presented in Table 6. The table
clearly illustrates that the overall recall of Metacrawler
is 0.5 and Dogpile is 0.49. In case of Metacrawler, the
96
ANN. LIB. INF. STU., JUNE 2010
search query 6 has highest recall value (0.58) and the
least recall is for search query 14 (0.39). In case of
Dogpile the highest recall is for search query 14 (0.60)
and least recall is for search query 6 (0.41).
Conclusion
Today, search engines are the most effective searching
tools for millions of users throughout the world to access
information on various topics and also to keep up with
the latest news. Even though search engines retrieve
enormous volume of information at impressive speed but
the results retrieved from these search engines may not
be relevant. In this context, the results of this study clearly
show that no search engine or metasearch engine
retrieved more relevant information on the World Wide
Web. Even though metasearch engines retrieved less
number of sites for all search queries the mean precision
of metasearch engines is comparatively high as compared
to search engines. It clearly shows that search engines
did not achieve higher precision than the metasearch
engines. However, despite the theoretical advantage of
searching the databases of several individual search
engines, metasearch engines did not achieve higher recall.
References
1. Johnson D, Malhotra V and Vamplew P, More effective web
search using bigrams and trigrams, Webology, 3 (4) (2006),
Available at: http://www.webology.ir/2006/v3n4/a35.html
(Accessed on 17 May 2010).
2. Cole J I, Suman M, Schramm P, Lunn R and Aquino J S, The
UCLA internet report surveying the digital future year three,
(2003) Available at: http://www.digitalcenter.org/pdf/
InternetReportYearThree.pdf (Accessed on 17 May 2010).
3. Sullivan D, Nielsen Net ratings search engine ratings, (2006)
Available at:http://searchenginewatch.com/2156451 (Accessed
on 17 May 2010).
4. Jansen B J and Spink A, An analysis of Web searching by
European Alltheweb.Com users, Information Processing and
Management, 41 (6) (2004) 361–381.
5. Jansen B J, Spink A and Saracevic T, Real life, real users, and real
needs: A study and analysis of user queries on the Web,
Information Processing and Management, 36 (2) (2000) 207–
227.
6. Hölscher C and Strube G, Web search behavior of Internet experts
and newbies, International Journal of Computer and
Telecommunications Networking, 33 (1-6) (2000) 337–346.
7. Jansen B J and Pooch U, Web user studies: A review and
framework for future work, Journal of the American Society of
Information Science and Technology, 52 (3) (2001) 235–246.
8. Montgomery A and Faloutsos C, Identifying Web browsing
trends and patterns, IEEE Computer, 34 (7) (2001) 94–95.
9. Chowdhury A and Soboroff I, Automatic evaluation of World
Wide Web search services, In Proceedings of the 25th annual
international ACM SIGIR conference, Finland, 11-15 August
(2002) 421-422.
10. Spink A, Ozmutlu S, Ozmutlu H C and Jansen B J, U.S. versus
European Web searching trends, SIGIR Forum, 32 (1) (2002)
30–37.
11. Can F, Nuray R and Sevdik B A, Automatic performance
evaluation of web search engines, Information Processing and
Management, 40 (3) (2004) 495-514.
12. Wang P, Berry M and Yang Y, Mining longitudinal Web queries:
Trends and patterns, Journal of the American Society for
Information Science and Technology, 54 (8) (2003) 743–758.
13. Jansen B J and Spink A, An analysis of Web information seeking
and use: documents retrieved versus documents viewed. In
IC’03: Proceedings of the 4th International Conference on
Internet Computing, Las Vegas, Nevada, 23-26 June 2003, 6569.
14. Shafi S M and Rather R A, Precision and recall of five search
engines for retrieval of scholarly information in the field of
biotechnology, Webology, 2 (2) (2005), Available at: http://
www.webology.ir/2005/v2n2/a12.html (Accessed on 17 May
2010).
15. Koshman S, Spink A and Jansen B J, Web searching on the
Vivisimo search engine, Journal of the American Society for
Information Science and Technology, 57 (14) (2006) 1875–1887.
16. Singh R, Performance of World Wide Web search engines: A
comparative study, Library Herald, 44 (4) (2006) 337.
17. Biradar B S, and Sampath Kumar B T, Internet search engines:
A comparative study and evaluation methodology, SRELS
Journal of Information Management, 43 (3) (2006) 231-241.
18. Jansen B J and Molina P R, The effectiveness of web search
engines for retrieving relevant ecommerce links, Information
Processing and Management, 42 (2006) 1075–1098.
19. Jansen B J, Adversarial information retrieval aspects of
sponsored search. In 2nd International Workshop on Adversarial
Information Retrieval on the Web (AIRWeb06). The 29th Annual
International ACM SIGIR Conference on Research and
Development on Information Retrieval (SIGIR06) (2006).
20. Lewandowski D, Wahlig H and Meyer-Bautor G, The freshness
of web search engine databases, Journal of Information Science,
32 (2) (2006) 131 - 148
21. Lewandowski D, A three-year study on the freshness of web
search engine databases, Journal of Information Science, 34 (6)
(2008) 817-831
22. Thelwall M, Quantitative comparisons of search engine results,
Journal of the American Society for Information Science and
Technology, 59 (11) (2008) 1702-1710
23. Höchstötter N and Lewandowski D, What users see - Structures
in search engine results pages, Information Sciences: an
International Journal, 179 (12) (2009) 1796-1812
24. Uyar A, Investigation of the accuracy of search engine hit counts,
Journal of Information Science, 35 (4) (2009) 469-480
25. Shafi S M and Rather R A (2005), Op cit
26. Clarke S and Willett P, Estimating the recall performance of
search engines, ASLIB Proceedings, 49 (7) (1997) 184-189.
27. Ibid, 184-189
28. Ibid, 184-189
KUMAR & PAVITHRA: EVALUATING THE SEARCHING CAPABILITIES OF SEARCH ENGINES
Appendix-I: Search Queries
i) Simple one-word queries
Q #1: Encyclopedia
Q #2: Computer
Q #3: Multimedia
Q #4: Hypothesis
Q #5: Database
ii) Simple multi word queries
Q #6: Digital library
Q#7: Library automation
Q #8: Internet resources
Q #9: Intellectual property rights
Q #10: What is search engine
iii) Complex multi word queries
Q #11: Designing of Library building
Q #12: Policies of Collection development
Q #13: Evaluation of Web sites
Q #14: Internet and Web designing
Q #:15 Evaluation of Digital library
97