Evaluation of a Semantic Search Engine against a Keyword Search

INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
55
Evaluation of a Semantic Search Engine
against a Keyword Search Engine Using First
20 Precision
1
Martin 0. Andago, 2Teh P.L Phoebe, 3Bassam A.M Thanoun
Faculty of Management and Information Technology, I.T Department, UCSI University
[email protected]
Abstract
The growth of the Semantic Web has resulted in the development of Semantic applications such
as search engines. The continuous development of these applications has prompted many
researchers to declare that the Semantic Web is a solution to the many problems faced by the
current World Wide Web. However, there is a lack of proper evidence to prove that the
Semantic Web is actually superior to the WWW. In our research, we collected queries from 30
university students and entered these queries into two search engines: Google – the most widely
used search engine and Hakia – an upcoming Semantic search engine. Precision was thereafter
calculated using a pre-determined formula. Our calculation revealed that Google outperforms
Hakia as it has a higher mean precision at 0.64 as compared to Hakia at 0.54. Google also has a
lower standard deviation at 0.14as compared to Hakia at 0.25. The results show that Google,
which is a keyword search engine, is superior to Hakia, a Semantic Search engine in terms of the
first 20 precision. While there is a possibility of this fact changing in the future with the current
advancement rates of the Semantic Web, it remains absolutely true for now.
Keywords: Semantic Web, First 20 Precision, Search Engine Evaluation
1. INTRODUCTION
The presence of large amounts of information on the World Wide Web and the problems
associated with searching for information prompted researchers and software developers to come
up with a new form of Web technology in order to keep up with the changing times and ensure
that the numerous problems witnessed such as query formation and information overload become
a thing of the past. Thus, the Semantic Web, also known as Web 3.0, was developed.
According to Ding et al (2005), the Semantic Web provides a way to encode information and
knowledge on web pages in a manner that is easier for computers to understand and process.
This implies that the Semantic Web simplifies the process of searching for and finding the
information you need from the World Wide Web.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
56
It is important to note that in order for human beings to retrieve information from the World
Wide Web, it is necessary that the computers or other devices that they are using, not only
contain the information, but are able to understand it and relate it to a similar topic. This is the
only way that we can ensure that people searching for information will find exactly what they are
looking for. This has been a problem because of the vast amount of information and resources on
the Web which makes accurate search a problem. This shows that despite the fact that
conventional search engines work and are used by millions of people on a daily basis, they still
have flaws that can possibly be addressed by Semantic Search engines.
We are slowly but steadily shifting away from first generation Semantic Web applications,
towards a new generation of applications, designed to exploit the large amounts of semantic
markup languages, which are increasingly becoming available. This implies that the current
applications being built are aimed at utilizing the vast amounts of semantic information that is
available on the World Wide Web.
Semantic search engines, an example of Semantic Web applications, are used by people from all
walks of life to gather information ranging from health issues to social and political concerns.
Anderson T. (2004) predicted that the concept and theories behind the Semantic Web could
provide an opportunity to expand the scope and ability to provide learning opportunities
“unbounded by geographic, temporal, or economic distance.”
2. PROBLEM STATEMENT
According to Albertoni et. Al. (2004), the Semantic Web was proposed as a solution to deal with
problems such as information overload and info-smog which are responsible for making web
content inaccessible for some users. Over the years, it has been developed as an alternative to the
traditional World Wide Web. Those promoting it have gone as far as suggesting that it solves the
current problems that face the WWW which were highlighted by Plessers P. and Troyer O.
(2004) as restricted query possibilities, query refinement and information overload.
However, it must be mentioned with concern that the Semantic Web has, up to today, never been
directly compared with the World Wide Web. Furthermore, this implies that applications
developed using the Semantic Web have also not been compared with other applications that
were developed pre-Semantic Web. Therefore, the notion that the Semantic Web solves the
many problems faced by the World Wide Web remains only a theory that needs to be proven true
or not.
The overall aim of this project is to provide an authoritative point of view with regards to the
user effort required to obtain hits using a web based Semantic search engine. The project seeks to
compare Hakia – A Semantic Search Engine with Google – A keyword search engine by
calculating the first 20 precision and determining which of the two has a higher precision.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
57
3. METHODOLOGY
While carrying out an evaluation of web-based search engines using user-effort measures, Tang
M. and Sun Y. (2003) came up with a method that was quick and convenient yet at the same time
provided useful information. They applied three user-effort sensitive evaluation measures,
namely – “first 20 full precision”, “search length” and “rank correlation”. From the authors’
point of view, these measures were better alternatives than precision and recall in Web search
situations as some characteristics of web searching require performance criteria other than the
traditional methods employed. (Clarke and Willett, 1997)
In the study mentioned above, the authors collected queries from the users and submitted them to
the search engines. A common environment was provided to ensure that computers of the same
properties and within the same LAN were used. The results were copied onto a Microsoft Word
file and the people who submitted the queries were allowed to examine them by clicking on the
URLs presented. I have therefore decided to adopt the strategy followed by Tang M. and Sun Y.
(1999) in my attempt to evaluate Semantic search engines using user effort measures.
Hakia was selected as the semantic search engine to be used for the experiment A study
commissioned by Hakia (company.hakia.com) pinpointed that even in its BETA state, seventeen
percent (17%) of users say hakia.com is better overall than their favorite search engine; twentythree percent (23%) will use hakia.com exclusively or most of the time; and fifty eight percent
(58%) said they would recommend hakia.com to friends. It remains my humble opinion that
Hakia has positioned itself strategically will be competing with those in the ‘big league’ such as
Google and Yahoo.
Google was selected as the keyword search engine because it is generally considered the most
popular search engine across the world. A global search survey conducted by comScore shows
that in 2007, Google is the most popular search engine in the world. In fact, Google handles
roughly 60% of world wide search. (www.comscore.com)
First 20
Precision
Figure 1: The formula used to calculate First 20 Precision.
58
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
Queries were collected from 30 university students who were asked to rank the top 20 hits
according to what they felt was relevant to the information they were searching for. The first 20
precision was thereafter calculated using the formula shown above.
4. ANALYSIS
From the table below, we learn that the minimum precision for Google, from the data analyzed
above is 0.25 and the maximum was 0.9. For Hakia, the minimum precision was 0.02 with the
maximum precision being 1.00.
Table 1: Table showing the mean, std. deviation and variance of the precision.
Min
Max
Mean
Std. Dev.
Var
First_20_Precision_Google .25
.90
.6380
.14471
.021
First_20_Precision_Hakia
1.00
.5380
.25022
.063
.02
The mean column presents some interesting information. Notice that the mean for Google, in
relation to the first 20 precision is higher (0.638) as compared to that of Hakia which is only
0.538.
Equally important is the standard deviation column which shows that Hakia has a wider standard
deviation (0.25022) as compared to Google which has a standard deviation of only 0.14471. This
implies that the precision of Hakia is spread over a wider range of values from the mean than
Google is. Statistically, the fact that Google has a lower standard deviation tells us that all the
precisions are tightly packed all and clustered around the mean in the set of data. This however is
not the case with Hakia.
The results clearly show that Google, which is a keyword search engine, is superior to Hakia, a
Semantic Search engine in terms of the first 20 precision. While there is a possibility of this fact
changing in the future, it remains absolutely true for now with evidence provided in the results
shown above.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
59
Figure 2: Visual Representation of Google First 20 Precision.
The diagram above represents a graphic summary of the precisions from all the 30 cases that
were analyzed. Notice that while the precision fluctuates with each changing case, the
fluctuations are not that extreme, save for once or twice.
The minimum precision is seen to be above 0.25 while the maximum precision is about 0.9. Also
note that majority of the values for precision lie between 0.65 and 0.75 suggesting that the mean
precision is in between.
Figure 3: Visual Representation of Hakia First 20 Precision.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
60
The diagram above represents a graphic summary of the precisions from all the 30 cases of
Hakia that were analyzed. There is a clear and visible difference between this diagram and the
equivalent Google one in that the fluctuations here are greater. Hakia precisions are extreme –
ranging from 1.00 all the way down to 0.25. This highlights the point that even though in some
cases Hakia returns exactly what the searcher is looking for, there are a few instances where the
hits are totally irrelevant and do not correspond with what the user was looking for.
Hakia First 20 Precision
Google First 20 Precision
The diagram above shows a combination of the Google and Hakia graphs super-imposed on one
another instead of separately as previously show. Different colours are used to bring out the
disparity between the two search engines.
5. EVALUATION OF RESULTS
There were two main concerns that were raised during the experiment that impact on the results
obtained. These are: Paid for ranking and dead links.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
61
Paid for ranking enables companies website to be ranked higher in a search engine’s hits than it
actually is. This means that it is possible for a website to be included in the first 20 hits not
because it is highly relevant and many people have visited it but because the organization was
willing to part with a sum of money. While this can affect the results of the precision calculation,
there is sometimes no way of knowing which website is ranked highly for what reason – an
obvious flaw in the experiment. However, it was my sincere thought that the fact that one or two
websites in the top twenty hits might be there because of paid for ranking and not because of its
relevant content, does not affect the calculation of precision. In short, the inclusion of paid for
ranking hits in the top 20 hits was not considered as a critical factor in the calculation of
precision.
The second concern that was raised was dead links. These are the links that when you click on
them, you receive an error message telling you that the website or webpage is unavailable for
one reason or another. While it is a rare phenomenon to come across dead links within the first
20 links, it did occur in one or two instances. My approach to this problem was simple: I
instructed the respondent to automatically assign a score of zero to any dead links. While there
might be a plausible reason as to why the link is dead, the fact remains that a dead link is not
what the respondent was looking for and therefore a score of zero seemed appropriate.
6. CONCLUSION
Years ago, Tim Berners Lee envisioned a universal medium for data, information and knowledge
exchange known as the Semantic Web. To him, it was possible to create a Web that was readable
and understandable by both humans and machines. Needless to say, his vision became a reality
and he became known to many as the father of the Semantic Web.
The growth of the semantic web has greatly impacted on information technology and changed
the face of e-commerce and web-based research. Numerous applications have also been
developed over the years that incorporate semantic technology in them. One key example of such
applications are Semantic search engines which were built as a response to the problems
associated with searching for information on the World Wide using traditional keyword search
engines. With emphasis having been placed on the Semantic Web and analysts predicting that it
is the future of the World Wide Web, the question is: Just how good and reliable is it?
The results of the experiment showed that Google is superior with a mean precision of 0.638 and
a standard deviation of 0.145 as compared to Hakia whose mean precision was 0.538 with a
standard deviation of 0.25. Hakia’s precisions seem to be spread out with the maximum precision
being 1.0 and the minimum being 0.02 which is a very wide area thus explaining the higher
standard deviation from the mean. Google’s precisions are generally clustered together with its
maximum precision being 0.9 and its minimum precision being 0.25.
While it is possible that, given time, Hakia’s Semantic search will pick up and fair better than
Google in terms of the first 20 precision in years to come, it remains clear for now that Google
still has the upper hand. Therefore researchers and analysts should not jump to conclusions by
claiming that semantic technology is the technology of the future but should wait until it has
been proven so before promoting it.
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
62
7. FUTURE WORKS
While this research has been carried out up to standards that are deemed acceptable and reliable,
there are a few ways on which it can be improved on in the future:
i) Selection of Respondents – while selection of respondents was done on a random
basis in this study, I recommend that they be selected based on majors i.e. what
course they study in the future. This means, they will be grouped into say social
sciences, health sciences and technology. Therefore they researcher will be able to
determine whether there is a difference between the keywords used by students who
are more technology savvy and whether this impacts on the calculation of precision.
In fact, I already foresee a situation where information technology students will use
unique keywords thus they will obtain a higher precision rating as compared to
students from other faculties. This data can thereafter be used to calculate a more
accurate mean precision.
ii) Widening the scope – Instead of just evaluating 2 search engines, I recommend the
evaluation of several search engines. Maybe one semantic search engine against three
keyword search engines and vice versa. Not only that but widening the number of
subjects in the study to over 100. The information that will be derived from the
analysis of such an experiment will be rich and very helpful in terms of comparing
the different types of search engines.
8. REFRENCES:
[1] Albertoni R., Bertone A., & De Martino M., (2004), “Semantic Web and Information
Visualization, Proceedings of the 1st Italian Workshop on Semantic Web Application and
Perspective, DEIT, pp. 108-114, Ancona, Italy, December 10, 2004.
[2] Anderson, T., Whitelock, D. (2004), The Educational Semantic Web: Visioning and
Practicing the Future of Education. Volume 1.
[3] Clarke, S., & Willett, P. (1997). Estimating the recall performance of search engines.
ASLIB Proceedings, 49 (7), 184-189.
[4] Cleverdon, C.W., Mills, J., and Keen, E.M. (1966), “An inquiry in testing of information
retrieval systems”, Aslib Cranfield Research Project, College of Aeronautics, Cranfield,
United Kingdom.
[5] Ding L., Finin T., Joshi A. Peng Y., Rong P., Reddivari P., Kolari P., (2005), “Finding
and Ranking Knowledge on the Semantic Web”, 4th International Semantic Web
Conference, November 6-10, 2005, Galway, Ireland.
[6] Plessers P., Troyer O. (2004), “Web Design for the Semantic Web”, Workshop on
Application Design, Development and Implementation Issues in the Semantic Web, May
18, 2004, New York, USA.
[7] Tang, M.-C. Sun, Y. (2003). Evaluation of Web-Based Search Engines Using User Effort
Measures. Library and Information Science Research Electronic Journal 13(2).
INTERNATIONAL JOURNAL FOR THE ADVANCEMENT OF SCIENCE & ARTS, VOL. 1, NO. 2, 2010
9. APPENDIX
Table 2: A summary of the analysis phase showing the most important information.
63