ARCHIMIDES: An Intelligent Agent for Adaptive

Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
“ARCHIMIDES”: An Intelligent Agent
for Adaptive - Personalized Navigation within a WEB Server
Nikos Bogonikolos
ZEUS Consulting S.A.
Trade Center - Georgiou A’
Square & Riga Feraiou 93
Patra 26221 GREECE
+30 61 622655
[email protected]
Dimitris Fragoudis*
Computer Engineering &
Informatics Department
University of Patras
26500 GREECE
+30 61 622655
[email protected]
Abstract
1. Introduction
With the explosive growth of Internet and the volume of
information published on it, the search and retrieval of
desired information has become practically impossible, if
its source is not known in advance. This is the reason why
search engines have been emerged, aiming to relieve the
user from the “lost in hyperspace” feeling and the
information overload. Imagine, however, cases where the
result of some query to a search engine contains hundreds
of thousands of URLs (Uniform Resource Locators). With
such a number of URLs, search engines become in
practice inefficient, if we consider that the navigation
through even a few decades of URLs is very tiring and
time consuming. Thus, instead of trying to address the
information overload problem with search engines and
robots (spiders), we believe that each server should
facilitate itself the retrieval of desired information,
published on its own domain. In this paper we present
“Archimides”, an intelligent agent that aims to provide
intelligent, adaptive and personalized navigation within a
WEB server. Provided a subset of the set of keywords that
characterize the server’s contents, Archimides undertakes
the task to perform an intelligent information retrieval and
afterwards to construct a personalized version of the
server in the form of an index to pages that present some
interest to the user. This index does not resemble what
search engines return as a result of some query; it could
be probably regarded as a much sorter version of the WEB
server, with links that are dynamically inserted or deleted
according to the user’s interests, preferences and
behavior, providing Archimides with the feature of
adaptivity. As a result the user navigates in a WEB server
that may completely present interest to him or her, thus
relieving the user from undesired information overload..
* Contact Person: Dimitris Fragoudis
Spiros Likothanassis
Computer Engineering &
Informatics Department
University of Patras
26500 GREECE
+30 61 997755
[email protected]
With the explosive growth of Internet (it is estimated
that there are currently 13 million hosts in operation, with
this number doubling every year [2]) and the volume of
information published on it, the search and retrieval of
desired information has become practically impossible if
its source is not known in advance. This is the reason why
search engines have been emerged, aiming to help people
find desired information. Provided some keywords that
characterize the desired information to be looked for, they
present a set of URLs that possibly contain the sought
information. For this reason a kind of smart indexing is
used, where every URL is characterized by a set of
keywords, and sophisticated information retrieval
methods. Imagine, however, cases where the result of
some query to a search engine contains hundreds of
thousands of URLs. With such a number of URLs, search
engines become in practice inefficient, if we consider that
the navigation through even a few decades of URLs is very
tiring and time consuming. At the same time, after
following a proposed hyperlink, the dominant method of
searching and exploring all this information is today the
“direct manipulation method”. The current interface
structure of many WEB browsers as well as the manner in
which the documents are organized within a WEB server
encourage depth first search, since every time one
descends a level, the choices to the next lower level are
immediately displayed. He or she has to return to the
previous level in order to explore choices to the same level,
that is a two step process in the interface. Thus, since users
usually explore in a relatively indirect fashion, they tend to
explore links downwards in a depth first fashion. This
tendency leads, after a while, the user to a very deep stack
of previously chosen documents and a “lost in hyperspace
feeling”. At this point the intelligent agents technology is
introduced, aiming to improve the quality of the services
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
1
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
provided, to save some precious user time and probably to
preserve his or her mental sanity.
information overload.
3.1. Setting preferences
2. Related Work
It is not in the scope of the current paper to examine
the notion of intelligent agents. Many researchers have
already dealt with this matter, from Wooldridge and
Jennings in [7], to Maes [4], and everyone has given a
definition based on his or her point of view. However one
thing is certain: Intelligent agents have already been
widely used in helping people on their “battle” against
information overload.
Most similar to our work is SiteHelper [5], which
follows the same philosophy, “let servers do their own
housekeeping”, however SiteHelper acts much more like a
sophisticated search engine within the domain of a Web
server. It incorporates incremental machine learning
capabilities to help the user explore the Web. It first learns
about the user’s areas of interests by analyzing the user’s
visit records and then assists the user retrieving
information by providing the user with updated
information about the Web site.
WebWatcher [1] and Letizia [3] are another category of
agents that provide the user with recommendations about
what links they should follow. Letizia is a user interface
agent that assists a user browsing the Web. While the user
navigates Letizia tracks his or her behavior, extracts the
user’s preferences, explores autonomously with a best-first
breadth-first strategy and makes recommendations upon
request. In the other hand, WebWatcher is a goal driven
Web-interface agent that is incorporated with Web pages
and recommends hyperlinks that should be followed in
order to achieve the preset goal. Both Letizia and
WebWatcher incorporate incremental learning, although
in a very different fashion.
One of the main problems we had to address was how a
user should set his or her preferences. There are currently
many techniques for expressing documents as sets of
keywords, the Vector Space Model is just one of them and
also many others for weighting these keywords, such as the
TFIDF measure [6]. However all these techniques are used
mainly in cases where we have to face unknown
documents and extract their meaning. In our case we deal
with known documents and we believe that in such a case
it is more accurate to define each page’s keywords by
ourselves. Thus some keywords are assigned to every page
and the set of all these keywords represent the server’s
content area. When the user decides to use Archimides, he
or she has to express interest on some of these in order to
proceed to what is called “server customization”.
3.2. Server customization
Lets assume the following structure of the WEB server:
Image 1. Structure of the server
3. Archimides
In this paper we present “Archimides”, an intelligent
agent that aims to provide intelligent, adaptive and
personalized navigation within a WEB server. Provided a
subset of the set of keywords that characterize the server’s
contents, Archimides undertakes the task to perform an
intelligent information retrieval and afterwards to
construct a personalized version of the server in the form
of an index to pages that present some interest to the user.
This index does not resemble what search engines produce
as a result of some query; it could be probably regarded as
a much sorter version of the WEB server with links that
are dynamically inserted or deleted according to the user’s
interests, preferences and behavior providing Archimides
with the feature of adaptivity. As a result the user
navigates in a WEB server that may completely present
interest to him or her relieving him or her from undesired
The nodes with the shading contain information the
user is interested in, always with regard to the submitted
information about his or her preferences.
A unique code number has been assigned at any node,
based on the node’s distance from the root of the tree that
represents the structure of the WEB server. The encoding
process is described below:
• The root (the WEB server home page) is assigned 0
• Let a node with encoding k (k has length equal to the
depth of the node). We arrange the children of this
node. Each child’s code number consists of the prefix k
followed by its position in the arrangement of the
children. Thus, for node 11, its first child is codified
with 111, its second with 112 etc.
This type of codification presupposes that every node of
the tree have less than 10 children, otherwise we just
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
2
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
assign to digits for the arrangement of the children,
increasing that way the maximum legal number of
children to100.
By using this type of codification we are able to
perform an easy and fast search of the ancestors and the
descendants of any node. Assuming a node with a k-digits
codification , the first k-1 digits represent the codification
of its father while all the nodes-children have a
codification of k+1 digits where the k-digits prefix is the
codification of this node.
After the application of the proposed algorithm to the
tree that represents the structure of the WEB server, we
receive the new personilized version of the WEB server,
which in our case is:
Image 2. Personalized version of the server
This structure is stored to the database that contains all
the necessary information for the specified user, while, at
the same time, a new HTML representation of this
structure is created. This HTML representation constitutes
the personalized version of the WEB server and presents
the following attributes:
• Every page is a collection of links
• The first link (it is presented separately from the
others) constitutes the root of this page and by
following it anyone can reach the actual corresponding
URL of the WEB server.
• The following Links lead to pages of the same type,
where they constitute their root.
The URL of the page is the result of the concatenation
of the user code number, a 5-digit long string, and the code
(index) of its root. If only number digits are used then we
can have a maximum of 105 users and 103 pages at the
WEB server; if we add all the 26 Latin characters for the
encoding, the above numbers increase to 365 ≅ 60000000
pages and 363 ≅ 46500 users respectively.
In the current example the start page has as root the
home page of the WEB server and links to the sub-trees
with roots the nodes 1,212,22 êáé 3.
The link to the home page leads to the home page of
the WEB server, while by following link 3 for example, we
are led to a new page with the node 3 as root and links to
the sub-trees with the nodes 31 and 32 as roots. In this new
page, if we follow the root link (number 3 in this case) we
get to the page 3, while the links 31 and 32 lead to a
repetition of the process described above.
The proposed algorithms that are responsible for the
construction of the personalized version of the WEB server
are described below:
Construct(v): Assuming a node v it finds all nodes that
must posed as children of v.
Construct_Tree(x): Assuming a node x it constructs the
personalized version of the WEB server below the node x.
It is obvious that by calling this function with argument
the root of the WEB server (Construct_Tree(root)), we
have constructed the personalized version of the entire
WEB server.
In further details, we have:
Construct(v){
S = { };
If v is a leaf then
return;
For every child x of v do
If x has some interest to the user
S=S+x
otherwise
S = S + Construct(x)
Construct an HTML page, with the attributes of
the HTML representation that has described
above, where v is going to be the root and links
the elements of S
return S
}
Construct_Tree(x){
S = Construct(x)
For every element y of S
Construct_Tree(y)
}
The current realization permits an easy and automated
re-construction of the personalized version of the WEB
server as the habits and preferences of the user change
over time.
Thus, in order to remove node x from the structure, we
have to perform the following procedure:
Delete(x){
S = The set of Links of the page that has x as root
Find the page where x is contained as a Link and
replace it with S
Delete the page that has x as root
}
while, in order to add a new node x to the structure , we
have to perform the following procedure:
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
3
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Add(x){
If now we remove nodes 3 and 22 for example, we will
have:
v=x
Repeat
v = father(v)
until v has some interest to the user
S = The set of Links of the page that has v as root
S’ = The set of Links of the page that has v as
root that also belong to the sub-tree with the node
x as root.
For the current page set S = S - S’ +x
Rebuild this page
Create a new page with x as root and links the
elements of S’
}
To add a set S_new of new nodes to the structure, we
first arrange the elements of S_new sorted by depth and
after we call Add() for each element of S_new. This way
guarantees the minimum possible reconstruction cost of
the personalized version of the WEB Server.
Image 5. Structure of the server after removing nodes 3
and 22
and the structure of the personalized version of the WEB
Server becomes:
Thus, if we add the nodes 2 and 221 the structure
Image 6. Structure of the personalized version of the server
after removing nodes 3 and 22
3.3. Adapting to user preferences
changes to:
Image 3. Structure of the server after adding nodes 2 and
221
and the structure of the personalized version of the WEB
Server becomes:
Image 4. Structure of the personalized version of the server
after the addition of nodes 2 and 221
Until now we have presented, by using simple but
effective algorithms, how a WEB server may be
personalized according to the preferences and habits of its
users. This personalization is not static, but it changes over
time dynamically as the user’s preferences and habits
change, giving to Archimides the feature of adaptivity.
There are three possible reasons that may cause the
reconstruction of the structure of the personalized version
of the WEB server:
• The user may explicitly alter his or her interests
through provided options-preferences;
• Nodes that have been marked as interesting, according
to the user preferences, do not actually present any
interest to the user and therefore they are deleted from
the structure that represents the personalized version of
the WEB server;
Nodes that have not been marked as interesting, according
again to the user preferences, do actually present some
interest to the user and therefore they are inserted to the
structure that represents the personalized version of the
WEB server;
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
4
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
In the first case, the structure that represents the
personalized version of the WEB server is reconstructed by
using the algorithms described above.
In order to realize the other two cases, a methodology
for determining the user’s interest for some page of the
WEB server, is necessary. In the current implementation
we examine the use of a parameter of interest, which is
initialized with the value 1 and changes any time the user
accesses the WEB server. How much the value of this
parameter changes depends on the time the user spends on
it and in order to determine this change we use a
modification function, called f(t) with t representing time,
that presents the following features:
• If the user do not spend some time on that page during
his or her visit then the function returns 0 (f(0)=0).
• Otherwise the value it returns increases as the amount
of time spent increases ( f(t1)≥f(t2) if t1≥t2 ).
select ë to be slightly less that 1. If after the calculation of
the new values for the interest parameter of each page,
there exists some page whose interest parameter has value
less than a predetermined limit then that page is removed
from the personalized version of the WEB server.
However, if the user visits pages out of the set of pages that
constitute his or her personalized version of the WEB
server, then those pages receive an interest parameter
whose value is initialized with 0 and modified according to
the above relation. If there exists some page for which the
value of its interest parameter rises above a predetermined
limit the this page is inserted into the personalized version
of the WEB server and the value of its interest parameter is
set to 1. The process of inserting a new page into the
personalized version of the WEB server has already been
analyzed.
The function has a maximum returned value ( ∋t0:
f(t)=c ,if t≥t0 , c is the maximum value of f(t) ).
4. Architecture
Thus, after each navigation in the WEB Server and in
case the user has logged on, no matter if the services
provided by Archimides have been used, the value of this
parameter is modified, and this modification is based on
the well known relation: Wk+1 = ë*Wk + (1-ë)*f(t),
where ë is a parameter that determines the memory of the
whole system, that is how significant is the previous value
of the interested parameter. It is obvious that if a Web page
is not visited then Wk+1 < Wk because 0<ë<1. We usually
In brief, the operations performed by Archimides are
summarized at image 7.
As it is shown on image 7, after the user has connected
to the WEB server, he or she has the potential to alter his
or her preferences, which has as a result the reconstruction
of the structure of the personalized version of the WEB
server according to the new user’s preferences. Afterwards
the user navigates through the WEB server, with or
without the utilization of Archimides.
Image 7. Operational diagram of Archimides
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
5
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
However Archimides observes the behavior of the user
during his or her navigation and when the user is
disconnected then he updates his preferences. If after the
update some remarkable change is noticed (decreasing of
the interested parameter for some pages or increasing for
some others) then the personalized version of the WEB
server is reconstructed.
4.1. Data structures.
In order to perform all the above operations,
Archimides is necessary to have knowledge about the
structure of the WEB server, it operates on, as well as the
preferences of its users. For this reason it uses a database
where it stores all the necessary information. Information
is stored about:
• The structure of the WEB server.
We use a table where we store information about
every page of the WEB server. For each page there are
stored its codification and its code (index) , as these
have been defined above, its URL and its keywords.
The parent-child relationships among the pages of the
WEB server are implicitly stored thanks to the used
naming fashion. As it has already been mentioned, if
the name of some page consists of k characters then its
first k-1 characters constitute the name of its parent
while all the children of that page have name with k+1
characters length with prefix the name of this page.
Furthermore, for each page we store a short summary
that presents what the page contains in brief. This
summary is incorporated into the WEB pages that are
created during the personalization process.
• Archimides’ registered user’s.
Every registered user has an associated record
where all the necessary information is stored. This
information is used by Archimides in order to provide
its personalized and adaptive services and it consists of:
1. the user’s personal information
This information includes data such as name,
address, city, phone, country, occupation, email,
etc. The user is not obliged to provide the above
information, that is not actually used by the
personalization process, however the more we know
about the user the better we may serve him or her.
2. the user’s preferences
It has to be denoted that the preferences are
elements that depend exclusively on the structure
and the contents of the WEB server and obviously
they differ from server to server. The technological
level of the user may be a characteristic preference
example in a WEB server with technical contents.
3. Additional auxiliary elements.
These elements are the user code, that
characterizes him or her and is used for the
construction of the pages that constitute the
personalized version of the WEB server, and his or
her nickname that enables the system to recognize
the user, to observe his or her behavior and
therefore to provide its adaptive and personalized
services.
• We have mentioned, while we are analyzing the
operations performed by Archimides, that the
personalized version of WEB server essentially consists
of a subset of the WEB server’s pages, that derives
from the user’s interests and preferences. It is obvious
that Archimides must have knowledge of this subset
that constitutes the personalized version of the WEB
server. Thus for every user it is created a data structure
that contains all the pages the user prefer. We also
assign an interest parameter to each page, whose
functionality we have analyzed above. Furthermore, as
time passes, new pages are inserted into this data
structure. These pages do not belong to the initial
personalized version of the WEB server, however the
user may has expressed some interest on them (by
spending some time on examining their content). By
performing the data storing in his way we are able to
insert or delete pages into the personalized version of
the WEB server in an easy manner. Within this
structure we also store the parent-child relationships,
because in this case they are not expressed implicitly.
Thus for every page we store the code of its parent
page.
The above data structures provide Archimides with the
potential to offer its adaptive and personalized services to
its users in an easy and consistent manner increasing the
chance for reduction of the user information overload.
5. In Operation.
At the labs of ZEUS S.A. we have implemented a pilot
version of Archimides, that has been incorporated in the
server for the SMARTMEC project .
When the user invokes Archimides, the first contact is
realized through the WWW page shown on image 8. The
user must enter his or her nickname in order to use the
services provided by Archimides. We could predict the
user’s identity by checking the IP address of the browser,
but since a host may be used by many users this is not a
sufficient method for determining the user’s identity. So
we use the nickname which, in conjunction with the IP
address of the browser allows us to fully identify the user.
The IP address is necessary because WWW servers
produce log files based on the IP addresses of their clients
and we need keep tracking of the user’s behavior.
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
6
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Image 8. Logging in.
Image 11. Just before navigation
Image 9. Setting User Information
Image 12. Initial WEB page after personalization
Image 10. Setting Preferences
Image 13. Following the link “What is Smartmec”
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
7
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
If the user has already registered, he or she enters the
nickname, otherwise, if the user wants to register, all that
he or she has to do is to press the submit button.
When a user visits Archimides for the first time he or
she has to provide some personal information, as it is
shown on image 9, that is not however mandatory, except
from the nickname. There is also the option to log as a
different user. The user preferences are inserted through
the form that is presented on image 10. After the user has
submitted his or her own personal information and
preferences and the dedicated personalized version of
WEB server is created, the user has the potential to update
his or her preferences or to navigate within the server by
pressing the newly emerged navigate button (see image
11). The navigation is performed as we have described
above. Thus, according to the preferences expressed on
figure 10, if the user presses the ‘navigate’ button the
navigation begins from the Web page that is shown on
image 12.
The Web pages that are generated by Archimides
contain collections of links in a manner similar to what
search engines return as a result of some query. If the user
has already registered, he or she enters the nickname,
otherwise, if the user wants to register, all that he or she
has to do is to press the submit button.
When a user visits Archimides for the first time he or
she has to provide some personal information, as it is
shown on image 9, that is not however mandatory, except
from the nickname. There is also the option to log as a
different user. The user preferences are inserted through
the form that is presented on image 10. After the user has
submitted his or her own personal information and
preferences and the dedicated personalized version of
WEB server is created, the user has the potential to update
his or her preferences or to navigate within the server by
pressing the newly emerged navigate button (see image
11). The navigation is performed as we have described
above. Thus, according to the preferences expressed on
figure 10, if the user presses the ‘navigate’ button the
navigation begins from the Web page that is shown on
image 12.
The Web pages that are generated by Archimides
contain collections of links in a manner similar to what
search engines return as a result of some query. However,
as they are free from advertisements and provide sufficient
description of what is “hidden” behind the links, the user
may navigate very fast and find desired information very
quickly.
In that way, if the user follows the link “HOME” on
image 12, he or she will be transferred to the home page of
the WEB Server. If, instead, the link “What is Smartmec”
is chosen, the result is shown on image 13. In that case, if
we choose the link “What is Smartmec” we will be
navigated to a page where there is a general description of
the Smartmec project, while choosing for example the link
“Partnership” we are led to the page on image 14. There, if
we choose the link “Partnership” we will move to the page
where the partnership of the Smartmec project is briefly
described.
Image 14. Following the link “Partnership”
Image 15. The “Partnership” page
Thus, we have presented a simple example of
navigation by the help of Archimides, in which we
implement all the algorithms we analyzed above,
providing this way an adaptive personalization.
It has to be mentioned that at any time the user may
“discard” Archimides and continue navigating without its
assistance. However, Archimides will keep tracking his or
her behavior, provided that registration has occurred, and
adapting to his of her changing habits.
You may experience this pilot version of Archimides at
http://www.smartmec.com.
6. Conclusions
The current interface structure of many WEB browsers
as well as the manner in which documents are organized
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
8
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999
within WEB servers do not provide users with the
potential to exploit their time efficiently and effectively. It
is very difficult for them to locate desired information even
within a single server and the utilization of search engines
does not help much. There is great need for introducing
intelligent, adaptive and personalized features in the way
WEB servers interact with their users if we want to help
them become more productive.
Archimides is an intelligent agent that can operate on
any WEB server and offer its services that present the
above desired features, personalization and adaptivity. By
incorporating new algorithms that enable adaptive
personalization it provides users with access only to the
information that has significant possibility to be
interesting. This way it helps its users to exploit their time
more effectively and to increase their productivity.
Archimides is also not restrictive at all. Users are not
prohibited from navigating freely, since the potential for
adaptivity hides mainly behind this free navigation.
However, Archimides will keep tracking their behavior,
provided that registration has occurred, and adapting to
their changing habits.
We intend to continue our work on Archimides and to
introduce features such as collaborative filtering in order to
introduce the user in new areas of interest by exploiting
other user’s experience.
7. References
1. Armstrong R., Freitag D., Joachims T., Mithell T.,
“WebWatcher: A Learning Apprentice for the World
Wide Web”, AAAI Spring Symposium on Information
Gathering, Stanford, CA, March 1995.
2. Caglayan A., Harrison C., “Agent Sourcebook”, John
Wiley & Sons, Inc, 1997.
3. Lieberman H., “Letizia: An Agent That Assists Web
Browsing”, Proceedings of the 1995 International
Joint Conference on Artificial Intelligence, Montreal,
Canada, August 1995
4. Maes P., “Agents That Reduce Work and Information
Overload”, Communications of the ACM, July 1994.
5. Salton G., and McGill M.J., Introduction to Modern
Information Retrieval, McGraw-Hill, Inc., 1983
6. Siaw D., Ngu W., and Wu X., “SiteHelper: Agent that
helps Incremental Exploration of the World Wide
Web”, Proceedings of the sixth international WWW
conference, Santa Clara, California, USA, April 1997.
7. Wooldridge M., and Jennings N. R., “Intelligent
Agents: Theory and Practice”, Knowledge Engineering
Review, Vol. 10, 1995.
0-7695-0001-3/99 $10.00 (c) 1999 IEEE
9