Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 “ARCHIMIDES”: An Intelligent Agent for Adaptive - Personalized Navigation within a WEB Server Nikos Bogonikolos ZEUS Consulting S.A. Trade Center - Georgiou A’ Square & Riga Feraiou 93 Patra 26221 GREECE +30 61 622655 [email protected] Dimitris Fragoudis* Computer Engineering & Informatics Department University of Patras 26500 GREECE +30 61 622655 [email protected] Abstract 1. Introduction With the explosive growth of Internet and the volume of information published on it, the search and retrieval of desired information has become practically impossible, if its source is not known in advance. This is the reason why search engines have been emerged, aiming to relieve the user from the “lost in hyperspace” feeling and the information overload. Imagine, however, cases where the result of some query to a search engine contains hundreds of thousands of URLs (Uniform Resource Locators). With such a number of URLs, search engines become in practice inefficient, if we consider that the navigation through even a few decades of URLs is very tiring and time consuming. Thus, instead of trying to address the information overload problem with search engines and robots (spiders), we believe that each server should facilitate itself the retrieval of desired information, published on its own domain. In this paper we present “Archimides”, an intelligent agent that aims to provide intelligent, adaptive and personalized navigation within a WEB server. Provided a subset of the set of keywords that characterize the server’s contents, Archimides undertakes the task to perform an intelligent information retrieval and afterwards to construct a personalized version of the server in the form of an index to pages that present some interest to the user. This index does not resemble what search engines return as a result of some query; it could be probably regarded as a much sorter version of the WEB server, with links that are dynamically inserted or deleted according to the user’s interests, preferences and behavior, providing Archimides with the feature of adaptivity. As a result the user navigates in a WEB server that may completely present interest to him or her, thus relieving the user from undesired information overload.. * Contact Person: Dimitris Fragoudis Spiros Likothanassis Computer Engineering & Informatics Department University of Patras 26500 GREECE +30 61 997755 [email protected] With the explosive growth of Internet (it is estimated that there are currently 13 million hosts in operation, with this number doubling every year [2]) and the volume of information published on it, the search and retrieval of desired information has become practically impossible if its source is not known in advance. This is the reason why search engines have been emerged, aiming to help people find desired information. Provided some keywords that characterize the desired information to be looked for, they present a set of URLs that possibly contain the sought information. For this reason a kind of smart indexing is used, where every URL is characterized by a set of keywords, and sophisticated information retrieval methods. Imagine, however, cases where the result of some query to a search engine contains hundreds of thousands of URLs. With such a number of URLs, search engines become in practice inefficient, if we consider that the navigation through even a few decades of URLs is very tiring and time consuming. At the same time, after following a proposed hyperlink, the dominant method of searching and exploring all this information is today the “direct manipulation method”. The current interface structure of many WEB browsers as well as the manner in which the documents are organized within a WEB server encourage depth first search, since every time one descends a level, the choices to the next lower level are immediately displayed. He or she has to return to the previous level in order to explore choices to the same level, that is a two step process in the interface. Thus, since users usually explore in a relatively indirect fashion, they tend to explore links downwards in a depth first fashion. This tendency leads, after a while, the user to a very deep stack of previously chosen documents and a “lost in hyperspace feeling”. At this point the intelligent agents technology is introduced, aiming to improve the quality of the services 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 1 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 provided, to save some precious user time and probably to preserve his or her mental sanity. information overload. 3.1. Setting preferences 2. Related Work It is not in the scope of the current paper to examine the notion of intelligent agents. Many researchers have already dealt with this matter, from Wooldridge and Jennings in [7], to Maes [4], and everyone has given a definition based on his or her point of view. However one thing is certain: Intelligent agents have already been widely used in helping people on their “battle” against information overload. Most similar to our work is SiteHelper [5], which follows the same philosophy, “let servers do their own housekeeping”, however SiteHelper acts much more like a sophisticated search engine within the domain of a Web server. It incorporates incremental machine learning capabilities to help the user explore the Web. It first learns about the user’s areas of interests by analyzing the user’s visit records and then assists the user retrieving information by providing the user with updated information about the Web site. WebWatcher [1] and Letizia [3] are another category of agents that provide the user with recommendations about what links they should follow. Letizia is a user interface agent that assists a user browsing the Web. While the user navigates Letizia tracks his or her behavior, extracts the user’s preferences, explores autonomously with a best-first breadth-first strategy and makes recommendations upon request. In the other hand, WebWatcher is a goal driven Web-interface agent that is incorporated with Web pages and recommends hyperlinks that should be followed in order to achieve the preset goal. Both Letizia and WebWatcher incorporate incremental learning, although in a very different fashion. One of the main problems we had to address was how a user should set his or her preferences. There are currently many techniques for expressing documents as sets of keywords, the Vector Space Model is just one of them and also many others for weighting these keywords, such as the TFIDF measure [6]. However all these techniques are used mainly in cases where we have to face unknown documents and extract their meaning. In our case we deal with known documents and we believe that in such a case it is more accurate to define each page’s keywords by ourselves. Thus some keywords are assigned to every page and the set of all these keywords represent the server’s content area. When the user decides to use Archimides, he or she has to express interest on some of these in order to proceed to what is called “server customization”. 3.2. Server customization Lets assume the following structure of the WEB server: Image 1. Structure of the server 3. Archimides In this paper we present “Archimides”, an intelligent agent that aims to provide intelligent, adaptive and personalized navigation within a WEB server. Provided a subset of the set of keywords that characterize the server’s contents, Archimides undertakes the task to perform an intelligent information retrieval and afterwards to construct a personalized version of the server in the form of an index to pages that present some interest to the user. This index does not resemble what search engines produce as a result of some query; it could be probably regarded as a much sorter version of the WEB server with links that are dynamically inserted or deleted according to the user’s interests, preferences and behavior providing Archimides with the feature of adaptivity. As a result the user navigates in a WEB server that may completely present interest to him or her relieving him or her from undesired The nodes with the shading contain information the user is interested in, always with regard to the submitted information about his or her preferences. A unique code number has been assigned at any node, based on the node’s distance from the root of the tree that represents the structure of the WEB server. The encoding process is described below: • The root (the WEB server home page) is assigned 0 • Let a node with encoding k (k has length equal to the depth of the node). We arrange the children of this node. Each child’s code number consists of the prefix k followed by its position in the arrangement of the children. Thus, for node 11, its first child is codified with 111, its second with 112 etc. This type of codification presupposes that every node of the tree have less than 10 children, otherwise we just 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 2 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 assign to digits for the arrangement of the children, increasing that way the maximum legal number of children to100. By using this type of codification we are able to perform an easy and fast search of the ancestors and the descendants of any node. Assuming a node with a k-digits codification , the first k-1 digits represent the codification of its father while all the nodes-children have a codification of k+1 digits where the k-digits prefix is the codification of this node. After the application of the proposed algorithm to the tree that represents the structure of the WEB server, we receive the new personilized version of the WEB server, which in our case is: Image 2. Personalized version of the server This structure is stored to the database that contains all the necessary information for the specified user, while, at the same time, a new HTML representation of this structure is created. This HTML representation constitutes the personalized version of the WEB server and presents the following attributes: • Every page is a collection of links • The first link (it is presented separately from the others) constitutes the root of this page and by following it anyone can reach the actual corresponding URL of the WEB server. • The following Links lead to pages of the same type, where they constitute their root. The URL of the page is the result of the concatenation of the user code number, a 5-digit long string, and the code (index) of its root. If only number digits are used then we can have a maximum of 105 users and 103 pages at the WEB server; if we add all the 26 Latin characters for the encoding, the above numbers increase to 365 ≅ 60000000 pages and 363 ≅ 46500 users respectively. In the current example the start page has as root the home page of the WEB server and links to the sub-trees with roots the nodes 1,212,22 êáé 3. The link to the home page leads to the home page of the WEB server, while by following link 3 for example, we are led to a new page with the node 3 as root and links to the sub-trees with the nodes 31 and 32 as roots. In this new page, if we follow the root link (number 3 in this case) we get to the page 3, while the links 31 and 32 lead to a repetition of the process described above. The proposed algorithms that are responsible for the construction of the personalized version of the WEB server are described below: Construct(v): Assuming a node v it finds all nodes that must posed as children of v. Construct_Tree(x): Assuming a node x it constructs the personalized version of the WEB server below the node x. It is obvious that by calling this function with argument the root of the WEB server (Construct_Tree(root)), we have constructed the personalized version of the entire WEB server. In further details, we have: Construct(v){ S = { }; If v is a leaf then return; For every child x of v do If x has some interest to the user S=S+x otherwise S = S + Construct(x) Construct an HTML page, with the attributes of the HTML representation that has described above, where v is going to be the root and links the elements of S return S } Construct_Tree(x){ S = Construct(x) For every element y of S Construct_Tree(y) } The current realization permits an easy and automated re-construction of the personalized version of the WEB server as the habits and preferences of the user change over time. Thus, in order to remove node x from the structure, we have to perform the following procedure: Delete(x){ S = The set of Links of the page that has x as root Find the page where x is contained as a Link and replace it with S Delete the page that has x as root } while, in order to add a new node x to the structure , we have to perform the following procedure: 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 3 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Add(x){ If now we remove nodes 3 and 22 for example, we will have: v=x Repeat v = father(v) until v has some interest to the user S = The set of Links of the page that has v as root S’ = The set of Links of the page that has v as root that also belong to the sub-tree with the node x as root. For the current page set S = S - S’ +x Rebuild this page Create a new page with x as root and links the elements of S’ } To add a set S_new of new nodes to the structure, we first arrange the elements of S_new sorted by depth and after we call Add() for each element of S_new. This way guarantees the minimum possible reconstruction cost of the personalized version of the WEB Server. Image 5. Structure of the server after removing nodes 3 and 22 and the structure of the personalized version of the WEB Server becomes: Thus, if we add the nodes 2 and 221 the structure Image 6. Structure of the personalized version of the server after removing nodes 3 and 22 3.3. Adapting to user preferences changes to: Image 3. Structure of the server after adding nodes 2 and 221 and the structure of the personalized version of the WEB Server becomes: Image 4. Structure of the personalized version of the server after the addition of nodes 2 and 221 Until now we have presented, by using simple but effective algorithms, how a WEB server may be personalized according to the preferences and habits of its users. This personalization is not static, but it changes over time dynamically as the user’s preferences and habits change, giving to Archimides the feature of adaptivity. There are three possible reasons that may cause the reconstruction of the structure of the personalized version of the WEB server: • The user may explicitly alter his or her interests through provided options-preferences; • Nodes that have been marked as interesting, according to the user preferences, do not actually present any interest to the user and therefore they are deleted from the structure that represents the personalized version of the WEB server; Nodes that have not been marked as interesting, according again to the user preferences, do actually present some interest to the user and therefore they are inserted to the structure that represents the personalized version of the WEB server; 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 4 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 In the first case, the structure that represents the personalized version of the WEB server is reconstructed by using the algorithms described above. In order to realize the other two cases, a methodology for determining the user’s interest for some page of the WEB server, is necessary. In the current implementation we examine the use of a parameter of interest, which is initialized with the value 1 and changes any time the user accesses the WEB server. How much the value of this parameter changes depends on the time the user spends on it and in order to determine this change we use a modification function, called f(t) with t representing time, that presents the following features: • If the user do not spend some time on that page during his or her visit then the function returns 0 (f(0)=0). • Otherwise the value it returns increases as the amount of time spent increases ( f(t1)≥f(t2) if t1≥t2 ). select ë to be slightly less that 1. If after the calculation of the new values for the interest parameter of each page, there exists some page whose interest parameter has value less than a predetermined limit then that page is removed from the personalized version of the WEB server. However, if the user visits pages out of the set of pages that constitute his or her personalized version of the WEB server, then those pages receive an interest parameter whose value is initialized with 0 and modified according to the above relation. If there exists some page for which the value of its interest parameter rises above a predetermined limit the this page is inserted into the personalized version of the WEB server and the value of its interest parameter is set to 1. The process of inserting a new page into the personalized version of the WEB server has already been analyzed. The function has a maximum returned value ( ∋t0: f(t)=c ,if t≥t0 , c is the maximum value of f(t) ). 4. Architecture Thus, after each navigation in the WEB Server and in case the user has logged on, no matter if the services provided by Archimides have been used, the value of this parameter is modified, and this modification is based on the well known relation: Wk+1 = ë*Wk + (1-ë)*f(t), where ë is a parameter that determines the memory of the whole system, that is how significant is the previous value of the interested parameter. It is obvious that if a Web page is not visited then Wk+1 < Wk because 0<ë<1. We usually In brief, the operations performed by Archimides are summarized at image 7. As it is shown on image 7, after the user has connected to the WEB server, he or she has the potential to alter his or her preferences, which has as a result the reconstruction of the structure of the personalized version of the WEB server according to the new user’s preferences. Afterwards the user navigates through the WEB server, with or without the utilization of Archimides. Image 7. Operational diagram of Archimides 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 5 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 However Archimides observes the behavior of the user during his or her navigation and when the user is disconnected then he updates his preferences. If after the update some remarkable change is noticed (decreasing of the interested parameter for some pages or increasing for some others) then the personalized version of the WEB server is reconstructed. 4.1. Data structures. In order to perform all the above operations, Archimides is necessary to have knowledge about the structure of the WEB server, it operates on, as well as the preferences of its users. For this reason it uses a database where it stores all the necessary information. Information is stored about: • The structure of the WEB server. We use a table where we store information about every page of the WEB server. For each page there are stored its codification and its code (index) , as these have been defined above, its URL and its keywords. The parent-child relationships among the pages of the WEB server are implicitly stored thanks to the used naming fashion. As it has already been mentioned, if the name of some page consists of k characters then its first k-1 characters constitute the name of its parent while all the children of that page have name with k+1 characters length with prefix the name of this page. Furthermore, for each page we store a short summary that presents what the page contains in brief. This summary is incorporated into the WEB pages that are created during the personalization process. • Archimides’ registered user’s. Every registered user has an associated record where all the necessary information is stored. This information is used by Archimides in order to provide its personalized and adaptive services and it consists of: 1. the user’s personal information This information includes data such as name, address, city, phone, country, occupation, email, etc. The user is not obliged to provide the above information, that is not actually used by the personalization process, however the more we know about the user the better we may serve him or her. 2. the user’s preferences It has to be denoted that the preferences are elements that depend exclusively on the structure and the contents of the WEB server and obviously they differ from server to server. The technological level of the user may be a characteristic preference example in a WEB server with technical contents. 3. Additional auxiliary elements. These elements are the user code, that characterizes him or her and is used for the construction of the pages that constitute the personalized version of the WEB server, and his or her nickname that enables the system to recognize the user, to observe his or her behavior and therefore to provide its adaptive and personalized services. • We have mentioned, while we are analyzing the operations performed by Archimides, that the personalized version of WEB server essentially consists of a subset of the WEB server’s pages, that derives from the user’s interests and preferences. It is obvious that Archimides must have knowledge of this subset that constitutes the personalized version of the WEB server. Thus for every user it is created a data structure that contains all the pages the user prefer. We also assign an interest parameter to each page, whose functionality we have analyzed above. Furthermore, as time passes, new pages are inserted into this data structure. These pages do not belong to the initial personalized version of the WEB server, however the user may has expressed some interest on them (by spending some time on examining their content). By performing the data storing in his way we are able to insert or delete pages into the personalized version of the WEB server in an easy manner. Within this structure we also store the parent-child relationships, because in this case they are not expressed implicitly. Thus for every page we store the code of its parent page. The above data structures provide Archimides with the potential to offer its adaptive and personalized services to its users in an easy and consistent manner increasing the chance for reduction of the user information overload. 5. In Operation. At the labs of ZEUS S.A. we have implemented a pilot version of Archimides, that has been incorporated in the server for the SMARTMEC project . When the user invokes Archimides, the first contact is realized through the WWW page shown on image 8. The user must enter his or her nickname in order to use the services provided by Archimides. We could predict the user’s identity by checking the IP address of the browser, but since a host may be used by many users this is not a sufficient method for determining the user’s identity. So we use the nickname which, in conjunction with the IP address of the browser allows us to fully identify the user. The IP address is necessary because WWW servers produce log files based on the IP addresses of their clients and we need keep tracking of the user’s behavior. 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 6 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Image 8. Logging in. Image 11. Just before navigation Image 9. Setting User Information Image 12. Initial WEB page after personalization Image 10. Setting Preferences Image 13. Following the link “What is Smartmec” 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 7 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 If the user has already registered, he or she enters the nickname, otherwise, if the user wants to register, all that he or she has to do is to press the submit button. When a user visits Archimides for the first time he or she has to provide some personal information, as it is shown on image 9, that is not however mandatory, except from the nickname. There is also the option to log as a different user. The user preferences are inserted through the form that is presented on image 10. After the user has submitted his or her own personal information and preferences and the dedicated personalized version of WEB server is created, the user has the potential to update his or her preferences or to navigate within the server by pressing the newly emerged navigate button (see image 11). The navigation is performed as we have described above. Thus, according to the preferences expressed on figure 10, if the user presses the ‘navigate’ button the navigation begins from the Web page that is shown on image 12. The Web pages that are generated by Archimides contain collections of links in a manner similar to what search engines return as a result of some query. If the user has already registered, he or she enters the nickname, otherwise, if the user wants to register, all that he or she has to do is to press the submit button. When a user visits Archimides for the first time he or she has to provide some personal information, as it is shown on image 9, that is not however mandatory, except from the nickname. There is also the option to log as a different user. The user preferences are inserted through the form that is presented on image 10. After the user has submitted his or her own personal information and preferences and the dedicated personalized version of WEB server is created, the user has the potential to update his or her preferences or to navigate within the server by pressing the newly emerged navigate button (see image 11). The navigation is performed as we have described above. Thus, according to the preferences expressed on figure 10, if the user presses the ‘navigate’ button the navigation begins from the Web page that is shown on image 12. The Web pages that are generated by Archimides contain collections of links in a manner similar to what search engines return as a result of some query. However, as they are free from advertisements and provide sufficient description of what is “hidden” behind the links, the user may navigate very fast and find desired information very quickly. In that way, if the user follows the link “HOME” on image 12, he or she will be transferred to the home page of the WEB Server. If, instead, the link “What is Smartmec” is chosen, the result is shown on image 13. In that case, if we choose the link “What is Smartmec” we will be navigated to a page where there is a general description of the Smartmec project, while choosing for example the link “Partnership” we are led to the page on image 14. There, if we choose the link “Partnership” we will move to the page where the partnership of the Smartmec project is briefly described. Image 14. Following the link “Partnership” Image 15. The “Partnership” page Thus, we have presented a simple example of navigation by the help of Archimides, in which we implement all the algorithms we analyzed above, providing this way an adaptive personalization. It has to be mentioned that at any time the user may “discard” Archimides and continue navigating without its assistance. However, Archimides will keep tracking his or her behavior, provided that registration has occurred, and adapting to his of her changing habits. You may experience this pilot version of Archimides at http://www.smartmec.com. 6. Conclusions The current interface structure of many WEB browsers as well as the manner in which documents are organized 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 8 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 Proceedings of the 32nd Hawaii International Conference on System Sciences - 1999 within WEB servers do not provide users with the potential to exploit their time efficiently and effectively. It is very difficult for them to locate desired information even within a single server and the utilization of search engines does not help much. There is great need for introducing intelligent, adaptive and personalized features in the way WEB servers interact with their users if we want to help them become more productive. Archimides is an intelligent agent that can operate on any WEB server and offer its services that present the above desired features, personalization and adaptivity. By incorporating new algorithms that enable adaptive personalization it provides users with access only to the information that has significant possibility to be interesting. This way it helps its users to exploit their time more effectively and to increase their productivity. Archimides is also not restrictive at all. Users are not prohibited from navigating freely, since the potential for adaptivity hides mainly behind this free navigation. However, Archimides will keep tracking their behavior, provided that registration has occurred, and adapting to their changing habits. We intend to continue our work on Archimides and to introduce features such as collaborative filtering in order to introduce the user in new areas of interest by exploiting other user’s experience. 7. References 1. Armstrong R., Freitag D., Joachims T., Mithell T., “WebWatcher: A Learning Apprentice for the World Wide Web”, AAAI Spring Symposium on Information Gathering, Stanford, CA, March 1995. 2. Caglayan A., Harrison C., “Agent Sourcebook”, John Wiley & Sons, Inc, 1997. 3. Lieberman H., “Letizia: An Agent That Assists Web Browsing”, Proceedings of the 1995 International Joint Conference on Artificial Intelligence, Montreal, Canada, August 1995 4. Maes P., “Agents That Reduce Work and Information Overload”, Communications of the ACM, July 1994. 5. Salton G., and McGill M.J., Introduction to Modern Information Retrieval, McGraw-Hill, Inc., 1983 6. Siaw D., Ngu W., and Wu X., “SiteHelper: Agent that helps Incremental Exploration of the World Wide Web”, Proceedings of the sixth international WWW conference, Santa Clara, California, USA, April 1997. 7. Wooldridge M., and Jennings N. R., “Intelligent Agents: Theory and Practice”, Knowledge Engineering Review, Vol. 10, 1995. 0-7695-0001-3/99 $10.00 (c) 1999 IEEE 9
© Copyright 2025 Paperzz