An Agent for Semi-automatic Management of Emails Fangfang Xiaa and Liu Wenyin b a Dept. of Computer Science & Technology, Tsinghua University, Beijing 100084, China b Dept. of Computer Science, City University of Hong Kong, Hong Kong SAR, China [email protected]; [email protected] ABSTRACT Recent growth in the use of emails for communication and the corresponding growth in the volume of emails have made automatic processing of email desirable. However, most existing systems failed to work in practice due to low classification accuracy and inconvenient user interfaces. In this paper, we present an adaptive Personal Email Agent (PEA) which can learn the mail handling preferences of its user and automatically categorize and manage its user’s emails. One of the key ideas in this approach is extracting both the high-level semantic features (e.g., concept information) from the body text and other low-level email features (e.g., sender, time, importance, etc.) from the entire email message for similarity assessment based on the standard Information Retrieval (IR) approach. Another main contribution of our work is establishing both global and local information space models for building relevance categories based on the user’s folders. Besides, a query refinement strategy is incorporated to make the agent act as an incremental learner. That is, it can adjust its working strategy based on only the new examples and avoid a total re-training using all previous examples. To test the effectiveness of our system, we did experiments on its two main functions, email retrieval and relevance categorization and obtained preliminary promising results. Keywords: Email Overload, Email Management, Example -based Learning, Information Retrieval, Content-based Retrieval, Relevance Categories, Query Refinement, Personal Email Agent (PEA) 1. INTRODUCTION The explosion in electronic communication is dramatically changing the way people interact with one another. Email overload [1,2] has become a growing problem since more and more users are embracing the online technologies in recent years. According to Forrester Research, 7 trillion emails are sent per day in 2002 and an estimated 81 percent of organizations that introduced email to improve their efficiency now complain that email is becoming a victim of its own success. IDC estimates that in 2002 the average business user spends an average of over 2.4 hours a day just dealing with an average of 30 work-related messages [2]. These numbers are still increasing or updated every day. To address the problem of email overload, many researchers have done evaluation of some common manual management strategies for emails, including Piorritizers, archivers [3], No filers, Spring cleaners, Frequent filers [2], and Folderless cleaners [4]. Whittaker and Sidner [2] have found that a major aim of filing is to reduce the huge number of undifferentiated inbox items into a relatively small set of folders each containing multiple related messages. Balter [5] has developed a mathematical model to illustrate that storage time is the major time consumer for users with more than a thousand stored messages and the best long term strategy is to use folders sparsely (4 to 20) in combination with the search functionality. He suggested those users who want to use folders use agents that can automatically suggest folders for archiving since the agents could help reduce the storage time drastically and a larger number of folders may help reduce the time to retrieve a message. Hence, the early research focused on a variety of machine learning techniques to classify emails into folders. Among the famous prototypes, SwiftFile used shortcut buttons to archive messages into folders, but only when initiated by the user [6]. Mock used a nearest-neighbor classifier to group inbox emails into categories in his experimental framework [7]. Some projects, such as Enfish Onespace, and Metastorm’s infowise, use information retrieval techniques to measure similarities among folders or individual messages [8]. Other companies, such as Abridge, Plumtree, and Tacit, use rules or user-supplied categories to group emails. There are also flexible email organizers. For example, the Gnus news and mail reading system [9], distributed with recent versions of GNU Emacs has hooks that allow installation of arbitrary programs for filtering and foldering news and mail. Furthermore, there are several open-source email readers which could be modified to include a hook for arbitrary classifiers [10]. With the vast amount of interest and research that has been accomplished with automatic email categorization, why hasn’t the concept been incorporated into existing e mail readers? The current difficulties with automatic email 1 organization exist in the following aspects . First, the user’s folders are usually not well organized and they change over time as new messages are received; this inbox irregularity has set hurdles for accurate classification. Second, most of the learning algorithms are based on statistics, and for the algorithms to perform well, a large amount of data must be on hand; the training time is usually considerable. Third, many of the current algorithms do not learn incrementally : they update by requiring a complete re-training based upon all data, including the original training messages. Fourth, most existing systems provided limited user-oriented functions; they do not allow classification into multiple categories and use imp licit rules that users cannot adjust. In this paper, we focus on the issue of automatic categorization to save the time on archiving (when there are a large number of folders) and present an example-based semi-automatic learning approach for this purpose. A prototype system— Personal Email Agent (PEA) is built based on this approach, which can adapt to an individual user by learning his/her email management preferences from the interaction examples between the user and the email system. Based on the user’s p references, PEA can automatically categorize and manage his/her incoming and/or stored emails. One of the key ideas in this approach is extracting both the high-level semantic features (e.g., concept information) from the text and other low-level email features (e.g., sender, time, importance, etc.) from the entire email message for similarity assessment. Another main contribution of our work is establishing both global and local information space models for building relevance categories based on the user’s folders. Besides, a query refinement strategy is incorporated to make the agent act as an incremental learner. Experiments have shown the effectiveness of the proposed approach. The remainder of this paper is structured as follows. In Section 2, we present our solution of the Personal Email Agent and describe its system architecture and user interface. We then present the core algorithms and other implementation details in Section 3. We will also show the preliminary experimental results of the agent in Section 4. Finally, we conclude and present some directions for future work. 2. SOLUTIONS Many of the difficulties described with classification may be alleviated through better classifiers, while another way to resolve these difficulties is to sidestep the entire problem with an alternate technology. We adopt one alternate technology, Relevance Categories [8], which addresses some of the same information management issues as automatic classification while avoiding many of the problems discussed in the previous section. In order to utilize as much detail information as possible, we extract all useful features from an email message, including sender, receipt, time, topic, body, etc. Different methods are then employed to compute the similarities respectively. The overall similarity between two messages is the weighted sum of these features. Note that, different sets of weights are assigned to the features in different folders. Learning from the user’s feedback, the weights can be adjusted automatically to represent more exactly the user’ s preferences to the diverse features within one folder and thus refine the query of this folder. 2.1 Architecture The architecture of our agent system is shown in Figure 1. The system consists of two components: the user interface, and the core component of Personal Email Agent. The user interface is divided into four parts: two functional parts and two peripheral ones. The functional parts include an email retrieval interface and an email classification interface, both of which provide user feedback interfaces. The system configuration part is where the user can set the parameters and manually adjust part of the folder space coefficients. The non-feedback function part consists of some auxiliary functions such as events logging and message filing according to their category. In the core component, we have three spaces, i.e., the weights space, the local information space and the global information space, five modules, namely, the feature extractor, the nearest-neighbor similarity evaluator, the inverted indexer, the email matcher and the relevance categorizer, and finally two databases which store low-level features and high-level semantic features, respectively. They work together to perform both function and feedback routines. A typical scenario of the system is as follows. Upon installation of the agent, the feature extractor scans all the emails in the user’s personal folder; both low-level and high-level features of the emails are extracted and the corresponding databases are constructed. Then, the nearest-neighbor similarity evaluator and the inverted indexer work simultaneously. The indexer builds the global information space for each folder according to the existing inbox structure; the evaluator compares emails within each folder to set up the local information space and decide the initial weights for the features. Once the three space models are available, the matcher compares the user’s query with the local space model of emails to 2 yield the ret rieval results and the outcome is given in the form of a rank list. The user can denote irrelevant emails which are ranked improperly high, and thus the negative feedback is applied. The relevance categorizer is triggered when a new message comes in or the user adjusts the inbox structure, e.g., moving emails from one folder to another or creating new folders. In these occasions, the agent first updates its database and space models and then refreshes its classification. The agent learns from user feedbacks by refining inner space models to yield more accurate results in the future. Figure 1. Architecture of the PEA 2.2 User Interface We implement our Personal Email Agent as an add-in in Microsoft Outlook 2002 on Windows XP. The basic interface is a supplemental command bar which is indicated within the red (or gray) rectangle (containing the “Email Retrieval”, “Archive”, and “Settings” buttons) in the upper-right part of Figure 2. Upon the first time startup, the scanning process is performed which automatically creates a category out of every folder the user maintains. The messages in the folder are then associated with that category. While the agent is enabled, new emails are automatically classified into the best matching folder. They are only grouped together but not moved immediately. The user can view inbox emails that are grouped into categories and make the mails really go to their assigned folders simply by clicking the “Archive”button. When the user manually adjusts the categorization result in the inbox or move mail from one folder to another, relevance feedbacks are provided and the learning process is then 3 triggered. In these occasions, the agent will automatically show the accompanying changes it made and the user can cancel some of them. The “Email Retrieval” button is used to aid users that wish to search emails. This function provides the capability to quickly display a list of messages ranked by relevance (using the similarity metrics) to the selected messages. In this manner, other messages in the same thread or in the same topic will be displayed at the top of the list. The feedback mechanism is also provided for the email retrieval function. Finally, the “Settings” button is for users to access and change the agent’s parameters such as constants and feature weights. Users can also enable or disable some non-feedback functions and change the running modes there. Figure 2. User Interface of the PEA 3. ALGORITHMS AND IMPLEMENTATIONS 3.1 Feature Extracting and Similarity Assessment There are two kinds of features that can be used in our agent. One is low-level feature, such as sender, time, importance, etc. The other is high-level semantic feature extracted from the subject and body of an email. We compute first the similarity between two e mails at each level and then calculated their weighted sum as the overall similarity. We implement the relevant email retrieval functionality of our agent by similarity assessment. All the emails are compared with the query one and then sorted in the descending order of their similarities. A high rank usually indicates significant relevancy. 3 .1.1 Low-level features We have extracted eight basic low-level features in our agent. They are sender, recipients, creation time , importance, body format and three Boolean variables (IsRead, IsReplied and IsWithAttachment). To compute the similarity, we also incorporate an additive feature “sender-recipients”which is useful in some particular occasions. This is not another independent feature; we add it mostly because of the following concern: In a quite frequent occasion, a user wants to keep all his correspondence with a person in the same folder. However, either the sender or recipients feature alone cannot help him. For example, two emails, one from A to B and the other from B to A, are obviously related, but the similarities calculated based on sender and recipients are both 0. In such case, the sender-recipients feature mingles the sender and receivers into one set and the similarity calculated on it should be 1. This feature is also useful for work groups. The similarities corresponding to each of the features is computed differently and their detailed calculation methods will be presented in an extended version of this paper. 4 3 .1.2 High-level features We have e xtracted two high-level features in our agent. They are subject and body. Since they are both text features, we use the same method to get the comparing results. Our implementation is based upon an inverted index with integrated TF/IDF [11] values. The detailed algorithm will be presented in an extended version of this paper. 3 .1.3 The overall similarity Although there may be many sophisticated similarity assessment methods, we use the simplest similarity models to obtain the overall similarity. With high-level and low-level similarities calculated separately, the overall similarity is simply calculated as the liner combination of them. Note that different folders are assigned with different sets of weights and they are consistent ly refined by user’s feedbacks. This is the key point for our agent to gain intelligence and will be further discussed in the following sections. 3.2 Folder Space and Relevance Categories A key function of our agent is to classify emails according to existing folders. Section 3.1 gives an algorithm of computing the similarity between two individual email messages. In order to assess the similarity between a message and a folder, we should also build a user folder space model, through which the nature of different folders could be well characterized. Many existing systems achieve this goal by assigning each folder a vector compatible with the email vector. Since such vector is usually the average of all the emails in the folder, its weakness in classifying is obvious as described in Section 1. To utilize as much detail information as possible, we explore both global and local properties of a folder in establishing its space model. (More exactly, “folder”here should be replaced by “relevance category”, a concept that will be discussed soon.) Global Information : Global information of a folder is the semantic information of all the messages in that folder (As we shall introduce the relevance categories concept in the following text, the messages in the category linked with this folder should also be included). The messages are concatenated and treated like a single document. The N most frequent terms (either from the body or the subject field) and term frequencies are extracted. (In our agent, N was set to 50 by default.) The resulting terms comprise part of the query for the category that it represents. Note that as the set of messages changes, the queries are simple to update. All that is required is to re-compute the term frequencies. Local Information: Local information of a folder is obtained by the simple nearest-neighbor method. Given a target message to classify, its features are extracted and compared to all messages in the folder using the algorithm introduced in Section 3.1. The top M matches are averaged as the local measure for the category. M was set to 3 by default in our agent. The introduction of local information should be helpful since some users maintain too generic folders (e.g., “Projects”) encompassing multiple irrelevant sub-categories. It is also useful when dealing with topic-drift occasions. The basic concept of Relevance Categories [8] is to provide the same functionality as regular folders or categories. Users can assign email to categories, or remove them from categories just like they are normally used to. Relevance Categories are initially built based on the existing folders in the user’s inbox. When new emails come in, they are automatically assigned to one category by our agent. The user can manually correct the wrongly classifications or assign one email to multiple categories. In these occasions, our agent will refine the queries based on the feedbacks, trying to approach more precisely to the user’s subjective intention. Otherwise, the newly assigned emails will be regarded as members of its category from then on, even though their real movements to the destination folders will not be applied until the user explicitly perform the “Archive”function of our agent. In the computation of the email-category similarity, a unique weight vector indicating the user’s preference placed on different features is assigned to each category to obtain the weighted feature sum. Apart from the global and local information, this weight vector is another important part of the folder space model, which alone builds up the Weights Space. How to compute the weight vector and adjust it based on user feedbacks thus becomes the central problem in our query refinement strategy. 3.3 Query Refinement Strategy Queries are created for each relevance category. Corresponding to the folder space model, the query refinement strategy for our agent could also be divided into two parts, the global query refinement and the local query refinement. 5 Global query refinement is an approach to the precise representation of the global semantic feature of a category. Negative training could be employed for emails the user explicitly denotes as not belonging to the category. These might arise in the agent’s email retrieval function if the user wishes to apply corrective action to highly ranked messages so that they are displayed toward the bottom of the list. To apply negative training, the N most frequent terms are extracted from the negative examples and subtracted from the N most frequent terms from the positive examples. This may result in some terms with negative frequencies. Local query refinement is mainly the adjustment of the weight vector mentioned in Section 3.2. Our agent learns from user feedbacks in order that the weight vector will more and more tally with the user’s subjective emphasis on the features. The detailed algorithm is presented in an extended version of this paper. 4. PERFORMANCE EVALUATION In order to test the two main functions of our agent, email retrieval and email classification, we designed two corresponding experiments. Since the effectiveness of the relevance categories on the purely semantic feature, i.e. our global information space, has been tested by Mock over the Reuters-21578 corpus [8], we will only concentrate on the overall performance of our agent on the multi-feature basis. The test data we use are mainly the daily emails of the authors. The volume is not very large (about 1000). However, it represents a typical user’s situation well. 4.1 Retrieval Accuracy In this experiment, we randomly select a number of emails (the number is less than 20, since usually a user does not have the patience to select more than 5 emails in each iteration or go over more than 4 iterations) belonging to the same category as query (positive feedback) examples and do email retrieval. Since we exactly use 100 emails as our ground truth for each query and we also only actually check first 100 emails, the value of precision and recall are the same. Therefore, we use the term “accuracy”to refer to both. The results are show in Figure 3, with the x axis being the number of query (positive feedback) emails and the y axis the average retrieval accuracy. As the figure shows, the average accuracy of email retrieval exceeds 50% when the number of query emails reaches 10. 80 70 Accuracy 60 50 40 30 20 10 0 1 4 7 10 13 16 19 Number of Query Emails Figure 3. Retrieval Accuracy 4.2 Categorization Accuracy and Feature Abilities The second experiment evaluates the performance of the categorizer on learning a user’s mail sorting preferences from hand-sorted mails. The input data are six months of the first author’s sorted mails. Table 1 shows the folders and distribution of messages in the data set. These data pose an interesting challenge for a learning system. Not only is the distribution of messages in the folders highly non-uniform, but the selection of folders for messages is also strongly idiosyncratic. While the content of the folder “FROM HER” was exclusively determined by a single keyword match (sender=”Arendt”), other folders were not determined by a single keyword match with the “from” or “to” fields, but rather by the subjective judgment of the first author of this paper of what folder would be the best mnemonic for later retrieval of the message based on its content, time, recipients, etc. For example, the “REMINDER” folder only maintains 6 emails received within the recent week, while the “E-MAGAZINE” folder contains various HTML messages the first author of this paper subscribed from various websites. In this case, the task of the agent is to learn a model of the user’s email sorting preferences. Table 1. Hand Archived Emails in Our Experiments. Folder Name Email Count Percentage CS 91 62 5.87% E-MAGAZINE 317 30.0% FROM HER 96 9.08% MISCELLANEOUS 80 15.0% PERSONAL 126 11.9% PHILOSPHY GROOP 30 2.84% PROJECTS 247 23.4% REMINDER 21 1.99% SOCCER 78 14.8% 1057 100% Total Exemples (a) (b) Figure 4. (a) Categorization Accuracy and (b) Feature Discrimination Abilities The results of this experiment are shown in Figures 4 (a) and (b). Through learning, the agent achieves 82% test accuracy after 100 training examples and 87% after 200. The weights of features begin to show the user’s different emphasis on them as the number of training examples increases. We only show three of the features in the figure. However, the trends of features are clear, which proves that the agent is capable of learning a user's preferences by our query refinement strategy. 7 The strategy of our agent has many advantages. First, relevance categories are not such “hard” folders; they are merely an add-on to existing categories and could be ignored and used exactly like a normal category without impacting performance; therefore, the errors made by our agent are more likely tolerated by users. Second, based on the simple similarity-computing algorithm, the management of our agent will still be possible in the presence of sparse data. Third, since both high-level and low-level features are extracted, the agent can handle diverse occasions well. Our agent obviously surpasses the traditional classifiers which focus only on the text features in dealing with categories like “From her”in the above experiment. Fourth, the incorporation of global and local information enables the agent to fit for the various user inboxes that are not well organized. Besides, t he query refinement can be done fast and hence can avoid the problems that most classifiers have regarding to intensive computation at the adjusting stage. 5. CONCLUSION AND FUTURE WORK We pres ent an intelligent agent which can learn from the user’s interactions with the email system and hence can semiautomatically manage the user’s emails. The feature that distinguishes our system from the existing email retrieval or management approaches is fourfold. First, different features of emails are extracted with corresponding similarity assessment methods designed for them. The employment of both high level semantic features and other low level features enables our agent to perform ambidextrously. Second, the adoption of relevance categories for our UI sidesteps some of the common hurdles that its peer systems normally face. Though the concept of relevance categories is really a step back from pure categorization, it allows for multiple or overlapping categories and is more likely to be tolerated by users when classification errors occur. Third, a unique space model is established for each user folder base on both global and local information of its encompassing emails. This makes it possible for the agent to fit a user’ s e mail sorting habits which may be extremely idiosyncratic. Fourth, an efficient query refinement strategy is presented to facilitate the learning process. The next phase is to further refine our space models. For example, noun phrase extraction, better term selection, use of more terms, support for languages other than English and mix languages, variation of test parameters and assumptions, and different similarity metrics might significantly improve the categorization accuracy. Additional work is also required to quantify the performance of current classification algorithms with both test data and user studies. Besides, much work remains to be completed in code enhancements such as latching into more Outlook events, database integration for classifiers, or MS.NET upgrades. Finally, new experiments that integrate classification and information retrieval techniques across email and into calendaring, notes, or other types of data may also be explored. REFERENCES 1. Email overload--facts email_overload.htm and figures: an e -mountain of e-mail. http://www.amikanow.com/corporate/ 2. Whittaker S and Sidner C. Email overload: explo ring personal information management of email. SIGCHI’96, pp. 276-283. 3. Pliskin N. Interacting with electronic mail can be a dream or a nightmare: a user’s point of view. Interacting with Computers 1(3):259-272. 4. Bälter O. Strategies for organizing email messages . SIGCHI’97, pp. 21-38. 5. Bälter O. Keystroke level analysis of email message organization. SIGCHI’2000, pp. 105-112. 6. Segal R and Kephart J. Incremental learning in SwiftFile. ICML’2000. 7. Mock K. An experimental framework for email categorization and management. SIGIR’ 2001. 8. Mock K. Dynamic email organization via relevance categories. ICTAI’99. 9. Ingebrigsten LM. Gnus network user services. http://www.gnus.org/. 10. Malone TW, Lai KY, and Fry C. Experiments with oval: a radically tailorable tool for cooperative work. ACM TOIS 13(2):177-205 11. Salton G. Automatic Text Processing, Addison-Wesley, 1989. 8
© Copyright 2026 Paperzz