Decision Support Systems 43 (2007) 46 – 61 www.elsevier.com/locate/dss An information delivery system with automatic summarization for mobile commerce Christopher C. Yang a,*, Fu Lee Wang b a Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China b Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong, China Available online 14 July 2005 Abstract Wireless access with handheld devices is a promising addition to the WWW and traditional electronic business. Handheld devices provide convenience and portable access to the huge information space on the Internet without requiring users to be stationary with network connection. Many customer-centered m-services applications have been developed. The mobile computing, however, should be extended to decision support in an organization. There is a desire of accessing most update and accurate information on handheld devices for fast decision making in an organization. Unfortunately, loading and visualizing large documents on handheld devices are impossible due to their shortcomings. In this paper, we introduce the fractal summarization model for document summarization on handheld devices. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for wireless access. The three-tier architecture with the middle-tier conducting the major computation is also discussed. Visualization of summary on handheld devices is also investigated. The automatic summarization, the three-tier architecture, and the information visualization are potential solutions to the existing problems in information delivery to handheld devices for mobile commerce. D 2005 Elsevier B.V. All rights reserved. Keywords: Document summarization; Mobile commerce; Fractal summarization; Handheld devices; Financial news delivery 1. Introduction The advance of mobile network creates business opportunities and provides value-added services to users. Access to the Internet through mobile phones * Corresponding author. Tel.: +852 2609 8239; fax: +852 2603 5505. E-mail address: [email protected] (C.C. Yang). 0167-9236/$ - see front matter D 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.dss.2005.05.012 and other handheld devices is growing significantly in recent years. Many m-services applications have been developed for the handheld devices [5–8,38]. However, most current applications are customer-centered applications, for example, users can now surf the web, check e-mail, read news, and quote stock price, etc., using handheld devices. The mobile computing should not be limited to user-centered applications only. In this age of information, mobile applications C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 should be extended to decision support in an organization. With a fast paced economy, organizations need to make decisions as fast as possible, they must gain competitive advantage by having access to the most current and accurate information available. For instance, a huge amount of financial news is generated everyday, and access to update financial information is important during decision making. On the other hand, the executives of an organization need make decision when they are on the road. As a result, there is an urgent need of information access through handheld devices. There are many shortcomings associated with handheld devices although the development of handheld devices is fast in the recent years. The shortcomings include limited screen size with low resolution, low bandwidth, and low memory capacity. It is impossible to search and visualize the critical information on a small screen with an intolerable slow downloading speed using handheld devices. Automatic summarization summarizes a document for users to preview its major content. Users may determine if the information fits their needs by reading the summary instead of browsing the whole document one by one. The amount of information displayed and downloading time are significantly reduced. Active researches on automatic summarization have been carried out. Summarization techniques have been applied to delivery of information on handheld devices [3,42]. Traditional summarizations do not consider the hierarchical structure of document but consider the document as a sequence of sentences. Most traditional summarization systems extract sentences from the source document and concatenate together as summary. However, it is believed that the document summarization on handheld devices must make use of btree viewQ [7] or bhierarchical displayQ [32]. Similar techniques have been applied to web browsing, an outline processor organizes the web page in a tree structure and the user click the link to expand the subsection and view the detail [4]. Hierarchical display is suitable for navigation of a large document and it is ideal for small area display. Therefore, a new summarization model with hierarchical display is required for summarization on handheld devices. Summarization of web pages on handheld devices has been investigated [5–8]. However, a large document exhibits totally different characteris- 47 tics from web pages. A web page usually contains a small number of sentences that are organized into paragraphs, but a large document contains much more sentences that are organized into a more complex hierarchical structure [48,49]. Besides, the summarization on webpage is mainly based on thematic features only [6]. However, it has been proved that other document features play a role as important as the thematic feature [10,22]. Therefore, a more advance summarization model combined with other document features is required for browsing of large document and other information sources on handheld devices [49–51]. With powerful summarization tool, the ability of handheld devices will be greatly enhanced. Visualization of various information sources becomes feasible, for example, financial news delivery to handheld devices. It provides an information visualization tool for m-commerce. The information access through mobile devices has drawn attention from researchers in recent years. Mobile devices support instant information and data access that is capable to support decision making. For example, Mendelsohn [35] has presented how physicians may request information to answer questions that occur at the point-of-care through mobile devices. A number of researchers have worked on advanced text input entry for handheld device [29,30,39]. More specifically, there are researches that focus on navigation of documents on mobile devices. Lamming et al. [26] have presented the Satchel system that used tokens to represent documents on mobile device and developed a context-sensitive user interface. In this work, we focus on the delivery of information to handheld devices with the support of automatic summarization. Example of how the proposed system supports the delivery of Yahoo! financial information is presented. The paper is organized as follows. Section 2 discusses the shortcomings of handheld devices and information visualization on handheld devices. The three-tier architecture, which reduces the computing load of the handheld devices, is used. Section 3 proposes the fractal summarization model based on the statistical data and the hierarchical structure of documents. Thematic, location, heading and cue features are adopted. Experiments have been conducted and the results show that the fractal summarization outperforms the traditional summarization. Fractal C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 Web Server HTML XML Document Server 2. Document delivery architecture on handheld devices Traditionally, two-tier architecture is typically utilized for Internet access. The user’s PC connects to the Internet directly, and the content loaded will be fed to the web browser and present to the user as illustrated in Fig. 1. However the information available online has increased explosively, it results in a well-recognized problem of information overloading. Advance-searching techniques solve the problem by filtering most of the irrelevant information. However, the precision of most of the commercial search engines is not high. Users may find only a few relevant documents out of a large pool of searching result. Due to the huge volume of documents, it will take a lot of time for the users to browse the searching result one by one and identify the relevant information using desktop computers. Automatic summarization summarizes a document for users to preview its major content. Users may determine if the information fits their needs by reading their summary instead of browsing the whole document one by one. The amount of information displayed and downloading time are significantly reduced. An automatic summarizer is therefore introduced to summarize a document for the users to preview before presenting the whole document. As shown in Fig. 2, the content will be first fed to the summarizer after loading to the user’s PC. The summarizer connects to database server when necessary and generates a summary to display on the browser. The convenience of handheld devices allows information access without geometric limitation; however, SQL DB Server for Summarizer User’s PC there are other limitations of handheld devices that restrict their capability. Although the development of wireless handheld devices is fast in recent years, there are many shortcomings associated with these devices, such as screen size, bandwidth, and memory capacity. There are two major categories of wireless handheld devices, namely WAP-enabled mobile phones and wireless PDAs. At present, the typical display size of popular WAP-enabled handsets and PDAs are relatively small in comparison with a standard PC. The memory capacity of a handheld device greatly limits the amount of information that can be stored. A large document cannot be entirely downloaded to the handheld device and present to user directly. The current bandwidth available for WAP is relatively narrow. It is not comparable with the broadband internet connection for PC. The handheld devices impose other constraints that do not exist on desktop computers. The two-tier architecture cannot be applied on handheld devices since the computing power of handheld devices is insufficient to perform summarization and the network connection of mobile network does not provide sufficient bandwidth for navigation between the summarizer and other servers. The three-tier architecture as illustrated in Fig. 3 is proposed. A WAP gateway is setup to conduct the summarization. The WAP gateway connects to InterHTML Fig. 1. Document browsing on PC. SQ DB Server for Summarizer Browser Browser User’s PC Document Server Summarizer WML XML User User Fig. 2. Document browsing with summarizer on PC. Web Server Web Server Browser summarization is further extended to information browsing on handheld devices. Section 4 will demonstrate the financial news delivery on handheld devices. Summarizer 48 Wireless Handhel PDA Local Synchronization WAP Gateway Fig. 3. Document browsing with summarizer on WAP. User C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 net trough broadband network. The wireless handheld devices can conduct interactive navigation with the gateway through wireless network to retrieve the summary piece by piece. Alternatively, if the PDA is equipped with more memory, the complete summary can be downloaded to PDA through local synchronization. 3. Automatic summarization 3.1. Traditional summarization Traditional automatic text summarization is the selection of sentences from the source document based on their significance to the document [10,28]. The selection of sentences is conducted based on the salient features of the document. The thematic, location, heading, and cue features are the most widely used summarization features. ! The thematic feature is first identified by Luhn [28]. Edmundson proposed to assign the thematic weight to keyword based on term frequency, and the sentence thematic score as the sum of thematic weight of constituent keywords [10]. In information retrieval, absolute term frequency by itself is considered as less useful than term frequency normalized to the document length and term frequency in the collection [18]. As a result, the tfidf (Term Frequency, Inverse Document Frequency) method is proposed to calculate the thematic weight of keyword [41]. ! The significance of sentence is indicated by its location [2] based on the hypotheses that topic sentences tend to occur at the beginning or in the end of documents or paragraphs [10]. Edmundson proposed to assign positive weights to sentences as sentence location score according to their ordinal position in the document, i.e., the sentences in the first and last paragraphs and the first and last sentences of the paragraphs. There are several functions proposed to calculate the location weight of sentences. Alternatively, the preference of sentence location can be stored in a list called Optimum Position Policy, and the sentences will be selected base on their order in the list [27]. 49 ! The heading feature is proposed based on the hypothesis that the author conceives the heading as circumscribing the subject matter of the document. When the author partitions the document into major sections, he summarizes them by choosing appropriate headings [10]. The formulation of heading weight is very similar to the thematic feature. A heading glossary is a list consisting of all the words in headings and subheadings. Positive weights are assigned to the heading glossary, where the heading words will be assigned a weight relatively prime to the subheading words. The sentence heading score of sentence is calculated by the sum of heading weight of its constituent words. ! The cue phrase feature is proposed by Edmundson [10] based on the hypothesis that the probable relevance of a sentence is affected by the presence of pragmatic words such as bsignificantQ, bimpossibleQ, and bhardlyQ. A pre-stored cue dictionary is used to identify the cue phases, which comprises of three sub-dictionaries: (i) bonus words, that are positively relevant; (ii) stigma words, that are negatively relevant; and (iii) null words, that are irrelevant. The sentence cue score of sentence is calculated by the sum of cue weight of its constituent words. Typical summarization systems select a combination of summarization features [10,25,27], the total sentence significance score (SSS) is calculated as, SSS ¼ a1 SSthematic þ a2 SSlocation þ a3 SSheading þ a4 SScue ð1Þ where SSthematic, SSlocation, SSheading and SScue are sentence scores based on thematic feature, location feature, heading feature and cue phrase feature, respectively; and a 1, a 2, a 3, and a 4 are positive integers to adjust the weighting of four summarization features. The sentences with sentence significant score higher than a threshold are selected as part of the summary. It has been proved that the weighting of different summarization features does not have any substantial effect on the average precision [25]. In our experiment, the maximum score of each feature is normalized to one, and the sentence significant score is calculated as the sum of scores of all summarization features without weighting. 50 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 Fig. 4. Koch curve at different abstractions level. 3.2. Fractal theory and fractal view for controlling information displayed Fractals are mathematical objects that have high degree of redundancy [31]. These objects are made of transformed copies of themselves or part of themselves (Fig. 4). Mandelbrot was the first researcher who investigated the fractal geometry and developed the fractal theory [31]. In his well-known example, the length of the British coastline depends on measurement scale. The larger the scale is, the smaller value of the length of the coastline is and the higher the abstraction level is. The British coastline includes bays and peninsulas. Bays include subbays and peninsulas include sub-peninsulas. Using fractals to represent these structures, abstraction of the British coastline can be generated with different abstraction degrees. Fractal theory is grounded in geometry and dimension theory. Fractals are independent of scale and appear equally detailed at any level of magnification. Such property is known as self-similarity. Any portion of a self-similar fractal curve appears identical to the whole curve. If we shrink or enlarge a fractal pattern, its appearance remains unchanged. Fractal view is a fractal-based method for controlling information displayed [23]. Fractal view provides an approximation mechanism for the observer to adjust the abstraction level and therefore control the amount of information displayed. At a lower abstraction level, more details of the fractal object can be viewed. A physical tree is one of the classical examples of fractal objects. A tree is made of a lot of sub-trees; each of them is also a tree. By changing the scale, the different levels of abstraction views are obtained (Fig. 5). The idea of fractal tree can be extended to any logical tree. The degree of importance of each node is represented by its fractal value. The fractal value of root is 1, it propagated to other nodes with the following expression: ( Fvroot ¼ 1 ð2Þ Fvchild mode of x ¼ C Fv1=Dx Nx where Fvx is the fractal value of node x; C is a constant between 0 and 1 to control rate of decade; N x is the number of child nodes of node x; and D is the fractal dimension. Mandelbrot had shown that the fractal dimension of a tree is 1, because its total length is finite and positive [31]. To simplify our discussion, both C and D are considered to be 1. A threshold value is chosen to control the amount of information displayed, the nodes with a fractal value less than the threshold value will be hidden (Fig. 6). By changing the threshold value, the user can adjust the amount of information displayed. 3.3. Fractal summarization Advance summarization techniques take the document structure into consideration to compute the probability of a sentence to be included in the summary. Many studies of human abstraction process had shown that the human abstractors extract the topic Fig. 5. Fractal view for logical tree at different abstraction level. C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 51 generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for m-commerce. Fractal summarization is developed based on the fractal theory. In fractal summarization, the important information is captured from the source text by exploring the hierarchical structure and salient features of the document. A condensed version of the document that is informatively close to the original document is produced iteratively using the contractive transformation in the fractal theory. Similar to the fractal geometry applying on the British coastline where the coastline includes bays, peninsulas, sub-bays, and sub-peninsulas, large document has a hierarchical structure with several levels, chapters, sections, subsections, paragraphs, sentences, and terms. A document is considered as prefractal that are fractal structures in their early stage with finite recursion only [12]. A document can be represented by a hierarchical structure as shown in Fig. 7. A document consists of chapters. A chapter consists of sections. A section may consist of subsections. A section or subsection consists of paragraphs. A paragraph consists of sentences. A sentence consists of terms. A term consists of words. A word consists of characters. A document structure can be considered as a fractal structure. At the lower abstraction level of a document, more specific information can be obtained. Although a document is not a true mathematical fractal object since a document cannot be viewed in an infinite abstraction level, we may consider a document as a prefractal. The smallest unit in a document is character; Fig. 6. An example of the propagation of fractal values. sentences according to the document structure from the top level to the low level until sufficient information has been extracted [11,15]. However, most traditional automatic summarization models consider the source document as a sequence of sentences but ignoring the structure of document. Some summarization systems may calculate sentence weight partially based on the document structure, but they extract sentences in a linear space. None of the current summarization model is developed on the foundation of the hierarchical structure of documents. Fractal Summarization Model is proposed here to generate summary based on document structure. Fractal summarization generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the Document Chapter Section Subsection Paragraph Sentence Character Subsection ... ... Term Word Word Character ... ... Paragraph Sentence Term Section ... Fig. 7. Prefractal structure of document. ... Chapter ... 52 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 however, neither a character nor a word will convey any meaningful information concerning the overall content of a document. The lowest abstraction level in our consideration is a term. The Fractal Summarization Model applies techniques from fractal view and fractal image compression [1,21]. In fractal image compression, an image is regularly segmented into sets of non-overlapping square blocks, called range-blocks, and then each range-block is sub-divided into subrange-blocks, until a contractive mapping can be found to represent this subrange-block. The Fractal Summarization Model generates the summary by a recursive deterministic algorithm based on the iterated representation of a document. The original document is represented as fractal tree structure according to its hierarchical document structure. Then, the system calculates the fractal value of each node and allocates the sentence quota base on the fractal value. The calculation of fractal value will be discussed later in this section. Given a document, a user can specify compression ratio to indicate the amount of information displayed. The summarization system calculates the number of sentences to be extracted as summary accordingly and the system assigns the number of sentences to the root of document tree as the quota of sentences. The quota of sentences is allocated to child nodes by propagation, i.e., the quota of parent node is shared by its child nodes directly proportional to the fractal value of the child nodes. The quota is then iteratively allocated to child nodes of child nodes until the quota allocated is less than a threshold value and the range-block can be transformed to some key sentences by traditional summarization methods (Fig. 8). The detail of the Fractal Summarization Model is shown as the following algorithm: Fractal Summarization Model 1. Choose a Compression Ratio. 2. Choose a Threshold Value. 3. Calculate the Sentence Number Quota of the summary. 4. Divide the document into range-blocks. 5. Transform the document into fractal tree. 6. Set the current node to the root of the fractal tree. 7. Repeat 7.1 For each child node under current node, Calculate the fractal value of child node. 7.2 Allocate Quota to child nodes in proportion to fractal values. 7.3 For each child nodes, If the quota is less than threshold value Select the sentences in the range-block by extraction Else Set the current node to the child node Repeat Step 7.1, 7.2, 7.3 8. Until all the child nodes under current node are processed The compression ratio of summarization is defined as the ratio of number of sentences in the summary to the number of sentences in the source document. It was chosen as 25% in most literatures because it has been proved that extraction of 20% sentences can be as informative as the full text of the source document [36], those summarization systems can achieve up to a 96% precision [10,22,44]. However, Teufel pointed out the high-compression ratio abstracting is more useful, and 49.6% of precision is reported at 4% compression ratio [43,44]. In order to minimize the bandwidth requirement and reduce the pressure on computing power of handheld devices, the default value of compression ratio is chosen as 4% for fractal summarization on handheld devices. On the other hand, a threshold value is the maximum number of sentences can be extracted from a range-block that is a node in the document tree. If the quota allocated is larger than the threshold value, the range-block must be divided into subrange-block. Document summarization is different from image compression, more than one attractor can be chosen in one range-block. The summarization by extraction of fixed number of sentences is proven; the optimal length of summary is 3 to 5 sentences [16]. The default value of threshold is chosen as 5 in our system. The fractal value Fv of range-block r is computed based on Range-block Significance Score (RBSS), C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 53 Document Fractal: 1 Quota: 40 Chapter 2 Fractal: 0.5 Quota: 20 Chapter 1 Fractal: 0.3 Quota: 12 Section 1.1 Fractal: 0.1 Quota: 4 Section 1.2 Fractal: 0.15 Quota: 6 Section 1.3 Fractal: 0.05 Quota: 2 Section 2.1 Fractal: 0.1 Quota: 3 Paragraphs ... Chapter 3 Fractal: 0.2 Quota: 8 Section 2.2 Fractal: 0.25 Quota: 10 Section 2.3 Fractal: 0.15 Quota: 7 Paragraphs... Paragraphs... Section 3.1 Fractal: 0.12 Quota: 5 Section 3.2 Fractal: 0.8 Quota: 3 Fig. 8. An example of fractal summarization model. where the RBSS is computed as the sum of the normalized weights of the sentence scores based on the four salient features as introduced in Section 3.1. FvðrÞ ¼ 8 1 > > > < 0 1 D1 C B C Fvð parent of rÞ @ P RBSSðrÞ A > > RBSSð xÞ > : x a sibling of r if r is root otherwise ð3Þ The system utilizes the fractal value to compute the sentence quota of each range-block by sharing the quota of parent node directly proportional to the fractal value of the child nodes, until the sentence quota allocated is less than the threshold value, and the system can extract sentences directly from the range-block. During extraction of sentences in the range-block, the system will extract the sentences according to the fractalized sentence weight of the sentences. 3.4. Experimental result It is believed that a full-length text document contains a set of subtopics [20] and a good quality summary should cover as many subtopics as possible, experiments showed that the fractal summarization model produces a summary with a wider coverage of information subtopic than traditional summarization model. Experiment of fractal summarization and traditional summarization with four unweighted features described in Section 3.1 was conducted on Hong Kong Annual Report 2000. In the experiment, it is found that the traditional summarization model extracts most of sentences from few chapters. As shown in Table 1, the traditional summarization extracts 29 sentences from one chapter when the sentence quota is 80 sentences, and 53 sentences are extracted from top 3 chapters out of total 23 chapters, no sentence is extracted from 8 chapters. However, the fractal summarization extracts the sentences evenly from each chapter. It extracts maximum 8 sentences and minimum 1 sentence from each chapter (Table 1). The standard deviation of sentence number extracted from chapters is 2.11 sentences in fractal summarization against 6.55 sentences in traditional summarization. Researchers believed that a good summary should find diverse topic areas in the text and reduce the redundancy of information contents in the summary [37]. Fractal summarization extracts the sentences distributively from the document, therefore it finds diverse topic areas and reduces the redundancy of information at the same time. A user evaluation is conducted. Ten subjects were asked to evaluate the quality of summaries of 23 documents generated by fractal summarization and traditional summarization. Both summaries of all documents are assigned to each subject in random order without telling the generation methods of the summaries. The results show that all subjects consider the summary generated by fractal summarization method as a better summary. In order to compare the result in more great detail, we calculate the precision as the number of relevant sentences in the summary accepted by the user divided by the number of sentences in the summary (Table 2). The fractal summarization can achieve up to 91.25% precision and 87.16% on average, while the traditional summarization can achieve up to a maximum of 77.50% precision and 67.00% on average. One-tailed T-test has shown that the precision of fractal summarization model outperforms the traditional summarization significantly at 99% confidence level. Experiment of fractal summarization and traditional summarization with four unweighted features described in Section 3.1 was conducted on Hong Kong Annual Report 2000. In the experiment, it is found that the traditional 54 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 Table 1 Number of sentences extracted by two summarization models from Hong Kong Annual Report 2000 Chapter ID Chapter title Number of sentences extracted in fractal summarization model 1 Hong Kong: Asia’s World City Constitution and Administration The Legal System The Economy Financial and Monetary affairs Commerce and Industry Employment Primary Production Education Health Social Welfare Housing Land, Public Works and Utilities Transport Infrastructure The Environment Travel and Tourism Public Order Communications, the Media and Information Technology Religion and Custom Recreation, Sport and the Arts Population and Immigration History 6 3 4 1 2 5 8 0 14 29 6 10 2 1 2 1 1 1 4 2 0 1 0 0 0 0 5 1 4 1 5 6 3 0 1 1 2 6 2 0 5 3 3 1 Table 2 Precision of summaries of Hong Kong Annual Report 2000 for two summarization models 5 3 User ID Fractal summarization model (%) Traditional summarization model (%) User User User User User User User User User User 81.25 85.00 80.00 85.00 88.75 81.25 91.25 86.25 85.00 87.50 71.25 67.50 56.25 63.75 77.50 61.25 76.25 58.75 65.00 72.50 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Number of sentences extracted in traditional summarization model tence from each chapter (Table 1). The standard deviation of sentence number extracted from chapters is 2.11 sentences in fractal summarization against 6.55 sentences in traditional summarization. Researchers believed that a good summary should find diverse topic areas in the text and reduce the redundancy of information contents in the summary [37]. Fractal summarization extracts the sentences evenly from the document; therefore it finds diverse topic areas and reduces the redundancy of information at the same time. We have also conducted another experiment using the tested TIPSTER Text Summarization Evaluation (SUMMAC) data. The TIPSTER Text Summarization Evaluation (SUMMAC) is the first large-scale, developer-independent evaluation of automatic text summarization systems [33]. The documents for the TIPSTER evaluation are drawn from Text Retrieval (TREC) [19] CDs 4 and 5. The categorization task in SUMMAC focuses on generic summaries. The task sought to find out whether a generic summary could effectively present sufficient information to allow an analyst to quickly and correctly categorize a document. In order to compare the performance with other summarization systems, we have conducted the categorization task to evaluate the performance of the fractal summarization model. We have followed the standard TIPSTER setting to conduct the classification task using fractal summarization. 100 documents are selected from the TIPSTER corpus, and 10% of sentences are extracted from each document by fractal summarization system. 15 subjects are involved in the summarization model extracts most of sentences from few chapters. As shown in Table 1, the traditional summarization extracts 29 sentences from one chapter when the sentence quota is 80 sentences, and 53 sentences are extracted from top 3 chapters out of total 23 chapters, no sentence is extracted from 8 chapters. However, the fractal summarization extracts the sentences evenly from each chapter. It extracts a maximum of 8 sentences and a minimum of 1 sen- 1 2 3 4 5 6 7 8 9 10 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 experiment. The subjects need to classify the documents based on the sentences extracted for each document. The experiment result shows that the fractal summarization system has a similar precision as other summarization systems, and it has a better recall. The fractal summarization achieves an Fscore [40] of 0.63 but other summarization systems achieve a mean F-score of 0.42. Therefore, the fractal summarization system outperforms other summarization systems. As the documents used are relatively shorter in their length, most of them contain less than 100 sentences. 3.5. Visualization of fractal summarization on handheld devices The summary generated by Fractal Summarization Model is represented in a hierarchical tree structure. The hierarchical structure of summary is suitable for visualization of information on handheld devices. However, a summary displayed in a small area of handheld devices without visualization effect is still difficult to read, it can be further enhanced by displaying the sentences in different font sizes according to their importance to help user to focus on important information and search for information easily. WML is the markup language supported by wireless handheld devices. The basic unit of a WML file is 55 a deck; each deck must contain one or more cards. The card element defines the content displayed to users, and the card cannot be nested. Each card links to another card within or across decks. Nodes on the fractal tree of fractal summarization model are converted into cards, and anchor links are utilized to implement the tree structure. Given a card of a summary node, there may be a lot of sentences or child nodes. A large number of sentences in a small display area make it difficult to read. Fisheye View is a visualization technique to enlarge the focus of interest and diminish the information that is less important [13]. When a user look at an object, the objects nearby are shown in a larger visual size; and the visual size of other objects is decreased inversely proportional to their distances to the focus point. In our system, we have modified the fisheye view. The size of an object does not depend on its distance from the focus point, but depends on the significance of the object. The sentences are displayed in different font sizes according to their importance. We have implemented the fisheye view with 3-scale font mode available for WML. The sentences or child nodes are sorted by their sentence weights or fractal value and separated evenly into three groups. The group with highest value is displayed in bLargeQ font size, and the group with middle value and the group with lowest value are displayed in bNormalQ and bSmallQ font size, respectively. Fig. 9. Screen capture of WAP summarization system. (a) Hong Kong Annual Report 2000; (b) Chapter 19 of Hong Kong Annual Report 2000, dCommunication, the Media and Information TechnologyT. 56 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 The prototype system using Nokia Handset Simulator is presented in Fig. 9. The document presenting is the Hong Kong Annual Report 2000. There are totally 23 chapters in the annual report, 8 of them are in large font, which means that they are more important, and the rest are in normal font or small font according to their importance to the report (Fig. 9a). The number inside the parentheses indicates the number of sentences under the node that are extracted as part of the summary. The main screen of the Hong Kong Annual Report 2000 gives user a general idea of overall information contents and the importance of each chapter. If the user wants to explore a particular node, user can click the anchor link, and the handheld device sends the request to the WAP gateway, and the gateway then decides whether to deliver another menu or the summary of the node to the user depends on its fractal value and quota allocated. Fig. 9b shows the summary of Chapter 19 of Hong Kong Annual Report 2002, bCommunication, the Media and Information TechnologyQ. A handheld PDA is usually equipped with more memory, and the complete summary can be downloaded as a single WML file to the PDA through local synchronization. To read the summary, the PDA is required to install a standard WML file reader, i.e., KWML for Palm [24]. 4. Summarization for financial news delivery on handheld devices Fractal summarization model summarizes the documents based on hierarchical document structure. In addition to large text document, a lot of other information sources also exhibit hierarchical document structure, such as web-site and newspaper. Due to the huge information available, it is difficult to browse these information sources on handheld devices. Automatic summarization is a possible solution. Theoretically, the fractal summarization is capable to summarize to all these information sources as long as the calculation of fractal value is well formulated. As financial news is critical in decision making, we shall modify the fractal value formula of generic fractal summarization in order to summaries the financial news and we shall demonstrate financial news delivery with fractal summarization on handheld devices. 4.1. Fractal summarization of financial news Fractal summarization model performs the summarization based on hierarchical document structure. In addition to large text documents, a lot of other documents also exhibit hierarchical tree document structure, such as web-site, newspaper, etc. The fractal summarization model is capable to summarize these documents based on their structure and their relationships in categorization; therefore, it is a powerful tool in providing m-services of real time information delivery. As present, a lot of electronic news delivery services have been provided. An example of the fractal summarization model being used to summarize the financial news available from Internet is presented in this section. Newspaper is one of the documents that exhibit the well-defined hierarchical document structure. At present, there is a lot of electronic news delivery services provided for PC, and most of them provide summarization tools to help user to search information, such as Lycos Financial Feed System with summarization system from Diyatech [9], YellowBrix with Inxight’s Summarizer [52] and Columbia’s Newsblaster [34]. However, summarizers for PC platform are not adaptable to mobile devices directly. Moreover, the existing commercial summarizers are indeed extracting the first few sentences from the document or using the primitive summarization model without considering the hierarchical structure of documents or the organization of information. Yahoo!News [47] is one of the most popular online content providers. There are 21 categories in the Yahoo!News. Moreover, each of the categories will be sub-divided into subcategories. Take the Business category as an example (Fig. 10), this category contains financial news and it is sub-divided into six sub-categories, namely, Economy, Stock Markets, Earnings, Personal Finance, Industries and Commentary. Each sub-category contains around 10 news articles. Each news article is a tree structure by itself. For some longer news article, there may exist more than one section, and each section contains few paragraphs, and paragraph contains sentences. The Fractal Summarization of Yahoo!News is very similar to the fractal summarization of large text document, only some minor modifications are C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 57 Yahoo! News - Business Weight: 1 Quota: 40 Commentary Weight: 0.25 Quota: 10 Free Flight Weight: 0.0075 Quota: 3 News ... Earning Weight: 0.25 Quota: 10 Economy Weight: 0.2 Quota: 8 News ... News ... Industries Weights: 0.1 Quota: 4 Personal Finance Weight: 0.15 Quota: 6 Stock Market Weight: 0.05 Quota: 2 News ... Fig. 10. Fractal summarization of yahoo! news-dbusinessT category. required to demonstrate the characteristic of the Yahoo!News. ! Firstly, the headings of categories and subcategories do not have a direct impact on the content of news under the branch; it serves for classification purpose only. As a result, the heading method will consider the headings of news articles only. In addition, the headings of categories can be used for personalization of news delivery, the user can set his preference of each category in advance and the system will adjust the weights accordingly. Alternatively, the preference can be constructed by autolearning of machine in the middle-tier. The WAP gateway can analyze the reading behavior of user and predict the user’s preference. ! The location feature in traditional summarization assumes that the text unit in the beginning or ending is more important. The news articles inside a subcategory are sorted in chronological order. The most recent news is usually considered as more important. Therefore, we propose calculating the location weight of a news article by its chronological position in the subcategory or the time-lag between the news event and browsing time. However, when the system traces the summarization tree down to a node inside a news article, the generic location method in fractal summarization will be adopted. ! In order to provide a glimpse of every article, each news article will receive a sentence quota with at least one sentence. 4.2. Financial news delivery to handheld devices In order to minimize the bandwidth requirement and reduce the pressure on computing power of handheld devices, the summarization of Yahoo!News will be conducted in two levels. As high-compression ratio abstracting is more useful [43] and it can save the network bandwidth, the fractal summarization system generates a brief skeleton of summary with compression ratio equals to 4% at the first stage. The details of the summary at different levels of the news tree are generated on demands of users. When handheld device retrieves the financial news from Yahoo! News-Business, the system will first show a card containing with 6 subcategories of dBusinessT category (Fig. 11). In the figure, three subcategories of Business category are displayed in large font, which means that they are more important; and the rest are in normal font or small font according to their importance. The skeleton of news gives user a general idea how the news articles are organized, and the user can decide which subcategory to go into details. When the user clicks the anchor link of subcategory, the WAP gateway will deliver a card depends on the quota allocated. If a large quota is allocated to the subcategory, the system will show another card containing of index of news article. However, if the quota is less than a threshold value of 5 sentences, the system will show a card with the summary of all news articles in the subcategory. In the summary page, when the user clicks the anchor link dMoreT at end of sentences, the system will generate the summary for the corresponding news articles with compression ratio 20%, because it has been proved that extraction of 20% sentences can be as informative as the full text of the source document [36]. On the other hand, the user can click the anchor link dFullT to view the full text of the news articles. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for m-services. 4.3. Future work In current stage, the fractal summarization is capable to process textual information only. However, there is a 58 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 Fig. 11. Financial news delivery system on mobile devices. lot of information available in multimedia format on the Web. Information delivery of multimedia document will be one of the key research topics in the near future. As the multimedia documents require a much higher bandwidth than textual documents, this problem cannot be resolved solely by the current steaming technology. Summarization of multimedia documents is required for information delivery to mobile devices. The research work of spoken document was initiated in the spoken language track of TREC 1997 [14]. The research of spoken document summarization starts in 2000 [53,54]. Summarization of video has also been investigated [45,46]. It would be a great challenge to move the proposed model to multimedia documents. The summarization of multimedia document is a complementary to the proposed model. Nowadays, most of the mobile devices are speech-based. With the summarization of spoken documents, the information can be easily delivered to speech-based mobile devices. This will certainly increase the popularity of the proposed model. Moreover, it can provide information access for the blind as well [17]. 5. Conclusion Mobile commerce is a promising addition to the electronic commerce by the adoption of portable hand- C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 held devices. However, the mobile computing should not be limited to user-centered m-services applications only, it should be extended to decision making in an organization. With a fast paced economy, organization need to make a decision as fast as possible, access to large text documents or other information sources is important during decision making. Unfortunately, there are many shortcomings of the handheld devices, such as limited resolution and narrow bandwidth. In order to overcome the shortcomings, fractal summarization and information visualization are proposed in this paper, which are critical in decision support in an m-organization. The fractal summarization creates a summary in hierarchical tree structure and presents the summary to the handheld devices through cards in WML. The adoption of keyword feature, location feature, heading feature, and cue feature are discussed. Users may browse the selected summary by clicking the anchor links from the highest abstraction level to the lowest abstraction level. Fractal views are utilized to filter the less important nodes in the document structure, sentences are displayed in different font size to enlarge the focus of interest and diminish the less significant sentences. Such visualization effect draws users’ attention on the important content. The three-tier architecture is presented to reduce the computing load of the handheld devices. The proposed system creates an information visualization environment to avoid the existing shortcomings of handheld devices for mobile commerce. References [1] M.F. Barnsley, A.E. Jacquin, Application of recurrent iterated function systems to images, Proceedings of SPIE Visual Communications and Image Processing (VCIP’88), Cambridge, MA, USA, vol. 1001, SPIE, 1988 (Nov.), pp. 122 – 131. [2] P.B. Baxendale, Machine-made index for technical literature— an experiment, IBM Journal of Research and Development 2 (4) (1958 (Oct.)) 354 – 361. [3] B. Boguraev, R. Bellamy, C. Swart, Summarization miniaturization: delivery of news to handhelds, Proceedings of Workshop on Automatic Summarization 2001, pp. 99–110, in conjunction with The Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), Association for Computational Linguistics, Pittsburgh, PA, USA, 2001 (Jun.). [4] M.H. Brown, W.E. Weihl, Zippers: a Focus+Context display of web pages, Proceedings of World Conference of the Web Society (WebNetT96), San Francisco, CA, USA 1996 (Oct.), DEC SRC Technical Report 140, (1996 (May)). 59 [5] O. Buyukkokten, H. Garcia-Molina, A. Paepcke, T. Winograd, Power browser: efficient web browsing for PDAs, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2000), ACM Press, Hague, Netherlands, 2000 (Apr.), pp. 430 – 437. [6] O. Buyukkokten, H. Garcia-Molina, A. Paepcke, Seeing the whole in parts: text summarization for web browsing on handheld devices, Proceedings of the 10th International Conference on World Wide Web (WWW10), ACM Press, Hong Kong, China, 2000 (May), pp. 652 – 662. [7] O. Buyukkokten, H. Garcia-Molina, A. Paepcke, Accordion summarization for end-game browsing on PDAs and cellular phones, Proceedings of the SIGCHI Conference on Human Factors in Computing System (CHI 2001), ACM Press, Seattle, WA, USA, 2001 (Mar.), pp. 213 – 220. [8] O. Buyukkokten, H. Garcia-Molina, A. Paepcke, Text summarization of web pages on handheld devices, Proceedings of Workshop on Automatic Summarization 2001, in conjunction with The Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), Association for Computational Linguistics, Pittsburgh, PA, USA, 2001 (Jun.). [9] Diyatech Homepage, http://www.diyatech.com/clycos.htm. [10] H.P. Edmundson, New method in automatic extraction, Journal of the ACM 16 (2) (1969 (Apr.)) 264 – 285. [11] B. Endres-Niggemeyer, E. Maier, A. Sigel, How to implement a naturalistic model of abstracting: four core working steps of an expert abstractor, Information Processing and Management 31 (5) (1995 (Sep.)) 631 – 674. [12] J. Feder, Fractals, Plenum, New York, 1988. [13] G.W. Furnas, Generalized fisheye views, Proceedings of the SIGCHI Conference on Human Factors in Computing System (CHI ’86), ACM SIGCHI Bulletin, vol. 17 (4), 1986 (Apr.), pp. 16 – 23. [14] J.S. Garofolo, E.M. Voorhess, V.M. Stanford, K.S. Jones, TERC-6 1997 spoken document retrieval track overview and results, Proceeding of the Sixth Text REtrieval Conference (TREC 6), National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA, 1997 (Nov.), pp. 83 – 91. [15] B.G. Glaser, A.L. Strauss, The Discovery of Grounded Theory; Strategies for Qualitative Research, Aldine de Gruyter, New York, 1967. [16] J. Goldstein, M. Kantrowitz, V. Mittal, J. Carbonell, Summarizing text documents: sentence selection and evaluation metrics, Proceedings of the Twenty-Second Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, (SIGIR’99), Berkeley, California, USA, ACM Press, 1999 (Aug.), pp. 121 – 128. [17] G. Grefenstette, Producing intelligent telegraphic text reduction to provide an audio scanning service for the blind, Working Notes of the American Association for Artificial Intelligence 1998 Spring Symposium (AAAI’98) on Intelligent Text Summarization, Stanford, California, AAAI Press, 1998 (Mar.), pp. 111 – 117. [18] D.K. Harman, Ranking algorithms, in: W.B. Frakes, R. BaezaYates (Eds.), Information Retrieval: Data Structures and Algorithms, Prentice-Hall, 1992, pp. 363 – 392 (Ch. 14). 60 C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 [19] D.K. Harman, E.M. Voorhees, The Fifth Text REtrieval Conference (TREC-5), NIST Special Publication, vol. 500-238, National Institute of Standards and Technology, Gaithersburg, MD, USA, 1996 (Nov.). [20] M.A. Hearst, Subtopic structuring for full-length document access, Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’93), Pittsburgh, Pennsylvania, USA, 1993 (Jun.), pp. 56 – 68. [21] A.E. Jacquin, Fractal image coding: a review, Proceedings of the IEEE, vol. 81 (10), 1993 (Oct.), pp. 1451 – 1465. [22] J. Kepiec, J. Pedersen, F. Chen, A trainable document summarizer, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’95), Seattle, Washington, USA, 1995 (Jul.), pp. 68 – 73. [23] H. Koike, Fractal views: a fractal-based method for controlling information display, ACM Transactions on Information Systems (TOIS) 13 (3) (1995 (Jul.)) 305 – 323. [24] KWML. KWML-KVM WML (WAP) Browser on Palm. PALM Inc., http://www.jshape.com/kwml/index.html, (2002). [25] M. Lam-Adesina, G.J.F. Jones, Applying summarization techniques for term selection in relevance Feedback, Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01), New Orleans, Louisiana, USA, ACM Press, 2001 (Sep.), pp. 1 – 9. [26] M. Lamming, M. Eldridge, M. Flynn, C. Jones, D. Pendlebury, Satchel: providing access to any document, any time, anywhere, ACM Transactions on Computer-Human Interaction (TOCHI), special issue on human–computer interaction with mobile systems 7 (3) (2000 (Sep.)) 322 – 352. [27] Y. Lin, E.H. Hovy, Identifying topics by position, Proceedings of the Workshop of Intelligent Scalable Text Summarization, in conjunction of the Fifth Conference on Applied Natural Language Processing (ANLP-97), Washington, DC, Association for Computational Linguistics, 1997 (Mar.), pp. 283 – 290. [28] H.P. Luhn, The automatic creation of literature abstracts, IBM Journal of Research and Development 2 (2) (1958 (Apr.)) 159 – 165. [29] I.S. MacKenzie, R. Soukoreff, Text entry for mobile computing: models and methods, theory and practice, Human–Computer Interaction 17 (2 and 3) (2002) 147 – 198. [30] I.S. MacKenzie, Mobile text entry using three keys, Proceedings of the Second Nordic Conference on Human Computer Interaction. (NordiCHI 2002), ACM Press, Aarhus, Denmark, 2002 (Oct.), pp. 27 – 34. [31] B. Mandelbrot, The Fractal Geometry of Nature, W.H. Freeman, New York, 1983. [32] I. Mani, Recent development in text summarization, The Proceedings of the tenth International Conference on Information and Knowledge Management (CIKM’01), ACM Press, Atlanta, GA, USA, 2001 (Nov.), pp. 529 – 531. [33] I. Mani, D. House, G. Klein, L. Hirschman, T. Firmin, B. Sundheim, The TIPSTER SUMMAC text summarization evaluation, Proceedings of the Ninth Conference on European [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] Chapter of the Association for Computational Linguistics (EACL’99), University of Bergen, Bergen, Norway, 1999 (Jun.), pp. 77 – 85. K. McKeown, R. Barzilay, J. Chen, D. Elson, D. Evans, J. Klavans, A. Nenkova, B. Schiffman, S. Sigelman, Columbia’s newsblaster: new features and future directions, Proceedings of the 2003 Human Language Technology conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Demonstrations, Association for Computational Linguistics, Edmonton, Canada, 2003 (May), pp. 15 – 16. T. Mendelsohn, Content at the point-of-care: in the palm of a doctor’s hand, Proceedings of Online Information, London, U.K., 2001 (Dec.), pp. 169 – 174. A.H. Morris, G.M. Kasper, D.A. Adams, The effects and limitations of automated text condensing on reading comprehension performance, Information Systems Research 3 (1) (1992 (Mar.)) 17 – 35. T. Nomoto, Y. Matsumoto, A new approach to unsupervised text summarization, Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01), ACM Press, New Orleans, LA, USA, 2001 (Sep.), pp. 26 – 34. PALM. PALM: Providing Fluid Connectivity in a Wireless World. White Paper of Palm Inc., http://www.palm.com/ wireless/ProvidingFluidConnectivity.pdf, (2002). K. Perlin, Quikwriting: continuous stylus-based text entry, Proceedings of the 11th Annual ACM Symposium on User Interface Software and Technology (UIST’98), ACM Press, San Francisco, CA, USA, 1998 (Nov.), pp. 215 – 216. C.J. van Rijsbergen, Information Retrieval, Butterworths, London, 1979. G. Salton, C. Buckley, Term-weighting approaches in automatic text retrieval, Information Processing and Management 24 (5) (1998 (May)) 513 – 523. Y. Seki, K. Eguchi, N. Kando, Compact Summarization for Mobile Phones, Proceedings of Mobile HCI 2003 International Workshop, Udine, Italy, 2003 (Sep.), In: F. Crestani, M. Dunlop, S. Mizzaro, (Eds.), Mobile and Ubiquitous Information Access, Lecture Notes in Computer Science, 2954, pp. 172–196, Springer, 2004. S. Teufel, M. Moens, Sentence extraction and rhetorical classification for flexible abstracts, Proceedings of the 1998 AAAI Spring Symposium on Intelligent Text Summarization, AAAI Press, Palo Alto, CA, USA, 1998 (Mar.), pp. 16 – 25. S. Teufel, M. Moens, Sentence extraction as a classification task, Proceedings of the ACL’97/EACL’97 Workshop on Intelligent Scalable Text Summarization, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain. Morgan Kaufmann Publishers, Madrid, Spain, 1997 (Jul.), pp. 58 – 68. N. Vasconcelos, A. Lippman, Bayesian modeling of video editing and structure: semantic features for video summarization and browsing, Proceedings of 1998 IEEE International Conference on Image Processing (ICIP’98), vol. 3, IEEE Computer Society, Chicago, IL, USA, 1998 (Oct.), pp. 153 – 157. C.C. Yang, F.L. Wang / Decision Support Systems 43 (2007) 46–61 [46] N. Vasconcelos, A. Lippman, A spatiotemporal motion model for video summarization, Proceedings of the IEEE Computer Society Conference on computer Vision and Pattern Recognition (CVPR’98), IEEE Computer Society, Santa Barbara, CA, USA, 1998 (Jun.), pp. 361 – 366. [47] Yahoo!News, 2003, Yahoo!News homepage, http://news. yahoo.com. [48] C.C. Yang, F.L. Wang, Fractal summarization: summarization based on fractal theory, Proceedings of the 26th Annual International ACM SIGIR Conference: Research and Development in Information Retrieval (SIGIR 2003), ACM Press, Toronto, Canada, 2003 (Jul.). [49] C.C. Yang, F.L. Wang, Fractal summarization for mobile devices to access large documents on the web, Proceedings of the Twelfth International Conference on World Wide Web (WWW 2003), ACM Press, Budapest, Hungary, 2003 (May), pp. 215 – 224. [50] C.C. Yang, F.L. Wang, Automatic summarization for financial news delivery on mobile devices, Proceedings of the Twelfth International Conference on World Wide Web (WWW 2003), ACM Press, Budapest, Hungary, 2003 (May), pp. 391 – 392. [51] C.C. Yang, F.L. Wang, Document summarization on handheld device: an information visualization tool for mobile commerce, Proceeding of The First Workshop on e-Business (WEB2002) in International Conference on Information Systems (ICIS 2002), Association for Information Systems, Barcelona, Spain, 2002 (Dec.). [52] YellowBrix, 2003, YellowBrix Homepage, http://www. yellowbrix.com/. [53] K. Zechner, A. Waibel, DiaSumm: flexible summarization of spontaneous dialogues in unrestricted domains, Proceedings of the 18th International Conference on Computational Linguistics (COLING-2000), Saarbruecken, Germany, Morgan Kaufmann Publishers, Universität des Saarlandes, Saarbrücken, Germany, 2000 (Jul.), pp. 968 – 974. [54] K. Zechner, A. Waibel, Minimizing word error rate in textual summaries of spoken language, Proceedings of the 1st Conference of the North American Chapter of the Association for computational Linguistics and the 6th Conference on Applied Natural Language Processing (NAACL/ANLP 2000), Association for Computational Linguistics, Seattle, WA, USA, 2000 (Apr.), pp. 186 – 193. 61 Christopher C. Yang is an associate professor in the Department of Systems Engineering and Engineering Management at the Chinese University of Hong Kong. He received his BS, MS, and PhD in Electrical and Computer Engineering from the University of Arizona. Before he joined the Chinese University of Hong Kong, he was an assistant professor in the Department of Computer Science and Information Systems and the associate director of the Authorized Academic JavaSM CampusSM at the University of Hong Kong. He has also been a research scientist in the Artificial Intelligence Laboratory in the Department of Management Information Systems at the University of Arizona. His recent research interests include cross-lingual information retrieval, multimedia information retrieval, digital library, information visualization, Internet searching, automatic summarization, information behavior, and electronic commerce. He has published over 100 refereed journal and conference papers in Journal of the American Society for Information Science and Technology, IEEE Transactions on Image Processing, IEEE Transactions on Robotics and Automation, IEEE Computer, Information Processing and Management, Decision Support Systems, Graphical Models and Image Processing, Optical Engineering, Pattern Recognition, International Journal of Electronic Commerce, Applied Artificial Intelligence, SIGIR, ICIS, WWW, and more. He was the chairman of the Association for Computing Machinery Hong Kong Chapter, the program co-chair of the First International Conference on Asia Digital Library, and a program committee and organizing committee member for several international conferences in digital library, information systems, electronic commerce, computer vision, image processing, and information retrieval. He has frequently served as an invited panelist in the NSF Digital Library Initiative Review Panel and the NSF Information Technology Research Review Panel in US. Fu Lee Wang is an Instructor in the Department of Computer Science, City University of Hong Kong. He received B.Eng. (Hon) degree in Computer Engineering and M.Phil. degree in Computer Science and Information Systems from the University of Hong Kong, and PhD degree in Systems Engineering and Engineering Management from Chinese University of Hong Kong. His current research interests focus on document summarization, digital library, and information retrieval. His research work has been appeared in such journals as Journal of the American Society for Information Science and Technology, and Information Processing Letters.
© Copyright 2026 Paperzz