Calculating Emotional Score of Words for User Emotion Detection in Messenger Logs Lun-Wei Ku and Cheng-Wei Sun National Yunlin University of Science and Technology, Taiwan [email protected]; [email protected] granularities and degrade to the upper level when the experimental materials are insufficient. With the hierarchy, how to map categories in other researches to ours becomes clearer and easier, and when generating the testing instances, annotators have more information of how to mark label current emotion. In this paper, six emotion classes in the first layer were used in experiments. Subjective information was easily found in web based chatting environments such as blog space, twitter, and instant message programs. Among them, the webpage based information, like the blog articles, was more public and easily reachable, while the program based information, like the messenger logs, was difficult to access due to the privacy issue. However, private messages could tell more about the emotion of users. Therefore, detecting and analyzing emotions from them is indispensible for further applications. In the past, emotion analysis was usually performed from the reader’s perspective. Therefore, the commonly seen process to acquire the experimental corpus was first collecting materials and then asking annotators to label the emotions on them. However, we aim to detect emotions from the author’s perspective in this paper; therefore, different process to collect experimental materials is proposed. As far as we know, there are no materials containing MSN logs and the corresponding user’s emotions so far. We tried to collect sentences containing emoticons from blogs, and viewed the emoticon within the sentence as the blogger’s emotion to construct the experimental materials further to learn how to detect emotion automatically. Sentences containing emoticons were called emoticon sentences hereafter. Then we utilized the information learned from messenger logs to detect user’s emotions. To evaluate the performance, we recorded the messenger logs and asked annotators to label their emotions each hour. In this paper, we proposed four approaches to calculate the emotion score of a word for each emotion class. The experiments were designed from the author’s perspective in both the material preparation and Abstract This paper utilized sentences containing emoticon from the articles in Yahoo! blogs to automatically detect user’s emotions from messenger logs. Four approaches, topical approach, emotional approach, retrieval approach, and lexicon approach, were proposed. Forty emoticon classes found in Yahoo! blog articles were used for experiments. Two experiments were performed. The first experiment classified sentences into 40 emoticon classes by calculating emotional scores of words. The second experiment took the Yahoo! and MSN messenger logs collected from users as the experimental materials, classified them into 40 emoticon classes by proposed approaches, and mapped 40 emoticon classes to 6 emotion classes to tell the user’s emotion. The best performance of the proposed approaches for user emotion detection was achieved by the topical approach and its micro-average precision 0.48 was satisfactory. Keywords: Sentiment analysis, Emoticon generation, Emotion detection, Messenger log 1. Introduction Emotion analysis has become one of the major research topics in the subjective information processing. Related approaches as well as applications have been proposed [1][2]. There are many ways to categorize emotions. Different emotion states were used for experiments in previous research [3]. To find suitable categories of emotions, we adopted the three-layered emotion hierarchy proposed by [4]1. In this hierarchy, 6 emotions are in the first layer, including love, joy, surprise, anger, sadness and fear. The second layer includes 25 emotions, and the third layer includes 135 emotions. Using this hierarchical classification of emotions, we can categorize them from rough to fine 1 http://changingminds.org/explanations/emotions/ basic%20emotions.htm IEEE IRI 2012, August 8-10, 2012, Las Vegas, Nevada, USA 978-1-4673-2284-3/12/$31.00 ©2012 IEEE 138 evaluation phases. The emotional scores were used to detect the user’s emotion from messenger logs. Using the proposed approaches, the emotional scores could be generated even when emotion categories were different. 2. words in the messenger log were accumulated to determine the emotion class of the log. Four approaches were proposed: topical approach, emotional approach, retrieval approach and lexicon approach. Topical approach and retrieval approach considers the importance of words in the query sentence and the emoticon sentences, while the emotional approach calculated the “emoticonal” tendency of the words in the query sentence. Lexicon approaches looked up words of different emoticon classes in an emotion dictionary. Experimental materials Emoticon sentences and messenger logs were adopted for experiments in this paper. Emoticon sentences were used to learn the emotional scores for detecting emoticons in sentences and testing the performance. The learned emotional scores were then utilized to predict the emoticon in the messenger logs and further to find the emotion. Though the emoticon of sentences in messenger logs was predicted, messenger logs were used only to test the performance of emotion detection. 2.1. 3.1. In this approach, the concept of tfɀidf was used to calculate the emotional score of words in the query sentence. Each emoticon sentence was treated as a document, and then the idf of each word was first calculated by formula (1): (1) idf ( wi ) log( N / N wi ) Emoticon sentences from Yahoo! blogs Because multiple emoticons in one sentence will increase the degree of confusion, sentences containing exact one emoticon were collected for experiments. A total of 1,540,163 Sentences were collected from articles in Yahoo! blog in the period July, 2006 to June, 2007. Forty emoticons in Yahoo! blog platform, their meanings, and the number of the corresponding collected sentences were listed in Table 1. 2.2. where tf denoted term frequency in one document, idf inverted document frequency, wi the current word, N the total number of documents (here, emoticon sentences), and N w the number of documents containing wi. Then the i idf score of the word was distributed to 40 emoticon classes by the probability of observing the emoticon sentences of the emoticon class cj containing wi over all emoticon sentences. Each word wi has forty emoticon scores and each score, denoted as emoticon(wi, cj), corresponded to an emoticon class cj. Formulae (2) and (3) show how to calculate these scores. Messenger logs Messenger logs were used as the materials to detect emotions. They were originally used for creating intelligent ambient according to emotions of users [5]. We collected texts from Yahoo! Messenger and MSN Messenger logs of 8 annotators. Whenever there was at least one new message, once an hour the collecting program would pop up the menu and ask them to annotate their current emotion. A total of 150 records were annotated for experiments and statistics were shown in Table 2. The quantity was not large because we needed to wait for the annotators’ chat, and at most one record would be generated each hour by one annotator. Emo Log # 1 11 2 80 3 1 4 15 5 39 emoticon( wi , c j ) prob ( s wi ,c j ) where s wi ,c j prob( swi ,c j ) idf ( wi ) (2) N ( s wi ,c j ) (3) N denoted those sentences of emoticon class cj containing word wi. After that, the emotional score of wi in emotion class ek was calculated by summing up the scores of the emoticons mapped to ek as shown in formula (4). The mapping of the emoticon classes and the emotion class was listed in Table 3. The emotion class emo_class of the query sentence sq was determined by formula (5). (4) emotion( wi , ek ) emoticon( wi , c j ) ¦ 6 4 c j ek emo _ class( sq ) arg max Table 2. Annotated messenger logs (Emo: Emotions, 1=Love, 2=Joy, 3=Surprise, 4=Angry, 5=Sad, 6=Fear; Log #: number of messenger logs) 3. Topical Approach ek ¦ emotion(w , e ) i k (5) wisq Because an emoticon sentence was treated as a document, if a word appeared multiple times in one emoticon sentence, its idf would be accumulated for each observation. Therefore, the term frequency (tf) was implicitly calculated in formula (5) following the definition of tfɀidf that tf was the frequency of a term in the current document. Learning emotional scores of words The major aim of the proposed approaches was to detect user’s emotions in messenger logs. As mentioned, the emoticon sentences were treated as the learning materials and from them the emotional score of each word was calculated. The learned emotional scores of the 139 1. smile 75,650 11. surprise 58,175 2. sad 41,118 3. blink 42,334 4. happy 114,969 5. blink2 44,328 6. bother 43,413 12. angry 13. smug 14. cool 15. worried 16. evil 46,618 22. stubborn 13,050 20,796 28,957 44,390 23. same 24. sleepy 25. eh… 6,938 33. let me 31. yeh? 32. slobber think 21,039 37,101 14,149 10,011 15,461 21.angel 6,617 34. is it? 28,560 35. clap 17. cry 19,873 26. sick 36. pray 44,342 8. shy 48,406 9. tongue 71,402 18. laugh 19. honest 29. clowd 5,855 131,148 28. won’t tell 32,563 37. sign 38. humph 39. flower 19,783 55,859 92,635 27. secret 17,577 52,880 7. love 106,274 17,480 10. kiss 29,414 20. so stuck 17,170 30. don’t be stupid 35,024 11,710 3,256 40. pig 13,838 Table 1.Statistics of emoticons and emoticon sentences ek (emotion) cj (emoticon) Love 7(love), 8(shy), 10(kiss) Joy 1(smile), 4(happy), 13(smug), 18(laugh) Surprise 11(surprise) Angry 12(angry) Sadness 17(cry), 37(sign) Fear 15(worried) were retrieved and the majority of their emoticons was treated as the emoticon class of the query sentence. The retrieving of the emoticon sentences was based on tfɀidf model, too. The ranking score rank_score(sn) of the emoticon sentence sn was calculated by formula (6), and then emotional class of sq was determined by formula (7). emo _ class ( s q ) arg max sek , rank ( sek ) d 10 (7) ek 3.4. Emotional Approach Lexicon Approach The lexicon approach was performed by looking up words in an emotion dictionary and calculating their emotional scores. The emoticon sentences were not used in this approach; instead, the Chinese emotion dictionary [6] was adopted. In this dictionary, lexicons were categorized into eight emotion types: awesome, heartwarming, surprising, sad, useful, happy, boring, and angry. These eight emotion types appeared in Yahoo! News Taiwan in the year 2008 but as we can see, not all of them were general emotion states. Therefore, we tried to find Lin’s emotion categories in Parrott’s emotion hierarchy before using his dictionary. Those could not be found were categorized into the Other class. Ku and Chen’s approach [7] for calculating sentiment scores senti_score(wi) was then adopted to give scores to these lexicons, as the emoticon sentences were no longer used for learning scores. The scores of lexicons of the same emotional classes were summed up and the emotion category of the highest total score was selected as the detected emotion as shown in the formula (8). The concept of using tfɀidf is to find words observed more often (tf) in fewer documents (idf). These terms can be viewed as the representative of a document. In emotional approach, we tried to find the representative words for each emoticon class and calculate the score to denote the degree of representation. Therefore, in emotional approach, the emoticon sentences of the same emoticon class were concatenated into one document. A total of 40 documents were generated and the corresponding tf ɀ idf scores were calculated for each word to be the emoticon(wi, cj), Then the emo_class(sq) was determined by formula (4) and (5), too. 3.3. (6) i wisn sq Table 3. The mapping of emoticon classes and the emotion class 3.2. ¦ idf (w ) rank _ score( sn ) Retrieval Approach The retrieval approach treated the emoticon detection problem as an information retrieval (IR) problem. The query sentence sq was the query in IR system, and the emoticon sentences which contained the most important words in sq were retrieved to vote for its emoticon class. To evaluate IR approaches, P@10 is a common measure which calculates the precision of the most highly ranked ten results. P@10 was used because the user usually cared the highly ranked documents and hence most IR system tried to improve the precision of retrieving them. Following the concept of the information retrieval and P@10, the most highly ranked 10 emoticon sentences emo _ class( sq ) arg max ek ¦ senti _ score(w , e ) i k (8) wi sq ,lexk where lexk denoted the lexicon set which belonged to emotion class ek. 140 1. 0.042 11. 0.108 21. 0.002 31. 0.034 2. 0.007 12. 0.247 22. 0.001 32. 0.167 3. 0.007 13. 0.003 23. 0.007 33. 0.014 4. 0.153 14. 0.01 24. 0.181 34. 0.001 5. 0.008 15. 0.025 25. 0.0 35. 0.153 6. 0.142 16. 0.008 26. 0.031 36. 0.255 7. 0.377 17. 0.454 27. 0.004 37. 0.002 8. 0.078 18. 0.646 28. 0.005 38. 0.003 9. 0.033 19. 0.001 29. 0.001 39. 0.063 10. 0.033 20. 0.001 30. 0.025 40. 0.011 Table 4. Precision of emoticon detection (topical approach) 4. The lexicon approach was different from the other three in that it did not calculate scores based on emoticon sentences. Its performance was the second among all. The advantage of using lexicons was that we could find words not appearing in the emoticon sentences and hence would still be able to know the emoticon class of sentences, even though there were no previously seen words in them. The performance of the emotion class Surprise (1.000, 58,175 emoticon sentences) in Table 5 showed this phenomenon. However, having fixed lexicon set was also its disadvantage. When there were many emoticon sentences so that scores of various words were learned in the topical approach, the lexicon approach would suffer from limited lexicons when determining the emotion class. Experimental results and discussions To evaluate the performance of the emoticon detection in emoticon sentences and the emotion detection in messenger logs, 10-fold experiments were performed. The best results of emoticon detection among four approaches, generated by the topical approach, were listed in Table 4, and the results of four approaches for emotion detection were listed in Table 5. Table 4 showed that the performance of emoticon detection for each emoticon class varied. The best performances were found in three class: Laugh (class 18, 0.646), Cry (class 17, 0.454) and Joy (class 7, 0.377), and Table 1 showed that the number of the emoticon sentences of these classes overwhelmed the other classes. Figure 1 further showed the relation between the number of emoticon sentences and the performance of emoticon detection. From Figure 1 we could see that the curve of the number conformed to the curve of the performances of three approaches. That is, collecting more emoticon sentences would help improve the performance. Table 5 showed that on average the best performance of the user emotion detection was achieved by using the emotion scores generated by the topical approach, while the emotional approach performed the worst. After looking over the emotional scores, we found that the unsatisfactory performance was caused by the concatenation of the emoticon sentences of the same class. This process made forty very large documents, where the term frequency might become the dominate factor and deteriorated the performance. The retrieval approach was better than the emotional approach, but worse than the topical approach. Both the topical approach and the retrieval approach determined the results according to the tfɀidf score. However, the topical approach distributed it to emotion classes, while the retrieval approach utilized it to rank sentences for voting the result. As a result, we can say that considering the important words of the query sentence to find the emotion class performs better than letting similar sentences to determine for it. Figure 1. Number of emoticon sentences and the performances of 40 emoticon classes As to performance of user emotion detection shown in Table 5, all approaches tended to perform unsatisfactory for emotion class Love, Angry and Fear. For Angry and Fear, Figure 1 has shown that the insufficiency of emoticon sentences was one causing factor of the low performance. Moreover, Table 2 showed that these emotion classes were seldom selected by annotators. Logs of these classes might be related to specific events represented by special word compositions instead of a certain subjective words. Table 6 showed the confusion matrix of user emotion detection. It showed that messenger logs of Sadness were often classified to Joy. Two major characteristics were 141 found in Sadness logs: lack of words or only commonly used words were found. The former characteristic made us difficult to determine its emotion class because there was no information; we would need additional context by using n-grams instead of individual words to get more information for the sentence of the latter characteristic. Bellegarda reported that his best f-measure was 0.340 also for 6 categories. Notice that his work analyzed from the reader’s perspective, while our work analyzed from the author’s perspective. The emotion analysis from author’s perspective was generally considered more difficult than from the reader’s perspective as what a user felt might not be consistent with what he/she wrote in messengers. Therefore, though Bellegarda’s experiments and experiments in this paper were done on different datasets and evaluated by different metrics, the microaverage precision of the topical approach, 0.480, reported by this paper was considered satisfactory. approach class found in the current sentence, in other words, the suitable emoticon for the current sentence. Suzuki and Tzuda [8] analyzed texts and generated emoticons for sentences automatically for the cell-phone message, which was similar to what we did. Emoticons have been used to reduce dependency in machine learning techniques for sentiment classification [9]. Therefore, emoticon sentences were also used for subjective sentence classification. However, sentences in those researches were classified into only positive and negative classes [10], or an additional neutral class instead of forty emoticon classes, which was different from our experiments. Various emotion dictionaries were proposed. Some of them include emotions but not limited to emotions, like General Inquire 2 and SentiWordnet [11]. Some created dictionaries or lexicons specific for emotion analysis [12]. Researchers utilized the existing emotion dictionaries to classify emotions in texts [13]. However, Osherenko and André did research on the necessity of affect dictionaries and found that they did not provide much additional information [14], which conformed to our results. Knowing emotions is a good way to predicting the further actions or demand of users. Many researchers found experimental materials on the Internet chatting platforms [15]. Matsumoto, Fuji, and Kuroiwa developed a system utilizing sentence structures to estimate the emotion from a text paragraph [16], which looked like the system we would like to build. However, they analyzed the texts from reader’s perspective, which was different from our aim and future applications. Topical Emotional Retrieval Lexicon Love 0.000 0.000 0.000 Joy 0.850 0.238 0.438 0.325 Surprise 0.000 0.000 0.000 1.000 0.000 Angry 0.000 0.000 0.000 0.000 Sadness 0.103 0.000 0.103 0.026 Fear 0.000 0.000 0.000 0.000 Macro-Avg 0.159 0.040 0.090 0.225 Micro-Avg 0.480 0.127 0.260 0.187 Table 5. Precision of user emotion detection in messenger logs S A 1 2 3 4 5 6 1 2 3 4 5 6 0 3 0 0 0 0 9 68 1 13 33 4 1 0 0 0 0 0 0 1 0 1 0 0 0 4 0 1 4 0 0 0 0 0 0 0 6. This paper aimed to detect user’s emotions from the messenger logs, which is an emotion analysis research problem from the author’s perspective. Forty emoticon classes and six emotion classes were adopted for the experiments. The emoticon sentences from Yahoo! blogs were utilized to learn the emotional score of words. Four approaches including topical approach, emotional approach, retrieval approach, and lexicon approach were proposed. The topical approach performed the best, while the lexicon approach did better in minority classes. The micro-average precision of the topical approach achieved 0.48, which was satisfactory compared to other researches. Several improvements can be implemented in the future. To improve the topical approach, different scoring functions will be tested. To improve the emotional approach, a sentence classification can be performed and sentences of the same class can be concatenated to generate short documents for each emoticon class. To improve the retrieval approach, real information retrieval Table 6. Confusion matrix of emotion detection (topical approach; S:system; A:answer) 5. Conclusion and future work Related work The experiments in this paper were related to several research problems, including emoticon generation, emotional lexicon collection, and emotion detection. The emoticon detection experiments in this paper can be viewed as an emoticon generation process. Our approaches predicted whether an emoticon should be 2 142 http://www.wjh.harvard.edu/~inquirer/ systems such as Lucene3 will be adopted. To improve the lexicon approach, we will try to use more emotional dictionaries in the experiments. In this paper, we suffered from the unbalanced materials among classes. We plan to collect more materials for small classes to further examine the effect of the quantity factor. [8] Nobuo Suzuki, and Kazuhiko Tsuda, “Automatic Emoticons Generation Method For Web Community”, IADIS International Conference on Web Based Communities, San Sebastian, Spain, 2006, pp. 331-334. [9] Jonathon Read, "Using emoticons to reduce dependency in machine learning techniques for sentiment classification", Proceedings of the ACL Student Research Workshop, 2005, pp. 43-48, Acknowledgements Research of this paper was partially supported by National Science Council, Taiwan, under the contract NSC100-2218-E-224-013-. [10] Ying-Tse Sun, Chien-Liang Chen, Chun-Chieh Liu, Chao-Lin Liu, and Von-Wun Soo, “Sentiment Classification of Short Chinese Sentences”, Proceedings of the 22nd Conference on Computational Linguistics and Speech Processing (ROCLING 2010), Nantou, Taiwan, 2010, pp. 184-198. References [1] Dipankar Das, “Analysis and Tracking of Emotions in English and Bengali Texts: A Computational Approach”, Proceedings of the International World Wide Web Conference (WWW 2011), 2011, pp. 343-347. [11] Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani, "SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining", Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Malta, May 17-23, 2010, pp. 2200-2204. [2] Elena Frantova, and Sabine Bergler, “Automatic Emotion Annotation of Dream Diaries”, K-CAP Workshop on Analyzing Social Media to Represent Collective Knowledge, California, USA, 2009. [12] Ge Xu, Xinfan Meng, and HoufengWang, “Build Chinese Emotion Lexicons Using A Graph-based Algorithm and Multiple Resources”, Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, 2010, pp. 1209-1217. [3] Bellegarda, Jerome R, “Emotion Analysis Using Latent Affective Folding and Embedding”, Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, Los Angeles, 2010, pp. 1-9. [13] Futoshi Sugimoto, and Yoneyama Masahide, “A Method for Classifying Emotion of Text Based on Emotional Dictionaries for Emotional Reading”, Second International Symposium on Communications, Control, and Signal Processing, Marrakech, Morocco, 2006. [4] W. Parrott, “Emotions in Social Psychology”, Psychology Press, Philadelphia, 2001. [5] Lun-Wei Ku, Cheng-Wei Sun, and Ya-Hsin Hsueh, “Demonstration of IlluMe: Creating Ambient According to Instant Message Logs”, Proceedings of 50th Annual Meeting of Association for Computational Linguistics, demo paper, Jeju, Republic of Korea, July 8-14, 2012. [14] Alexander Osherenko and Elisabeth André, “Lexical Affect Sensing: Are Affect Dictionaries Necessary to Analyze Affect?”, Affective Computing and Intelligent Interaction (ACII2007), LNCS 4738, Lisbon, Portugal, 2007, pp. 232–243. [6] Kevin Hsin-Yih Lin, Changhua Yang, and Hsin-Hsi Chen, “Emotion Classification of Online News Articles from the Reader’s Perspective”, Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence, 2008, pp. 220-226. [15] Shashank and Pushpak Bhattacharyya, "Emotion Analysis of Internet Chat", Proceedings of ICON-2008: 6th International Conference on Natural Language Processing, Macmillan Publishers, India. [7] Lun-Wei Ku and Hsin-His Chen, “Mining Opinions from the Web: Beyond Relevance Retrieval”, Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 58(12), 2007, pp. 18381850. 3 [16] Kazuyuki Matsumoto, Ren Fuji, and Shingo Kuroiwa, “Emotion Estimation System Based on Emotion Occurrence Sentence Pattern”, International Conference on Intelligent Computing (ICIC 2006), LNAI 4114, 2006, pp. 902–911. http://lucene.apache.org/core/ 143
© Copyright 2026 Paperzz