Knowledge-Based Approaches to Concept-Level Sentiment Analysis Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining Soujanya Poria, Jadavpur University Alexander Gelbukh, National Polytechnic Institute Amir Hussain, University of Stirling Newton Howard, Massachusetts Institute of Technology SenticNet 1.0 is one of the most widely used, publicly available resources Dipankar Das, National Institute of Technology Sivaji Bandyopadhyay, Jadavpur University T he Internet contains important information on its users’ opinions and sentiments. The extraction of such unstructured Web data for concept-based is known as opinion mining and sentiment analysis, a recent and explo- opinion mining. sively growing research field widely used for marketing, customer service, The presented financial market prediction, and other purposes. Researchers have developed several lexical resources—including WordNet-Affect ( W NA) 1 and S entiWord Net 2 —for different opinion-mining tasks, but most are incomplete and noisy (see the “Lexical Resources” sidebar). In particular, SenticNet3 is a concept-based resource containing 5,732 single- or multiword concepts along with a quantitative polarity score in the range from −1 to +1; example concepts and scores include “aggravation” (−0.925), “accomplish goal” (+0.967), “and December” (+0.111). methodology enriches SenticNet concepts with affective information by assigning an emotion label. 2 However, it’s often desirable to have a more complete resource containing affective labels for the concepts as well as polarity scores. Currently, the main lexical resource for emotion detection in natural language texts is WNA, which assigns emotion labels to words—for example, “wrath” (anger), “nausea” (disgust), and “triumph” (joy). WNA, however, has a small number of words and gives only qualitative affective informationan emotion labelbut not quantitative information on the emotion’s intensity. Our goal was to create a resource resulting from automatically merging SenticNet 1541-1672/13/$31.00 © 2013 IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society IS-28-02-Gelbukh.indd 2 5/22/13 5:12 PM Lexical Resources A ll known approaches to opinion mining crucially depend on the availability of adequate lexical resources that provide emotion-related information. Several methods that employ semiautomatic building have been suggested, both for English1 and other languages. 2,3 SentiWordNet or WordNet-Affect (WNA) are the most widely used resources. They contain a rather small number of words and are mostly limited to affective information for single words. Recent research 4 shows that concept-based sentiment analysis and opinion mining outperform word-based methods. This concept-focused approach relies on polarity and affective information for commonsense knowledge concepts— for example, to accomplish a goal, experience a bad feeling, and celebrate a special occasion—which are often used to express viewpoints and affect. SenticNet is a publicly available, affective, commonsense resource for sentic computing,5 a new paradigm that exploits artificial intelligence, the Semantic Web, and af fective computing techniques to better recognize, interpret, and process natural language opinions over the Web. SenticNet was developed through the ensemble applica tion of graph-mining and dimensionality-reduction techniques over multiple commonsense knowledge bases.6 It has been exploited to develop applications in fields such and W NA—that is, our resource would contain both SenticNet polarity scores and WNA emotion categories for the concepts. Because WNA’s vocabulary is almost a subset of that of SenticNet, our task was to automatically extend the emotion labels from WNA to the rest of SenticNet’s vocabulary. The obtained resource is an extended version of WNA, and users can apply our methodology to extend similar resources that provide emotion labels for concepts. Here, we describe a methodology to automatically assign emotion labels to SenticNet concepts. We trained a classifier on the subset of SenticNet concepts present in WNA. As classification features, we used several concept similarity measures, as well as various psychological features available in an emotion-related corpus. Of the various classifiers we tried, we obtained the highest accuracy using the classifier called Support Vector Machine (SVM). This work lays the foundation for a new concept-level march/april 2013 IS-28-02-Gelbukh.indd 3 as social media marketing, human-computer interaction, and e-health. Although SenticNet is much larger than WNA, it doesn’t provide the specific emotion labels for the concepts. In our work, we fill this gap by extending WNA’s labels to other SenticNet concepts. References 1. S. Mohammad and P.D. Turney, “Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon,” Proc. NAACL-HLT Workshop Computational Approaches to Analysis and Generation of Emotion in Text, Association for Computational Linguistics, 2010, pp. 26–34. 2. P. Arora, A. Bakliwal, and V. Varma, “Hindi Subjective Lexicon Generation Using WordNet Graph Traversal,” Int’l J. Computational Linguistics and Applications, vol. 3, no. 1, 2012, pp. 25–39. 3. A. Wawer, “Extracting Emotive Patterns for Languages with Rich Morphology,” Int’l J. Computational Linguistics and Applications, vol. 3, no. 1, 2012, pp. 11–24. 4. E. Cambria and A. Hussain, Sentic Computing: Techniques, Tools, and Applications, Springer, 2012. 5. R. Picard, Affective Computing, MIT Press, 1997. 6. E. Cambria, D. Olsher, and K. Kwok, “Sentic Activation: A Two-Level Affective Common Sense Reasoning Framework,” Proc. Assoc. Advancement of Artificial Intelligence (AAAI), AAAI, 2012, pp. 186–192. opinion-mining resource, and presents new features and measures for generating such resources. Our Method There are many uses for WNA, ranging from social, economic, and commercial uses to health care, political, and governmental uses. For example, companies need to know what emotions customers express in product reviews so that they can improve their products, which translates into better income for the businesses and better quality of life for the consumers. A political party or government needs to know the emotions prevailing in the press related to its actions. The improvements in decision making that result from such analysis represent a more efficient democracy. This would help provide democracy in real time, in contrast to using elections as a means of punishing bad government or giving credit to a party’s promises. In all such applications, a larger resource will give more www.computer.org/intelligent precise and reliable results, because more words in the analyzed texts will contribute more emotion labels to the statistics. We find it interesting to view our obtained resource as SenticNet augmented with affective labels. Such a resource will give rise to a range of novel applications combining the polarity information from SenticNet with the affective information that we added. In this way, a company could obtain information on which products or features customers like and don’t like (the polarity), and on the specific emotions customers feel relating to the products (the affect)— for example, are they surprised, angered, or joyful? Further, we can use polarity information as a degree measure of the corresponding emotion. Although both “sulk” and “offend” have the same emotion label of anger in WNA, “sulk” has polarity −0.303 in SenticNet and “offend” has −0.990. Therefore, “offend” is stronger in terms of 3 5/22/13 5:12 PM Knowledge-Based Approaches to Concept-Level Sentiment Analysis the anger affect than “sulk.” This information is important for weighted emotion detection. For example, consider the following review of a mobile phone: “the keyboard is comfortable [+0.120, joy] and the interface is amicable [+0.214, joy], but the color is queasy [−0.464, disgust].” Although the review contains two joy words and only one disgust word, weighting the labels by the polarity score indicates that the customer’s main emotion was disgust. Lexical Resources Our work’s primary aim is to assign WNA emotion labels to SenticNet’s concepts. For this, we use a supervised machine-learning approach. The intuition behind this approach is that words that have similar features are similar in their use and meaning and, in particular, are related with similar emotions. This lets us extend WNA’s emotion labels for known seed words to words absent from WNA that share features with the seed words. To achieve this, we must select linguistically relevant features, which we extracted from various relevant text corpora and dictionaries. In this section, we describe the corresponding lexical resources. In addition, we used standard resources such as WordNet 3.0. SenticNet As a target lexicon and source of polarity information for our polaritybased, concept-similarity measure, we used SenticNet 1.0 (http://sentic. net/senticnet-1.0.zip), which contains 5,732 concepts, of which 2,690 are multiword concepts (such as “animate flesh dead person”, “ban Harry Potter”, and “why happen”). Of the 5,732 SenticNet concepts, 3,303 exist in WordNet 3.0 and 2,429 do not. Of the latter set, most are multiword concepts such as “access Internet” or 4 IS-28-02-Gelbukh.indd 4 “make mistake,” except for 68 singleword concepts such as “against” or “telemarketer.” WordNet-Affect Emotion Lists As a training corpus, we used t h e W N A l i s t s (w w w. c s e .u nt . e d u / ~ r a d a /a f f e c t i v e t e x t / d a t a / WordNetAffectE motionLists.tar.gz) provided as part of the SensEval 2007 data. This dataset consists of six word lists corresponding to Ekman’s six basic emotions: anger, disgust, fear, joy, sadness, and surprise.4 Other work provides an overview of different sets of basic emotions proposed in the literature.5 The dataset contains 606 synsets, of which all but two are assigned exactly one label each. If the synsets are broken down into individual concepts, the dataset contains 1,536 concepts. Only 63 concepts are multiword expressions, for example, with hostility or jump for joy. All but 72 concepts (93 percent) are present in the SenticNet vocabulary. ISEAR Dataset As a source of various features and corpus-based similarity measures between concepts, we used the International Survey of Emotion Antecedents and Reactions (ISEAR) dataset (see www.affective-sciences.org/system/ files/page/2636/ ISEAR.zip and www. affective-sciences.org/researchmaterial).6 The survey, conducted in the 1990s across 37 countries, consists of short texts called statements, obtained from approximately 3,000 respondents who were instructed to describe a situation in which they felt a particular emotion. A statement contains 2.37 sentences on average. The dataset contains 7,666 statements, 18,146 sentences, and 449,060 running words. For each statement, the dataset contains 40 numeric or categorical parameters that give various types www.computer.org/intelligent of information, such as respondent age, emotion felt in the situation described, emotion intensity, and so on. The majority of these parameters are numerical scores with a few discrete values (usually three or four) expressing different parameter degrees. Preprocessing and Tokenizing Many of the features we used were based on occurrences of the concepts in the ISEAR dataset. For locating concepts in the text, we used preprocessing tools from Rapidminer’s text plug-in (http://rapid-i.com/content/view/ 181/190), and for lemmatizing we used the WordNet lemmatizer. Of the 5,732 concepts contained in Sentic Net, we found 2,729 at least once in ISEAR, either directly or after lemmatizing. Features Used for Classification For each S ent icNet concept , we extracted from ISEAR statistical features of its occurrences and cooccurrences with other SenticNet concepts in ISEAR statements (we treated multiword concepts as a single token). We used two feature types: those based on the ISEAR dataset parameters and those based on various similarity measures between concepts. Features Based on ISEAR Parameters Some of the 40 parameters provided for each ISEAR statement—such as the respondent’s ID parameter—aren’t informative for our goals. We used the following parameters: • background data related to the respondent (age, gender, religion, father’s occupation, mother’s occupation, and country); • general data related to the emotion felt in the situation described in the IEEE INTELLIGENT SYSTEMS 5/22/13 5:12 PM statement (intensity, timing, and longevity); • physiological data (ergotropic arousals, trophotropic arousals, and felt change in temperature); • expressive behavior data (movement, nonverbal activity, and paralinguistic activity); and • emotion felt in the situation described in the statement. Even for values expressing a parameter’s degree or intensity, we used each degree value as an independent feature, and the frequency of occurrences of a given concept under a given para meter value as the feature value. Namely, for each occurrence of each concept in an ISEAR dataset statement, we extracted the corresponding parameters from the ISEAR dataset. Then we aggregated the data for multiple occurrences of the same concept in the whole corpus in a feature vector for that concept as the corresponding frequencies. Features Based on Similarity Measures Another type of classification feature we used was the similarity values between concepts. We used two types of similarity measures: those based on lexical resources, such as SenticNet and WordNet, and those based on concept cooccurrence. The intuition behind similarity-based features is that if the distances from two data points in Euclidian space to a number of other points are similar, then it’s probable that the two points are close to each other. Given a concept, we treated as independent dimensions of its feature vector the value of each similarity measure, which we obtained by comparing the given concept with the other concepts in the vocabulary. SenticNet score-based similar it y. We define the distance between two march/april 2013 IS-28-02-Gelbukh.indd 5 SenticNet concepts a and b as DSN(a, b) = |p(a) - p(b)|, where p(⋅) is the polarity specified for these concepts in SenticNet. The similarity is the inverse of the distance: SimSN (a, b) = 1/DSN(a, b). the corpus; and p(x,y) is the probability for a sentence to contain both x and y—that is, the normalized number of sentences that contain both x and y. WordNet distance-based similarity. Emotional affinit y. We define the emotional affinity between two concepts a and b in the same way as PMI but at the level of complete ISEAR statements—that is, p(x) in Equation 1 is defined as the corresponding number of statements instead of sentences, normalized by the total number of statements. PMI often reflects syntactic relatedness of the wordsfor example, it’s high for a verb and its typical object, or for parts of a frequently found multiword expression. However, emotional affinity incorporates a wider notion of relatedness within the same real-world situation, as well as the notions of synonymy and rephrasing. Because ISEAR statements have strong emotional content and are related with one emotion each, it’s probable that when two words co-occur in the same ISEAR statement, then the emotion with which one of them is related is the same as the emotion with which the other one is related. We used English WordNet 3.0 to measure the semantic distance between two words. WordNet::Similarity (www. d.umn.edu/tpederse/similarity.html) is an open source package developed at the University of Minnesota for calculating nine different lexical similarity measures between pairs of word senses. This gave us nine different similarity measures. Because WordNet similarity is defined for specific word senses, for every two concepts found in WordNet, we defined the corresponding similarity as the maximum similarity between all senses of the first concept and all senses of the second concept. For the concepts not found in WordNet, we set the similarity between such a concept and any other concept to a random value. An alternative would be to use 0 or some other not-found category, but this deteriorated the results by making all notfound concepts appear similar to each other because of the numerous coinciding values in their feature vectors. Point-wise mutual information. Point- wise mutual information (PMI) is a similarity measure based on cooccurrences of two concepts within the same sentence. For concepts a and b, it is defined as SimPMI = log p(a, b) , p(a)p(b) (1) where p(x) is the probability for a sentence in the corpus to contain the concept x—that is, the number of sentences where x occurs normalized by the total number of sentences in www.computer.org/intelligent ISEAR text-distance similarity. We used positional information of the concept tokens in the ISEAR statements to measure the similarity between concepts. For this, we calculated the average minimum distance between the pairs of tokens of SenticNet concepts that co-occurred in the ISEAR dataset statements. We defined the similarity as the inverse of the distance. If the concepts didn’t co-occur in any statement, then we considered the similarity between them to be 0. Classification Procedure We cast the task as a six-way categorization task, in which we assigned 5 5/22/13 5:12 PM Knowledge-Based Approaches to Concept-Level Sentiment Analysis Table 1. Accuracy obtained with different feature sets, datasets, and classifiers. Feature set SenticNet + Word Net + ISEAR Total dataset SenticNet ∩ ISEAR Total data size Labeled dataset 2,729 SenticNet + WordNet SenticNet 5,732 SenticNet ∩ ISEAR ∩ WNA Labeled data size 1,202 5,732 SenticNet ∩ WNA 1,436 Naive Bayes (%) 71.20 51.23 53.21 Multilayer perceptron (%) 74.12 55.54 52.12 Support Vector Machine (%) 88.64 57.77 59.23 exactly one of the six WNA emotion labels to each considered concept. We conducted two sets of experiments. I n one ex p er i ment we to ok i nto account the features that relied on ISEAR, and in the other we didn’t use those features. In the experiments that relied on features based on occurrences or cooccurrences of concepts in the ISEAR statements, we only used the 2,729 S enticNet concepts fou nd i n t he ISEAR data (which thus had valid ISEAR-based features) in further processing, and hence assigned them the emotion labels. In contrast, in the experiments without ISEAR-based features, we assigned the labels to all 5,732 SenticNet concepts. Therefore, we constructed feature vectors for 2,729 and 5,732 SenticNet concepts, respectively. As training and test data, we used the intersection between the corresponding set of concepts and the WNA vocabulary (for which we had the gold standard emotion labels); this intersection consisted of 1,202 and 1,436 concepts, respectively. To evaluate, we randomly divided the corresponding set of available labeled data into training and test data, using 66.7 percent of the set for training and 33.3 percent for testing. For constructing the final resource, we used 100 percent of available labeled data for training. Finally, we used various machine-learning algorithms trained on the training data to derive the labels for the test and unlabeled data. 6 IS-28-02-Gelbukh.indd 6 Evaluation and Discussion As a gold standard data for evaluation, we used WNA concepts. We considered a label to be assigned correctly to a concept if the WNA data assigned the same label to the same concept (in a few cases, when the WNA assigned two labels to a concept, we considered our label assignment correct if our assigned label was one of those two labels). Table 1 summarizes our experiments with different feature sets, datasets, and classifiers. Using the features derived from SenticNet, WordNet, and ISEAR, we included in those experiments only the 2,729 concepts in the intersection of SenticNet and ISEAR because they had valid ISEAR-based features. When using feature vectors not containing features that rely on the ISEAR data—that is, consisting only of the SenticNet similarity and the nine WordNet::Similarity measures between the given concept and all other concepts in the dataset— all 5,732 SenticNet concepts were processed. Accordingly, as a labeled dataset. we used the intersection of the total dataset with WNA; for ISEAR-based experiments, this set contained 1,202 labeled concepts, and for experiments not relying on ISEAR, the set contained 1,436 concepts. The SVM classifier showed the best performance, obtaining an accuracy of 88.64 percent with the ISEAR-based features. Table 2 shows the corresponding confusion matrix. www.computer.org/intelligent As Table 1 shows, ISEAR-based features are important for accurate classification in our task. Also, increasing the training dataset improves accuracy. We thus focus on our results for the corpus of 2,729 concepts obtained with the ISEAR features in the discussion that follows. Although we experimented with different subsets of features, we observed that similarity-based features performed better than the ISEAR parameter-based features.7 Indeed, similarity measures identify whether two concepts have similar properties and thus should be placed in the same category. SenticNet-based similarity had a positive impact. The use of all available features gave the best results both when we used only similaritybased features and when we took into account ISEAR parameter-based features.7 This is probably because of the balancing of the two somewhat contradicting and complementary information sources: ISEAR parameterbased features provided affective information that helped to choose the label for those concepts for which this information was available, and similarity measures indicated whether different concepts are to be assigned the same or different labels. Polarity and Label Agreement Even when the emotion label assigned by the algorithm doesn’t coincide exactly with the one present in the goldstandard data, it can share important properties with the correct label and thus can be considered correct in a relaxed sense. Considering joy and surprise as positive emotions and the rest as negative—and ignoring the confusion within the areas marked with dotted lines in Table 2—we observe a 95.71 percent agreement with the gold standard WNA data on whether the emotion IEEE INTELLIGENT SYSTEMS 5/22/13 5:12 PM Table 2. Confusion matrix on the test set of 396 labeled concepts. Actual labels Predicted labels Surprise Surprise Joy Sadness Anger Fear Disgust Precision (%) Recall (%) 39 4 - - - 1 85 89 Joy 3 91 2 3 1 1 91 90 Sadness - 3 52 1 - - 85 93 Anger 2 2 3 72 4 2 90 85 Fear - - 3 1 63 2 91 91 Disgust 2 - 1 3 1 34 85 85 is positive or negative. Table 3 gives examples of agreement and disagreement in label polarity; two labels agree if both are positive or both are negative. When we used the entire set of 1,202 labeled examples as both training and test data, agreement was 96.8 percent. Table 4 shows agreement between our results and the gold standard in affective polarity sign (negative or positive emotion) per emotion label. Agreement with the Hourglass Model The Hourglass of Emotions8 reinterprets Plutchik’s model by organizing primary emotions around four dimensions whose different levels of activation make up the total emotional state of the mind (see Table 5). Based upon the Hourglass model, SenticNet 29 aims to assign one primary emotion and one secondary emotion to each SenticNet concept. If we map the six emotion labels used in our research (shown in Table 5 in boldface) to the affective dimensions of the Hourglass model, then joy and sadness are mapped to the same dimension, as are anger and fear. Ignoring these differencesthat is, ignoring the confusion within the areas marked with a dashed line in Table 2we have a 91.16 percent agreement with the gold standard in identifying the affective dimensions, or 93.5 percent when we used all labeled data as both the training and test sets. march/april 2013 IS-28-02-Gelbukh.indd 7 Table 3. Examples of agreement and disagreement in label polarity. Concept Gold standard (polarity) Our label (polarity) Frustration Anger (negative) Fear (negative) Agreement Yes Offensive Disgust (negative) Anger (negative) Yes Triumph Joy (positive) Surprise (positive) Yes Favor Joy (positive) Anger (negative) No Bored Sadness (negative) Fear (negative) Yes Wonder Surprise (positive) Joy (positive) Yes Table 4. Our results vs. the gold standard (WNA) in affective polarity sign (negative or positive) per emotion label. Polarity Emotion label Positive Joy Agree Disagree Total 422 8 430 97 8 105 Anger 198 3 201 Disgust 148 10 158 Fear 123 8 131 Sadness 175 2 177 Total: 1,163 Total: 39 Total: 1,202 Surprise Negative Table 5. Primary emotions used in The Hourglass of Emotions. 8 Sentic level Pleasantness Attention Sensitivity Aptitude 1 Ecstasy Vigilance Rage Admiration 2 Joy Anticipation Anger Trust 3 Serenity Interest Annoyance Acceptance 4 Pensiveness Distraction Apprehension Boredom 5 Sadness Surprise Fear Disgust 6 Grief Amazement Terror Loathing Agreement with the SenticNet polarity score A possible way to indirectly evaluate our algorithm on non-WNA data is comparing the polarity of our assigned www.computer.org/intelligent labels (considering joy and surprise as positive and other labels as negative) with the sign of the polarity score given in SenticNet. Table 6 gives some examples. 7 5/22/13 5:12 PM Knowledge-Based Approaches to Concept-Level Sentiment Analysis Table 6. The polarity of our assigned labels vs. the sign used in SenticNet’s polarity score. Concept SenticNet score Our label (polarity) Agreement Better grade −0.185 Joy (positive) No Collect information −0.309 Joy (positive) No Commonsense +0.588 Surprise (positive) Yes Dislike −0.917 Disgust (negative) Yes Efficiency −0.136 Surprise (positive) No Rescue +0.963 Joy (positive) Yes Table 7. Examples of SenticNet concept polarities and labels. Concept Polarity Label Annoyed −0.755 Anger Birthday +0.309 Joy December +0.111 Joy Feel guilty −0.540 Sadness Make mistake −0.439 Sadness Weekend +0.234 Joy Table 8. Distribution of SenticNet concepts per emotion label. Label Concepts Anger 421 Disgust 334 Fear 303 Joy 718 Sadness 426 Surprise 527 Total: 2,729 As Table 6 shows, we obtained a 95.30 percent agreement. As in the cases of the concepts “better grade” or “efficiency,” sometimes the disagreement can be attributed to a possible problem with the SenticNet score rather than with our algorithm. Developed Resource Statistics We obtained two resources: one with the use of ISEAR data and the other without it. The former set is smaller (2,729 versus all 5,732 SenticNet concepts) but more accurate. Table 1 shows the lower bounds on the accuracy of each resource: 88.64 percent 8 IS-28-02-Gelbukh.indd 8 and 59.23 percent, respectively. We believe that the resource obtained with the ISEAR dataset is accurate enough, and still big enough, for practical applications. Table 7 shows exa mples f rom ISEAR’s 2,729 SenticNet concepts along with their polarity scores and emotion labels. The label on “annoyed” coincides with the one given in WNA, while other concepts aren’t present in WNA, and thus we can’t directly validate their labels. However, they appear quite plausible: birthday would fit under joy, weekend for most people is joy, as well as December (because of Christmas and other holidays). All three concepts also have positive labels in SenticNet. The two concepts with negative polarity, “feel guilty” and “make mistake,” were also plausibly associated with sadness. Table 8 shows the distribution of the concepts per emotion label. Note that the most numerous classes are the two positive emotions: joy and surprise. Probably this can be attributed to the selection of concepts in the SenticNet vocabulary: SenticNet www.computer.org/intelligent contains 3,576 concepts with positive polarity and only 2,221 concepts with negative polarity, so it seems natural that the two positive emotions are more numerous in this vocabulary, with joy being more general and thus more frequent than surprise. We should also point out that ISEAR contains nearly the same number of statements (1,093 to 1,096) for each of the seven emotions it considered, of which only one (joy) is positive; thus there are many more emotionally negative statements in the ISEAR dataset than emotionally positive. W e developed the world’s largest freely available dictionary for opinion mining and sentiment analysis that contains both quantitative polarit y scores and qualitative affective labels. Namely, we automatically and accurately supplied SenticNet with affective WNAcompatible labels using a machinelearning algorithm. Our resource, along with associated programs and other relevant data, is available for academic use from www.gelbukh. com/resources/emo-senticnet. Our work opens up multiple directions for future research, such as using other types of monolingual or multilingual 10 corpora as a source of features to improve the accuracy or to label more concepts, as well as the use of more elaborated classification techniques in combination with syntactic and psychological clues to improve accuracy. It will be interesting to see how ISEAR information on gender, country, and so on affects the results. Acknowledgments The work was partially supported by the governments of India, Mexico, and the UK, with grants from CONACYT-DST India 122030 (“Answer Validation through Textual Entailment”); Mexico’s CONACYT IEEE INTELLIGENT SYSTEMS 5/22/13 5:12 PM The Authors 50206-H, ICYT PICCO10-120, and SIPIPN 20131702; and the UK’s Engineering and Physical Sciences Research Council (EPSRC) grant EP/G501750/1 (“Common Sense Computing to Enable the Development of Next-Generation Semantic Web Applications”). References 1.C. Strapparava and A. Valitutti, “WordNet-Affect: An Affective Extension of WordNet,” Proc. 4th Int’l Conf. Language Resources and Evaluation, European Language Resources Association, 2004, pp. 1083–1086. 2.A. Esuli and F. Sebastiani, “SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining,” Proc. Int’l Conf. Language Resources and Evaluation, European Language Resources Association, 2006, pp. 417–422. 3.E. Cambria et al., “SenticNet: A Publicly Available Semantic Resource for Opinion Mining,” Proc. AAAI Common Sense Knowledge, Assoc. Advancement of Artificial Intelligence (AAAI), 2010, pp. 14–18. 4.P. Ekman, “Facial Expression and Emotion, Am. Psychologist, vol. 48, no. 4, 1993, pp. 384–392. 5.A. Ortony and T.J. Turner, “What’s Basic About Basic Emotions?” Psychological Rev., vol. 97, no. 3, 1990, pp. 315–331. 6.K.R. Scherer, “What Are Emotions? And How Can They Be Measured?” Social Science Information, vol. 44, no. 4, 2005, pp. 693–727. march/april 2013 IS-28-02-Gelbukh.indd 9 Soujanya Poria is a graduate student at the Department of Computer Science and Engineering of Jadavpur University, India. His research interests include natural language processing and machine learning. Poria has a BE in computer science and engineering from Jadavpur University. Find his contact details at http://soujanyaporia.in. Alexander Gelbukh is a research professor at the Computing Research Center of the National Polytechnic Institute, Mexico. His research interests include natural language processing. Gelbukh has a PhD in computer science from the All-Russian Institute of Scientific and Technical Information. Find his contact details at http://nlp.cic.ipn. mx/~gelbukh. Amir Hussain is a professor of computing science at the University of Stirling, UK. His research interests include computational intelligence and cognitive computing technology and applications. Hussain has a PhD in electronic and electrical engineering from the University of Strathclyde in Glasgow. Find his contact details at www.cs.stir.ac.uk/~ahu. Newton Howard is one of the directors of the Synthetic Intelligence Project, a resident scientist at the Massachusetts Institute of Technology, and a national security advisor to several US government organizations. Howard has a PhD in cognitive informatics and mathematics from La Sorbonne. Find his contact details at http://web.mit.edu/nhmit/ www/index.html. Dipankar Das is an assistant professor at the Department of Computer Science and Engineering, National Institute of Technology (NIT), Meghalaya, India. His research focuses on natural language processing. Dar has a PhD in computer science from Jadavpur University. Find his contact details at http://dasdipankar.com. Sivaji Bandyopadhyay is a professor at the Department of Computer Science and En- gineering of Jadavpur University, India. His research interests include natural language processing and machine learning. Bandyopadhyay has a PhD in computer science from Jadavpur University. Find his contact details at www.sivajibandyopadhyay.com. 7.S. Poria et al., “Merging SenticNet and WordNet-Affect Emotion Lists for Sentiment Analysis,” Proc. 11th Int’l Conf. Signal Processing, IEEE, 2012, pp. 1251–1255. 8.E. Cambria, A. Livingstone, and A. Hussain, The Hourglass of Emotions, LNCS 7403, Springer, 2012, pp. 144–157. 9.E. Cambria, C. Havasi, and A. Hussain, “SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis,” Proc. Florida www.computer.org/intelligent Artificial Intelligence Research Soc. Conf. (FLAIRS), AAAI, 2012, pp. 202–207. 10.G. Sidorov et al., “A New Combined Lexical and Statistical Based Sentence Level Alignment Algorithm for Parallel Texts,” Int’l J. Computational Linguistics and Applications, vol. 2, nos. 1–2, 2011, pp. 257–263. Selected CS articles and columns are also available for free at http://ComputingNow.computer.org. 9 5/22/13 5:12 PM
© Copyright 2026 Paperzz