An Unsupervised Aspect-Sentiment Model for Online Reviews Samuel Brody Department of Biomedical Informatics Columbia University [email protected] May 2010 Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 1 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 2 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 3 / 42 Unsupervised Learning and NLP Unsupervised Methods + don’t require annotation + fit themselves to the task and data + suitable for plentiful data / low knowledge scenarios + can harness knowledge sources + intuitive and cognitively plausible Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 4 / 42 An Unsupervised Aspect-Sentiment Model for Online Reviews Brody & Elhadad (NAACL ’10) Detecting Salient Aspects in Online Reviews of Health Providers Brody & Elhadad (AMIA ’10 - under review) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 5 / 42 Online Reviews What we have: overall score details in free-form text What we need: the relevant information in the text, for summary comparison pro/con lists Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 6 / 42 Online Reviews What we have: overall score details in free-form text What we need: the relevant information in the text, for summary comparison pro/con lists Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 6 / 42 Why Unsupervised? manual annotation may not be feasible relevant information is unpredictable varying ways of expressing similar meaning spelling errors and typos Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 7 / 42 “The Pitch” Aspect-Sentiment Model simple and elegant unsupervised - no labeled training data effective flexible across domains Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 8 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 9 / 42 Previous Approaches Keyword Based manual annotation / ask users IE techniques (TF-IDF) use special lexicons plus adapt across domains keyword expansion and clustering Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 10 / 42 Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 11 / 42 Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 11 / 42 Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 12 / 42 Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 12 / 42 Latent Dirichlet Allocation (LDA) - Blei et al. (2003) The William Randolph Hearst Foundation will give $1.25 million to Lincoln Center , Metropolitan Opera Co. , New York Philharmonic and Juilliard School . “Our board felt that we had a real opportunity to make a mark on the future of the performing arts with these grants an act every bit as important as our traditional areas of support in health, medical research , education and the social services ,” Hearst Foundation President Randolph A. Hearst said Monday in announcing the grants . Lincoln Center’s share will be $200,000 for its new building , which will house young artists and provide new public facilities . The Metropolitan Opera Co. and New York Philharmonic will receive $400,000 each. The Juilliard School , where music and the performing arts are taught , will get $250,000 . The Hearst Foundation , a leading supporter of the Lincoln Center Consolidated Corporate Fund , will make its usual annual $100,000 donation, too. “Arts” Sam Brody “Budgets” “Children” “Education” NEW MILLION CHILDREN SCHOOL FILM TAX WOMEN STUDENTS SHOW PROGRAM PEOPLE SCHOOLS MUSIC BUDGET CHILD EDUCATION MOVIE BILLION YEARS TEACHERS PLAY FEDERAL FAMILIES HIGH MUSICAL YEAR WORK PUBLIC BEST SPENDING PARENTS TEACHER ACTOR NEW SAYS BENNETT FIRST An Unsupervised STATE Aspect-Sentiment FAMILY Model for Online MANIGAT Reviews May 2010 13 / 42 Latent Dirichlet Allocation (LDA) - Blei et al. (2003) The William Randolph Hearst Foundation will give $1.25 million to Lincoln Center , Metropolitan Opera Co. , New York Philharmonic and Juilliard School . “Our board felt that we had a real opportunity to make a mark on the future of the performing arts with these grants an act every bit as important as our traditional areas of support in health, medical research , education and the social services ,” Hearst Foundation President Randolph A. Hearst said Monday in announcing the grants . Lincoln Center’s share will be $200,000 for its new building , which will house young artists and provide new public facilities . The Metropolitan Opera Co. and New York Philharmonic will receive $400,000 each. The Juilliard School , where music and the performing arts are taught , will get $250,000 . The Hearst Foundation , a leading supporter of the Lincoln Center Consolidated Corporate Fund , will make its usual annual $100,000 donation, too. “Arts” Sam Brody “Budgets” “Children” “Education” NEW MILLION CHILDREN SCHOOL FILM TAX WOMEN STUDENTS SHOW PROGRAM PEOPLE SCHOOLS MUSIC BUDGET CHILD EDUCATION MOVIE BILLION YEARS TEACHERS PLAY FEDERAL FAMILIES HIGH MUSICAL YEAR WORK PUBLIC BEST SPENDING PARENTS TEACHER ACTOR NEW SAYS BENNETT FIRST An Unsupervised STATE Aspect-Sentiment FAMILY Model for Online MANIGAT Reviews May 2010 13 / 42 LDA as a Generative Model Generating a document: sample a topic distribution θd from Dirichlet(α) repeatedly generate words wi by sampling a topic ti ∼ Multinomial(θd ) sampling a word wi |ti ∼ Multinomial(φti ) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 14 / 42 Intuition LDA Assumptions: Documents are composed of a distribution of topics. The distributions are skewed towards few topics (Dirichlet prior). ⇒ Words in same document are more likely from the same topic. Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 15 / 42 Inference via Sampling - Griffiths and Steyvers (2004) Inference Procedure 1 random initialization 2 calculate word-topic distribution 3 calculate topic distribution for documents, based on the words 4 re-assign topics to word instances, based on document distrib. 5 repeat to convergence Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 16 / 42 Using LDA for Aspects Global vs. Local Topics - MG-LDA (Titov and McDonald, 2008a) "London, Pound, Tube" vs. "Paris, Euro, Metro" Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 17 / 42 Using LDA for Aspects Global vs. Local Topics - MG-LDA (Titov and McDonald, 2008a) "London, Pound, Tube" vs. "Paris, Euro, Metro" Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 17 / 42 Using LDA for Aspects Global vs. Local Topics - MG-LDA (Titov and McDonald, 2008a) "London, Pound, Tube" vs. "Paris, Euro, Metro" Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 17 / 42 Using LDA for Aspects Global vs. Local Topics - MG-LDA (Titov and McDonald, 2008a) "London, Pound, Tube" vs. "Paris, Euro, Metro" Mapping Topics to Aspects - MAS (Titov and McDonald, 2008b) 100 topics → few rateable aspects Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 17 / 42 Using LDA for Aspects Global vs. Local Topics - MG-LDA (Titov and McDonald, 2008a) "London, Pound, Tube" vs. "Paris, Euro, Metro" Mapping Topics to Aspects - MAS (Titov and McDonald, 2008b) 100 topics → few rateable aspects Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 17 / 42 Topics ⇒ Rateable Aspects Problem: global vs. local topics Solution: use sentences instead of documents Problem: mapping topics to aspects Solution: use few topics corresponding directly to aspects Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 18 / 42 Topics ⇒ Rateable Aspects Problem: global vs. local topics Solution: use sentences instead of documents Problem: mapping topics to aspects Solution: use few topics corresponding directly to aspects Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 18 / 42 Topics ⇒ Rateable Aspects Problem: global vs. local topics Solution: use sentences instead of documents Problem: mapping topics to aspects Solution: use few topics corresponding directly to aspects Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 18 / 42 Model Order - Number of Aspects P F (C, Ĉ) = 1{Ci,j = Ĉi,j = 1, di , dj ∈ D̂} P i,j 1{Ci,j = 1, di , dj ∈ D̂} i,j Validation Procedure (for a given k ) : 1 For dataset D: Run LDA with k topics on D to produce assignment matrix Ck . Create random assignment matrix Rk (uniform assignment) 2 Sample random subset D i of size δ|D| from D. Run LDA on D i to produce assignment matrix Cki . Create random comparison matrix Rki (uniform assignments) Calculate scorei (k ) = F (Cki , Ck ) − F (Rki , Rk ) 3 4 Repeat 2nd step q times. Return the average score over q iterations. Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 19 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 20 / 42 Data Restaurants - 50,000 restaurant reviews from Citysearch NY (http://newyork.citysearch.com/) 3,400 sentences annotated by Ganu et al. (2009) Products - from Amazon (http://www.amazon.com/) 1,086 reviews for four leading netbooks 586 reviews for watches Medical Professionals - 33,654 reviews of 12,898 practitioners in NY state, from Rate MDs (http://www.ratemds.com) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 21 / 42 Results in the Restaurant Domain Inferred Aspect Representative Words Manual Aspect Food - General Wine & Drinks Dishes Bakery menu, fresh, sushi, fish, chef, cuisine wine, list, glass, drinks, beer, bottle chicken, sauce, rice, cheese, spicy, salad, hot, delicious, dessert, bagels, bread, chocolate Food & Drink Ambiance / Mood Physical Atmosphere great, atmosphere, wonderful, music, experience bar, room, outside, seating, tables, cozy, loud Atmosphere Staff Service service, staff, friendly, attentive, busy, slow table, order, wait, minutes, reservation, forgot Staff Value portions, quality, worth, size, cheap Price Anec. - experience Anec. - location dinner, night, group, friends, date, family out, back, definitely, around, walk, block Anecdotes Recommendation Location Misc. Description best, top, favorite, city, NYC restaurant, found, Paris, (New) York, location place, eat, enjoy, big, often, stuff Misc. Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 22 / 42 Results in the Restaurant Domain Inferred Aspect Representative Words Manual Aspect Food - General Wine & Drinks Dishes Bakery menu, fresh, sushi, fish, chef, cuisine wine, list, glass, drinks, beer, bottle chicken, sauce, rice, cheese, spicy, salad, hot, delicious, dessert, bagels, bread, chocolate Food & Drink Ambiance / Mood Physical Atmosphere great, atmosphere, wonderful, music, experience bar, room, outside, seating, tables, cozy, loud Atmosphere Staff Service service, staff, friendly, attentive, busy, slow table, order, wait, minutes, reservation, forgot Staff Value portions, quality, worth, size, cheap Price Anec. - experience Anec. - location dinner, night, group, friends, date, family out, back, definitely, around, walk, block Anecdotes Recommendation Location Misc. Description best, top, favorite, city, NYC restaurant, found, Paris, (New) York, location place, eat, enjoy, big, often, stuff Misc. Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 22 / 42 Comparison with MAS - Titov and McDonald (2008b) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 23 / 42 Products Domain Netbooks Aspect Representative Words Performance Hardware Memory Software Usability Battery Size power, performance, mode, fan, quiet drive, wireless, bluetooth, usb, speakers, webcam ram, 2GB, upgrade, extra, 1GB, speed using, office, software, installed, works, programs internet, video, web, movies, music, email, play battery, life, hours, time, cell, last screen, keyboard, size, small, enough, big ..., Portability, Comparison, Mouse, General, Purchase, Looks, OS Watches Aspect Representative Words General Appearance Display Performance buy, perfect, husband, gift, beautiful, deal looks, looking, look, nice, titanium, quality read, little, date, display, digital, set atomic, day, accurate, battery, solar, adjust Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 24 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Recommendation: years, best, family, recommend, practice “My mother has been a patient here for approximately 2 years .. I have been very impressed ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Recommendation: years, best, family, recommend, practice “My mother has been a patient here for approximately 2 years .. I have been very impressed ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Manner: staff, great, caring, knowledgeable, helpful “Terrific doctor, excellent bed-side manner, staff needs to improve ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Anecdotal: visit, problem, insurance, treatment, husband, symptoms “ A week later I felt sick & when I called him, he then sent me to have a sonogram” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Attention: time, questions, listens, appointment, talk “prompt, fast, efficient .. he takes time for your questions ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Schedule: time, office, out, appointment, minutes, phone, late “You can call before you leave .... and they will let you know how long of a wait you will have ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Medical Domain - Shared and Unique Aspects Aspect Recommendation Manner Anecdotal Attention Scheduling Special GP X X X X – Dent. X X X – – OBGYN X X X X X Psyc. X X X X X Perscrip. & Tests Cost Pregnancy – Pregnancy : first, pregnancy, delivered, baby, hospital “ My last pregnancy was high risk, and he stayed on top of everything ...” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 25 / 42 Summary automatically infer relevant aspects aspects missed by annotators Food & Drink, Staff & Service, Wireless, difference btw. doctors flexible with regard to domain leisure, products, professional services handle non-keyword aspects Food, Atmosphere, Bedside Manner, Scheduling, Portability handle misspellings and domain specific words desert, decour/decore, anti-pasta, creme-brule, sandwhich, omlette six common misspellings of restaurant Korma, Edamame, Dosa, Pho Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 26 / 42 Summary automatically infer relevant aspects aspects missed by annotators Food & Drink, Staff & Service, Wireless, difference btw. doctors flexible with regard to domain leisure, products, professional services handle non-keyword aspects Food, Atmosphere, Bedside Manner, Scheduling, Portability handle misspellings and domain specific words desert, decour/decore, anti-pasta, creme-brule, sandwhich, omlette six common misspellings of restaurant Korma, Edamame, Dosa, Pho Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 26 / 42 Summary automatically infer relevant aspects aspects missed by annotators Food & Drink, Staff & Service, Wireless, difference btw. doctors flexible with regard to domain leisure, products, professional services handle non-keyword aspects Food, Atmosphere, Bedside Manner, Scheduling, Portability handle misspellings and domain specific words desert, decour/decore, anti-pasta, creme-brule, sandwhich, omlette six common misspellings of restaurant Korma, Edamame, Dosa, Pho Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 26 / 42 Summary automatically infer relevant aspects aspects missed by annotators Food & Drink, Staff & Service, Wireless, difference btw. doctors flexible with regard to domain leisure, products, professional services handle non-keyword aspects Food, Atmosphere, Bedside Manner, Scheduling, Portability handle misspellings and domain specific words desert, decour/decore, anti-pasta, creme-brule, sandwhich, omlette six common misspellings of restaurant Korma, Edamame, Dosa, Pho Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 26 / 42 Summary automatically infer relevant aspects aspects missed by annotators Food & Drink, Staff & Service, Wireless, difference btw. doctors flexible with regard to domain leisure, products, professional services handle non-keyword aspects Food, Atmosphere, Bedside Manner, Scheduling, Portability handle misspellings and domain specific words desert, decour/decore, anti-pasta, creme-brule, sandwhich, omlette six common misspellings of restaurant Korma, Edamame, Dosa, Pho Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 26 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 27 / 42 Previous Approaches Dictionary/Lexicon Based Unsupervised (Graph, Constraints) Hatzivassiloglou and McKeown (1997); Popescu and Etzioni (2005) Combined Turney (2002); Fahrni and Klenner (2008); Jijkoun and Hofmann (2009) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 28 / 42 Methodology Issues: Dictionary/Lexicon Based – lack of coverage – misspellings and out-of-lexicon words – require manual annotation Unsupervised (Graph, Constraints) – lack domain and aspect specificity Combined + better coverage – problems of both Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 29 / 42 Methodology Design Principles: start with a seed and propagate allow for supervised or unsupervised seeds keep it aspect-specific Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 30 / 42 Graph Representation “The food was tasty and hot, but our waiter was not friendly.” Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 31 / 42 Graph Representation “The food was tasty and hot, but our waiter was not friendly.” (tasty, food), (hot, food), (not-friendly, waiter) Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 31 / 42 Graph Representation “The food was tasty and hot, but our waiter was not friendly.” (tasty, food), (hot, food), (not-friendly, waiter) weight{(tasty,hot)}++; Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 31 / 42 Negation-Based Seed Used: explicit negation: - friendly vs. not friendly morphological indicators: - (dis)courteous, (un)interesting, (in)expensive Discarded: disjunctions - “convenient but noisy location” - “cheap but tasty food”, “dainty but strong necklace” antonyms from dictionary (WordNet) - round vs. square, cool vs. warm Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 32 / 42 Negation-Based Seed Used: explicit negation: - friendly vs. not friendly morphological indicators: - (dis)courteous, (un)interesting, (in)expensive Discarded: disjunctions - “convenient but noisy location” - “cheap but tasty food”, “dainty but strong necklace” antonyms from dictionary (WordNet) - round vs. square, cool vs. warm Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 32 / 42 Negation-Based Seed Used: explicit negation: - friendly vs. not friendly morphological indicators: - (dis)courteous, (un)interesting, (in)expensive Discarded: disjunctions - “convenient but noisy location” - “cheap but tasty food”, “dainty but strong necklace” antonyms from dictionary (WordNet) - round vs. square, cool vs. warm Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 32 / 42 Negation-Based Seed Used: explicit negation: - friendly vs. not friendly morphological indicators: - (dis)courteous, (un)interesting, (in)expensive Discarded: disjunctions - “convenient but noisy location” - “cheap but tasty food”, “dainty but strong necklace” antonyms from dictionary (WordNet) - round vs. square, cool vs. warm Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 32 / 42 Propagation Method Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 33 / 42 Propagation Method Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 33 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 34 / 42 Human Gold Standard Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 35 / 42 Evaluation Measures Kendall’s tau coefficient (τk ) and Kendall’s distance (Dk ) G = {(a, b) : a, b ∈ Gold ∧ a ≺ b} τk = |Same|−|Reverse| |G| Dk = |Reverse|+p·|Tied| |G| −1 ≤ τk ≤ +1 0 ≤ Dk ≤ 1 correlation: more is better distance: less is better Note: only pairs that are ordered in the gold standard are used! Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 36 / 42 Evaluation Results Aspect Mood Staff Main Dishes Physical Atmo. Bakery Food - General Wine & Drinks Service Average Sam Brody Auto. τk ↑ Dk ↓ 0.53 0.23 0.57 0.22 0.19 0.40 0.34 0.33 0.33 0.33 0.19 0.41 0.32 0.34 0.41 0.30 0.36 0.32 Lexicon τk ↑ Dk ↓ 0.56 0.22 0.60 0.20 0.38 0.31 0.25 0.37 0.35 0.33 0.41 0.30 0.52 0.24 0.54 0.23 0.45 0.27 An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 37 / 42 Summary no manual annotation sentiment indicators missed by annotators “neutral” adjectives convey sentiment in a certain domain sentiment depends on aspect warm, cheap, busy handle misspellings and domain specific words exelent, tastey New-Yorky, orgasmic Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 38 / 42 Summary no manual annotation sentiment indicators missed by annotators “neutral” adjectives convey sentiment in a certain domain sentiment depends on aspect warm, cheap, busy handle misspellings and domain specific words exelent, tastey New-Yorky, orgasmic Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 38 / 42 Summary no manual annotation sentiment indicators missed by annotators “neutral” adjectives convey sentiment in a certain domain sentiment depends on aspect warm, cheap, busy handle misspellings and domain specific words exelent, tastey New-Yorky, orgasmic Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 38 / 42 Summary no manual annotation sentiment indicators missed by annotators “neutral” adjectives convey sentiment in a certain domain sentiment depends on aspect warm, cheap, busy handle misspellings and domain specific words exelent, tastey New-Yorky, orgasmic Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 38 / 42 Outline 1 Introduction 2 Aspect Prev. Approaches Methodology Experiments 3 Sentiment Prev. Approaches Methodology Experiments 4 Conclusions Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 39 / 42 Summary Aspect-Sentiment Model infers relevant information flexible across domains no annotation / supervision robust to noise and error Future Directions aspect: closer integration of aspect and sentiment cross-sentence interaction sentiment: other sentiment indicators other graph methods Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 40 / 42 Ask me about ... Dependency Parsing ? = Machine Translation ? = Combining semantic and distributional similarity for: – automatic labeling – text simplification Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 41 / 42 Thank You! Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 42 / 42 Bibliography Blei, David M., Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3:993–1022. Fahrni, Angela and Manfred Klenner. 2008. Old Wine or Warm Beer: Target-Specific Sentiment Analysis of Adjectives. In Proc.of the Symposium on Affective Language in Human and Machine, AISB 2008 Convention. pages 60 – 63. Ganu, Gayatree, Noemie Elhadad, and Amelie Marian. 2009. Beyond the stars: Improving rating predictions using review text content. In WebDB. Griffiths, Thomas L. and Mark Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101(Suppl 1):5228–5235. Hatzivassiloglou, Vasileios and Kathleen R. McKeown. 1997. Predicting the semantic orientation of adjectives. In Proc. of the 35th Annual Meeting of the Association for Computational Linguistics. ACL, Madrid, Spain, pages 174–181. Jijkoun, Valentin and Katja Hofmann. 2009. Generating a non-english subjectivity lexicon: Relations that matter. In Proc. of the 12th Conference of the European Chapter of the ACL (EACL 2009). ACL, Athens, Greece, pages 398–405. Popescu, Ana-Maria and Oren Etzioni. 2005. Extracting product features and opinions from reviews. In HLT ’05: Proc. of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. ACL, Morristown, NJ, USA, pages 339–346. Titov, Ivan and Ryan McDonald. 2008a. Modeling online reviews with multi-grain topic models. In WWW ’08: Proc. of the 17th international conference on World Wide Web. ACM, New York, NY, pages 111–120. Titov, Ivan and Ryan McDonald. 2008b. A joint model of text and aspect ratings for sentiment summarization. In Proc. of ACL-08: HLT . ACL, Columbus, Ohio, pages 308–316. Turney, Peter. 2002. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proc. of 40th Annual Meeting of the Association for Computational Linguistics. ACL, Philadelphia, Pennsylvania, USA, pages 417–424. Sam Brody An Unsupervised Aspect-Sentiment Model for Online Reviews May 2010 42 / 42
© Copyright 2026 Paperzz