Deriving adjectival scales from continuous space word representations Joo-Kyung Kim and Marie-Catherine de Marneffe The Ohio State University EMNLP 2013 Overview Q: Was the movie good? A: It was excellent. Yes the movie was good Q: Was the movie good? A: It was okay. No the movie was not good okay < good < excellent [Mikolov et al. 2013] syntactic and semantic regularities of continuous word representations from Recurrent Neural Network Language Model [Mikolov et al. 2010] off-the-shelf continuous space word representations http://www.fit.vutbr.cz/~imikolov/rnnlm/word_projections-1600.txt.gz Can we derive adjectival scales from such representations? 2 Continuous word representations from RNNLM w(t) 1 out of N Input (N nodes) s(t-1) [Mikolov et al., 2010] π ππππππ ππ€ π‘ + ππ π‘ β 1 s(t) Hidden (M nodes) π πππ‘πππ₯ ππ π‘ y(t) π =π×π Output (N nodes) Recurrence (M nodes) From the matrix U, M dimensional column vector for each word is used as the continuous space word representation. (M=1,600 for our experiments) Both the current and the previous words (contexts) influence on the training. 3 Syntactic/semantic regularities of continuous word representations in RNNLM [Mikolov et al. 2013] Gender transformation Number transformation 4 Deriving adjectival scales from continuous word representations We assume that intermediate vectors between two word vectors in the continuous space represents some βmiddleβ form or meaning. quiet angry x1 a=furious x2 x3 b=calm tense 5 Intermediate words between base forms and superlative forms <comparative> <superlative> ? <base form> base superlative words with highest cosine similarities to the center good best better: .738 strong: .644 normal: .619 less: .609 bad worst terrible: .726 great: .678 horrible: .674 worse: .665 slow slowest slower: .637 sluggish: .614 steady: .558 brisk: .543 fast fastest faster: .645 slower: .602 quicker: .542 harder: .518 6 Intermediate words between two semantically related adjectives 1st input word Word with highest cosine similarity to each intermediate point 2nd input word 1st quarter half 3rd quarter furious angry: .615 tense: .465 quiet: .560 calm furious angry: .632 unhappy: .640 pleased: .516 happy terrible horrible: .783 incredible: .714 wonderful: .772 terrific cold mild: .348 warm: .517 sticky: .424 hot ugly nasty: .672 wacky: .645 lovely: .715 gorgeous 7 Evaluation: corpus of indirect answers to yes/no questions [de Marneffe et al., 2010] 125 question-answer pairs where both the question and the answer contain an adjective Q: Is Obama qualified? A: I think heβs young. Each pair is annotated via Mechanical Turk for whether the answer conveys - yes - no - uncertain 8 Classifying the IQAPs 1. 2. 3. 4. Q: Is Obama qualified? A: I think heβs young. Get the antonym of the question word from WordNet Draw a line connecting the question and the antonym The perpendicular hyperplane passing through the center between the question and the antonym is the decision boundary Check the side of the answer unqualified qualified young 9 Choosing the antonym Q: Was the movie good? A: It was excellent. β’ A word can have multiple antonyms with different senses. β’ We need to choose the antonym that is most related to both the question and the answer. β’ We choose the antonym that is most collinear with the question and the answer in the continuous word space. β’ πππ max πππ π€π β π€π , π€π β π€πππ‘π πππ‘π evil good excellent bad 10 Deriving adjectival scales by [de Marneffe et al., 2010] Movie reviews are with ratings. πΈπ π€ = πβπ π ππ π|π€ is used as the scale of the word π€ β’ There is a need for data with numerical labels(e.g., movie rates) 11 Deriving adjectival scales by [Mohtarami et al., 2011] SVD Dimension reduction Reconstruction [Courtesy of Wang Houfeng, 2010] ππ × π·π is the π-dimensional continuous representations for the terms. Then, relative positions in the latent semantic space can derive the scales. β’ Singular Value Decomposition (SVD) assumes normally distributed data. β’ A bag of words model. (Word sequences in a document are ignored) 12 Deriving adjectival scales by [de Melo & Bansal, 2013] Find intensity scales using regular expressions on web documents Weak-Strong Patterns Strong-Weak Patterns * (,) but not * not * (,) (but) just * * (,) if not * not * (,) (but|although|though) still * * (,) (al)though not * * (,) or very * * (,) (and|or) (even|almost) * not (only|just) * but * e.g., βgood but not greatβ ο good < great Globally optimize the scales using Mixed Integer Linear Programming (MILP) β’ Can find only a limited number of adjectival scales 13 Evaluation on the 125 IQAPs Accuracy Precision Recall F1 score de Marneffe et al., (2010) 60.00 59.72 59.40 59.56 Mohtarami et al., (2011) - 62.23 60.88 61.55 72.80 69.78 71.39 70.58 The RNN based Model β’ Precision, Recall, and F1 score are macro averaged for yes and no. β’ Our model showed statistically significantly better scores for different metrics than those of [de Marneffe et al., 2010]. β’ [Mohtarami et al., 2011] also showed better results by using synonyms. However, that approach does not learn scales. 14 Visualization of questions, antonyms and answers in 2D space by multidimensional scaling (MDS) A: Do you think she'd be happy with this book? B: I think she'd be delighted by it. 20 15 10 bad good sure terrible confident happy 5 delighted dim 2 0 unhappy -5 young -10 -15 diffident qualified -20 -25 -20 unqualified -15 -10 -5 0 dim 1 5 10 15 20 15 Visualization of questions, antonyms and answers in 2D space by multidimensional scaling (MDS) A: Do you think that's a good idea? B: It's a terrible idea. 20 15 10 bad good sure terrible confident happy 5 delighted dim 2 0 unhappy -5 young -10 -15 diffident qualified -20 -25 -20 unqualified -15 -10 -5 0 dim 1 5 10 15 20 16 Visualization of questions, antonyms and answers in 2D space by multidimensional scaling (MDS) A: The president is promising support for Americans who have suffered from this hurricane. Are you confident you are going to be getting that? B: I'm not so sure about my insurance company. 20 15 10 bad good sure terrible confident happy 5 delighted dim 2 0 unhappy -5 young -10 -15 diffident qualified -20 -25 -20 unqualified -15 -10 -5 0 dim 1 5 10 15 20 17 Conclusion We give further evidence that relationships in the RNNLM continuous vector space are interpretable. We successfully learn adjectival scales as shown by high improvement on the IQAP corpus. 18 Future work Adjectives with modifying adverbs (so sure, quite good, quite a few, etc.) From What the British say What the British mean What foreigners understand Thatβs not bad Thatβs good Thatβs poor Quite good A bit disappointing Quite good Very interesting This is clearly nonsense They are impressed I almost agree I donβt agree at all Heβs not far from agreement http://www.telegraph.co.uk/news/newstopics/howaboutthat/10280244/Translationtable-explaining-the-truth-behind-British-politeness-becomes-internet-hit.html. 19 Thank you! We also thank Eric Fosler-Lussier and the anonymous reviewers for their helpful comments. 20 Shifting the decision boundary No accuracy gain from shifting the decision boundary Center: 72.8 +1%: 71.2 -1%: 70.4 -10%: 69.6 +10%: 64.0 21
© Copyright 2026 Paperzz