Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 5a Raquel Fernández TtTv 2012 - lecture 5a 1 / 21 Plan for this Week Topic: lexical semantics Today: • Word senses • Lexical relations between senses • Word Sense Disambiguation (WSD) Next lecture: • Word similarity • Distributional semantics models Raquel Fernández TtTv 2012 - lecture 5a 2 / 21 Compositional vs. Lexical Semantics Last week we looked into compositional semantics: how the meaning of sentences (expressed as FOL formulas) can be compositionally built up from the meanings of their constituents. S λu.(u@vincent) @ λx .love(x , mia) λx .love(x , mia) @ vincent ⇒ love(vincent, mia) NP VP λv .[λx .[v @λy.love(x , y)]] @ λu.(u@mia) λx .[ λu.(u@mia)@λy.love(x , y) ] λx .[ λy.love(x , y)@mia ] λx .love(x , mia) N λu.(u@vincent) Vincent V λv .[λx .[v @λy.love(x , y)]] loves Raquel Fernández TtTv 2012 - lecture 5a N λu.(u@mia) Mia 3 / 21 Compositional vs. Lexical Semantics The compositional approach does not focus on the meaning of words. Words are considered basic expressions associated with an entity, a property, or a relation in the world (a FOL model). Polder is expensive Expensive(Polder ) Raquel Fernández Every pineapple is sweet ∀x [Pineapple(x ) → Sweet(x )] TtTv 2012 - lecture 5a Mia loves Vincent Love(Mia, Vincent) 4 / 21 Compositional vs. Lexical Semantics The compositional approach does not focus on the meaning of words. Words are considered basic expressions associated with an entity, a property, or a relation in the world (a FOL model). Polder is expensive Expensive(Polder ) Every pineapple is sweet ∀x [Pineapple(x ) → Sweet(x )] Mia loves Vincent Love(Mia, Vincent) This is a rather crude representation of word meaning: what do words like expensive or pineapple actually mean? Raquel Fernández TtTv 2012 - lecture 5a 4 / 21 Compositional vs. Lexical Semantics The compositional approach does not focus on the meaning of words. Words are considered basic expressions associated with an entity, a property, or a relation in the world (a FOL model). Polder is expensive Expensive(Polder ) Every pineapple is sweet ∀x [Pineapple(x ) → Sweet(x )] Mia loves Vincent Love(Mia, Vincent) This is a rather crude representation of word meaning: what do words like expensive or pineapple actually mean? • Lexical semantics is the sub-field of linguistics that deals with word meanings. • It is related to lexicography: a discipline dedicated to the design and compilation of dictionaries. Raquel Fernández TtTv 2012 - lecture 5a 4 / 21 Word Forms and Word Senses The main aspect that makes lexical semantics a challenging problem is that the relation between word form and word meaning is not one-to-one: • Several words can have the same meaning → synonymy ∗ ‘buy’ / ‘purchase’ ∗ ‘car’ / ‘automobile’ Raquel Fernández TtTv 2012 - lecture 5a 5 / 21 Word Forms and Word Senses The main aspect that makes lexical semantics a challenging problem is that the relation between word form and word meaning is not one-to-one: • Several words can have the same meaning → synonymy ∗ ‘buy’ / ‘purchase’ ∗ ‘car’ / ‘automobile’ • One word can mean different things → lexical ambiguity ∗ ‘bank’1 : the slope of land adjoining a body of water ∗ ‘bank’2 : a business establishment in which money is kept Raquel Fernández TtTv 2012 - lecture 5a 5 / 21 Word Forms and Word Senses The main aspect that makes lexical semantics a challenging problem is that the relation between word form and word meaning is not one-to-one: • Several words can have the same meaning → synonymy ∗ ‘buy’ / ‘purchase’ ∗ ‘car’ / ‘automobile’ • One word can mean different things → lexical ambiguity ∗ ‘bank’1 : the slope of land adjoining a body of water ∗ ‘bank’2 : a business establishment in which money is kept Note that when we talk about word forms, we refer to lemmas (stems or roots). Word senses are the meanings associated with lemmas. Raquel Fernández TtTv 2012 - lecture 5a 5 / 21 Lexical Ambiguity: One Form, Several Senses Raquel Fernández TtTv 2012 - lecture 5a 6 / 21 Lexical Ambiguity: One Form, Several Senses Homonymy: accidental ambiguity between unrelated senses (1) a. Mary walked along the bank of the river. b. ABN-AMRO is the richest bank in the city. (2) a. Nadia’s plane taxied to the terminal. b. The central data storage device is served by multiple terminals. c. He disliked the angular planes of his cheeks and jaw. Raquel Fernández TtTv 2012 - lecture 5a 6 / 21 Lexical Ambiguity: One Form, Several Senses Homonymy: accidental ambiguity between unrelated senses (1) a. Mary walked along the bank of the river. b. ABN-AMRO is the richest bank in the city. (2) a. Nadia’s plane taxied to the terminal. b. The central data storage device is served by multiple terminals. c. He disliked the angular planes of his cheeks and jaw. Polysemy: ambiguity between semantically related senses (3) a. The bank raised its interest rates yesterday. b. The store is next to the newly constructed bank. (4) a. John crawled through the window. b. The window is closed. (5) a. The lamb is running in the fild. b. John ate lamb for dinner. (6) a. John spilled coffee on the newspaper b. The newspaper fired its editor. Raquel Fernández TtTv 2012 - lecture 5a 6 / 21 Polysemy vs. Homonymy In dictionaries, it is common to group polysemous senses within one lexical entry and to include a different lexical entry for each homonymous sense or group of senses. http://www.dictionary.com/ Raquel Fernández TtTv 2012 - lecture 5a 7 / 21 Polysemy vs. Homonymy In dictionaries, it is common to group polysemous senses within one lexical entry and to include a different lexical entry for each homonymous sense or group of senses. http://www.dictionary.com/ The distinction between homonymy and polysemy is one of degree: there is no hard threshold for how related two senses must be to be considered polysemous. Raquel Fernández TtTv 2012 - lecture 5a 7 / 21 Homophones & Homographs Two other types of lexical ambiguity that cause problems when dealing with spoken language: Raquel Fernández TtTv 2012 - lecture 5a 8 / 21 Homophones & Homographs Two other types of lexical ambiguity that cause problems when dealing with spoken language: Homophones: one pronunciation, several forms and several senses. break / brake to / too / two knows / nose waste / waist They pose problems for any application that requires speech recognition. Raquel Fernández TtTv 2012 - lecture 5a 8 / 21 Homophones & Homographs Two other types of lexical ambiguity that cause problems when dealing with spoken language: Homophones: one pronunciation, several forms and several senses. break / brake to / too / two knows / nose waste / waist They pose problems for any application that requires speech recognition. Homographs: one form, several senses and several pronunciations. All candidates are present today. The boss will present the award at 10:00. They pose problems for any application that requires speech synthesis. Raquel Fernández TtTv 2012 - lecture 5a 8 / 21 Relations between Senses: Synonymy & Antonymy Besides ambiguity, lexical semantic theories are also interested in accounting for semantic relations that hold between senses. Raquel Fernández TtTv 2012 - lecture 5a 9 / 21 Relations between Senses: Synonymy & Antonymy Besides ambiguity, lexical semantic theories are also interested in accounting for semantic relations that hold between senses. • Synonymy: a relation of semantic identity (or near identity) between senses. aurora/dawn/sunrise Raquel Fernández TtTv 2012 - lecture 5a whore/prostitute big/large 9 / 21 Relations between Senses: Synonymy & Antonymy Besides ambiguity, lexical semantic theories are also interested in accounting for semantic relations that hold between senses. • Synonymy: a relation of semantic identity (or near identity) between senses. aurora/dawn/sunrise whore/prostitute big/large • Antonymy: relation of semantic oppositeness between senses. tall/short Raquel Fernández TtTv 2012 - lecture 5a dead/alive up/down 9 / 21 Relations between Senses: Synonymy & Antonymy Besides ambiguity, lexical semantic theories are also interested in accounting for semantic relations that hold between senses. • Synonymy: a relation of semantic identity (or near identity) between senses. aurora/dawn/sunrise whore/prostitute big/large • Antonymy: relation of semantic oppositeness between senses. tall/short dead/alive up/down • Note that antonyms have opposite but very similar meanings: automatically distinguishing synonyms from antonyms can be difficult. Raquel Fernández TtTv 2012 - lecture 5a 9 / 21 Relations between Senses: Synonymy & Antonymy Besides ambiguity, lexical semantic theories are also interested in accounting for semantic relations that hold between senses. • Synonymy: a relation of semantic identity (or near identity) between senses. aurora/dawn/sunrise whore/prostitute big/large • Antonymy: relation of semantic oppositeness between senses. tall/short dead/alive up/down • Note that antonyms have opposite but very similar meanings: automatically distinguishing synonyms from antonyms can be difficult. • Synonymy and antonymy are symmetric relations: if A is a synonym/antonym of B, then B is a synonym/antonym of A. Raquel Fernández TtTv 2012 - lecture 5a 9 / 21 Relations between Senses: Hyponymy & Hypernymy Raquel Fernández TtTv 2012 - lecture 5a 10 / 21 Relations between Senses: Hyponymy & Hypernymy Hyponymy and Hypernymy: relation of semantic inclusion that holds between a more general term (such as ‘bird’) and a more specific term (such as ‘robin’) Raquel Fernández TtTv 2012 - lecture 5a 10 / 21 Relations between Senses: Hyponymy & Hypernymy Hyponymy and Hypernymy: relation of semantic inclusion that holds between a more general term (such as ‘bird’) and a more specific term (such as ‘robin’) • Hyponymy and hypernymy are not symmetric, they are the complement of each other: if A is a hyponym of B, B is a hypernym of A. • Both relations are transitive: if A is a hyponym of B and B is a hyponym of C, then A is a hyponym of C. • The term superordinate is sometimes used in place in hypernym. Raquel Fernández TtTv 2012 - lecture 5a 10 / 21 Relations between Senses: Hyponymy & Hypernymy Hyponymy and Hypernymy: relation of semantic inclusion that holds between a more general term (such as ‘bird’) and a more specific term (such as ‘robin’) • Hyponymy and hypernymy are not symmetric, they are the complement of each other: if A is a hyponym of B, B is a hypernym of A. • Both relations are transitive: if A is a hyponym of B and B is a hyponym of C, then A is a hyponym of C. • The term superordinate is sometimes used in place in hypernym. ... The class-inclusion relation defined by hyponymy gives rise to a taxonomy, which can be represented with a treelike structure: ... ... golden Raquel Fernández TtTv 2012 - lecture 5a fruit vegetable apple pear elstar ... 10 / 21 How to Represent Word Senses? Raquel Fernández TtTv 2012 - lecture 5a 11 / 21 How to Represent Word Senses? Two possible approaches (we will see a different one on Thursday): • Relational approach: we can define a sense by how it related to other senses. essence of dictionary definitions • Decompositional approach: we can define a sense it terms of some set of meaning primitives Raquel Fernández TtTv 2012 - lecture 5a 11 / 21 How to Represent Word Senses? Two possible approaches (we will see a different one on Thursday): • Relational approach: we can define a sense by how it related to other senses. essence of dictionary definitions • Decompositional approach: we can define a sense it terms of some set of meaning primitives Dolphin = [−human, +animate,...] Woman = [+human, +female] Bachelor = [+human, −female] Raquel Fernández TtTv 2012 - lecture 5a 11 / 21 How to Represent Word Senses? Two possible approaches (we will see a different one on Thursday): • Relational approach: we can define a sense by how it related to other senses. essence of dictionary definitions • Decompositional approach: we can define a sense it terms of some set of meaning primitives Dolphin = [−human, +animate,...] Woman = [+human, +female] Bachelor = [+human, −female] Some of these primitives can be said to play a role in the selectional restrictions of words: e.g: the verb ‘speak’ selects subjects that are [+human] and the pronoun ‘he’ refers to entities that are [−female]. Raquel Fernández TtTv 2012 - lecture 5a 11 / 21 How to Represent Word Senses? Two possible approaches (we will see a different one on Thursday): • Relational approach: we can define a sense by how it related to other senses. essence of dictionary definitions • Decompositional approach: we can define a sense it terms of some set of meaning primitives Dolphin = [−human, +animate,...] Woman = [+human, +female] Bachelor = [+human, −female] Some of these primitives can be said to play a role in the selectional restrictions of words: e.g: the verb ‘speak’ selects subjects that are [+human] and the pronoun ‘he’ refers to entities that are [−female]. Section 19.5 of J&M introduces this approach. For more on selectional restrictions of verbs, see sec. 19.4 on event participants. Raquel Fernández TtTv 2012 - lecture 5a 11 / 21 The Relational Approach Relational theories of lexical meaning attempt to capture how lexical items are logically related to each other. They characterise word senses in terms of the inferences they license. Raquel Fernández TtTv 2012 - lecture 5a 12 / 21 The Relational Approach Relational theories of lexical meaning attempt to capture how lexical items are logically related to each other. They characterise word senses in terms of the inferences they license. raven dolphin seek kill Raquel Fernández ∀x .Raven(x ) → Black (x ) ≈ Raven ⊂ Black ∀x .Dolphin(x ) ↔ Mammal(x ) ∧ Can(x , Swim(x )) ∧ ... ∀x ∀y.Seek (x , y) ↔ Try(x , Find(x , y))) ∀x ∀y.Kill(x , y) ↔ Cause(x , Become(y, ¬Alive(y))) TtTv 2012 - lecture 5a 12 / 21 The Relational Approach Relational theories of lexical meaning attempt to capture how lexical items are logically related to each other. They characterise word senses in terms of the inferences they license. raven dolphin seek kill ∀x .Raven(x ) → Black (x ) ≈ Raven ⊂ Black ∀x .Dolphin(x ) ↔ Mammal(x ) ∧ Can(x , Swim(x )) ∧ ... ∀x ∀y.Seek (x , y) ↔ Try(x , Find(x , y))) ∀x ∀y.Kill(x , y) ↔ Cause(x , Become(y, ¬Alive(y))) • Under this view, the sense of an expression is considered to be the set of its lexical entailments • The lexical entailments of a word W in a sentence S are all the entailments of S that are exclusively due to W. Raquel Fernández X devours Y X eats Y (e.g. they devoured the cake) → X eats Y → X acts quickly ... → X does something → Y disappears → X causes Y to disappear ... TtTv 2012 - lecture 5a 12 / 21 Semantic Relations & Lexical Entailment Some semantic relations can be characterised in terms of lexical entailment. Raquel Fernández TtTv 2012 - lecture 5a 13 / 21 Semantic Relations & Lexical Entailment Some semantic relations can be characterised in terms of lexical entailment. • Synonymy (assuming there are true synonyms) ∗ Two expressions A and B are synonymous if and only if they have the same lexical entailments ∗ or ∀x [A(x ) ↔ B (x )] Raquel Fernández TtTv 2012 - lecture 5a 13 / 21 Semantic Relations & Lexical Entailment Some semantic relations can be characterised in terms of lexical entailment. • Synonymy (assuming there are true synonyms) ∗ Two expressions A and B are synonymous if and only if they have the same lexical entailments ∗ or ∀x [A(x ) ↔ B (x )] • Hyponymy and Hypernymy ∗ A is a hyponym of B iff the lexical entailments of B are a proper subset of the lexical entailments of A. ∗ So if A is an hyponym of B (and hence B is an hypernym of A) then ∀x [A(x ) → B (x )] ∗ recall the hyponymy is transitive Raquel Fernández Hyponyms Hypernyms car devour enormous vehicle eat large TtTv 2012 - lecture 5a 13 / 21 WordNet Raquel Fernández TtTv 2012 - lecture 5a 14 / 21 WordNet WordNet is a lexical database created to deal with tasks that require knowledge of lexical semantics. It can be searched online at http://wordnetweb.princeton.edu/perl/webwn Raquel Fernández TtTv 2012 - lecture 5a 14 / 21 WordNet WordNet is a lexical database created to deal with tasks that require knowledge of lexical semantics. It can be searched online at http://wordnetweb.princeton.edu/perl/webwn You can search for word forms (lemmas). Each entry includes: • a set of senses (no distinction between homonymy and polysemy) • a set of synonyms for each sense (called a synset) • a dictionary style definition (called a gloss) • an set of usage examples Each synset is related to its direct hypernym. Longer hypernymy chains can also be examined. Raquel Fernández TtTv 2012 - lecture 5a 14 / 21 WordNet WordNet is a lexical database created to deal with tasks that require knowledge of lexical semantics. It can be searched online at http://wordnetweb.princeton.edu/perl/webwn You can search for word forms (lemmas). Each entry includes: • a set of senses (no distinction between homonymy and polysemy) • a set of synonyms for each sense (called a synset) • a dictionary style definition (called a gloss) • an set of usage examples Each synset is related to its direct hypernym. Longer hypernymy chains can also be examined. See the WordNet website at http://wordnet.princeton.edu/ and section 19.3 from J&M. Raquel Fernández TtTv 2012 - lecture 5a 14 / 21 WordNet: An Example Raquel Fernández TtTv 2012 - lecture 5a 15 / 21 WordNet: An Example Raquel Fernández TtTv 2012 - lecture 5a 16 / 21 Word Sense Disambiguation Raquel Fernández TtTv 2012 - lecture 5a 17 / 21 Word Sense Disambiguation WSD: the task of detecting which sense of a word is being used in a given context. Nadia’s plane taxied to the terminal. The central data storage device is served by multiple terminals. He disliked the angular planes of his cheeks and jaw. Raquel Fernández TtTv 2012 - lecture 5a 17 / 21 Word Sense Disambiguation WSD: the task of detecting which sense of a word is being used in a given context. Nadia’s plane taxied to the terminal. The central data storage device is served by multiple terminals. He disliked the angular planes of his cheeks and jaw. The compositional approach we looked at last week ignored this problem. But it is critical for many NLP applications. Raquel Fernández TtTv 2012 - lecture 5a 17 / 21 Word Sense Disambiguation WSD: the task of detecting which sense of a word is being used in a given context. Nadia’s plane taxied to the terminal. The central data storage device is served by multiple terminals. He disliked the angular planes of his cheeks and jaw. The compositional approach we looked at last week ignored this problem. But it is critical for many NLP applications. Basic WSD algorithms: • take as input a word in context (e.g. a sentence) and a list of possible word senses • return as output the correct word sense for that use Raquel Fernández TtTv 2012 - lecture 5a 17 / 21 Word Sense Disambiguation WSD: the task of detecting which sense of a word is being used in a given context. Nadia’s plane taxied to the terminal. The central data storage device is served by multiple terminals. He disliked the angular planes of his cheeks and jaw. The compositional approach we looked at last week ignored this problem. But it is critical for many NLP applications. Basic WSD algorithms: • take as input a word in context (e.g. a sentence) and a list of possible word senses • return as output the correct word sense for that use Input and output vary across applications. For instance: • machine translation: input word in language A and a list of possible translations in language B. • speech synthesis: input word and list of possible pronunciations (homographs). Raquel Fernández TtTv 2012 - lecture 5a 17 / 21 Dictionary Methods for WSD When we evaluate WSD independently of a task, we can use the set of senses from a lexical resource like WordNet. Raquel Fernández TtTv 2012 - lecture 5a 18 / 21 Dictionary Methods for WSD When we evaluate WSD independently of a task, we can use the set of senses from a lexical resource like WordNet. • The simplest algorithm: Raquel Fernández TtTv 2012 - lecture 5a 18 / 21 Dictionary Methods for WSD When we evaluate WSD independently of a task, we can use the set of senses from a lexical resource like WordNet. • The simplest algorithm: select the most frequent sense in WordNet Raquel Fernández TtTv 2012 - lecture 5a 18 / 21 Dictionary Methods for WSD When we evaluate WSD independently of a task, we can use the set of senses from a lexical resource like WordNet. • The simplest algorithm: select the most frequent sense in WordNet • The Lesk algorithm: a family of algorithms that take into account not only the frequency of a sense, but also the amount of overlap between the context of the input word and the gloss and examples (the signature) of each potential sense. Raquel Fernández TtTv 2012 - lecture 5a 18 / 21 Dictionary Methods for WSD When we evaluate WSD independently of a task, we can use the set of senses from a lexical resource like WordNet. • The simplest algorithm: select the most frequent sense in WordNet • The Lesk algorithm: a family of algorithms that take into account not only the frequency of a sense, but also the amount of overlap between the context of the input word and the gloss and examples (the signature) of each potential sense. There are several options: ∗ Original Lesk: the signatures of the context words are compared to the signatures of the input word. ∗ Corpus Lesk: if a sense-tagged corpus is available, we can use as signature the words in each relevant corpus sentence. Raquel Fernández TtTv 2012 - lecture 5a 18 / 21 The Simplified Lesk Algorithm It chooses the sense whose signature shares most words with the context of the input word. Or if there none, because there is no overlap of there is a tie, it takes the most frequent sense. Raquel Fernández TtTv 2012 - lecture 5a 19 / 21 The Simplified Lesk Algorithm: Example Target sentence: the port they served us was deliciously sweet Raquel Fernández TtTv 2012 - lecture 5a 20 / 21 Homework • You can start working on homework #5 (due Friday midnight). There will be one more exercise, which will be available tomorrow. ∗ Always use the latest version of the homework that is on Blackboard • Recall that Thursday is the deadline to update your team website with an outline of your project. ∗ I will look at all your websites on Friday morning example with bank... Raquel Fernández TtTv 2012 - lecture 5a 21 / 21
© Copyright 2026 Paperzz