Word Senses
Slidesadaptedfrom
DanJurafskyandJamesMar6n
Recaponwords:lemmavs.wordform
• Alemmaorcita5onform
• Samestem,partofspeech,roughseman6cs
• Awordform
• Theinflectedwordasitappearsintext
Wordform
banks
sung
duermes
Lemma
bank
sing
dormir
Lemmashavesenses
• Onelemma“bank”canhavemanymeanings:
Sense1: • …a bank1!can hold the investments in a custodial
account…!
Sense2: • “…as agriculture burgeons on the east bank2!the
river will shrink even more”
• Sense(orwordsense)
• Adiscreterepresenta6on
ofanaspectofaword’smeaning.
• Thelemmabankherehastwosenses
Homonymy
Homonyms:wordsthatshareaformbuthave
unrelated,dis6nctmeanings:
• bank1:financialins6tu6on,bank2:slopingland
• bat1:clubforhiMngaball,bat2:nocturnalflyingmammal
1. Homographs(bank/bank,bat/bat)
2. Homophones:
1. writeandright
2. pieceandpeace
HomonymycausesproblemsforNLP
• Informa6onretrieval
• “bat care”!
• MachineTransla6on
• bat:murciélago(animal)orbate(forbaseball)
• Text-to-Speech
• bass(stringedinstrument)vs.bass(fish)
Polysemy
• 1.Thebankwasconstructedin1875outoflocalredbrick.
• 2.Iwithdrewthemoneyfromthebank
• Arethosethesamesense?
• Sense2:“Afinancialins6tu6on”
• Sense1:“Thebuildingbelongingtoafinancialins6tu6on”
• Apolysemouswordhasrelatedmeanings
• Mostnon-rarewordshavemul6plemeanings
Metonymyorsystema5cpolysemy
• Lotsoftypesofpolysemyaresystema6c
• School, university, hospital!
• Allcanmeantheins6tu6onorthebuilding.
• Asystema6crela6onship:
• BuildingOrganiza6on
• Othersuchkindsofsystema6cpolysemy:
Author(Jane Austen wrote Emma)
WorksofAuthor(I love Jane Austen)
Tree(Plums have beautiful blossoms)
!
!Fruit(I ate a preserved plum)!
Howdoweknowifmorethanonesense?
• The“zeugma”test:Twosensesofserve?
• Which flights serve breakfast?!
• Does Lufthansa serve Philadelphia?!
• ?DoesLu\hansaservebreakfastandPhiladelphia?
• Sincethisconjunc6onsoundsweird,
• wesaythatthesearetwodifferentsensesof“serve”
Quiz
• Whichofthefollowingpairsexemplifyhomonymy
(asopposedtopolysemy)?
1.
2.
3.
4.
9
mouse(animal)vs.mouse(electronicdevice)
bark(ofadog)vs.bark(ofatree)
rock(music)vs.rock(hard)
chair(forsiMng)vs.chair(ofamee6ng)
Sense Relations
Slidesadaptedfrom
DanJurafskyandJamesMar6n
Synonyms
• Wordthathavethesamemeaninginsomeorallcontexts.
•
•
•
•
•
•
filbert/hazelnut
couch/sofa
big/large
automobile/car
vomit/throwup
Water/H20
• Twolexemesaresynonyms
• iftheycanbesubs6tutedforeachotherinallsitua6ons
• Ifsotheyhavethesameproposi5onalmeaning
Synonyms
• Buttherearefew(orno)examplesofperfectsynonymy.
• Evenifmanyaspectsofmeaningareiden6cal
• S6llmaynotpreservetheacceptabilitybasedonno6onsofpoliteness,
slang,register,genre,etc.
• Example:
• Water/H20
• Big/large
• Brave/courageous
Synonymyisarela5on
betweensensesratherthanwords
• Considerthewordsbigandlarge
• Aretheysynonyms?
• Howbigisthatplane?
• WouldIbeflyingonalargeorsmallplane?
• Howabouthere:
• MissNelsonbecameakindofbigsistertoBenjamin.
• ?MissNelsonbecameakindoflargesistertoBenjamin.
• Why?
• bighasasensethatmeansbeingolder,orgrownup
• largelacksthissense
Antonyms
• Sensesthatareoppositeswithrespecttoonefeatureofmeaning
• Otherwise,theyareverysimilar!
dark/light
hot/cold !
short/long
up/down !
!fast/slow
!in/out!
!rise/fall!
• Candefineabinaryopposi6onorbeatoppositeendsofascale
• alive/dead!
• fast/slow!
• Scalecanbecontext-sensi6ve:
• ashortbasketballplayercanbeatallperson
HyponymyandHypernymy
• Onesenseisahyponymofanotherifthefirstsenseismore
specific,deno6ngasubclassoftheother
• carisahyponymofvehicle
• mangoisahyponymoffruit
• Converselyhypernym/superordinate(“hyperissuper”)
• vehicleisahypernymofcar
• fruitisahypernymofmango
Hyponymymoreformally
Fruit
Mango
• Extensional:
• Theclassdenotedbythehypernymextensionallyincludestheclass
denotedbythehyponym
• Entailment:
• AsenseAisahyponymofsenseBifbeinganAentailsbeingaB
• Anothername:theIS-Ahierarchy
• AIS-AB(orAISAB)
• BsubsumesA
HyponymsandInstances
• Hyponymyholdsbetweenclasses
• Classeshavespecificinstances.
• Aninstanceisanindividual,apropernounthatisauniqueen6ty
• SanFranciscoisaninstanceofcity
• Butcityisaclass
• cityisahyponymofmunicipality,...,loca6on...
17
Meronymy
• Thepart-wholerela6on
• Alegispartofachair;awheelispartofacar.
• Wheelisameronymofcar,andcarisaholonymofwheel.
18
Quiz
• Whichofthefollowingpairsexemplifyhyponymy/hypernymy?
1.
2.
3.
4.
19
dog–animal
dog–tail
dog–beagle
dog–Snoopy
WordNet
Slidesadaptedfrom
DanJurafskyandJamesMar6n
WordNet3.0
• Ahierarchicallyorganizedlexicaldatabase
• On-linethesaurus+aspectsofadic6onary
• Someotherlanguagesavailableorunderdevelopment
• (Arabic,Finnish,German,Portuguese…)
Category
UniqueStrings
Noun
117,798
Verb
11,529
Adjec6ve
22,479
Adverb
4,481
Sensesof“bass”inWordnet
Howis“sense”definedinWordNet?
• Thesynset(synonymset),thesetofnear-synonyms,
instan6atesasenseorconcept,withagloss
• Example:chumpasanounwiththegloss:
“apersonwhoisgullibleandeasytotakeadvantageof”
• Thissenseof“chump”issharedby9words:
chump1, fool2, gull1, mark9, patsy1, fall guy1,
sucker1, soft touch1, mug2!
• Eachofthesesenseshavethissamegloss
• (Noteverysense;sense2ofgullistheaqua6cbird)
WordNetHypernymHierarchyfor“bass”
WordNetNounRela5ons
16.4
Relation
Hypernym
Hyponym
Instance Hypernym
Instance Hyponym
Member Meronym
Member Holonym
Part Meronym
Part Holonym
Substance Meronym
Substance Holonym
Antonym
Derivationally
Related Form
Figure 16.2
Also Called
Superordinate
Subordinate
Instance
Has-Instance
Has-Member
Member-Of
Has-Part
Part-Of
•
W ORD S ENSE D ISAMBIGUATION : OVERVIEW
Definition
From concepts to superordinates
From concepts to subtypes
From instances to their concepts
From concepts to concept instances
From groups to their members
From members to their groups
From wholes to parts
From parts to wholes
From substances to their subparts
From parts of substances to wholes
Semantic opposition between lemmas
Lemmas w/same morphological root
Noun relations in WordNet.
7
Example
breakfast1 ! meal1
meal1 ! lunch1
Austen1 ! author1
composer1 ! Bach1
faculty2 ! professor1
copilot1 ! crew1
table2 ! leg3
course7 ! meal1
water1 ! oxygen1
gin1 ! martini1
leader1 () follower1
destruction1 () destroy1
Substance Meronym
Substance Holonym
Antonym
Derivationally
Related Form
From substances to their subparts
From parts of substances to wholes
Semantic opposition between lemmas
Lemmas w/same morphological root
WordNetVerbRela5ons
Figure 16.2
Noun relations in WordNet.
Relation
Hypernym
Troponym
Entails
Antonym
Derivationally
Related Form
Figure 16.3
water1 ! oxygen1
gin1 ! martini1
leader1 () follower1
destruction1 () destro
Definition
From events to superordinate events
From events to subordinate event
(often via specific manner)
From verbs (events) to the verbs (events) they entail
Semantic opposition between lemmas
Lemmas with same morphological root
Example
fly9 ! travel5
walk1 ! stroll1
snore1 ! sleep1
increase1 () decrease1
destroy1 () destruction1
Verb relations in WordNet.
respond to the notion of immediate hyponymy discussed on page 5. Each synset is
related to its immediately more general and more specific synsets through direct hy-
WordNet:Viewedasagraph
Word Sense Disambiguation: A Survey
27
10:9
638 area
CHANGE
274 fix
ical entries, th
TIME
530 day
EMOTION
249 love
ized. The N : PE
EVENT
431 experience PERCEPTION 143 see
∗
synsets for bo
COMMUNIC . 417 review
CONSUMPTION 93 have
POSSESSION 339 price
BODY
82 get. . . done sense of princ
(countsfromSchneiderandSmith2013’sStreuselcorpus)
ATTRIBUTE
205 quality
CREATION
64 cook
As far as w
QUANTITY
102 amount
46 put Verb
Noun CONTACT
Noun
Verb
Th
The
26
noun
and 15 verb originally
supersenseinte
cat
ANIMAL
88 dog
COMPETITION 11 win
listed
listed with
examples
in table
Some
ofs
GROUP
STATIVE
is the1.WordNet
GROUP
1469 place
STATIVE
2922
is 1469 place
BODY
87 hair
WEATHER
0 — 2922
(2003)
pioneer
PERSON
COGNITION
know
overl
PERSON
1202 people
COGNITION
1093
know1202 people
overlap
between
noun
and
verb
inven
STATE
56 pain
all
15 VSSTs
7806 the1093
∗
∗
ARTIFACT
971 carthey areCOMMUNIC
. 974 recommend
tion taskcatego
of sua
ARTIFACT
971 car
COMMUNIC
. .974
recommend
they
NATURAL OBJ
54 flower
to be considered
separate
usepersense after,
COGNITION
771 way
SOCIAL
944
use 771 wayafter, N/A
RELATION COGNITION
35 portion
(seedistinguish
§3.2) 944 the
weSOCIAL
will
noun andcateg
verb
FOOD
766
food
MOTION
602
go
FOOD
766 food
MOTION
go
SUBSTANCE 602
34 oil
`a
1191
have
of the vs.
tradition
with
p
with prefixes, e.g. N : COGNITION
V: CO
ACT
700
service
POSSESSION
309
pay
ACT
700 service
POSSESSION
309
pay
FEELING
34 discomfort `
821 anyone
all nouns
Th
Though
WordNet
synsetspass
are associate
LOCATION
638
area
CHANGE
LOCATION
638 area
CHANGE
274
fix
PROCESS
28 process
`j
54 fried 274 fix expanded the
icalaree
the supersense
EMOTION
249 lovecategories
TIME
530 day
EMOTION
love 530 dayical entries,
MOTIVE TIME249
25 reason
ized.se
ized. The
N : PERSON143
category,
for instanc
∗ PERCEPTION
EVENT
seea supervised
EVENT
431 experience PERCEPTION
143
see 431 experience
PHENOMENON
23 result
COMMUNIC .
∗
from
NER.
EvA
synse
COMMUNIC
.∗ 417 review
93 have
synsetsisCONSUMPTION
for
both
and student.
COMMUNIC . 417 review
CONSUMPTION
93
have
short
for principal
SHAPE
6 square
data
tha
POSSESSION
82under
get.tagged
. . done
POSSESSION 339 price
BODY
82
get. . . 339
doneprice
sense
senseCOMMUNICATION
ofBODY
principal falls
N : POSSESS
PLANT
5 tree
ATTRIBUTE
CREATION
cook
ATTRIBUTE
205 quality
CREATION
64
cook 205 qualityAs far
thethe
coarser
su
OTHER
2 stuff
As
as we are 64
aware,
superse
QUANTITY
CONTACT
46 put
QUANTITY
102 amount
CONTACT
46 put 102 amount
built
foro
all 26 NSSTs
9018
origin
originally intended only asbeen
a method
of
ANIMAL
88
dog
COMPETITION
11
win
ANIMAL
88 dog
COMPETITION 11 win
nese (Qiuthe
et al
W
the verb
WordNet
structure.
Table 1: Summary of noun
and
supersense
cate87 hair
WEATHER
0 But
— Ciaramita an
BODY
87 hair
WEATHER BODY 0 —
LOCATION
“Supersenses”
Thetoplevelhypernymsinthehierarchy
28
WordNets(2003
map
classes known as supersenses.
For example, given the POS-tagged sentence
Supersenses
•
IPRP googledVBD restaurantsNNS inIN theDT areaNN andCC FujiNNP SushiNNP cameVBD upRB a
reviews
NNS wereVBD greatJJ soRB IPRP madeVBD aDT carryVB outRP orderNN
Aword’ssupersensecanbeausefulcoarse-grained
representa6onofwordmeaningforNLPtasks
the goal is to predict the representation
I googledcommunication restaurantsGROUP in the areaLOCATION and Fuji_SushiGROUP
came_upcommunication and reviewsCOMMUNICATION werestative great so I made_ a
carry_outpossession _ordercommunication
where lowercase labels are verb supersenses, UPPERCASE labels are noun supersenses, and _
within a multiword expression. (carry_outpossession and made_ordercommunication are separ
The29
two facets of the representation are discussed in greater detail below. Systems are expec
the both facets, though the manner in which they do this (e.g., pipeline vs. joint model) is up
Word Sense Disambiguation
Slidesadaptedfrom
DanJurafskyandJamesMar6n
WordSenseDisambigua1on(WSD)
• Task
• Awordincontext+afixedinventoryofpoten6alwordsenses
• Decidewhichsenseofthewordthisis
• Why?
• Machinetransla6on,QA,speechsynthesis,…
• Whatsetofsenses?
• English-to-SpanishMT:setofSpanishtransla6ons
• SpeechSynthesis:homographslikebassandbow
• Ingeneral:thesensesinathesauruslikeWordNet
TwovariantsofWSDtask
• LexicalSampletask
• Smallpre-selectedsetoftargetwords(line,plant)
• Andinventoryofsensesforeachword
• Supervisedmachinelearning:trainaclassifierforeachword
• All-wordstask
• Everywordinanen6retext
• Alexiconwithsensesforeachword
• Datasparseness:can’ttrainword-specificclassifiers
SupervisedMachineLearningApproaches
• Supervisedmachinelearningapproach:
• atrainingcorpusofwordstaggedincontextwiththeirsense
• usedtotrainaclassifierthatcantagwordsinnewtext
• Summaryofwhatweneed:
•
•
•
•
thetagset(“senseinventory”)
thetrainingcorpus
Asetoffeaturesextractedfromthetrainingcorpus
Aclassifier
SupervisedWSD1:WSDTags
• What’satag?
Adic6onarysense?
• Forexample,forWordNetaninstanceof bass inatexthas8
possibletagsorlabels(bass1throughbass8).
Inventoryofsensetagsforbass
16.5 • S UPERVISED W ORD S ENSE D ISAMBIGUATION
WordNet
Sense
bass4
bass4
bass7
bass7
Figure 16.5
Spanish
Roget
Translation Category
lubina
FISH / INSECT
lubina
FISH / INSECT
bajo
MUSIC
bajo
MUSIC
9
Target Word in Context
. . . fish as Pacific salmon and striped bass and. . .
. . . produce filets of smoked bass or sturgeon. . .
. . . exciting jazz bass player since Ray Brown. . .
. . . play bass because he doesn’t have to solo. . .
Possible definitions for the inventory of sense tags for bass.
the set of senses are small, supervised machine learning approaches are often used
to handle lexical sample tasks. For each word, a number of corpus instances (con-
SupervisedWSD2:Getacorpus
• Lexicalsampletask:
• Line-hard-servecorpus-4000examplesofeach
• Interestcorpus-2369sense-taggedexamples
• Allwords:
• Seman1cconcordance:acorpusinwhicheachopen-classwordislabeled
withasensefromaspecificdic6onary/thesaurus.
• SemCor:234,000wordsfromBrownCorpus,manuallytaggedwith
WordNetsenses
• SENSEVAL-3compe66oncorpora-2081taggedwordtokens
SemCor
<wfpos=PRP>He</wf>
<wfpos=VBlemma=recognizewnsn=4lexsn=2:31:00::>recognized</wf>
<wfpos=DT>the</wf>
<wfpos=NNlemma=gesturewnsn=1lexsn=1:04:00::>gesture</wf>
<punc>.</punc>
8
SupervisedWSD3:Extractfeaturevectors
• Intui6onfromWarrenWeaver(1955):
“Ifoneexaminesthewordsinabook,oneata6measthroughanopaque
maskwithaholeinitonewordwide,thenitisobviouslyimpossibleto
determine,oneata6me,themeaningofthewords…
Butifonelengthenstheslitintheopaquemask,un6lonecanseenotonly
thecentralwordinques6onbutalsosayNwordsoneitherside,thenifNis
largeenoughonecanunambiguouslydecidethemeaningofthecentral
word…
Theprac6calques6onis:``WhatminimumvalueofNwill,atleastina
tolerablefrac6onofcases,leadtothecorrectchoiceofmeaningforthe
centralword?”
Featurevectors
• Asimplerepresenta6onofeachtargetwordinstance
• Vectorsofsetsoffeature/valuepairs
• Representedasanorderedlistofvalues
• Represen6ng,e.g.,thewindowofwordsaroundthetarget
Twokindsoffeaturesinthevectors
• Colloca1onalfeaturesandbag-of-wordsfeatures
• Colloca1onal
• Featuresaboutwordsatspecificposi6onsneartargetword
• Ojenlimitedtojustwordiden6tyandPOS
• Bag-of-words
• Featuresaboutwordsthatoccuranywhereinthewindow
• Typicallylimitedtofrequencycounts
FeatureExample
• Exampletext(WSJ):
Anelectricguitarandbassplayerstandoffto
onesidenotreallypartofthescene
• Assumeawindowof+/-2fromthetarget
FeatureExample
• Exampletext(WSJ)
Anelectricguitarandbassplayerstandoffto
onesidenotreallypartofthescene,
• Assumeawindowof+/-2fromthetarget
encoding
local lexical
andthe
grammatical
information
can following
often accurately
isola
For example
consider
ambiguous
word bassthat
in the
WSJ sent
a given sense.
6.17) An electric guitar and bass player stand off to one side, not really par
For example consider the ambiguous word bass in the following WSJ sentenc
Colloca1onalfeatures
the scene,
just as a sort of nod to gringo expectations perhaps.
(16.17) An electric guitar and bass player stand off to one side, not really part of
collocational
fromexpectations
a window perhaps.
of two words to the
the scene,feature
just as vector,
a sort ofextracted
nod to gringo
d left
the target word, made up of the words themselves, their respective
• ofPosi6on-specificinforma6onaboutthewordsand
A collocational
feature
vector,that
extracted
from a window of two words to the rig
-speech,
and
pairs
of
words,
is,
and leftcolloca6onsinwindow
of the target word, made up of the words themselves, their respective part
of-speech,
and pairs
ofi words,
[w
wi+1 , POSi+1 , wi+2 , POSi+2 , wii 12 , wi+1
2 , POS
i 2, w
1 , POSthat
i 1 , is,
• iguitarandbassplayerstand
i ] (
[wi 2the
, POS
POSi 1 , wi+1 , POSi+1 , wi+2 , POSi+2 , wii 12 , wi+1
i 2 , wi 1 ,vector:
ould yield
following
i ] (16.1
yield the following vector:
would
gh performing
usestand,
POS VB,
tagsand
andguitar,
word collocations
[guitar,
NN,systems
and, CC,generally
player, NN,
player stand]of l
• word1,2,3gramsinwindowof±3iscommon
2, andperforming
3 from a window
of wordsuse
3 toPOS
the tags
left and
to the
right (Zhong
an
High
systems generally
and 3word
collocations
of leng
10).
1,
2, and 3 from a window of words 3 to the left and 3 to the right (Zhong and N
[guitar, NN, and, CC, player, NN, stand, VB, and guitar, player stand]
Bag-of-wordsfeatures
• Anunorderedsetofwords–posi6onignored
• Countsofwordsthatoccurwithinthewindow
• Chooseavocabulary
• Counthowojeneachwordoccursinagivenwindow
• Some6mesjustabinary“indicator”:1or0
Co-OccurrenceExample
• Assumewe’veseoledonapossiblevocabularyof12wordsin
“bass”sentences:
[fishing,big,sound,player,fly,rod,pound,double,runs,playing,guitar,band]
• Thevectorfor:
guitarandbassplayerstand
[0,0,0,1,0,0,0,0,0,0,1,0]
SupervisedWSD4:Classifier
• Input:
• awordwinatextwindowd(whichwe’llcalla“document”)
• afixedsetofclasses(senses)C={c1,c2,…,cJ}
• Atrainingsetofmhand-labeledtextwindowsagaincalled
“documents”D={(d1,c1),…,(dm,cm)}
• Output:
• alearnedclassifierf(d)=c
DanJurafsky
NaïveBayesclassifier
• Probabilityofclass/sensegivendocument/context:
P(c | d) = P(c) P(d | c) / P(d)
• Assumeindependencebetweencontextwords:
P(d | c) = ∏i P(wi | c)
• Findmostprobableclass/sense:
f(d) = argmaxj P(cj) ∏i P(wi | cj)
DanJurafsky
P̂(c) =
P̂(w | c) =
Nc
N
count(w, c) +1
count(c)+ | V |
Priors:
P(f)= 3
4 1
P(g)= 4
Training
Test
Doc
1
2
3
4
5
Class
f
f
f
g
?
V={fish,smoked,line,haul,guitar,jazz}
19
Words
fishsmokedfish
fishline
fishhaulsmoked
guitarjazzline
lineguitarjazzjazz
Condi1onalProbabili1es:
(1+1)/(8+6)=2/14
P(line|f)=
P(guitar|f)= (0+1)/(8+6)=1/14
P(jazz|f)=
(0+1)/(8+6)=1/14
P(line|g)=
(1+1)/(3+6)=2/9
P(guitar|g)= (1+1)/(3+6)=2/9
P(jazz|g)= (1+1)/(3+6)=2/9
Choosingaclass:
P(f|d5) ∝ 3/4*2/14*(1/14)2*1/14
≈0.00003
P(g|d5)∝ 1/4*2/9*(2/9)2*2/9
≈0.0006
WSDEvalua1onsandbaselines
• Bestevalua6on:extrinsic(end-to-end,task-based)evalua1on
• EmbedWSDalgorithminataskandseeifyoucandothetaskbeoer!
• Whatweojendoforconvenience:intrinsicevalua1on
• Exactmatchsenseaccuracy
• %ofwordstaggediden6callywiththehuman-manualsensetags
• Usuallyevaluateusingheld-outdatafromsamelabeledcorpus
• Baselines
• Randomguessing
• Mostfrequentsense
MostFrequentSense
• WordNetsensesareorderedinfrequencyorder
• So mostfrequentsense inWordNet= takethefirstsense • SensefrequenciescomefromtheSemCorcorpus
Ceiling
• Humaninter-annotatoragreement
• Compareannota6onsoftwohumans
• Onsamedata
• Givensametaggingguidelines
• Humanagreementsonall-wordscorporawith
WordNetstylesenses
• 75%–80%
WordNet3.0
• Whereitis:
• hnp://wordnetweb.princeton.edu/perl/webwn
• Libraries
• Python:WordNetfromNLTK
• hnp://www.nltk.org/Home
• Java:
• JWNL,extJWNLonsourceforge
© Copyright 2026 Paperzz