WordNet PropBank NomBank

WordNet
PropBank
NomBank
Nathan Schulte
WordNet – A lexical database for English
A lexical database
●
Expresses nouns, verbs, adjectives, and adverbs as related groups
●
Each group is called a synset – “cognitive synonyms”
●
Synsets are interlinked via semantic and lexical relations
●
Creates a web-like network of words – WordNet
The project – WordNet ®
●
Princeton University
●
Free (gratis), both research and commercial use
●
●
Must accept license
●
Must provide proper citation
Current release is version 3.0
●
Windows: version 2.1
WordNet, PropBank, NomBank – Nathan Schulte
WordNet – some data
The database/network
●
Over 150000 unique words
●
Nearly 120000 synsets
●
Resulting in nearly 210000 captured word-senses
API bindings
●
Microsoft .NET (C#), Java
●
Perl, PHP, Python, Ruby
●
●
Python via NLTK Project
Prolog
Other interfaces
●
Many web interfaces and bindings – REST/RESTful
●
XML encodings
●
SQL adaptations
●
Many “viewer” applications
WordNet, PropBank, NomBank – Nathan Schulte
WordNet – relations
Synonyms - synsets
●
●
e.g. big – large, buy – purchase, quickly – speedily
Primary relation in WordNet
Hypernyms / hyponyms - “forest” relation
●
e.g. furniture → bed → bunkbed, scarlet → red → color
●
Primary relation among synsets
●
Distinguishes proper nouns from common nouns (“instances” from “types”)
Meronyms - whole/part relation
●
●
Parts are inherited from their parent categories (“superordinates”)
●
●
e.g. chair – set, leg, back
Chair has legs, armchair also then has legs
Relation is one way; superordinates do not inherit siblings' children
●
Chairs have legs, as do “kinds of chairs”, but “kinds of furniture” do no always have legs
WordNet, PropBank, NomBank – Nathan Schulte
WordNet – more relations; verbs, adjectives
Troponyms - “manner” relation
●
e.g. communicate → talk → whisper
●
Verbs arranged into hierarchies of troponyms
●
Children of hierarchy are more troponymic; they express the manner of action in more detail
“Entails” relations
●
●
e.g. buy → pay, succeed → try, show → see
entails: “events that necessarily and uni-directionally entail one another”
Antonyms
●
●
e.g. wet – dry, young – old
“direct” opposites are annotated; “polar adjectives” only
Pertainyms - relational adjectives
●
e.g. criminal → crime
WordNet, PropBank, NomBank – Nathan Schulte
PropBank – Semantic Proposition Bank
What is PropBank?
●
Created a corpus with annotations about basic semantic propositions
●
Adds predicate-argument relations to the Penn Treebank syntax trees
●
University of Colorado at Boulder
●
Funded by ACE, continuation by NSF and DARPA
Wait... What is PropBank?
●
Has a similar function as FrameNet – “frame”
●
Provides a database of framesets
●
Each frameset frames a predicate
●
Each framesets contain rolesets
●
roleset is essentially a specific sense of the verb of the frameset
WordNet, PropBank, NomBank – Nathan Schulte
PropBank – it's just data
Some background
●
Frequently compared to FrameNet, but much simpler in basis
●
Originally intended as training data for machine learning-based SRL
●
●
Focused on verbs; identifying their arguments
Each word sense of a verb equates to a roleset, but only if the difference affects the
arguments there-of; e.g.
●
●
amount.01 - “sums to”
●
<role descr="thing(s) being counted" n="1"/>
●
<role descr="count, total" n="2"/>
amount.02 - “is equivalent to”
●
<role descr="focus" n="1"/>
●
<role descr="ground, introduced by 'to'" n="2"/>
WordNet, PropBank, NomBank – Nathan Schulte
PropBank – an example: forego.xml
<!DOCTYPE frameset SYSTEM "frameset.dtd">
<frameset>
<predicate lemma="forego">
<roleset id="forego.01" name="to abstain, refrain from, or do without" vncls="">
<roles>
<role descr="abstainer" n="0"></role>
<role descr="thing foregone" n="1"></role>
</roles>
<example name="Angel Gabriel">
<text>
What Garbriel was asking *T*-1 was that mankind forego all its parochial moral judgements.
</text>
<arg n="0">mankind</arg>
<rel>forego</rel>
<arg n="1">all its parochial moral judgements</arg>
</example>
</roleset>
<note>frame by Perryn</note>
</predicate>
</frameset>
WordNet, PropBank, NomBank – Nathan Schulte
PropBank – it's still just data
How do I use it?
●
Provided as a collection of XML files, each of which is a self-contained frameset
●
Other projects have converted PropBank into other data formats
●
●
A combined set, the “Unified Verb Index” combines PropBank, FrameNet, and VerbNet
●
http://verbs.colorado.edu/verb-index/index.php
The project also provides an HTML rendition of the original XML annotations
●
http://verbs.colorado.edu/propbank/framesets-english/
WordNet, PropBank, NomBank – Nathan Schulte
NomBank – Noun Annotation Bank
What is NomBank?
●
NomBank – Nominal, noun, bank
●
Similar to PropBank, except for nouns, not verbs
●
●
Looks at arguments of nouns in the corpus
“to mark the sets of arguments that cooccur with nouns just as PropBank records such
information for verbs”
●
New York University
●
Based on a project called Nomlex
●
Current release: version 1.0
WordNet, PropBank, NomBank – Nathan Schulte
NomBank – it's still just data
Three lexicons
●
NOMLEX-PLUS
●
●
●
The whole of the NOMLEX project, with semi-automatically generated additions
PropBank frame files
●
Same thing as PropBank, but for nouns
●
Contains relations from NomBank noun senses to PropBank verb senses
Morhpology file
●
A single file, read one line at a time
●
Each line starts with a base form of a noun, and is followed by possible morphologic forms
WordNet, PropBank, NomBank – Nathan Schulte
NomBank – an example
<!DOCTYPE frameset SYSTEM "frameset.dtd">
<frameset>
<predicate lemma="shard">
<roleset id="shard.01" name="partitive-part">
<roles>
<role descr="whole" n="1"></role>
</roles>
<example name="autogen1">
<text>
shards of metal
</text>
<rel>shards</rel>
<arg n="1">of metal</arg>
</example>
</roleset>
</predicate>
</frameset>
WordNet, PropBank, NomBank – Nathan Schulte
NomBank – another example: shopping.xml
<!DOCTYPE frameset SYSTEM "frameset.dtd">
<frameset>
<predicate lemma="shopping">
<roleset id="shopping.01" name="look for something to buy" source="verb-shop.01" vncls="35.2">
<roles>
<role descr="shopper" n="0">
<vnrole vncls="35.2" vntheta="Agent"/></role>
<role descr="thing sought" n="1">
<vnrole vncls="35.2" vntheta="Theme"/></role>
<role descr="source" n="2"></role>
<role descr="beneficiary" n="4"></role>
</roles>
<example name="autogen1">
<text>
one-stop shopping for takeover finance
</text>
<arg n="2">one-stop</arg>
<rel>shopping</rel>
<arg n="1">for takeover finance</arg>
</example>
</roleset>
</predicate>
</frameset>
<!DOCTYPE frameset SYSTEM "frameset.dtd">
<frameset>
<predicate lemma="shop">
<roleset id="shop.01" name="look for something to buy" vncls="35.2">
<roles>
<role descr="shopper" n="0">
<vnrole vncls="35.2" vntheta="Agent"/></role>
<role descr="thing sought" n="1">
<vnrole vncls="35.2" vntheta="Theme"/></role>
<role descr="benefactive" n="4"/>
</roles>
WordNet, PropBank, NomBank – Nathan Schulte
Anything else...
Projects are corpus based; PTB WSJ
Processes involved in projects are general...
...
WordNet, PropBank, NomBank – Nathan Schulte
References
WordNet - http://wordnet.princeton.edu/wordnet/
●
http://en.wikipedia.org/wiki/Wordnet
PropBank - http://verbs.colorado.edu/~mpalmer/projects/ace.html
●
http://en.wikipedia.org/wiki/PropBank
●
http://verbs.colorado.edu/propbank/
●
http://en.wikipedia.org/wiki/VerbNet
●
http://verbs.colorado.edu/verb-index/index.php
NomBank - http://nlp.cs.nyu.edu/meyers/NomBank.html
●
http://nlp.cs.nyu.edu/meyers/nombank/nombank-specs-2007.pdf
●
http://nlp.cs.nyu.edu/nomlex/index.html
WordNet, PropBank, NomBank – Nathan Schulte