WordNet PropBank NomBank Nathan Schulte WordNet – A lexical database for English A lexical database ● Expresses nouns, verbs, adjectives, and adverbs as related groups ● Each group is called a synset – “cognitive synonyms” ● Synsets are interlinked via semantic and lexical relations ● Creates a web-like network of words – WordNet The project – WordNet ® ● Princeton University ● Free (gratis), both research and commercial use ● ● Must accept license ● Must provide proper citation Current release is version 3.0 ● Windows: version 2.1 WordNet, PropBank, NomBank – Nathan Schulte WordNet – some data The database/network ● Over 150000 unique words ● Nearly 120000 synsets ● Resulting in nearly 210000 captured word-senses API bindings ● Microsoft .NET (C#), Java ● Perl, PHP, Python, Ruby ● ● Python via NLTK Project Prolog Other interfaces ● Many web interfaces and bindings – REST/RESTful ● XML encodings ● SQL adaptations ● Many “viewer” applications WordNet, PropBank, NomBank – Nathan Schulte WordNet – relations Synonyms - synsets ● ● e.g. big – large, buy – purchase, quickly – speedily Primary relation in WordNet Hypernyms / hyponyms - “forest” relation ● e.g. furniture → bed → bunkbed, scarlet → red → color ● Primary relation among synsets ● Distinguishes proper nouns from common nouns (“instances” from “types”) Meronyms - whole/part relation ● ● Parts are inherited from their parent categories (“superordinates”) ● ● e.g. chair – set, leg, back Chair has legs, armchair also then has legs Relation is one way; superordinates do not inherit siblings' children ● Chairs have legs, as do “kinds of chairs”, but “kinds of furniture” do no always have legs WordNet, PropBank, NomBank – Nathan Schulte WordNet – more relations; verbs, adjectives Troponyms - “manner” relation ● e.g. communicate → talk → whisper ● Verbs arranged into hierarchies of troponyms ● Children of hierarchy are more troponymic; they express the manner of action in more detail “Entails” relations ● ● e.g. buy → pay, succeed → try, show → see entails: “events that necessarily and uni-directionally entail one another” Antonyms ● ● e.g. wet – dry, young – old “direct” opposites are annotated; “polar adjectives” only Pertainyms - relational adjectives ● e.g. criminal → crime WordNet, PropBank, NomBank – Nathan Schulte PropBank – Semantic Proposition Bank What is PropBank? ● Created a corpus with annotations about basic semantic propositions ● Adds predicate-argument relations to the Penn Treebank syntax trees ● University of Colorado at Boulder ● Funded by ACE, continuation by NSF and DARPA Wait... What is PropBank? ● Has a similar function as FrameNet – “frame” ● Provides a database of framesets ● Each frameset frames a predicate ● Each framesets contain rolesets ● roleset is essentially a specific sense of the verb of the frameset WordNet, PropBank, NomBank – Nathan Schulte PropBank – it's just data Some background ● Frequently compared to FrameNet, but much simpler in basis ● Originally intended as training data for machine learning-based SRL ● ● Focused on verbs; identifying their arguments Each word sense of a verb equates to a roleset, but only if the difference affects the arguments there-of; e.g. ● ● amount.01 - “sums to” ● <role descr="thing(s) being counted" n="1"/> ● <role descr="count, total" n="2"/> amount.02 - “is equivalent to” ● <role descr="focus" n="1"/> ● <role descr="ground, introduced by 'to'" n="2"/> WordNet, PropBank, NomBank – Nathan Schulte PropBank – an example: forego.xml <!DOCTYPE frameset SYSTEM "frameset.dtd"> <frameset> <predicate lemma="forego"> <roleset id="forego.01" name="to abstain, refrain from, or do without" vncls=""> <roles> <role descr="abstainer" n="0"></role> <role descr="thing foregone" n="1"></role> </roles> <example name="Angel Gabriel"> <text> What Garbriel was asking *T*-1 was that mankind forego all its parochial moral judgements. </text> <arg n="0">mankind</arg> <rel>forego</rel> <arg n="1">all its parochial moral judgements</arg> </example> </roleset> <note>frame by Perryn</note> </predicate> </frameset> WordNet, PropBank, NomBank – Nathan Schulte PropBank – it's still just data How do I use it? ● Provided as a collection of XML files, each of which is a self-contained frameset ● Other projects have converted PropBank into other data formats ● ● A combined set, the “Unified Verb Index” combines PropBank, FrameNet, and VerbNet ● http://verbs.colorado.edu/verb-index/index.php The project also provides an HTML rendition of the original XML annotations ● http://verbs.colorado.edu/propbank/framesets-english/ WordNet, PropBank, NomBank – Nathan Schulte NomBank – Noun Annotation Bank What is NomBank? ● NomBank – Nominal, noun, bank ● Similar to PropBank, except for nouns, not verbs ● ● Looks at arguments of nouns in the corpus “to mark the sets of arguments that cooccur with nouns just as PropBank records such information for verbs” ● New York University ● Based on a project called Nomlex ● Current release: version 1.0 WordNet, PropBank, NomBank – Nathan Schulte NomBank – it's still just data Three lexicons ● NOMLEX-PLUS ● ● ● The whole of the NOMLEX project, with semi-automatically generated additions PropBank frame files ● Same thing as PropBank, but for nouns ● Contains relations from NomBank noun senses to PropBank verb senses Morhpology file ● A single file, read one line at a time ● Each line starts with a base form of a noun, and is followed by possible morphologic forms WordNet, PropBank, NomBank – Nathan Schulte NomBank – an example <!DOCTYPE frameset SYSTEM "frameset.dtd"> <frameset> <predicate lemma="shard"> <roleset id="shard.01" name="partitive-part"> <roles> <role descr="whole" n="1"></role> </roles> <example name="autogen1"> <text> shards of metal </text> <rel>shards</rel> <arg n="1">of metal</arg> </example> </roleset> </predicate> </frameset> WordNet, PropBank, NomBank – Nathan Schulte NomBank – another example: shopping.xml <!DOCTYPE frameset SYSTEM "frameset.dtd"> <frameset> <predicate lemma="shopping"> <roleset id="shopping.01" name="look for something to buy" source="verb-shop.01" vncls="35.2"> <roles> <role descr="shopper" n="0"> <vnrole vncls="35.2" vntheta="Agent"/></role> <role descr="thing sought" n="1"> <vnrole vncls="35.2" vntheta="Theme"/></role> <role descr="source" n="2"></role> <role descr="beneficiary" n="4"></role> </roles> <example name="autogen1"> <text> one-stop shopping for takeover finance </text> <arg n="2">one-stop</arg> <rel>shopping</rel> <arg n="1">for takeover finance</arg> </example> </roleset> </predicate> </frameset> <!DOCTYPE frameset SYSTEM "frameset.dtd"> <frameset> <predicate lemma="shop"> <roleset id="shop.01" name="look for something to buy" vncls="35.2"> <roles> <role descr="shopper" n="0"> <vnrole vncls="35.2" vntheta="Agent"/></role> <role descr="thing sought" n="1"> <vnrole vncls="35.2" vntheta="Theme"/></role> <role descr="benefactive" n="4"/> </roles> WordNet, PropBank, NomBank – Nathan Schulte Anything else... Projects are corpus based; PTB WSJ Processes involved in projects are general... ... WordNet, PropBank, NomBank – Nathan Schulte References WordNet - http://wordnet.princeton.edu/wordnet/ ● http://en.wikipedia.org/wiki/Wordnet PropBank - http://verbs.colorado.edu/~mpalmer/projects/ace.html ● http://en.wikipedia.org/wiki/PropBank ● http://verbs.colorado.edu/propbank/ ● http://en.wikipedia.org/wiki/VerbNet ● http://verbs.colorado.edu/verb-index/index.php NomBank - http://nlp.cs.nyu.edu/meyers/NomBank.html ● http://nlp.cs.nyu.edu/meyers/nombank/nombank-specs-2007.pdf ● http://nlp.cs.nyu.edu/nomlex/index.html WordNet, PropBank, NomBank – Nathan Schulte
© Copyright 2025 Paperzz