Summary Event Schema First Step: Event Induction Second Step

InToEventS:
An Interactive Toolkit for Discovering and Building Event Schemas
Germán
1
Ferrero ,
Audi
2
Primadhanty ,
Ariadna
3
Quattoni
1 Universidad
Nacional de Còrdoba, Cordoba, Argentina
2 Universitat Politècnica de Catalunya, Barcelona, Spain
3 Xerox Research Centre Europe, Grenoble, France
Summary
Event Schema
Event Triggers
• Goal: event schema induction
• Main challenge: no supervision
• Traditional approaches: require document-level
supervision
Set of atomic predicates associated with an event
• Literal (e.g. explode)
• Real-valued word vector representation
• Distance threshold (defines a ball around the literal in
a word vector space representation)
• Problems:
• No annotated data
• User does not know in advance the event
schemas that he/she is interested in
Event Slots
Set of participating entities involved in the event
• Entity type (e.g. person, organization or object)
• Contributions:
• Interactive event schema induction system that
can be used by non-experts to explore a corpus
and easily build event schemas and their
corresponding extractors
• A set of predicates:
• Literal
• Real-valued word vector representation
• Distance threshold
• Syntactic relation
First Step: Event Induction
Second Step: Role Induction
Idea
Observations
• Literals that tend to appear nearby in a document usually play a role in the
same event description
• Literals with similar meaning are usually describing the same atomic
predicates
System
• Extract predicate literals: all unique verbs and all nouns noun with a
corresponding synset in Wordnet labeled as noun.event or noun.act
• Calculate distance between predicates:
• Distance in corpus
•
Distance in word embedding vector space
• Agglomerative clustering: based on corpus distance
User
• Explore the resulting dendogram:
• Choose distance threshold
•
Choose initial partition of event triggers
• Merge or split initial clusters
• Selects and labels the cluster
• Expand each event trigger set: adding predicate literals that are close
in the word vector space
Victim of a bombing:
“Someone who dies, is attacked or injured”,
that is:
“PERSON: subject of die, object of attack, object of injured”
System
• For each predicate in the event trigger set:
• Extract from the corpus all unique tuples:
< predicate, syntactic relation, entity type >
• Compute distance between tuples: based on average word embeddings
of the arguments observed in the corpus
• Agglomerative clustering
User
• Explore different clusters settings and store those that represent the
slots that he/she is interested in