Poster

Global and Local Wikification (GLOW) in TAC KBP Entity Linking Shared Task 2011
Visit our demo: http://cogcomp.cs.illinois.edu/demo/wikify/
Lev Ratinov, Dan Roth
1) MENTION IDENTIFICATION
Vision: aggregate information about an entity from multiple documents
It’s a version of Chicago – the
standard classic Macintosh
menu font, with that distinctive
thick diagonal in the ”N”.
Chicago was used by default for
Mac menus through MacOS 7.6,
and OS 8 was released mid1997..
•
•
•
•
Is a Macintosh font
Has a distinctive N
Used in Mac OS 7.6
….
3) GLOW OUTPUT RECONCILIATION
Given a set of mentions linked to the
query, we need to provide a single
Wikipedia title. However each
mention can be assigned a different
title.
We are using the ranker scores and the
linker scores to make the decision. The
“with linker” strategy discards
mentions assigned negative linker
score (which means the objective
function increases if we map these
mentions to NULL). The “no linker”
strategy uses all mentions.
The decision on the single-best
matching title is based on ranker
scores. The “Max” strategy uses a
single mention with the highest ranker
score. The “Sum” strategy, sums the
ranker scores of all the mentions
assigned to the same title.
In the figure on the left, we illustrate
the 4 resulting strategies along with
the mentions they use, and with the
resulting ranker scores for each title.
The hollow circles indicate the
discarded mentions, while the full
circles indicate mentions that
contribute to final title ranking scores.
We have explored two strategies:
• Simple Query Identification (SIQI): mark the expressions in the text which
match the query form exactly.
• Named Entity Query Identification (NEQI): identify the named entities in the
text matching the query form approximately, normalize the spelling using
Wikipedia (this poster illustrates NEQI). This is similar to query expansion.
2) GLOW DISAMBIGUATION
GLOW Problem Formulation: bipartite matching
Task methodology: map queries to a TAC entity database
TAC QUERY
(ID=2012, Form= “Ford”,
Text=“The Ford Presidential Library is named
after President Gerald Ford”)
TAC QUERY
(ID=2017, Form= “Michael”,
Text=“This video shows Michael Jackson
performing Billie Jean”)
KBP TAC
Knowledgebase
…
Michael Jordan (basketball)
Michael Jackson (singer)
Gerald Ford (president)
…
Experiments, Results (TAC 2011 Test Data)
Our approach: use the GLOW “disambiguation to Wikipedia” system
Local and Global Algorithms for Disambiguation to Wikipedia
L. Ratinov and D. Downey and M. Anderson and D. Roth (ACL 2011)
TAC QUERY
* ID=2012
* “Ford”
* “The Ford Presidential
Library is named after
President Gerald Ford”
KBP TAC
Knowledgebase
…
Michael Jordan (basketball)
Michael Jackson (singer)
Gerald Ford (president)
…
1) MENTION
IDENTIFICATION
2) GLOW
DISAMBIGUATION
“The [Ford]m1 Presidential
Library is named after
President [Gerald Ford]m2”
(m1, http://en.wikipedia.org/wiki/Ford_Motor_Company, 0.1, -0.1)
(m2, http://en.wikipedia.org/wiki/President_Gerald_Ford, 0.2, 0.7)
QUERY MAPPING
3) GLOW OUTPUT
RECONCILIATION
 Γ* is a solution to the problem, a set of mention-title pairs (m,t).
 Evaluate the local matching quality using Φ(m,t).
 Evaluate the global structure based on (a) pair-wise coherence scores
Ψ(ti,tj) (b) an approximate solution Γ’.Γ’ allows disambiguating the mentions
independently while taking into account the global structure.
Gerald Ford (president)
Conclusions:
1)It is possible to apply a “disambiguation to Wikipedia” system directly to
the TAC KBP Entity Linking task. We did not train our system on TAC data.
2)NEQI mention identification gains 4 B3 F1 points over SIQI.
3)All reasonable output reconciliation policies have performed comparably.
This research is supported by the Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09-C-0181 and by and by the Army Research Laboratory
(ARL) under agreement W911NF-09-2-0053. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the DARPA, AFRL, ARL or the US government.