Global and Local Wikification (GLOW) in TAC KBP Entity Linking Shared Task 2011 Visit our demo: http://cogcomp.cs.illinois.edu/demo/wikify/ Lev Ratinov, Dan Roth 1) MENTION IDENTIFICATION Vision: aggregate information about an entity from multiple documents It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid1997.. • • • • Is a Macintosh font Has a distinctive N Used in Mac OS 7.6 …. 3) GLOW OUTPUT RECONCILIATION Given a set of mentions linked to the query, we need to provide a single Wikipedia title. However each mention can be assigned a different title. We are using the ranker scores and the linker scores to make the decision. The “with linker” strategy discards mentions assigned negative linker score (which means the objective function increases if we map these mentions to NULL). The “no linker” strategy uses all mentions. The decision on the single-best matching title is based on ranker scores. The “Max” strategy uses a single mention with the highest ranker score. The “Sum” strategy, sums the ranker scores of all the mentions assigned to the same title. In the figure on the left, we illustrate the 4 resulting strategies along with the mentions they use, and with the resulting ranker scores for each title. The hollow circles indicate the discarded mentions, while the full circles indicate mentions that contribute to final title ranking scores. We have explored two strategies: • Simple Query Identification (SIQI): mark the expressions in the text which match the query form exactly. • Named Entity Query Identification (NEQI): identify the named entities in the text matching the query form approximately, normalize the spelling using Wikipedia (this poster illustrates NEQI). This is similar to query expansion. 2) GLOW DISAMBIGUATION GLOW Problem Formulation: bipartite matching Task methodology: map queries to a TAC entity database TAC QUERY (ID=2012, Form= “Ford”, Text=“The Ford Presidential Library is named after President Gerald Ford”) TAC QUERY (ID=2017, Form= “Michael”, Text=“This video shows Michael Jackson performing Billie Jean”) KBP TAC Knowledgebase … Michael Jordan (basketball) Michael Jackson (singer) Gerald Ford (president) … Experiments, Results (TAC 2011 Test Data) Our approach: use the GLOW “disambiguation to Wikipedia” system Local and Global Algorithms for Disambiguation to Wikipedia L. Ratinov and D. Downey and M. Anderson and D. Roth (ACL 2011) TAC QUERY * ID=2012 * “Ford” * “The Ford Presidential Library is named after President Gerald Ford” KBP TAC Knowledgebase … Michael Jordan (basketball) Michael Jackson (singer) Gerald Ford (president) … 1) MENTION IDENTIFICATION 2) GLOW DISAMBIGUATION “The [Ford]m1 Presidential Library is named after President [Gerald Ford]m2” (m1, http://en.wikipedia.org/wiki/Ford_Motor_Company, 0.1, -0.1) (m2, http://en.wikipedia.org/wiki/President_Gerald_Ford, 0.2, 0.7) QUERY MAPPING 3) GLOW OUTPUT RECONCILIATION Γ* is a solution to the problem, a set of mention-title pairs (m,t). Evaluate the local matching quality using Φ(m,t). Evaluate the global structure based on (a) pair-wise coherence scores Ψ(ti,tj) (b) an approximate solution Γ’.Γ’ allows disambiguating the mentions independently while taking into account the global structure. Gerald Ford (president) Conclusions: 1)It is possible to apply a “disambiguation to Wikipedia” system directly to the TAC KBP Entity Linking task. We did not train our system on TAC data. 2)NEQI mention identification gains 4 B3 F1 points over SIQI. 3)All reasonable output reconciliation policies have performed comparably. This research is supported by the Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09-C-0181 and by and by the Army Research Laboratory (ARL) under agreement W911NF-09-2-0053. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the DARPA, AFRL, ARL or the US government.
© Copyright 2025 Paperzz