Poster - Cognitive Computation Group

Relation Alignment for Textual Entailment Recognition
Cognitive Computation Group, University of Illinois
Recognizing Textual Entailment
The task of Recognizing Textual Entailment frames
Natural Language Text understanding as recognizing
when two text spans express the same meaning. In
the example below, the text span ‘T’ contains the
meaning of the text span ‘’H’, so a successful RTE
system would say that T entail s H.
T:
The Shanghai Co-operation Organization
(SCO), is a fledgling association that binds
Russia, China and four other nations.
H: China is a member of SCO.
Most successful systems share a basic assumption:
that semantics is largely compositional, meaning that
we can combine the results of local entailment
decisions to reach a global decision. Many systems
share the same basic architecture:
1. Preprocess the TE pair with a range of NLP tools
2. Determine some structure over each sentence in
the Entailment pair
3. Align some level of structure in the Hypothesis
with structure in the Text
4. Either: directly compute entailment result based
on alignment (either online or in batch mode)
OR: Extract features using alignment (and
possibly other resources), and determine the label
of the TE pair based on this feature
representation.
(Zanzotto et al. 2006) take the first approach,
computing the ‘best’ alignment for each pair, then
learning a classifier over all aligned pairs in a corpus,
thereby using alignment directly to determine the
entailment label.
Others, such as (Hickl et al. 2007, de Marneffe et al
2008) use alignment as a filtering step to select
among possible feature sources. (Zanzotto and
Moschitti 2006). explain their alignment as capturing
valid and invalid syntactic transformations across
many entailment pairs. (de Marneffe et al.) propose
an alignment task that is separate from the entailment
decision, in which elements in the Hypothesis are
paired with the most relevant elements of the Text.
We believe that Alignment is a valuable inference
framework in RTE, but found problems with existing
approaches when we tried to incorporate new
analysis and comparison resources. In the present
work, we share our insights about the Alignment
process and its relation to Textual Entailment
decisions.
The
RATER System
Title
Experimental Results
The
RATER system first annotates entailment pairs
 Text
with a suite of NLP analytics, generating a multi-view
representation mapping each analysis to the original
text. Resource-specific metrics are then used to
compare constituents in each (T,H) paired view (e.g.,
NE metrics are used to compare constituents in the T,
H Named Entity views) to build a match graph. An
Aligner then selects edges from these graphs (see
panel below). Features are then extracted over the
resulting set of alignments, and used to train a
classifier which is used to label examples.
The RATER system was trained using the RTE5
Development corpus and evaluated on the RTE5
Test corpus.
We compare the system’s
performance against a ‘smart’ lexical baseline
that uses WordNet-based similarity resources.
In addition, we carried out an ablation study
with three versions of the system: without
WordNet-based resources (“No WN”); without
Named Entity resources (“No NE”); and with
simple Named Entity similarity (“Basic NE”).
After the submission deadline, we augmented
the shallow semantic predicates in the full
system using Coreference information to create
predicates
spanning
multiple
sentences
(“+Coref”).
Contributions
 Identify clear roles for Alignment in Textual Entailment systems: filter and decider
 Propose an alignment framework to leverage
focused knowledge resources, avoid canonization
Figure 1: Architecture of the RATER system
Alignment over Multiple Views
In the alignment step, instead of aligning only a
single shallow or unified representation (as previous
alignment systems have done), RATER divides the
set of views in to groups, and computes a separate
alignment for each group (groups contain analysis
sources for which the comparison metrics share a
common output scale). Within each alignment,
RATER selects the edges that maximize match score
while minimizing the distance of mapped constituents
in the text from each other; the objective function is
given below. The selected constituents of H must
respect the constraint that each token in H may be
mapped to at most one token in T.
1
m
 e( H , T )   . (e( H , T ), e( H
 I [e( H , T )]  1
i
i
j
i
i
i
j
i 1 , Tk ))

Table 1 shows the performance of these
variants of the system on the Development
corpus, while table 2 shows the results on the
Test corpus. Performance is consistent with the
expected behavior of the system; as semantic
resources are removed, system performance
declines. Wordnet (Miller et al. 1990), Named
Entity (Ratinov and Roth, 2009), and
Coreference (Bengtson and Roth, 2009) each
make a significant contribution to overall
performance.
RTE5 Development
Figure 3: Example showing multiple alignments over
different views of the entailment pair
System
All
QA
IE
IR
Baseline
0.628
0.641
0.557
0.683
Submtd*
0.648
0.647
0.552
0.744
No NE*
0.640
0.631
0.577
0.708
Basic NE 0.623
0.655
0.543
0.670
No WN
0.647
0.650
0.533
0.755
+Coref
0.663
0.665
0.559
0.765
Table 1: RTE5 2-way Task Results (Dev. Corpus)
j
j
Figure 2: Objective function for Alignment
RTE5 Test
System
All
QA
IE
IR
Selected References
Baseline
0.600
0.550
0.500
0.750
Submtd*
0.644
0.580
0.576
0.775
Marie-Catherine de Marneffe, Trond Grenager, Bill MacCartney, Daniel
Cer, Daniel Ramage, Chloe Kiddon, and Christopher D. Manning:
Aligning semantic graphs for textual inference and machine reading. In
AAAI Spring Symposium at Stanford, 2007.
No NE*
0.629
0.580
0.530
0.775
Basic NE 0.633
0.580
0.605
0.715
No WN
0.603
0.565
0.535
0.710
+Coref
0.666
0.596
0.615
0.785
Fabio Massimo Zanzotto and Alessandro Moschitti: Automatic learning
of textual entailments with cross-pair similarities. In Proceedings of the
21st Intl. Conf. on Computational Linguistics and 44th Annual Meeting of
the ACL, 2006.
Andrew Hickl, John Williams, Jeremy Bensley, Kirk Roberts, Bryan Rink, L. Ratinov and D. Roth: Design challenges and misconceptions in
named entity recognition. In Proc. of CoNLL 2009.
and Ying Shi: Recognizing textual entailment with LCC’s groundhog
E. Bengtson and D. Roth: Understanding the value of features for
system. In Proc. of the 2nd PASCAL Challenges Workshop on
coreference resolution, in EMNLP 2008.
Recognizing Textual Entailment, 2006.
Table 2: RTE5 2-way Task Results (Test Corpus)
Mark Sammons, V.G.Vinod Vydiswaran, Tim Vieira, Nikhil Johri, Ming-Wei Chang, Dan Goldwasser, Vivek
Srikumar, Gourab Kundu, Yuancheng Tu, Kevin Small, Joshua Rule, Quang Do, Dan Roth