RepResenting and Reasoning foR explicit, implicit and implicated

Representing and reasoning
for explicit, implicit and implicated
textual information
Moldovan D. ([email protected])
Human Language Technology Research Institute
University of Texas at Dallas
Explicit, implicit and implicated text meaning
It is useful to distinguish between explicit and implicit information, and between
implicit and implicated knowledge conveyed by natural language.
Explicit information is what the reader gathers only from the strict meaning
of words. For instance, the explicit information conveyed by The car is cheap (S1)
is that a particular automobile is inexpensive.
Implicit information, also called impliciture, is built up from the explicit content of the utterance by conceptual strengthening or “enrichment”, which yields what
would have been made fully explicit if lexical extensions had been included in an utterance. In our example, one may complete the sentence saying that the car is cheap
compared to other cars.
Implicated information, called implicature, is completely separated from what
is said and goes beyond what is said. They are additional propositions external to what
is said, heavily dependent on the context of the situation. The speaker of S1 may want
to make a derogatory remark about the driver of the car, or may praise its owner who
is rather wealthy for driving an average car and not wasting unnecessary money
on a vehicle. Similarly, Mary has a boyfriend implicates that Mary has exactly one
boyfriend. However, many other implicatures are possible depending on the context;
for example: one should not ask Mary out, Mary is not a lesbian, Mary is getting a divorce, or she will be getting a divorce.
Most implicated information is conveyed during conversations. For example,
given the following exchange between two users of twitter.com:
A: D
inner’s ready! prawns, grouper in some sauce, vegetables, rice and shark’s fin
melon soup! Still waiting for lotus root soup this week!
B: Eeeeeee lotus root?
A: so what you having for dinner?
several facts are stated explicitly (the dinner is ready, a list of dishes: prawns,
grouper with sauce, rice, soup made of shark’s fin and melon, lotus root soup for later
in the week, A’s question about what B will have for dinner). Implicitly, A’s first utterance discloses the existence of a meal which includes all the dishes it explicitly
Representing and reasoning for explicit, implicit and implicated textual information
mentions. B’s Eeeeeee points to disgust, a feeling of dislike. Last but not least, the implicatures conveyed by each utterance complete its understanding tying together the
explicit and implicit information it communicated: A has prepared the ready dinner
which includes the list of mentioned dishes; A is excited of having prepared this gourmet
dinner, B dislikes lotus root and cannot believe that A would choose to eat it; A has a poor
opinion of B’s gastronomic knowledge.
We note that text understanding ranges from extracting explicit information
to more complex implicitures or entailments, all the way to implicatures that take
into account contextual information. Thus, implicitures or implications are implicit
in what is said, whereas implicatures are implied by what is said. Implicatures depend
largely on context of the utterance, whereas implications depend less of context. Sentences imply certain things, but speakers implicate. Speaker utters a sentence and
in fact implicate something else which can be true or false.
Hierarchical knowledge representation
In order to represent the information conveyed by natural language texts (explicit, implicit or implicated), we proposed a hierarchical knowledge representation,
consisting as several layers (Figure 1). Each layer in the hierarchy builds upon the previous layer by bringing new information. The base is the lexico-syntactic layer, which
reveals the text’s syntactic parse trees. The semantic layer adds semantic relations
between the concepts identified by the previous layer. Next, the identified events bring
a new level of abstraction to the process of representing knowledge since events tend
to dominate the meaning of text. Last, relations between events are established with
the goal of determining the text coherence at the highest level.
In order to derive the explicit information conveyed by natural language, which
is to be represented using the formalism described above and stored for later use
by more advanced applications, a library of robust NLP tools are needed to extract
lexical, syntactic and semantic information from natural language input. Our NLP
pipeline includes a text tokenization module, a part-of-speech tagging system, a sentence boundary detection tool, a collocation identification module, a named entity
recognizer, a chunk parser, a word sense disambiguation system[4], a full-syntactic
parser [1], a nominal and pronominal coreference resolution module, a semantic
parser [2],[9], and an event and event relation identification tool [3],[5].
Logical representation of knowledge
In order to facilitate the consumption of derived knowledge by a natural language reasoner, the hierarchical information must be converted into a logical form.
For this purpose, a logical predicate is created for each textual concept that participates in a semantic relation. The name of each predicate derived from a concept
is a concatenation of the lexeme’s base form, its part of speech, and corresponding
WordNet sense number. For questions, a special predicate is introduced to capture the
Moldovan D.
target type, the answer type of the question. Furthermore, for each predicate associated with a named entity or event, a new predicate is added to the logical form of the
text that describes this property. Last, but not least, logical predicates denoting the
semantic relations and event relations identified during the NLU process are included
to complete the logical representation of the derived knowledge.
Given these simple logical transformation of text rules, an accurate multi-layered
first order logical form of the knowledge conveyed by Gilda Flores’s kidnapping occurred on January 13, 1990. A week before, he had fired the kidnappers can be derived
(Figure 1).
Event
Relations
CAUSE_ER(e2,e1) & BEFORE_ER(e2,e1)
Events
_occurrence_EV(e1)
_occurrence_EV(e2)
Semantic Relations
THEME_SR(x1,e1) &
AGENT_SR(x4,e1) &
DURING_SR(e1,x2)
AGENT_SR(x1,e2) &
THEME_SR(x4,e2) &
DURING_SR(e2,x3)
Concepts (Lexical and
Syntax)
gilda_flores_NN(x1) & _human_NE(x1)
& kidnapping_NN_1(e1) &
1_12_1990_NE(x2) & _date_NE(x2)
1_6_1990_NN(x3) & _date_NE(x3)
& fire_VB_1(e2) &
kidnapper_NN_1(x4)
Input text
Gilda Flores’s kidnapping occurred on
January 13th, 1990.
A week before, he had fired the kidnappers.
Figure 1. Hierarchical knowledge representation
More specifically, all concepts mentioned in the text constitute the bottom
layer of the representation. Named entity classes (e. g., Gilda Flores = human) and
normalized values of temporal expressions (e. g., a week before January 13, 1990 =
01/06/1990) are captured within this layer of lexical information. The next level
of knowledge is dedicated to semantic relations identified between the text’s concepts
(e. g., theme(Gilda Flores, kidnapping)). Co-referring concepts are also stored in this
layer of representation (he = Gilda Flores). Events and their properties are identified
and represented as the third layer of information (e. g., kidnap, theme(Gilda Flores),
agent(kidnappers), during(01/13/1990)), with the final level of representation consisting of event-event relations (cause( fire, kidnapping), before( fire, kidnapping)). Finally, there is a causation relations, namely, firing (e2) is the cause of kidnapping (e1)
and has preceded the kidnapping. The reason for kidnapping is the firing of employees
and the two sentences can be summarized as a revenge action.
A model for representing knowledge
In order to represent the meaning of natural language, we define a model that
captures not only the explicit information transmitted to the hearer, but also the implicit and implicated information conveyed by a speaker.
Representing and reasoning for explicit, implicit and implicated textual information
By characterizing a speaker S by his utterance (SU = {u1,..., um}, where ui is a speaker
utterance (what speaker says)) and his intentions (SI = {i1,..., ik}, where ii is a speaker
intention conveyed by SU) and the hearer H by his understanding of SU (H TM denotes
the hearer understanding of the explicit text meaning; H TE represents the hearer understanding of the implicit text meaning, i. e. entailments, and HI symbolizes implicatures H made from SU), the 7-tuple
{SU, SI, HTM, HTE, HI, C, K}
captures the complete meaning of an utterance. Our model’s C component represents the context and K denotes the common sense knowledge needed by H to derive
S’s implicatures.
Noting that each of these components are hierarchical representations of explicit, implicit as well as implicated knowledge conveyed by a natural language text,
the model establishes a standard of representation for natural language meaning.
An instantiation of this model for A’s first utterance, part of the exchange shown
above, Dinner’s ready! prawns, grouper in some sauce, vegetables, rice and shark’s fin
melon soup! Still waiting for lotus root soup this week! is
SU/HTM:
dinner_NN_1(x1)
& ready_JJ_1(x2) & value_SR(x2,x1) & ex& prawn_NN_1(x3) & grouper_NN_1(x4) & sauce_NN_1(x5)
& part_whole_SR(x4,x5) & vegetable_NN_1(x6) & rice_NN_1(x7) & shark_
NN_1(x8) & fin_NN_1(x9) & part_whole_SR(x9,x8) & melon_NN_1(x10)
& soup_NN_1(x11) & part_whole_SR(x10,x11) & part_whole_SR(x9,x11)
& still_RB_1(x12) & wait_VB_1(e1) & _occurrence_EV(e1) & manner_
SR(x12,e1) & lotus_NN_1(x13) & root_NN_1(x14) & part_whole_SR(x14,x13)
& soup_NN_1(x15) & part_whole_SR(x14,x15) & theme_SR(x15,e1) & week_
NN_1(x16) & _date_NE(x16) & during_SR(e1,x16)
HTE: dish_NN_1(x17) & part_whole_SR(x5,x17) & meal_NN_1(x18) &
part_whole_SR(x17,x18)
& part_whole_SR(x5,x18) & dish_NN_1(x19)
& isa_SR(x11,x19) & meal_NN_1(x20) & part_whole_SR(x19,x20) &
part_whole_SR(x11,x20)
SI/HI: show_VB_1(e2) & _A_USER(x21) & agent_SR(x21,e2) & strong_
JJ_1(x22) & feeling_NN_1(x23) & value_SR(x22,x23) & theme_SR(x23,e2)
& =(x1,x18,x20) & cook_VB_1(e3) & theme_SR(x18,e3) & agent_SR(x21,e3)
C: ∅
clamatory_type
Reasoning
The NLU methodologies mentioned above are the first step taken by automated
systems to determine the complete meaning of a natural language text. They derive
the explicit information conveyed by the text: the meaning of the words used by the
speaker (given by the named entity recognizer or the word sense disambiguation
modules) and their semantic connections (given by the coreference resolution module
Moldovan D.
and the semantic parser). However, in order to go beyond what is being said, to derive
the implicit and implicated information conveyed by the text, and, thus, to expand the
understanding of explicit information, reasoning mechanisms that exploit the derived
knowledge using various types of natural language axioms within an abductive reasoning framework must be implemented.
The natural language reasoner presented here processes the natural language of a conversation one turn at a time and derives the logical implications
and implicatures communicated by each speaker utterance, updating the conversation’s context after each individual analysis. Below, we show the architecture
of the system, highlighting the roles and interactions of our model’s components
({SU,SI,H TM,H TE ,H I,C,K}).
update
Context C
Input Conversation
Hi there!
Holla! How...
Keeping busy...
I can imagine...
...
for
each
turn
Logic Form
Transformation of
SU (NLP+)
XWN Lexical
Chains
Default Reasoning
(COGEX)
Semantic
Calculus
Common
Sense Axioms
SU, HTM
HTE
SI, HI
Grice Maxims
Axioms
World Knowledge K
Cogex, a natural language reasoning engine
Cogex is a heavily modified version of the Otter theorem prover1, which uses
the Set of Support (SoS) strategy to prove by contradiction: a hypothesis is proved
by showing that it is impossible for it to be false in the face of the provided evidence
and background knowledge (BackgroundKnowledge, Evidence & ¬Hypothesis → ^).
Cogex has been successfully employed within a question answering engine to rerank the final list of candidate answer passages based on the degree of entailment
between each passage and the given question [16]. For the Recognizing Textual Entailment task, Cogex computes the extent to which a text snippet entails a hypothesis
as a normalized score between 0 and 1 and compares this value to a threshold learned
during training to determine whether the given hypothesis is entailed by the given
text [6],[7],[8].
Within these settings, the initial usable list contains various natural language
axioms, which can be used to infer new information (BackgroundKnowledge), while
the logical clauses corresponding to the candidate passage/text snippet (Evidence)
1
http://www.cs.unm.edu/˜mccune/otter/
Representing and reasoning for explicit, implicit and implicated textual information
as well as the negated question/hypothesis (¬Hypothesis) are added to the SoS list.
Given Otter’s best-first clause selection mechanism, the heavily weighted question/
hypothesis clauses are the last clauses to be processed by the system. Thus, when
resolutions using these clauses are attempted, it is guaranteed that all other inferences ([BackgroundKnowledge & Evidence]+) have been made and are stored in the
usable list.
For the task of deriving implicit and implicated information conveyed by natural language, Cogex exploits the first order logical representations of the meaning
model components. It makes use of the SU, C, and K components to derive the values
of SI , HTM, HTE, and HI , which, in turn, will be used to update the context of future
conversational turns (Figure 2). Given that our current assumptions include that the
communication channel is noise-free, H TM = SU. Futhermore, SI = HI since it is difficult
to determine whether the speaker intended all the implicatures SU conveyed at the
time of SU’s analysis. These values are revised if future conversational turns cancel
some of the implicatures derived during SU’s analysis.
When the analysis of a conversation begins, the context C is empty, the set of axioms described above form the knowledge component (K), and the semantic representation of the speaker utterance make up the model’s SU component. Furthermore, the
first order clauses of the C and K model components form the usable list and the logical clauses of SU serve as the initial SoS. As the reasoning process unfolds, all SU predicates and the inferences they produce (entailments as well as implicatures) are moved
to the usable list, where they become part of the context C of future conversational
turns.
The reasoning process terminates when no more inferences can be made from
SU (i. e., SoS is empty). In this situation, the clauses inferred from a given utterance using non-default axioms and/or context clauses that were explicitly stated or entailed
by previously analyzed utterances (i. e., previous SU/H TM and HTE components) are
marked as logical entailments, or implicitures (H TE). All clauses inferred from default
axioms as well as abductive rules and/or context clauses that were previously labeled
as implicatures (i. e., previous SI/HI components) are marked as the implicatures
conveyed by the current SU (SI,HI). This is the most expected outcome of an analysis
of a conversational turn.
However, if a refutation is found during the reasoning process, the clauses that
caused the inconsistency may indicate that (1) the speaker made a false statement
(the contradiction stems from SU clauses alone), which carries certain Quality-flouting
implicatures or (2) a previously identified implicature must be canceled (information
from current SU contradicts previous SI/HI), in which case, the implicature clauses are
removed and the reasoning process is restarted.
Natural language axioms
The axioms used during the reasoning process capture the world knowledge
people use to derive the information implicitly conveyed by natural language texts.
Various types of axioms are needed to encode this vast pool of information.
Moldovan D.
eXtended WordNet lexical chain axioms
The lexical chain axioms link WordNet concept by exploiting the semantic relationships present in the eXtended WordNet (XWN)2. In addition to WordNet’s semantic relations between synsets, this valuable semantic resource stores a highly semantic
logical representation of WordNet’s plain text glosses, a rich source of world knowledge made available to automated systems by our NLP tools. Using the structured representation of WordNet’s glosses provided by XWN, new semantic links are identified
between WordNet’s synsets. Examples include:
praise_VB_1(e1)
& agent_SR(x1,e2) & theme_SR(x2,e1) → express_
VB_1(e2) & isa_SR(e1,e2) & approval_NN_1(x3) & theme_SR(x3,e2) &
agent_SR(x1,e2) & theme_SR(x2,x3)
— WordNet gloss for praise: “express approval of”
phone_call_NN_1(x1) → telephone_NN_1(x2) & isa_SR(x1,x2)
Semantic calculus axioms
We introduced the notion of Semantic Calculus to describe the various combinations of two semantic relations. The Semantic Calculus axioms describe the semantic
relation R0 that holds between two concepts c1 and c2 linked by two semantic relationships R1 and R 2 (not necessarily distinct) that share a third concept c3 as a common
argument. More formally, R1(c1,c3) & R 2(c3,c2) → R0(c1,c2) [12],[13].
We note that not any two semantic relations can be combined. Furthermore,
the accuracy of each Semantic Calculus axiom was measured as the correctness
with which it predicted the correct semantic relation that links two concepts also
connected by the semantic relations that are combined (for each linguistic context
where R1(c1,c 3) and R 2(c 3,c 2) hold, for any c1, c 2, and c 3, consider if R0(c1,c 2) also
holds).
These axioms greatly increase the semantic connectivity between concepts. This
is particularly important when no immediate semantic link can be found between two
concepts of interest. Examples include:
isa_sr(x1,x2)
& isa_sr(x2,x3) → isa_sr(x1,x3)
& location_sr(x2,x3) :→ location_sr(x1,x3)
quantity_sr(x1,x2) & instrument_sr(x3,x1) :→ quantity_sr(x1,x3)
isa_sr(x1,x2)
Common sense axioms
This type of axioms encodes the common sense knowledge required by an automated system to derive unstated implications. These axioms describe not only various
2
http://xwn.hlt.utdallas.edu/index.html
Representing and reasoning for explicit, implicit and implicated textual information
properties of concepts, but also how the concepts interact in the world and how people
speak about them. Most of the semantic rules belonging to this class are default axioms. They also have an abductive nature, implicating the best-guess outcomes. Fully
automatic methods for deriving these semantic axioms are yet to be proposed. Examples of world knowledge axioms include:
topic_SR(x1,x2)
& purpose_SR(e1,x2) & communicate_VB_1(e1) → topic_
SR(x1,e1)— this axiom states that the topic of something used for communication becomes the topic of the communication
express_VB_1(e1) & approval_NN_1(x1) & theme_SR(x1,e1) & person_
NN_1(x2) & theme_SR(x2,x1) & agent_SR(x2,e2) :→ theme_SR(e2,x1)—
this axiom states that if expressing approval of person that is the agent of action,
then express approval of action
Furthermore, valuable world knowledge can be identified by exploring the semantic relations identified in textual documents. Unlike the semantic relations stored
within machine readable dictionaries, such as WordNet, these relationships link concepts mentioned within a document’s content that belong to all part-of-speech classes.
These semantic relations are contextual, locally true, and may not seem useful for
a reasoning engine that operates outside of their context. However, most of the concepts they connect are defined in WordNet’s hierarchies and, thus, they can be abstracted to more general concepts. Using this process of generalization, we can axiomatically describe the selectional restrictions of certain concepts by noting regularities
in the types of arguments that they pair with.
For instance, the axiom shown below can be derived based on the high frequency
of agent semantic relations that link the verb communicate and various hyponyms
of person — we include here all _human named entities, most of which are not included in WordNet:
communicate_VB_1(e1)
:→ _human_NE(x1) & agent_SR(x1,e1)— this axiom
states that people are usually the agents of communicating
We note that these semantic rules are default axioms. Each is associated with
a weight that is proportional with the frequency of the defining semantic relation
within a large corpus.
Grice Maxims axioms
The implicit assumptions and default inferences that capture our intuitions
about a normal interpretation of a communication have been analyzed and formalized by language philosophers as principles of rational human communication behavior. For instance, Grice’s Cooperative Principle explains the link between utterances
and what is understood from them. Grice’s Maxims [14],[15], which are derived from
this principle, can be leveraged by an automated system aiming to derive the implicatures conveyed by natural language utterances.
Moldovan D.
The set of macro-axioms encoding the essence of Grice’s Maxims make use of the
components of the model we defined above. Before being used by an automated system for the analysis of a particular utterance, each Gricean maxim axiom must be instantiated with the values of the various components of the existing conversational
model. Examples include:
_relevance_GM → (predicatei(xj)∈ LF(SU) & predicatei(xk)∈(C) :→ xj =
— this axiom states that there must be at least one common predicate between
the (enhanced) logical forms of SU (speaker utterance) and C (context) — given
that the speaker’s utterance must be relevant to the established common ground.
In the case of floutings of the Relevance Maxim, this axiom can be used to assume
the unification of two identically named predicates from SU and C
_quality_GM & SU(x1) & _question(x1) → S(x2) & believe_VB_1(e1) &
experiencer_SR(x2,e1) & H(x3) & know_VB_1(e2) & experiencer_SR(x3,e2)
& theme_SR(e2,e1) & theme_SR(x1,e2)— this axiom states that for all speaker
utterances that are questions, S (the speaker) believes H (the hearer) knows the
answer to his question
_manner_GM → ((_capitalized_chars(x1) & SU(x1)) :→ (S(x2) & excited_JJ_1(x3) & value_SR(x3,x2)))— this axiom states that capitalized
texts within the speaker’s utterance indicate the excitement of the speaker
_quantity_GM & SU:(P(xi) → P(xi)) → predicate0(x0)∈ LF(C) & (property_SR(x0,xi) | value_SR(x0,xi)) & _always_TMP(x0) & -(predicatex(xx)
→ -( property_SR(x0,xi) | value_SR(x0,xi)))— this axiom states that
if the speaker’s utterance is of the form “X is X”, then one of X’s (essential) properties
that is relevant to the existing common ground always happens in X and nothing
can change that
xk)
References
1.
2.
3.
4.
5.
Elliot Glaysher and Dan Moldovan. 2006. Speeding Up Full Syntactic Parsing
by Leveraging Partial Parsing Decisions. In Proceedings of COLING/ACL 2006,
Sydney, Australia, July 2006
Adriana Badulescu and Munirathnam Srikanth. 2007. LCC-SRN: LCC’s SRN System for SemEval 2007 Task 4. In Proceedings of SemEval-2007, Prague, Czech
Republic, June 23–24, 2007
Congmin Min, Munirathnam Srikanth, and Abraham Fowler. 2007. LCC-TE: A Hybrid Approach to Temporal Relation Identification in News Text. In Proceedings
of SemEval-2007, Prague, Czech Republic, June 23–24, 2007
Adrian Novischi, Muirathnam Srikanth, and Andrew Bennett. 2007. LCC-WSD:
System Description for English Coarse Grained All Words Task at SemEval 2007.
In Proceedings of SemEval-2007, Prague, Czech Republic, June 23–24, 2007
Marta Tatu and Munirathnam Srikanth. 2008. Experiments with Reasoning for
Temporal Relations between Events. In Proceedings of COLING 2008, Manchester, UK, 18–22 August, 2008
Representing and reasoning for explicit, implicit and implicated textual information
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi and Dan Moldovan.
2006. COGEX at the Second Recognizing Textual Entailment Challenge. In Proceedings of the Second PASCAL Challenges Workshop on Recognizing Textual
Entailment. Venice, Italy, May 2006
Marta Tatu and Dan Moldovan. 2006. A Logic-based Semantic Approach to Recognizing Textual Entailment. In Proceedings of COLING/ACL 2006, Sydney,
Australia, July 2006
Marta Tatu and Dan Moldovan. 2007. COGEX at RTE 3. In Proceedings of ACL2007 Workshop on Textual Entailment and Paraphrasing. Prague, Czech Republic, June 2007
Dan Moldovan and Eduardo Blanco. 2012. Polaris: Lymba’s Semantic Parser.
In Proceedings of LREC 2012, Istanbul, Turkey, May 2012
Dan Moldovan and Adriana Badulescu. 2005. A Semantic Scattering Model for
the Automatic Interpretation of Genitives. In Proceedings of HLT-EMNLP 2005,
Vancouver, British Columbia, Canada, October, 2005
Dan Moldovan and Vasile Rus. 2001. Logic Form Transformation of WordNet and
its Applicability to Question Answering. In Proceedings of ACL-2001, Toulouse,
France, July 2001.
Eduardo Blanco and Dan Moldovan. 2011. Unsupervised Learning of Semantic
Relation Composition. In Proceedings of ACL-HLT 2011, Portland, OR, USA.
Eduardo Blanco and Dan Moldovan. 2011. A Model for Composing Semantic Relations. In Proceedings of IWCS 2011, Oxford, UK.
Grice H. P. 1969. Utterer’s Meaning and Intentions. In Philosophical Review 78:
147–177 [Reprinted in [15]]
Grice H. P. 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press
Dan Moldovan, Marta Tatu, and Christine Clark. 2010. Role of Semantics in Question Answering. In Semantic Computing, pages 373–419. John Wiley & Sons, Inc.