Local Pragmatics

Local Pragmatics
Jerry R. Hobbs and Paul Martin
Artificial Intelligence Center
SRI International
Abstract
The outline of a unified theory of local pragmatics phenomena is presented, including an approach to the problems
of reference resolution, metonymy, and interpreting nominal compounds, and the T A C I T U S system embodying this
theory is described. The theory and system are based on
the use of a theorem prover to draw the appropriate inferences from a large knowledge base of commonsense and
technical knowledge. Issues of control and minimality are
discussed.
1
The Problems
In the messages about breakdowns in machinery that are
being processed by the T A C I T U S system at SRI International, we find the following sentence:
We disengaged the compressor after the lube oil
alarm.
This sentence, like virtually every sentence in natural language discourse, confronts us w i t h difficult problems of interpretation. First, there are the reference problems; what
do "the compressor" and "the lube oil alarm" refer to.
Then there is the problem of interpreting the implicit relation between the two nouns "lube o i l " (considered as a
multiword) and "alarm" in the nominal compound "lube
oil alarm". There is also a metonymy that needs to be expanded. An alarm is a physical object, but "after" requires
events for its arguments. We need to coerce "the lube oil
alarm" into "the sounding of the lube oil alarm". There
is the syntactic ambiguity problem of whether to attach
the prepositional phrase "after the lube oil alarm" to "the
compressor" or to "disengaged". A l l of these problems we
have come to call problems in "local pragmatics". They
seem to be specifically linguistic problems, but the traditional linguistic methods in syntax and semantics have not
yielded solutions of any generality.
The difficulty, as is well-known, is that to solve these
problems we need to use a great deal of arbitrarily detailed
general commonsense and domain-specific technical knowledge. A theory of local pragmatics phenomena must therefore be r. theory about how knowledge is used. The aim of
our research has been to develop a unified theory of local
520
KNOWLEDGE REPRESENTATION
pragmatics, encompassing reference resolution, metonymy,
the interpretation of nominal compounds and other implicit and vague predicates, and the resolution of syntactic, lexical, and quantifier scope ambiguities, based on the
drawing of appropriate inferences from a large knowledge
base.
The T A C I T U S system is intended to embody that theory. Its specific aim is the interpretation of casualty reports
(casreps), which are messages in free-flowing text about
breakdowns in mechanical devices. More broadly, however, our aim is to develop general procedures, together
with the underlying theory, for using commonsense and
technical knowledge in the interpretation of written (and
spoken) discourse regardless of domain.
The T A C I T U S system has four principal components.
First, a syntactic front-end, the D I A L O G I C system, translates sentences of a text into a logical form in first-order
predicate calculus, described in Section 3. Second, we are
building a knowledge base, specifying large portions of potentially relevant knowledge encoded as predicate calculus
axioms (Hobbs et al., 1986). T h i r d , the T A C I T U S system makes use of the K A D S theorem prover, developed
by Mark Stickel (Stickel, 1982). Finally, there is a pragmatics module, including a local pragmatics component,
which uses the theorem prover to draw appropriate inferences from the knowledge base, thereby constructing an
interpretation of the text.
To interpret a sentence, the local pragmatics component determines from the logical form of the sentence what
interpretation problems need to be solved. Logical expressions are constructed for each local pragmatics phenomenon the sentence exhibits, and the proofs of these
expressions constitute the interpretation of the sentence.
Where there is more than one interpretation, it is because
there is more than one proof for the expressions. Interpretation is thus viewed as crucially involving deduction.
In Section 2, we describe the three of the phenomena we
are addressing—reference, metonymy, and nominal compounds. For each, we describe the expression that needs
to be proved, and for the last two, we describe how current standard techniques can be seen as special cases of
our general approach.
2
2.1
Local Pragmatics Phenomena
Reference
Entities are referred to in discourse in many guises—proper
nouns, definite, indefinite, and bare noun phrases of varying specificity, pronouns, and omitted or implicit arguments. Moreover, verbs, adverbs, and adjectives can refer to events, conditions, or situations. The problem in
all of these cases is to determine what is being referred
to. Here we confine ourselves to definite noun phrases,
although T A C I T U S handles the other cases in a manner
described briefly in Section 4.
In the sentence
for some entity z and some relation q. The predication
q(z, a) functions as a kind of buffer, or impedence match,
between the explicit predications w i t h their possibly inconsistent constraints. In many cases, of course, there is no
inconsistency. The argument satisfies the selectional constraints imposed by the predicate. In this case, z is a and
q is identity. This is the first possibility tried in the implemented system. Where this fails, however, the problem
is to find what z and q refer to, subject to the constraint,
imposed by the predicate after, that z is an event.
T A C I T U S thus modifies the logical form to
The alarm sounded.
the noun phrase "the alarm" is definite, and the hearer
is therefore expected to be able to identify a unique entity that the speaker intends to refer to. Restating this in
theorem-proving terminology, the natural language system
should be able to prove constructively the expression
It must find an x which is an alarm in the domain model.
If it succeeds, it has solved the reference problem. 1
2.2
Metonymy
Metonymy is very common in discourse; few sentences lack
examples. Certain functions frequently provide the required coercions. Wholes are used for parts, tokens for
types, and objects for their names. Nunberg (1978), however, has shown that there is no finite set of possible coercion functions. The relation between the explicit and
implicit referents can be virtually anything.
Every morpheme in a sentence conveys information that
corresponds to a predication, and every predicate imposes
selectional constraints on its arguments. Since entities in
the text are generally the arguments of more than one
predicate, there could well be inconsistent constraints imposed on them. To eliminate this inconsistency, we interpose, as a matter of course, another entity and another
relation between any two predications. Thus, when we
encounter in the logical form of a sentence
1
In this paper we ignore the problem of the uniqueness of the entity
referred to. A hint of our approach is this: If the search for a proof
is heuristically ordered by salience, then the entity found will be the
uniquely most salient.
Find an event z bearing some relation q to the alarm.
The most common current method for dealing with
metonymy, e.g., in the T E A M system (Grosz et al., 1985),
is to specify a small set of possible coercion functions, such
as narne-of. This method can be captured in the present
framework by treating q not as a predicate variable, but as
a predicate constant, and expressing the possible coercions
in axioms like the following:
That is, if x is the name of y, then y can be coerced to
x. This is the method we have implemented in our initial
version of the T A C I T U S system.
2.3
Nominal Compounds
To interpret a nominal compound, like "lube oil alarm",
it is necessay to discover the implicit relation between the
nouns. 2 Some relations occur quite frequently in nominal
compounds—part-of', location, purpose. Moreover, when
the head noun is relational, the modifier noun is often one
of the arguments of the relation. Levi (1978) argued that
these two cases encompassed virtually all nominal compounds. However, Downing (1977) and others have shown
that virtually any relation can occur. A lube oil alarm, for
example, is an alarm that sounds when the pressure of the
lube oil drops too low.
To discover the implicit relation, one must prove constructively from the knowledge base the existence of some
possible relation, which we may call nn, between the entities referred to by the nouns:
In our framework, we can implement the approach that
hypothesiszes a small set of possible relations, by taking
nn to be not a predicate variable but a predicate constant,
and encoding the possibilities in axioms like
2
Some nominal compounds can of course be treated as single lexical
items. This case is not interesting and is not considered here.
Hobbs and Martin
521
522
KNOWLEDGE REPRESENTATION
shorter coercions are favored over longer ones. These ideas
at least give us a start on the very difficult problem of
choosing the best interpretation.
Acknowledgements
The authors have profited from discussions with Mark
Stickel, Doug Edwards, Bill Croft, Fernando Pereira, Ray
Perrault, Stu Shieber, and Mabry Tyson about this work.
The research was funded by the Defense Advanced Research Projects Agency under Office of Naval Research
contract N00014-85-C-0013, as part of the Strategic Computing Program.
4
O t h e r Issues
Syntactic Ambiguity: Within this framework we
have also implemented a treatment of the most common
syntactic ambiguities. Attachment ambiguity problems
are translated into constrained reference problems by introducing an existentially quantified variable representing,
say, the subject of the prepositional phrase, constrained by
a disjunction listing the possible attachment sites.
Given and New: A problem that arises in the casreps
in a severe form, since determiners are often omitted, is
the problem of determining whether an entity referred to
is definite or indefinite, that is, given or new. In
Metal particles in oil sample and strainer.
"oil sample" is indefinite (or new) while "oil strainer" is
definite (or given). This problem occurs in general discourse as well, especially for nonnominal reference. Our
approach is to assert the existence of each entity, but with
a cost determined by the type of reference—in increasing
order, indefinite, nonnominal, bare, or definite. If this assertion (called a referential implicature) is required in the
proof that constitutes the interpretation, the entity is new;
if not, it is given.
M i n i m a l i t y : Axioms can be assigned a cost, depending
upon their salience. High salience, low cost axioms would
then be tried first. Short proofs are naturally tried before
long proofs. Thus, a cost depending on salience and length
is associated with each proof, and hence with each interpretation. Where, as usually happens, there is more than
one possible interpretation, the better interpretations are
supported by less expensive proofs. A second criterion for
good interpretations is to favor the minimal solution in the
sense that the fewest new entities and relations needed to
be hypothesized. For example, the argument-relation pattern in nominal compounds, as in "oil sample", is minimal
in that no new implicit relation need be hypothesized; the
one already given by the head noun will do. In metonymy,
the identity coercion is favored for the same reason, and
References
[1] Downing, Pamela, 1977. "On the Creation and Use of
English Compound Nouns", Language, vol. 53, no. 4,
pp. 810-842.
[2] Finin, Timothy, 1980. "The Semantic Interpretation of
Nominal Compounds", Report T-96, Coordinated Science Laboratory, University of Illinois, Urbana, Illinois,
June 1980.
[3] Crosz, Barbara J., Douglas E. Appelt, Paul Martin,
Fernando C. N. Pereira and Lorna Shinkle, 1985. "The
TEAM Natural-Language Interface System", Final Report, Project 4865, Artificial Intelligence Center, SRI
International, Menlo Park, California.
[4] Hobbs, Jerry R., 1983. "An Improper Treatment of
Quantification in Ordinary English", Proceedings, 21st
Annual Meeting of the Association for Computational
Linguistics, Cambridge, Massachusetts, pp. 57-63.
[5] Hobbs, Jerry R., William Croft, Todd Davies, Douglas Edwards, and Kenneth Laws, 1986. "Commonsense
Metaphysics and Lexical Semantics", Proceedings, 24th
Annual Meeting of the Association for Computational
Linguistics, New York, June 1986., pp. 231-240.
[6] Levi, Judith, 1978. The Syntax and Semantics of Complex Nominals, Academic Press, New York.
[7] Nunberg, Geoffery, 1978. "The Pragmatics of Reference", Ph. D. thesis, City University of New York, New
York.
[8] Stickel, Mark E., 1982. "A Nonclausal ConnectionGraph Theorem-Proving Program", Proceedings, AAAI82 National Conference on Artificial Intelligence, Pittsburgh, Pennsylvania, pp. 229-233.
4
This technique is due to Mark Stickel.
Hobbs and Martin
523