A Conceptual Parser for Natural Language.

A CONCEPTUAL PARSER FOR NATURAL LANGUAGE*
Roger C. Schank and Lawrence G. T e s l e r
Computer Science Department
Stanford University
Stanford, California
This paper describes an operable automatic parser
f o r n a t u r a l language.
I t i s a conceptual p a r s e r ,
concerned w i t h d e t e r m i n i n g the u n d e r l y i n g meaning
of the i n p u t u t i l i z i n g a network of concepts exp l i c a t i n g the b e l i e f s i n h e r e n t i n a piece o f d i s course.
1.
system is capable of making sense of a piece of
language c o n t a i n i n g o n l y a few words t h a t it knows
since i t s conceptual framework i s capable o f making
predictions.
Thus, i t can understand w h i l e u s i n g
o n l y a few r e a l i z a t i o n r u l e s , whereas it would need
a g r e a t many more to map the same s t r u c t u r e back
i n t o language.
This phenomenon i s s i m i l a r t o t h a t
observed in a man a t t e m p t i n g to l e a r n a f o r e i g n
language.
Introduction
The parser described in t h i s paper is a conceptual parser.
I t s p r i m a r y concern i s t o e x p l i cate the u n d e r l y i n g meaning and conceptual r e l a t i o n s h i p s present in a piece of discourse in any
n a t u r a l language.
I t s output is a language-free
network c o n s i s t i n g of unambiguous concepts and
t h e i r r e l a t i o n s t o o t h e r concepts.
Pieces o f d i s course w i t h i d e n t i c a l meanings, whether i n the
same or d i f f e r e n t languages, parse i n t o the same
conceptual network.
The parser is not a s y n t a c t i c parser in t h a t
i t s o u t p u t i s n o t concerned w i t h t h e syntax o f the
i n p u t language.
I t bears some s i m i l a r i t y t o c e r t a i n
deep s t r u c t u r e p a r s e r s 4 , 7 , 1 2 , 1 3 o n l y i n s o f a r a s a l l
these p a r s e r s are concerned to an e x t e n t w i t h the
meaning of the piece of discourse being operated
upon.
However, the conceptual parser i s not l i m i t e d
by the problems i n h e r e n t in t r a n s f o r m a t i o n a l grammar (such a s t h e d i f f i c u l t y i n r e v e r s i n g t r a n s f o r m a t i o n a l r u l e s and the n o t i o n t h a t semantics i s
something t h a t 'operates' on s y n t a c t i c o u t p u t ) .
A l s o , the parser does not have as a g o a l the t e s t i n g of a p r e v i o u s l y f o r m u l a t e d grammar7,13 so t h a t
the u n d e r l y i n g t h e o r y has been able to be changed
as was warranted by o b s t a c l e s t h a t we encountered.
The i n t e n t i o n o f t h i s work i s t o handle n a t u r a l
language u t i l i z i n g a semantics-based system, and
thus our paper bears some s i m i l a r i t y to the work of
Quillian".
However, the conceptual dependency
framework, though semantics-based, i s intended t o
f u n c t i o n as a more complete l i n g u i s t i c system. Thus,
a grammar of a language is employed.
The grammar of the system is b i p a r t i t e .
The
f i r s t p a r t i s a u n i v e r s a l grammar e x e m p l i f i e d b y the
conceptual r u l e s employed by t h e system.
The second
p a r t is l a n g u a g e - s p e c i f i c and is made up of r e a l i z a t i o n r u l e s i n t e n d e d t o map pieces o f the conceptual
network i n t o l i n g u i s t i c i t e m s .
The r e a l i z a t i o n
r u l e s may be used f o r both p a r s i n g and g e n e r a t i n g .
However, i t i s not necessary t o use a l l the r e a l i z a t i o n r u l e s i n order t o p a r s e .
That i s , the
* T h i s research is supported by Grant PHS MH 06645-0?
from t h e N a t i o n a l I n s t i t u t e o f Mental H e a l t h , and
( i n p a r t ) by t h e Advanced Research P r o j e c t s Agency
o f the O f f i c e o f the S e c r e t a r y o f Defense ( S D - I 8 3 ) .
The network produced by the parser contains
conceptualizations.
A c o n c e p t u a l i z a t i o n is a
statement about a s i n g l e conceptual s u b j e c t .
The
s u b j e c t may be a b s t r a c t or c o n c r e t e ; it may be one
t h i n g or a combination of t h i n g s .
The statement
may t e l l what the s u b j e c t i s , what i t does, what i s
done t o i t , e t c .
Furthermore, the e n t i r e concept u a l i z a t i o n may be q u a l i f i e d as to time and place
of occurrence, reasons, causes, consequences, and
explanations.
The numerous c o n c e p t u a l i z a t i o n s in a
discourse are i n t e r r e l a t e d not o n l y b y c a s u a l , l o g i c a l , s p a t i a l and temporal o r d e r i n g , but a l s o b y
anaphoric r e f e r e n c e s and by m u l t i p l e mention of the
same concepts.
Several c o n c e p t u a l i z a t i o n s may occur in a
s i n g l e sentence, and words from s e v e r a l sentences
may be r e q u i r e d to complete one c o n c e p t u a l i z a t i o n .
Furthermore, i n f o r m a t i o n from some c o n c e p t u a l i z a t i o n s in a discourse may serve to disambiguate the
i n t e r p r e t a t i o n of other conceptualizations.
Cons e q u e n t l y , the parser i s not i n h e r e n t l y sentencebound.
2.
Domain and C a p a b i l i t i e s
The p a r s e r is being used to understand n a t u r a l
language statements i n C o l b y ' s o n - l i n e dialogue
program f o r p s y c h i a t r i c i n t e r v i e w i n g , but i s not
r e s t r i c t e d to t h i s context.
In interviewing programs l i k e C o l b y ' s , as w e l l as in question-answeri n g programs, a d i s c o u r s e - g e n e r a t i n g a l g o r i t h m
must be i n c o r p o r a t e d to reverse t h e f u n c t i o n of the
parser.
The conceptual parser is based on a l i n g u i s t i c t h e o r y t h a t uses the same r u l e s f o r b o t h
p a r s i n g and g e n e r a t i n g , thus f a c i l i t a t i n g manmachine d i a l o g u e s .
In an i n t e r v i e w i n g program, the i n p u t may cont a i n words t h a t the program has never encountered,
o r which i t has encountered o n l y i n d i f f e r e n t e n v i ronments.
The i n p u t may d e a l w i t h a conceptual
s t r u c t u r e t h a t i s o u t s i d e the range o f experience
of the program, or even use a s y n t a c t i c combination
t h a t is unknown.
The program is designed to l e a r n
new words and word-senses, new semantic p o s s i b i l i t i e s , and new r u l e s of syntax both by encountering
new examples d u r i n g the dialogue and by r e c e i v i n g
explicit instruction.
,569-
3•
the dependent concept cannot make sense without
i t s governor.
U n d e r l y i n g Theory
The p a r s e r is based on t h e Conceptual Dependency model of language9, in t h i s model, a conc e p t u a l i z a t i o n i s a network o f l i n g u i s t i c conc e p t s 1 0 t h a t f a l l i n t o the f o l l o w i n g conceptual
categories;
Governing c a t e g o r i e s
PP
An a c t o r or o b j e c t ; corresponds r o u g h l y
to a s y n t a c t i c noun or pronoun.
ACT
I n E n g l i s h , corresponds s y n t a c t i c a l l y t o
v e r b s , v e r b a l nouns ( e . g . , gerunds) and
c e r t a i n a b s t r a c t nouns.
LOC
A noun d e n o t i n g t h e l o c a t i o n of a conceptualization.
T
Denotes t h e t i m e of a c o n c e p t u a l i z a t i o n .
Assisting categories
PA
P P - a s s i s t e r ; corresponds r o u g h l y to an
adjective.
AA
A C T - a s s i s t e r ; an adverb.
Most words or i d i o m a t i c phrases in a piece of
d i s c o u r s e r e p r e s e n t one or p o s s i b l y s e v e r a l concepts i n t h e n e t w o r k , and t h e r e f o r e f a l l i n t o the
c a t e g o r i e s above.
However, connectives such as c o n j u n c t i o n s ,
p r e p o s i t i o n s , p u n c t u a t o r s , a u x i l i a r i e s , and d e t e r miners are i n t e n d e d l i n g u i s t i c a l l y t o r e p r e s e n t
t h e s t r u c t u r e r a t h e r t h a n the content o f t h e n e t work.
They serve in a p a r s e r to disambiguate t h e
parse.
In a g e n e r a t o r , they are generated so t h a t
the hearer w i l l have clues t o disambiguate the
discourse.
He t h e r e b y can understand what meaning
was i n t e n d e d by t h e speaker.
The Conceptual Dependency model is s t r a t i d e a t i o n a l i n s o f a r as it i n v o l v e s a mapping from one
l e v e l t o another ( e . g . , see Lamb6). I t s h i g h e s t
l e v e l i s a n i n t e r l i n g u a c o n s i s t i n g o f a network o f
l a n g u a g e - f r e e dependent concepts.
(The dependency
c o n s i d e r e d here i s p a r t i a l l y d e r i v e d from the not i o n s of Hays3 and K l e i n 5 , however t h e dependenc i e s are n o t a t a l l r e s t r i c t e d t o any s y n t a c t i c
c r i t e r i o n . ) The l i n g u i s t i c process can be thought
o f , in Conceptual Dependency terms, as a, mapping
i n t o and out of some mental r e p r e s e n t a t i o n .
This
m e n t a l r e p r e s e n t a t i o n c o n s i s t s o f concepts r e l a t e d
to each o t h e r by v a r i o u s meaning-contingent dependency l i n k s .
Each concept in the network may be
a s s o c i a t e d w i t h some word t h a t i s i t s r e a l i z a t e o n
a sentential level.
The conceptual c a t e g o r i e s and r e l a t i o n s h i p s
of t h e system are a r r i v e d at by s t u d y i n g a one-byone a n a l y s i s , i d e n t i c a l to t h e way in which a p e r son hears a sentence. According to the t h e o r y , as
each word i s i n p u t , i t s r e p r e s e n t a t i o n i s d e t e r mined and t h e n i s s t o r e d u n t i l i t can b e a t t a c h e d
t o t h e concept which d i r e c t l y governs i t .
The
r u l e - o f - t h u m b in r e p r e s e n t i n g concepts as dependent o n o t h e r concepts i s t o see i f t h e dependent
concept w i l l f u r t h e r e x p l a i n i t s governor and i f
-570-
For example, in the sentence, "The big man
steals the red book," the analysis is as follows:
'The' is input and stored for possible use in
connecting sentences in paragraph, i . e . , in t h i s
case, 'the' specifies that 'man' has been referred
to previously. Next, ' b i g ' is input. But ' b i g '
cannot stand alone conceptually, and it is stored
u n t i l i t s governor appears. 'Man' can stand alone
and is modified conceptually by ' b i g ' , so it is
stored as a governor with i t s dependent.
'Steals' denotes an action that is dependent
on the concept that is doing the acting. A conceptualization cannot be complete without a concept acting (or an a t t r i b u t e statement), so a twoway dependency l i n k may be said to exist between
'man' and ' s t e a l ' . That i s , they are dependent on
each other and govern each other. Every concept u a l i z a t i o n must have a two-way dependency l i n k .
Next, ' t h e ' is stored as before as is ' r e d ' .
When 'book' is input, 'red' is attached to it as
before, and the whole e n t i t y is stored as dependent
on ' s t e a l s ' as the object of the action (represented by a horizontal single arrow).
The entire structure is as follows:
The categories assigned to these concepts are as
follows:
This system, with more symbols and categories
added, has been made to effect a set of rules
which can account f o r a l l possible conceptualizat i o n s . The relations inherent in these conceptualizations are intended to provide a system for
representing the meaning of a sentence in any
language, in language-free terms. (Language
enters i n t o the conceptual representation only in
a naming capacity.) A l i s t of the allowable conceptual dependencies is presented in the Appendix.
In addition to dependencies, other relations are
allowed by the theory, e . g . , conjunctions, d i s junctions and comparatives.
This conceptual framework can be used to
write r e a l i z a t i o n rules f o r any language and rules
for English have been w r i t t e n 9 . The r e a l i z a t i o n
rules are responsible for placing the e n t i t i e s
realized from the conceptual l e v e l in the correct
grammatical order to form a sentence. These rules
are then the parsing rules for a particular language.
The system f o r analyzing a sentence i n t o i t s
conceptual representation works backwards through
the r e a l i z a t i o n rules of a language. In places
where either of two rules could apply, the system
can b u i l d upon both of them, aborting if one analysis leads to an impossible conceptual structure,
or producing multiple analyses for ambiguous sentences. A l l conceptualizations are checked
against a l i s t of experiences to see if that particular part of the construction has occurred before. If the construction has not occurred, or
has occurred only in some peculiar context, t h i s
is noted. Thus, in the construction 'ideas <=>
sleep 1 , it is discovered that t h i s connection has
never been made before, and is therefore meaningless to the system. If the user says that t h i s
construction is a l l r i g h t , it is added to the memory; otherwise the construction is looked up in a
metaphor l i s t or aborted. The system thus employs
a record of what it has heard before in order to
analyze what it is presently hearing.
In order for the system to choose between two
analyses of a sentence both of which are feasible
with respect to the conceptual rules (see Appendix)
a conceptual semantics is incorporated. The conceptual semantics is a data base which l i m i t s the
possible conceptual dependencies to statements consonant with the system's knowledge of the real
world. The d e f i n i t i o n of each concept is composed
of records organized by type of dependency and by
conceptual category of the dependent. For each
type of dependency, semantic categories (such as
animate object, human i n s t i t u t i o n , animal motion)
are delimited with respect to the conceptual category of a given concept, and defining characterist i c s are inserted when they are known. For
example, concepts in the semantic category 'physical object' a l l have the characteristic 'shape'.
Sometimes t h i s information is i n t r i n s i c to the
particular concept involved, for example, 'balls
are round'.
The semantic categories are organized into
hierarchical structures in which l i m i t a t i o n s on
any category are assumed to apply as well to a l l
categories subordinate to i t . The system of
semantic categories and a method of constructing
semantic f i l e s is discussed more f u l l y in a previous paper9.
Conceptual
If a permissible conceptual governor has been
found for a queued concept, the queued concept is
placed in the network with the appropriate dependency. A governor is not placed d i r e c t l y in the
network unless it s a t i s f i e s the conceptual semant i c s with respect to i t s dependencies.
Ambiguous interpretations of a piece of discourse are b u i l t up if more than one conceptual
rule may apply. If the conceptual semantics d i s allows an i n t e r p r e t a t i o n , the dependency is
aborted. If more than one network remains after
the conceptual semantics have been checked m u l t i ple analyses are b u i l t up. Semantic ambiguity
(that i s , multiple meanings for a word) is also
handled by the conceptual semantics. If a word
connects to more than one concept, each concept
is checked for the appropriate p o s s i b i l i t i e s of
dependence. The concept that f i t s according to
the semantics is the word-sense chosen. If more
than one concept f i t s , more than one network is
b u i l t up. The semantics are checked as each dependent is added, so it is possible to abort a
multiple network at a l a t e r poin in the parse.
As an example of the strategy employed in the
theory it is illuminating to follow the parse of
an example sentence. Consider:
'The t a l l boy went to the park with a g i r l 1 .
The machine t r i e s to simulate the behavior of a
human in perceiving t h i s sentence. Thus, it is
continually operating and making hypotheses as
each word enters the system.
When 'The' enters, it is held in waiting as
it may be a l i n k to a previous mention of the
next PP which w i l l enter the system. That i s ,
'the' would be replaced by ' a ' in an ordinary
dialogue if i t s specific referent were previously
unmentioned or unknown.
' T a l l ' is marked as a PA. It is therefore
queued u n t i l a rule that uses PA as a dependent
can apply. Since 'boy' is a PP, it is placed
d i r e c t l y in the network. Previous networks generated by the user are searched for an instance
of 'boy' to which the ' t h e ' refers and a l i n k is
made if one is found. The conceptual rule 'PP <PA' is keyed by the r e a l i z a t i o n rule 'PA PP: 2;
In the present system, the f i l e s are constructed by incorporating information derived
from rules presented as English sentences. The
program parses each of these sentences, observes
which dependencies are new, and then adds them to
the f i l e s .
4.
ceptual rule (see Appendix) and where the dependency effected by the use of that rule is allowed
by the conceptual semantics. 11 (An example of the
conceptual semantics is given in the Appendix.)
Analysis
In the Conceptual Dependency theory on which
our program is based, the parsing procedure begins
by looking up the conceptual category of a word.
If the conceptual category is either PP or ACT,
the concept evoked by the word being considered is
placed d i r e c t l y i n t o the conceptual network.
Otherwise the concept is queued u n t i l a permissible
conceptual governor enters the system. Preposit i o n s , conjunctions, and determiners are s i m i l a r l y
queued. A permissible conceptual governor is one
whose category is on the l e f t hand side of a con-571-
the numbers are place markers representing r e l a t i v e position in the piece of discourse. The PA
is below the l i n e to indicate that t h i s dependent
is an a t t r i b u t e of i t s governor. The 'PP <- PA'
semantics for 'boy' are checked to see if the conjunction 'boy
can e x i s t . Since the system knows that any animal can have height the
connection is allowed and our network looks as
follows:
(The check w i t h t h e semantics is made at every
connection b u t from now o n w e w i l l o n l y mention i t
when i t i s n e c e s s a r y . )
When ' w e n t ' i s operated o n i t i s transformed
i n t o * g o - p ' (p means p a s t ) and since 'go' is an
ACT t h e r e a l i z a t i o n r u l e t h a t a p p l i e s w i l l connect
it to a p r e v i o u s PP by a two-way l i n k .
The ' p '
m o d i f i e s t h e two-way l i n k and i s moved over i t .
We now have t h e f o l l o w i n g :
' T o ' , t h e next word i n t o the system, i s
queued since it may be a p r e p o s i t i o n or p a r t of an
infinitive.
I f t h e next governor encountered i s
n o t a n ACT, ' t o ' i s a p r e p o s i t i o n and i s t r a n s l a t e d i n t o a 'to' l i n k .
4=
cy.
When t h e l i n k i s w r i t t e n h o r i z o n t a l l y , i t
r e p r e s e n t s a dependency between an ACT and a PP
where t h e dependent is n o t t h e o b j e c t of a d i r e c t
action.
V e r t i c a l p r e p o s i t i o n a l dependency
s p e c i f i e s a d d i t i o n a l i n f o r m a t i o n about a concept
t h a t i s o n l y i n d i r e c t l y a n a t t r i b u t e o f t h a t concept ( e . g . , a l o c a t i o n r a t h e r than a p h y s i c a l
attribute).
P r e p o s i t i o n a l l i n k s may have many
d i f f e r e n t f o r m s , each r e p r e s e n t e d b y a t a g ( e . g . ,
" t o " , " o f " ) w r i t t e n over the l i n k .
Prepositional
dependency i s d i f f e r e n t from simple dependency i n
c e r t a i n s p e c i f i e d ways, one b e i n g t h a t s t r i n g s o f
p r e p o s i t i o n a l l y dependent PP's may e x i s t whereas
t h i s cannot e x i s t w i t h simple dependency.
When ' t h e ' i s encountered i t i s t r e a t e d a s
before.
' P a r k ' is marked as b o t h a LOC and a
However, t h e system w i l l demand the PP
PP.
LOC
i n t e r p r e t a t i o n since ' t o LOC' i s not a l l o w e d ; b y
d e f i n i t i o n , LOC's can o n l y modify two-way l i n k s .
' P a r k ' i s then p l a c e d i n the network g i v i n g :
'With' is held in waiting as a
l i n k unt i l the P P t o which i t connects i s encountered.
' A ' i s i g n o r e d and the c o n s t r u c t i o n ' w i t h
g i r l ' i s made. We are are now faced w i t h the p r o b lem o f where t o a t t a c h t h i s c o n s t r u c t .
A problem
e x i s t s s i n c e a t l e a s t two r e a l i z a t i o n r u l e s may
apply:
The problem is r e s o l v e d by t h e conceptual semantics.
The semantics f o r ' g o ' c o n t a i n s a l i s t o f concept u a l prepositions.
Under ' w i t h ' i s l i s t e d ' a n y
movable p h y s i c a l o b j e c t ' and since a g i r l is a
p h y s i c a l o b j e c t the dependency is a l l o w e d .
The
semantics f o r ' p a r k ' are a l s o checked.
Under
' w i t h ' f o r ' p a r k ' are l i s t e d the v a r i o u s items
t h a t parks are known t o c o n t a i n , e . g . , s t a t u e s ,
j u n g l e gyms, e t c .
' G i r l ' i s not found s o t h e n e t work ( l ) i s allowed w h i l e (2) i s a b o r t e d .
Although ' *=
g i r l ' is dependent on 'go' it
i s dependent t h r o u g h ' p a r k ' .
That i s , these are
not i s o l a t e d dependencies since we would want to
be able to answer t h e q u e s t i o n ' D i d the g i r l go
to the park?' a f f i r m a t i v e l y .
I n (2) the b e l o w - t h e l i n e n o t a t i o n i n d i c a t e s t h a t i t i s the ' p a r k w i t h
The conceptual semantics f u n c t i o n s as an e x p e r i ence f i l e i n t h a t i t l i m i t s c o n c e p t u a l i z a t i o n t o
ones consonant w i t h the system's past experience.
Since i t has never encountered ' p a r k s w i t h g i r l s '
i t w i l l assume t h a t t h i s i s not the meaning i n tended.
It is possible, as it is in an ordinary
c o n v e r s a t i o n , f o r the user t o c o r r e c t the system
i f a n e r r o r was made.
That i s , i f (2) were the
i n t e n d e d network i t might become apparent t o the
user t h a t t h e system had misunderstood and a
c o r r e c t i o n c o u l d e a s i l y be made.
The system would
then l e a r n the new p e r m i s s i b l e c o n s t r u c t and would
add it to i t s semantics.
The system can always
l e a r n from t h e u s e r 1 0 and i n f a c t the semantics
were o r i g i n a l l y i n p u t i n t h i s way, b y n o t i c i n g
occurrences in sample s e n t e n c e s . *
Thus, t h e system p u r p o r t s to be a n a l y z i n g a
sentence in a way analogous to the human method.
I t handles i n p u t one word a t a time a s i t i s enc o u n t e r e d , checks p o t e n t i a l l i n k i n g s w i t h i t s own
knowledge of t h e w o r l d and past e x p e r i e n c e , and
places i t s o u t p u t i n t o a l a n g u a g e - f r e e f o r m u l a t i o n
t h a t can be operated o n , r e a l i z e d in a paraphrase,
or translated.
5.
The p a r s e r i s p r e s e n t l y o p e r a t i n g i n a l i m i t e d
form.
I t i s coded i n MLISP f o r t h e FDP-10 and can
be adapted to o t h e r LISP processors w i t h minor
revisions.
The a l g o r i t h m used d i f f e r s from t h e
t h e o r e t i c a l analysis given in the previous section
because a computer program must d e a l w i t h machine
l i m i t a t i o n s and must cope w i t h s p e c i a l cases t h a t
may be encountered in t h e i n p u t .
572-
Rather than b u i l d i n g up the network s t r u c t u r e
a l i t t l e b i t a t a time d u r i n g the p a r s e , the p r o gram determines a l l the dependencies present i n
the network and then assembles the e n t i r e network
at the end.
Thus, t h e sentence 'The b i g boy gives
apples t o the p i g . ' i s parsed i n t o :
a
The i n p u t sentence is processed word-by-word.
A f t e r " h e a r i n g " each word, the program attempts
to determine as much as it can about the sentence
before " l i s t e n i n g " f o r more. To t h i s end, the
network is b u i l t up a l i t t l e at a time as each
word i s processed. Furthermore, the program a n t i c i p a t e s what k i n d s of concepts and s t r u c t u r e s may
b e expected l a t e r i n the sentence.
I f what i t
hears does not conform w i t h i t s a n t i c i p a t i o n , i t
may be " c o n f u s e d " , " s u r p r i s e d " , or even "amused".
I n case o f semantic o r s y n t a c t i c a m b i g u i t y ,
the program should determine which of s e v e r a l
p o s s i b l e i n t e r p r e t a t i o n s was intended by the
"speaker".
I t f i r s t s e l e c t s one i n t e r p r e t a t i o n
by means of miscellaneous h e u r i s t i c s and stacks
the r e s t .
I n case l a t e r t e s t s and f u r t h e r i n p u t
r e f u t e o r cast doubt upon the i n i t i a l guess, t h a t
guess is discarded or s h e l v e d , and a d i f f e r e n t i n t e r p r e t a t i o n is removed from the stack to be p r o cessed.
To process an i n t e r p r e t a t i o n , it may be
necessary to back up the scan to an e a r l i e r p o i n t
in the sentence and rescan s e v e r a l words. To
a v o i d r e p e t i t i o u s work d u r i n g rescans, any i n f o r mation l e a r n e d about t h e words of the sentence is
kept in core memory.
The parse i n v o l v e s f i v e s t e p s :
the d i c t i o n a r y l o o k u p , the a p p l i c a t i o n o f r e a l i z a t i o n r u l e s ,
t h e e l i m i n a t i o n o f i d i o m s , the r e w r i t i n g o f abs t r a c t s , a n d t h e check a g a i n s t the conceptual semantics.
The d i c t i o n a r y of words is kept m o s t l y on the
d i s k , b u t the most f r e q u e n t l y encountered words
remain in core memory to minimize processing t i m e .
Under each word are l i s t e d a l l i t s senses.
"Senses" are d e f i n e d p r a g m a t i c a l l y as i n t e r p r e t a t i o n s o f the word t h a t can l e a d t o d i f f e r e n t n e t work s t r u c t u r e s o r t h a t denote d i f f e r e n t concepts.
For example, some of t h e senses of " f l y " a r e :
f l y . - ( i n t r a n s i t i v e ACT):
does in an a i r p l a n e
I f t h e r e are s e v e r a l senses from which t o
choose, the program sees whether it was a n t i c i p a t i n g a concept or connective from some s p e c i f i c
category.
Recent c o n t e x t u a l usage of some sense
a l s o can serve to p r e f e r one i n t e r p r e t a t i o n over
another. To choose among s e v e r a l senses w i t h
otherwise equal l i k e l i h o o d s , the sense w i t h lowest
s u b s c r i p t i s chosen f i r s t .
Thus, b y o r d e r i n g
senses i n the d i c t i o n a r y according t o t h e i r empiri c a l frequency o f occurrence, the system can t r y
t o improve i t s guessing a b i l i t y .
The r e a l i z a t i o n r u l e s t h a t apply to each word
sense are r e f e r e n c e d in the d i c t i o n a r y under each
sense.
Most o f these r u l e s f a l l i n t o categories
t h a t cover l a r g e conceptual classes and are r e f erenced by many concepts. Such c a t e g o r i e s are PP,
PA, AA, PPLOC, PPT> LOC, T, simply t r a n s i t i v e ACT,
i n t r a n s i t i v e ACT, ACT t h a t can take an e n t i r e
c o n c e p t u a l i z a t i o n as d i r e c t o b j e c t ( " s t a t e ACT"),
and ACT t h a t can take an i n d i r e c t o b j e c t w i t h o u t a
p r e p o s i t i o n ( " t r a n s p o r t ACT").
In contrast to
most concepts, each connective ( e . g . , an a u x i l i a r y ,
p r e p o s i t i o n , or d e t e r m i n e r ) tends to have i t s own
r u l e s o r t o share i t s r u l e s w i t h a few other words.
A r e a l i z a t i o n r u l e c o n s i s t s of two p a r t s :
a
r e c o g n i z e r and a dependency c h a r t .
The recognizer
determines whether the r u l e a p p l i e s and the
dependency c h a r t shows t h e dependencies t h a t e x i s t
when i t does.
I n the recognizer are s p e c i f i e d the
o r d e r i n g , c a t e g o r i e s , and i n f l e c t i o n o f the concepts and connectives t h a t n o r m a l l y would appear
i n a sentence i f the r u l e a p p l i e s .
I f c e r t a i n concepts o r connectives are o m i s s i b l e i n the i n p u t ,
the r u l e can s p e c i f y what to assume when they are
m i s s i n g . Agreement of i n f l e c t e d words can be
s p e c i f i e d i n a n absolute ( e . g . , " p l u r a l " ) o r a
r e l a t i v e manner ( e . g . , "same t e n s e " ) .
Rules f o r a
language l i k e E n g l i s h have a preponderance of word
order s p e c i f i c a t i o n s w h i l e r u l e s f o r a more h i g h l y
i n f l e c t e d language would have a preponderance of
inflection specifications.
R e a l i z a t i o n r u l e s are used b o t h t o f i t concepts i n t o the network as t h e y are encountered and
t o a n t i c i p a t e f u r t h e r concepts and t h e i r p o t e n t i a l
r e a l i z a t e s i n t h e networks.
When a r u l e i s s e l ected f o r t h e c u r r e n t word sense, i t i s compared
w i t h the r u l e s o f p r e c e d i n g word senses t o f i n d
one t h a t " f i t s " .
For example, i f " v e r y h o t " i s
h e a r d , one r e a l i z a t i o n r u l e f o r " v e r y " i s :
what a passenger
-573-
8.
Q u i l l i a n , R., "The Teachable Language Comprehender: A S i m u l a t i o n Program and Theory
of Language", B o l t Beranek and Newman, Camb r i d g e , Massachusetts, 1969*
9.
Schank, R., "A Conceptual Dependency Repres e n t a t i o n f o r a Computer-Oriented Semantics",
Ph.D. Thesis U n i v e r s i t y of Texas, A u s t i n 1969
(Also a v a i l a b l e as S t a n f o r d AI Memo 8 3 ,
Computer Science Department, S t a n f o r d U n i v e r s i t y , S t a n f o r d , C a l i f o r n i a , March 1969)-
10.
Schank, R., "A N o t i o n of L i n g u i s t i c Concept:
A Prelude to Mechanical T r a n s l a t i o n " , S t a n f o r d AI Memo 75.
Computer Science Department, S t a n f o r d U n i v e r s i t y , S t a n f o r d , C a l i f o r n i a , December 1969*
11.
Schank, R., " O u t l i n e of a Conceptual Semant i c s f o r Generation o f Coherent D i s c o u r s e " ,
TRACQR-68-U62-U, Tracor I n c . , A u s t i n , Texas
March 1968.
12.
Thorne, J . , B r a t l e y , P . , and Dewar, H . , "The
S y n t a c t i c A n a l y s i s o f E n g l i s h b y Machine"
i n Machine I n t e l l i g e n c e I I I , U n i v e r s i t y o f
E d i n b u r g h , 196b.
13*
Walker, D . ,
L . , "Recent
t i c Analysis
Mass., June
Chapin, P . , G e i s , M., and Gross,
Developments in t h e METRE SyntacP r o c e d u r e " , MITRE C o r p . , B e d f o r d ,
1966.
-578-