A CONCEPTUAL PARSER FOR NATURAL LANGUAGE* Roger C. Schank and Lawrence G. T e s l e r Computer Science Department Stanford University Stanford, California This paper describes an operable automatic parser f o r n a t u r a l language. I t i s a conceptual p a r s e r , concerned w i t h d e t e r m i n i n g the u n d e r l y i n g meaning of the i n p u t u t i l i z i n g a network of concepts exp l i c a t i n g the b e l i e f s i n h e r e n t i n a piece o f d i s course. 1. system is capable of making sense of a piece of language c o n t a i n i n g o n l y a few words t h a t it knows since i t s conceptual framework i s capable o f making predictions. Thus, i t can understand w h i l e u s i n g o n l y a few r e a l i z a t i o n r u l e s , whereas it would need a g r e a t many more to map the same s t r u c t u r e back i n t o language. This phenomenon i s s i m i l a r t o t h a t observed in a man a t t e m p t i n g to l e a r n a f o r e i g n language. Introduction The parser described in t h i s paper is a conceptual parser. I t s p r i m a r y concern i s t o e x p l i cate the u n d e r l y i n g meaning and conceptual r e l a t i o n s h i p s present in a piece of discourse in any n a t u r a l language. I t s output is a language-free network c o n s i s t i n g of unambiguous concepts and t h e i r r e l a t i o n s t o o t h e r concepts. Pieces o f d i s course w i t h i d e n t i c a l meanings, whether i n the same or d i f f e r e n t languages, parse i n t o the same conceptual network. The parser is not a s y n t a c t i c parser in t h a t i t s o u t p u t i s n o t concerned w i t h t h e syntax o f the i n p u t language. I t bears some s i m i l a r i t y t o c e r t a i n deep s t r u c t u r e p a r s e r s 4 , 7 , 1 2 , 1 3 o n l y i n s o f a r a s a l l these p a r s e r s are concerned to an e x t e n t w i t h the meaning of the piece of discourse being operated upon. However, the conceptual parser i s not l i m i t e d by the problems i n h e r e n t in t r a n s f o r m a t i o n a l grammar (such a s t h e d i f f i c u l t y i n r e v e r s i n g t r a n s f o r m a t i o n a l r u l e s and the n o t i o n t h a t semantics i s something t h a t 'operates' on s y n t a c t i c o u t p u t ) . A l s o , the parser does not have as a g o a l the t e s t i n g of a p r e v i o u s l y f o r m u l a t e d grammar7,13 so t h a t the u n d e r l y i n g t h e o r y has been able to be changed as was warranted by o b s t a c l e s t h a t we encountered. The i n t e n t i o n o f t h i s work i s t o handle n a t u r a l language u t i l i z i n g a semantics-based system, and thus our paper bears some s i m i l a r i t y to the work of Quillian". However, the conceptual dependency framework, though semantics-based, i s intended t o f u n c t i o n as a more complete l i n g u i s t i c system. Thus, a grammar of a language is employed. The grammar of the system is b i p a r t i t e . The f i r s t p a r t i s a u n i v e r s a l grammar e x e m p l i f i e d b y the conceptual r u l e s employed by t h e system. The second p a r t is l a n g u a g e - s p e c i f i c and is made up of r e a l i z a t i o n r u l e s i n t e n d e d t o map pieces o f the conceptual network i n t o l i n g u i s t i c i t e m s . The r e a l i z a t i o n r u l e s may be used f o r both p a r s i n g and g e n e r a t i n g . However, i t i s not necessary t o use a l l the r e a l i z a t i o n r u l e s i n order t o p a r s e . That i s , the * T h i s research is supported by Grant PHS MH 06645-0? from t h e N a t i o n a l I n s t i t u t e o f Mental H e a l t h , and ( i n p a r t ) by t h e Advanced Research P r o j e c t s Agency o f the O f f i c e o f the S e c r e t a r y o f Defense ( S D - I 8 3 ) . The network produced by the parser contains conceptualizations. A c o n c e p t u a l i z a t i o n is a statement about a s i n g l e conceptual s u b j e c t . The s u b j e c t may be a b s t r a c t or c o n c r e t e ; it may be one t h i n g or a combination of t h i n g s . The statement may t e l l what the s u b j e c t i s , what i t does, what i s done t o i t , e t c . Furthermore, the e n t i r e concept u a l i z a t i o n may be q u a l i f i e d as to time and place of occurrence, reasons, causes, consequences, and explanations. The numerous c o n c e p t u a l i z a t i o n s in a discourse are i n t e r r e l a t e d not o n l y b y c a s u a l , l o g i c a l , s p a t i a l and temporal o r d e r i n g , but a l s o b y anaphoric r e f e r e n c e s and by m u l t i p l e mention of the same concepts. Several c o n c e p t u a l i z a t i o n s may occur in a s i n g l e sentence, and words from s e v e r a l sentences may be r e q u i r e d to complete one c o n c e p t u a l i z a t i o n . Furthermore, i n f o r m a t i o n from some c o n c e p t u a l i z a t i o n s in a discourse may serve to disambiguate the i n t e r p r e t a t i o n of other conceptualizations. Cons e q u e n t l y , the parser i s not i n h e r e n t l y sentencebound. 2. Domain and C a p a b i l i t i e s The p a r s e r is being used to understand n a t u r a l language statements i n C o l b y ' s o n - l i n e dialogue program f o r p s y c h i a t r i c i n t e r v i e w i n g , but i s not r e s t r i c t e d to t h i s context. In interviewing programs l i k e C o l b y ' s , as w e l l as in question-answeri n g programs, a d i s c o u r s e - g e n e r a t i n g a l g o r i t h m must be i n c o r p o r a t e d to reverse t h e f u n c t i o n of the parser. The conceptual parser is based on a l i n g u i s t i c t h e o r y t h a t uses the same r u l e s f o r b o t h p a r s i n g and g e n e r a t i n g , thus f a c i l i t a t i n g manmachine d i a l o g u e s . In an i n t e r v i e w i n g program, the i n p u t may cont a i n words t h a t the program has never encountered, o r which i t has encountered o n l y i n d i f f e r e n t e n v i ronments. The i n p u t may d e a l w i t h a conceptual s t r u c t u r e t h a t i s o u t s i d e the range o f experience of the program, or even use a s y n t a c t i c combination t h a t is unknown. The program is designed to l e a r n new words and word-senses, new semantic p o s s i b i l i t i e s , and new r u l e s of syntax both by encountering new examples d u r i n g the dialogue and by r e c e i v i n g explicit instruction. ,569- 3• the dependent concept cannot make sense without i t s governor. U n d e r l y i n g Theory The p a r s e r is based on t h e Conceptual Dependency model of language9, in t h i s model, a conc e p t u a l i z a t i o n i s a network o f l i n g u i s t i c conc e p t s 1 0 t h a t f a l l i n t o the f o l l o w i n g conceptual categories; Governing c a t e g o r i e s PP An a c t o r or o b j e c t ; corresponds r o u g h l y to a s y n t a c t i c noun or pronoun. ACT I n E n g l i s h , corresponds s y n t a c t i c a l l y t o v e r b s , v e r b a l nouns ( e . g . , gerunds) and c e r t a i n a b s t r a c t nouns. LOC A noun d e n o t i n g t h e l o c a t i o n of a conceptualization. T Denotes t h e t i m e of a c o n c e p t u a l i z a t i o n . Assisting categories PA P P - a s s i s t e r ; corresponds r o u g h l y to an adjective. AA A C T - a s s i s t e r ; an adverb. Most words or i d i o m a t i c phrases in a piece of d i s c o u r s e r e p r e s e n t one or p o s s i b l y s e v e r a l concepts i n t h e n e t w o r k , and t h e r e f o r e f a l l i n t o the c a t e g o r i e s above. However, connectives such as c o n j u n c t i o n s , p r e p o s i t i o n s , p u n c t u a t o r s , a u x i l i a r i e s , and d e t e r miners are i n t e n d e d l i n g u i s t i c a l l y t o r e p r e s e n t t h e s t r u c t u r e r a t h e r t h a n the content o f t h e n e t work. They serve in a p a r s e r to disambiguate t h e parse. In a g e n e r a t o r , they are generated so t h a t the hearer w i l l have clues t o disambiguate the discourse. He t h e r e b y can understand what meaning was i n t e n d e d by t h e speaker. The Conceptual Dependency model is s t r a t i d e a t i o n a l i n s o f a r as it i n v o l v e s a mapping from one l e v e l t o another ( e . g . , see Lamb6). I t s h i g h e s t l e v e l i s a n i n t e r l i n g u a c o n s i s t i n g o f a network o f l a n g u a g e - f r e e dependent concepts. (The dependency c o n s i d e r e d here i s p a r t i a l l y d e r i v e d from the not i o n s of Hays3 and K l e i n 5 , however t h e dependenc i e s are n o t a t a l l r e s t r i c t e d t o any s y n t a c t i c c r i t e r i o n . ) The l i n g u i s t i c process can be thought o f , in Conceptual Dependency terms, as a, mapping i n t o and out of some mental r e p r e s e n t a t i o n . This m e n t a l r e p r e s e n t a t i o n c o n s i s t s o f concepts r e l a t e d to each o t h e r by v a r i o u s meaning-contingent dependency l i n k s . Each concept in the network may be a s s o c i a t e d w i t h some word t h a t i s i t s r e a l i z a t e o n a sentential level. The conceptual c a t e g o r i e s and r e l a t i o n s h i p s of t h e system are a r r i v e d at by s t u d y i n g a one-byone a n a l y s i s , i d e n t i c a l to t h e way in which a p e r son hears a sentence. According to the t h e o r y , as each word i s i n p u t , i t s r e p r e s e n t a t i o n i s d e t e r mined and t h e n i s s t o r e d u n t i l i t can b e a t t a c h e d t o t h e concept which d i r e c t l y governs i t . The r u l e - o f - t h u m b in r e p r e s e n t i n g concepts as dependent o n o t h e r concepts i s t o see i f t h e dependent concept w i l l f u r t h e r e x p l a i n i t s governor and i f -570- For example, in the sentence, "The big man steals the red book," the analysis is as follows: 'The' is input and stored for possible use in connecting sentences in paragraph, i . e . , in t h i s case, 'the' specifies that 'man' has been referred to previously. Next, ' b i g ' is input. But ' b i g ' cannot stand alone conceptually, and it is stored u n t i l i t s governor appears. 'Man' can stand alone and is modified conceptually by ' b i g ' , so it is stored as a governor with i t s dependent. 'Steals' denotes an action that is dependent on the concept that is doing the acting. A conceptualization cannot be complete without a concept acting (or an a t t r i b u t e statement), so a twoway dependency l i n k may be said to exist between 'man' and ' s t e a l ' . That i s , they are dependent on each other and govern each other. Every concept u a l i z a t i o n must have a two-way dependency l i n k . Next, ' t h e ' is stored as before as is ' r e d ' . When 'book' is input, 'red' is attached to it as before, and the whole e n t i t y is stored as dependent on ' s t e a l s ' as the object of the action (represented by a horizontal single arrow). The entire structure is as follows: The categories assigned to these concepts are as follows: This system, with more symbols and categories added, has been made to effect a set of rules which can account f o r a l l possible conceptualizat i o n s . The relations inherent in these conceptualizations are intended to provide a system for representing the meaning of a sentence in any language, in language-free terms. (Language enters i n t o the conceptual representation only in a naming capacity.) A l i s t of the allowable conceptual dependencies is presented in the Appendix. In addition to dependencies, other relations are allowed by the theory, e . g . , conjunctions, d i s junctions and comparatives. This conceptual framework can be used to write r e a l i z a t i o n rules f o r any language and rules for English have been w r i t t e n 9 . The r e a l i z a t i o n rules are responsible for placing the e n t i t i e s realized from the conceptual l e v e l in the correct grammatical order to form a sentence. These rules are then the parsing rules for a particular language. The system f o r analyzing a sentence i n t o i t s conceptual representation works backwards through the r e a l i z a t i o n rules of a language. In places where either of two rules could apply, the system can b u i l d upon both of them, aborting if one analysis leads to an impossible conceptual structure, or producing multiple analyses for ambiguous sentences. A l l conceptualizations are checked against a l i s t of experiences to see if that particular part of the construction has occurred before. If the construction has not occurred, or has occurred only in some peculiar context, t h i s is noted. Thus, in the construction 'ideas <=> sleep 1 , it is discovered that t h i s connection has never been made before, and is therefore meaningless to the system. If the user says that t h i s construction is a l l r i g h t , it is added to the memory; otherwise the construction is looked up in a metaphor l i s t or aborted. The system thus employs a record of what it has heard before in order to analyze what it is presently hearing. In order for the system to choose between two analyses of a sentence both of which are feasible with respect to the conceptual rules (see Appendix) a conceptual semantics is incorporated. The conceptual semantics is a data base which l i m i t s the possible conceptual dependencies to statements consonant with the system's knowledge of the real world. The d e f i n i t i o n of each concept is composed of records organized by type of dependency and by conceptual category of the dependent. For each type of dependency, semantic categories (such as animate object, human i n s t i t u t i o n , animal motion) are delimited with respect to the conceptual category of a given concept, and defining characterist i c s are inserted when they are known. For example, concepts in the semantic category 'physical object' a l l have the characteristic 'shape'. Sometimes t h i s information is i n t r i n s i c to the particular concept involved, for example, 'balls are round'. The semantic categories are organized into hierarchical structures in which l i m i t a t i o n s on any category are assumed to apply as well to a l l categories subordinate to i t . The system of semantic categories and a method of constructing semantic f i l e s is discussed more f u l l y in a previous paper9. Conceptual If a permissible conceptual governor has been found for a queued concept, the queued concept is placed in the network with the appropriate dependency. A governor is not placed d i r e c t l y in the network unless it s a t i s f i e s the conceptual semant i c s with respect to i t s dependencies. Ambiguous interpretations of a piece of discourse are b u i l t up if more than one conceptual rule may apply. If the conceptual semantics d i s allows an i n t e r p r e t a t i o n , the dependency is aborted. If more than one network remains after the conceptual semantics have been checked m u l t i ple analyses are b u i l t up. Semantic ambiguity (that i s , multiple meanings for a word) is also handled by the conceptual semantics. If a word connects to more than one concept, each concept is checked for the appropriate p o s s i b i l i t i e s of dependence. The concept that f i t s according to the semantics is the word-sense chosen. If more than one concept f i t s , more than one network is b u i l t up. The semantics are checked as each dependent is added, so it is possible to abort a multiple network at a l a t e r poin in the parse. As an example of the strategy employed in the theory it is illuminating to follow the parse of an example sentence. Consider: 'The t a l l boy went to the park with a g i r l 1 . The machine t r i e s to simulate the behavior of a human in perceiving t h i s sentence. Thus, it is continually operating and making hypotheses as each word enters the system. When 'The' enters, it is held in waiting as it may be a l i n k to a previous mention of the next PP which w i l l enter the system. That i s , 'the' would be replaced by ' a ' in an ordinary dialogue if i t s specific referent were previously unmentioned or unknown. ' T a l l ' is marked as a PA. It is therefore queued u n t i l a rule that uses PA as a dependent can apply. Since 'boy' is a PP, it is placed d i r e c t l y in the network. Previous networks generated by the user are searched for an instance of 'boy' to which the ' t h e ' refers and a l i n k is made if one is found. The conceptual rule 'PP <PA' is keyed by the r e a l i z a t i o n rule 'PA PP: 2; In the present system, the f i l e s are constructed by incorporating information derived from rules presented as English sentences. The program parses each of these sentences, observes which dependencies are new, and then adds them to the f i l e s . 4. ceptual rule (see Appendix) and where the dependency effected by the use of that rule is allowed by the conceptual semantics. 11 (An example of the conceptual semantics is given in the Appendix.) Analysis In the Conceptual Dependency theory on which our program is based, the parsing procedure begins by looking up the conceptual category of a word. If the conceptual category is either PP or ACT, the concept evoked by the word being considered is placed d i r e c t l y i n t o the conceptual network. Otherwise the concept is queued u n t i l a permissible conceptual governor enters the system. Preposit i o n s , conjunctions, and determiners are s i m i l a r l y queued. A permissible conceptual governor is one whose category is on the l e f t hand side of a con-571- the numbers are place markers representing r e l a t i v e position in the piece of discourse. The PA is below the l i n e to indicate that t h i s dependent is an a t t r i b u t e of i t s governor. The 'PP <- PA' semantics for 'boy' are checked to see if the conjunction 'boy can e x i s t . Since the system knows that any animal can have height the connection is allowed and our network looks as follows: (The check w i t h t h e semantics is made at every connection b u t from now o n w e w i l l o n l y mention i t when i t i s n e c e s s a r y . ) When ' w e n t ' i s operated o n i t i s transformed i n t o * g o - p ' (p means p a s t ) and since 'go' is an ACT t h e r e a l i z a t i o n r u l e t h a t a p p l i e s w i l l connect it to a p r e v i o u s PP by a two-way l i n k . The ' p ' m o d i f i e s t h e two-way l i n k and i s moved over i t . We now have t h e f o l l o w i n g : ' T o ' , t h e next word i n t o the system, i s queued since it may be a p r e p o s i t i o n or p a r t of an infinitive. I f t h e next governor encountered i s n o t a n ACT, ' t o ' i s a p r e p o s i t i o n and i s t r a n s l a t e d i n t o a 'to' l i n k . 4= cy. When t h e l i n k i s w r i t t e n h o r i z o n t a l l y , i t r e p r e s e n t s a dependency between an ACT and a PP where t h e dependent is n o t t h e o b j e c t of a d i r e c t action. V e r t i c a l p r e p o s i t i o n a l dependency s p e c i f i e s a d d i t i o n a l i n f o r m a t i o n about a concept t h a t i s o n l y i n d i r e c t l y a n a t t r i b u t e o f t h a t concept ( e . g . , a l o c a t i o n r a t h e r than a p h y s i c a l attribute). P r e p o s i t i o n a l l i n k s may have many d i f f e r e n t f o r m s , each r e p r e s e n t e d b y a t a g ( e . g . , " t o " , " o f " ) w r i t t e n over the l i n k . Prepositional dependency i s d i f f e r e n t from simple dependency i n c e r t a i n s p e c i f i e d ways, one b e i n g t h a t s t r i n g s o f p r e p o s i t i o n a l l y dependent PP's may e x i s t whereas t h i s cannot e x i s t w i t h simple dependency. When ' t h e ' i s encountered i t i s t r e a t e d a s before. ' P a r k ' is marked as b o t h a LOC and a However, t h e system w i l l demand the PP PP. LOC i n t e r p r e t a t i o n since ' t o LOC' i s not a l l o w e d ; b y d e f i n i t i o n , LOC's can o n l y modify two-way l i n k s . ' P a r k ' i s then p l a c e d i n the network g i v i n g : 'With' is held in waiting as a l i n k unt i l the P P t o which i t connects i s encountered. ' A ' i s i g n o r e d and the c o n s t r u c t i o n ' w i t h g i r l ' i s made. We are are now faced w i t h the p r o b lem o f where t o a t t a c h t h i s c o n s t r u c t . A problem e x i s t s s i n c e a t l e a s t two r e a l i z a t i o n r u l e s may apply: The problem is r e s o l v e d by t h e conceptual semantics. The semantics f o r ' g o ' c o n t a i n s a l i s t o f concept u a l prepositions. Under ' w i t h ' i s l i s t e d ' a n y movable p h y s i c a l o b j e c t ' and since a g i r l is a p h y s i c a l o b j e c t the dependency is a l l o w e d . The semantics f o r ' p a r k ' are a l s o checked. Under ' w i t h ' f o r ' p a r k ' are l i s t e d the v a r i o u s items t h a t parks are known t o c o n t a i n , e . g . , s t a t u e s , j u n g l e gyms, e t c . ' G i r l ' i s not found s o t h e n e t work ( l ) i s allowed w h i l e (2) i s a b o r t e d . Although ' *= g i r l ' is dependent on 'go' it i s dependent t h r o u g h ' p a r k ' . That i s , these are not i s o l a t e d dependencies since we would want to be able to answer t h e q u e s t i o n ' D i d the g i r l go to the park?' a f f i r m a t i v e l y . I n (2) the b e l o w - t h e l i n e n o t a t i o n i n d i c a t e s t h a t i t i s the ' p a r k w i t h The conceptual semantics f u n c t i o n s as an e x p e r i ence f i l e i n t h a t i t l i m i t s c o n c e p t u a l i z a t i o n t o ones consonant w i t h the system's past experience. Since i t has never encountered ' p a r k s w i t h g i r l s ' i t w i l l assume t h a t t h i s i s not the meaning i n tended. It is possible, as it is in an ordinary c o n v e r s a t i o n , f o r the user t o c o r r e c t the system i f a n e r r o r was made. That i s , i f (2) were the i n t e n d e d network i t might become apparent t o the user t h a t t h e system had misunderstood and a c o r r e c t i o n c o u l d e a s i l y be made. The system would then l e a r n the new p e r m i s s i b l e c o n s t r u c t and would add it to i t s semantics. The system can always l e a r n from t h e u s e r 1 0 and i n f a c t the semantics were o r i g i n a l l y i n p u t i n t h i s way, b y n o t i c i n g occurrences in sample s e n t e n c e s . * Thus, t h e system p u r p o r t s to be a n a l y z i n g a sentence in a way analogous to the human method. I t handles i n p u t one word a t a time a s i t i s enc o u n t e r e d , checks p o t e n t i a l l i n k i n g s w i t h i t s own knowledge of t h e w o r l d and past e x p e r i e n c e , and places i t s o u t p u t i n t o a l a n g u a g e - f r e e f o r m u l a t i o n t h a t can be operated o n , r e a l i z e d in a paraphrase, or translated. 5. The p a r s e r i s p r e s e n t l y o p e r a t i n g i n a l i m i t e d form. I t i s coded i n MLISP f o r t h e FDP-10 and can be adapted to o t h e r LISP processors w i t h minor revisions. The a l g o r i t h m used d i f f e r s from t h e t h e o r e t i c a l analysis given in the previous section because a computer program must d e a l w i t h machine l i m i t a t i o n s and must cope w i t h s p e c i a l cases t h a t may be encountered in t h e i n p u t . 572- Rather than b u i l d i n g up the network s t r u c t u r e a l i t t l e b i t a t a time d u r i n g the p a r s e , the p r o gram determines a l l the dependencies present i n the network and then assembles the e n t i r e network at the end. Thus, t h e sentence 'The b i g boy gives apples t o the p i g . ' i s parsed i n t o : a The i n p u t sentence is processed word-by-word. A f t e r " h e a r i n g " each word, the program attempts to determine as much as it can about the sentence before " l i s t e n i n g " f o r more. To t h i s end, the network is b u i l t up a l i t t l e at a time as each word i s processed. Furthermore, the program a n t i c i p a t e s what k i n d s of concepts and s t r u c t u r e s may b e expected l a t e r i n the sentence. I f what i t hears does not conform w i t h i t s a n t i c i p a t i o n , i t may be " c o n f u s e d " , " s u r p r i s e d " , or even "amused". I n case o f semantic o r s y n t a c t i c a m b i g u i t y , the program should determine which of s e v e r a l p o s s i b l e i n t e r p r e t a t i o n s was intended by the "speaker". I t f i r s t s e l e c t s one i n t e r p r e t a t i o n by means of miscellaneous h e u r i s t i c s and stacks the r e s t . I n case l a t e r t e s t s and f u r t h e r i n p u t r e f u t e o r cast doubt upon the i n i t i a l guess, t h a t guess is discarded or s h e l v e d , and a d i f f e r e n t i n t e r p r e t a t i o n is removed from the stack to be p r o cessed. To process an i n t e r p r e t a t i o n , it may be necessary to back up the scan to an e a r l i e r p o i n t in the sentence and rescan s e v e r a l words. To a v o i d r e p e t i t i o u s work d u r i n g rescans, any i n f o r mation l e a r n e d about t h e words of the sentence is kept in core memory. The parse i n v o l v e s f i v e s t e p s : the d i c t i o n a r y l o o k u p , the a p p l i c a t i o n o f r e a l i z a t i o n r u l e s , t h e e l i m i n a t i o n o f i d i o m s , the r e w r i t i n g o f abs t r a c t s , a n d t h e check a g a i n s t the conceptual semantics. The d i c t i o n a r y of words is kept m o s t l y on the d i s k , b u t the most f r e q u e n t l y encountered words remain in core memory to minimize processing t i m e . Under each word are l i s t e d a l l i t s senses. "Senses" are d e f i n e d p r a g m a t i c a l l y as i n t e r p r e t a t i o n s o f the word t h a t can l e a d t o d i f f e r e n t n e t work s t r u c t u r e s o r t h a t denote d i f f e r e n t concepts. For example, some of t h e senses of " f l y " a r e : f l y . - ( i n t r a n s i t i v e ACT): does in an a i r p l a n e I f t h e r e are s e v e r a l senses from which t o choose, the program sees whether it was a n t i c i p a t i n g a concept or connective from some s p e c i f i c category. Recent c o n t e x t u a l usage of some sense a l s o can serve to p r e f e r one i n t e r p r e t a t i o n over another. To choose among s e v e r a l senses w i t h otherwise equal l i k e l i h o o d s , the sense w i t h lowest s u b s c r i p t i s chosen f i r s t . Thus, b y o r d e r i n g senses i n the d i c t i o n a r y according t o t h e i r empiri c a l frequency o f occurrence, the system can t r y t o improve i t s guessing a b i l i t y . The r e a l i z a t i o n r u l e s t h a t apply to each word sense are r e f e r e n c e d in the d i c t i o n a r y under each sense. Most o f these r u l e s f a l l i n t o categories t h a t cover l a r g e conceptual classes and are r e f erenced by many concepts. Such c a t e g o r i e s are PP, PA, AA, PPLOC, PPT> LOC, T, simply t r a n s i t i v e ACT, i n t r a n s i t i v e ACT, ACT t h a t can take an e n t i r e c o n c e p t u a l i z a t i o n as d i r e c t o b j e c t ( " s t a t e ACT"), and ACT t h a t can take an i n d i r e c t o b j e c t w i t h o u t a p r e p o s i t i o n ( " t r a n s p o r t ACT"). In contrast to most concepts, each connective ( e . g . , an a u x i l i a r y , p r e p o s i t i o n , or d e t e r m i n e r ) tends to have i t s own r u l e s o r t o share i t s r u l e s w i t h a few other words. A r e a l i z a t i o n r u l e c o n s i s t s of two p a r t s : a r e c o g n i z e r and a dependency c h a r t . The recognizer determines whether the r u l e a p p l i e s and the dependency c h a r t shows t h e dependencies t h a t e x i s t when i t does. I n the recognizer are s p e c i f i e d the o r d e r i n g , c a t e g o r i e s , and i n f l e c t i o n o f the concepts and connectives t h a t n o r m a l l y would appear i n a sentence i f the r u l e a p p l i e s . I f c e r t a i n concepts o r connectives are o m i s s i b l e i n the i n p u t , the r u l e can s p e c i f y what to assume when they are m i s s i n g . Agreement of i n f l e c t e d words can be s p e c i f i e d i n a n absolute ( e . g . , " p l u r a l " ) o r a r e l a t i v e manner ( e . g . , "same t e n s e " ) . Rules f o r a language l i k e E n g l i s h have a preponderance of word order s p e c i f i c a t i o n s w h i l e r u l e s f o r a more h i g h l y i n f l e c t e d language would have a preponderance of inflection specifications. R e a l i z a t i o n r u l e s are used b o t h t o f i t concepts i n t o the network as t h e y are encountered and t o a n t i c i p a t e f u r t h e r concepts and t h e i r p o t e n t i a l r e a l i z a t e s i n t h e networks. When a r u l e i s s e l ected f o r t h e c u r r e n t word sense, i t i s compared w i t h the r u l e s o f p r e c e d i n g word senses t o f i n d one t h a t " f i t s " . For example, i f " v e r y h o t " i s h e a r d , one r e a l i z a t i o n r u l e f o r " v e r y " i s : what a passenger -573- 8. Q u i l l i a n , R., "The Teachable Language Comprehender: A S i m u l a t i o n Program and Theory of Language", B o l t Beranek and Newman, Camb r i d g e , Massachusetts, 1969* 9. Schank, R., "A Conceptual Dependency Repres e n t a t i o n f o r a Computer-Oriented Semantics", Ph.D. Thesis U n i v e r s i t y of Texas, A u s t i n 1969 (Also a v a i l a b l e as S t a n f o r d AI Memo 8 3 , Computer Science Department, S t a n f o r d U n i v e r s i t y , S t a n f o r d , C a l i f o r n i a , March 1969)- 10. Schank, R., "A N o t i o n of L i n g u i s t i c Concept: A Prelude to Mechanical T r a n s l a t i o n " , S t a n f o r d AI Memo 75. Computer Science Department, S t a n f o r d U n i v e r s i t y , S t a n f o r d , C a l i f o r n i a , December 1969* 11. Schank, R., " O u t l i n e of a Conceptual Semant i c s f o r Generation o f Coherent D i s c o u r s e " , TRACQR-68-U62-U, Tracor I n c . , A u s t i n , Texas March 1968. 12. Thorne, J . , B r a t l e y , P . , and Dewar, H . , "The S y n t a c t i c A n a l y s i s o f E n g l i s h b y Machine" i n Machine I n t e l l i g e n c e I I I , U n i v e r s i t y o f E d i n b u r g h , 196b. 13* Walker, D . , L . , "Recent t i c Analysis Mass., June Chapin, P . , G e i s , M., and Gross, Developments in t h e METRE SyntacP r o c e d u r e " , MITRE C o r p . , B e d f o r d , 1966. -578-
© Copyright 2024 Paperzz