COATIS, an NLP System to Locate Expressions of Actions Connected by Causality Links D a n i e l a Gaxcia Universitd de Paris-Sorbonne, Cams-Lalic 96, boulevard Raspail, 75006 Paris, France and EDF-DER, I M A - T I E M 1, avenue du General-De-Gaulle, 92141 Clamart Cedex, France A b s t r a c t . COATIS is an automatic tool designed to locate certain actions expressed in texts. Rules of contextual exploration, activated by the presence of linguistic indicators of causality in sentences, enable COATIS to locate expressions that denote field actions and that are linked by causal relations. COATIS processes technical texts of any domain, in the French language. It is therefore particularly suitable for use in causal knowledge acquisition from texts. 1 Introduction The notions of transfer, entity and movement, expressed by natural languages, have been extensively studied for instance by Talmy [18, 19], Langacker [14], Jackendoff [11] and Pustejovski [16], but the systematic study of the encoding of causality by natural languages is still in its early stages. Several research studies of systematic description of vocabulary (verbs of movement are analyzed by means of schemas by M. Abraham in [1], semantic transitivity, aspectuality schema and diathesis schema had been studied by J.-P. Desclds in [5], while particular semantic domains such as relations of localisation and whole-part relations were also recently studied and presented in [13] and [12]) have been conducted with the aim to get knowledge from texts without any information about the field described in the processed text. In this paper, we set out the results concerning the notions of action and causal relations between actions as expressed by verbs of the French language. The model we built is coupled with the Strategy of the Contextual Exploration to obtain the COATIS computer system. This system aims to index the processed text by the actions expressed within it and that are organized by causal links. We start by explaining (Section 2) how we organized the French verbs that express causal links between actions. We then describe (Section 3) the COATIS system. 2 Semantic Organization of Causality as it is Expressed in French A classic distinction between the efficient causality and the causality that is able to be described by formal representations, has long been established and we take 348 it into account. We distinguish between the efficient causality where one action provokes a different action that comes later in time ("Massive deforestation of the planet leads to global coolin]'), and the causality that substitutes the notions of cause and effect by regularities encountered between actions ("Energy is proportional to mass"). We extend this distinction with an original work on the organization of the efficient causal relations. 2.1 Efficient Causality as E x p r e s s e d by French Verbs The idea of efficient causality as an oriented retation between actions, can be expressed by French verbs. French verbs such as provoquer (to provoke), g~ner (to disturb), rgsulter (to result), or conduire d (to lead to), are called indicator verbs o] causality (or indicators for short). The indicators that express efficient causality relations can (i) clarify the nature of the produced effect (-disturbing -1, -letting-, -modification-, -creation-, etc.), or (ii) clarify the intervention of the causal action (-contribution-, -collaboration-). Figure i presents an extract of the model comprising twenty-three specific relations of causality (nineteen relations of efficient causality and four relations of formal causality). Fig. 1. Semantic organization of the relations of efficient causality (extract). The model presented below comes primarily from the manual classification of indicator verbs found in technical texts. Some of the classes we describe were first brought forward by the American linguist Leonard Talmy [20]. 2.2 Efficient Causalities Clarifying t h e P r o d u c e d Effect One causal action can block, impede, influence, create or keep another action. The two relations -blocking and -keeping- respectively extend the relations -impediment1 We note each specific causal relation between two hyphens. 349 and -creation- that render an account of two extreme notions. Between the two notions, -creation- and -impediment-, non categoric influences from one action to another one are possible. One causal action can facilitate, let or disturb another action. These three relations can be ordered according to a qualitative feature which is associated with the relation of-influence-: the relation of-facilitation- is a modification t h a t is more or less positive; the relation of-letting- is a nil modification; the relation of-disturbing- is a modification that is more or less negative. As described above, the organization of efficient causalities clarifying the nature of the produced effect justifies the existence of a continuum between -blocking- and -keeping-. Indeed, the -facilitation- relation, taken to extremes, becomes a -creation-. And the relation of -disturbing-, varying in its intensity, merges into the relation of-impediment- when it goes beyond the extrem limit. Other authors are currently engaged in research on verb classification. Note in particular the work of Patrick Saint-Dizier described in [17], as well as the results exposed by Beth Levin in [15]. Jacques Francois in [8] is interested in the notion of causality as a criteria for classifying the causation verbs of change. Our model is based on an analysis of texts with frequent occurrences of 253 verbs carrying the notion of efficient causality. 3 COATIS COATIS applies the Strategy of the Contextual Exploration [7] to detect expressions of action. The method is applied sentence by sentence to the whole text. The presence of an indicator (e.g., one of the verbs in Fig. 1) activates a contextual exploration of the sentence that seeks to detect the clues necessary to: - decide if the indicator is likely to express a causal relation in the text or, on the contrary, to show that an interpretation of causality is impossible; - identify the arguments of the relation. This is the general organization of a system that processes texts by the Strategy of the Contextual Exploration. This method is still implemented and used to resolve problems such as knowledge modelling from texts (SEEK system is presented in [13], automatic abstracting [2], as well as tense and aspect analysis [6]). These systems are implemented as Knowledge Base Systems where the knowledge used is only a linguistic one. Figure 2 illustrates an example of a sentence processed by COATIS. COATIS is implemented in SMECI 2. The system applies to texts t h a t have first been processed by L E X T E R [3], a terminology extraction software that performs a morpho-syntactic analysis of a corpus of French texts on any technical domain and yields a network of noun phrases. 2 SMECITM . from ILOG T M - is a Knowledge-Based System generator. 350 Input: La [raise ~ Famllele de diffefent~types d'ouvral~l la p!upart du t ~ l le If~ec.~n~-n~t du ~s~ul outl~e ~ve I zr~ e~ parallele de diffeam~*~ d'ouwages t disturbing Fig. 2. An example of a sentence processed by COATIS. The noun phrases between brackets are the noun phrases detected by LEXTER. The sentence processed is "Putting different kinds of equipment in parallel generally hinders the operation of the network". The result of the process is that the two expressions of action "Putting different kinds of equipment in parallel" and "operation of the network" are in causal relation of-disturbing-. Linguistic indicators are not sufficient to detect the presence of a causal relation in a sentence. Account must be taken of their occurrence. In order to keep only those utterances that express causality, it is necessary to examine the context of the indicators detected, in order to locate relevant clues. The rules of contextual exploration analyze different kind of information, e.g. morphosyntactic (detecting for exemple the occurrence of an infinitive verb preceding or following the occurrence of the indicator); or morphologic (for example, the presence in part of the indicator context, to be determined by the system, of a French linguistic unit ending in -ment, -ion, -ure, -ise, -age, -aison, -yse, etc.). COATIS criteria for selecting LEXTER units are essentially connected to the relative position of different units in the sentence: LEXTER units, chosen prepositions depending on indicators, verbs with a morphology feature (infinitive, past or present participle, conjugated) and punctuation. The noun phrase selected (one LEXTER unit) is then explored to search for additional clues that would confirm the first causal interpretation of the indicator (e.g. the presence of a French noun ending in -ment, -ure, -age, etc.). This second exploration is necessary because the morphological information is only relevant when it is present in a certain part of the sentence. The organization of indicator verbs also makes it possible to specify the causal value concerned (-creation-, -disturbing-, -collaboration-, etc.). One of the main features of COATIS is its independence with regard to the subject field processed. Indicators and clues are independent of any specific field. COATIS does not need a particular domain dictionary to operate. It is therefore a useful tool for causal knowledge acquisition from texts, since no preliminary domain knowledge is required. At present, there are few computerized systems with the same operational designs, and it seems difficult to compare COATIS with other systems. Quote from the work of Gary C. Borchardt [4]: it is about a system which locates the causal relations in written texts according to the directives provided by the 351 constructor of the system. In contrast, COATIS processes any technical text t h a t is not specifically written for its use. For the time being, COATIS does not make a lot of noise (about 15% for the "Guide for Regional Networks Planning"); however it is much more difficult to quantify what the system does not find, so we have to process many other texts to be able give a verdict. However, the results already obtained from technical texts are encouraging, as is the feedback from cogniticians using the results. 4 Using Results of COATIS COATIS can be used for different applications. COATIS results are helpful in constituting action terminology for a given domain. They can also be helpful in building a causal model of a domain: COATIS provides (i) a number of expressions of action and (ii) a basic organizational structure based on causal relations. Moreover, we have listed a number of additional contributions to the modelling process: - When resolving problems of inference, if we have detected that "A -letting- B" and if we moreover know that A is carried out, then B becomes possible ; if we have detected that "A -cause- B" and that A is carried out, then B is carried out or going to be carried out at a certain point. - When building a causal network, if "A -cause- B" and "B -cause- C", we should ask the expert if we can add the relation "A -cause- C". - When testing the coherence of the knowledge gathered, if "A -cause- B" and "B -hinder- C", and if the relation "A -cause- C" exists in the network, we have to argue this apparent paradox by considering it as a particular case or an exception, or even by leaving it out altogether. The results provided by COATIS are being used in the H Y P E R P L A N [10] project which concerns the building of a Technical Documentation Consulting System (TDCS) for the "Guide for Regional Networks Planning" (150.000 words): part of the text indexing takes into account the causal information identified by COATIS [9]. 5 Conclusion The approach adopted to building COATIS aims to find an operational m e t h o d for constructing representations from a text, taking into account the causality notion. While our university research team is engaged in theoretical research, we are implementing a number of automatic tools to process texts. The independence of these systems with regard to a particular field and their ability to adapt to many different texts was made possible by the support of a strong theoretical base : a linguistic model under construction that explores and seeks to model the general notions expressed by natural languages, e.g. membership, loealisation, whole-part relation, movement, transfer and.., causality. 352 References I. Abraham, M.: Analyse sdmantico-cognitive des verbes de mouvement et d'activitd : Contribution mdthodologique ~ la constitution d'un dictionnaire informatique des verbes. Phl) E H E S S Paris (1995) 2. Berri, J., Le Roux, D., Malrieu, D, Minel, J.-L.: S E R A P H I N main sentences automatic extraction system. Proceedings of the Second Language Engineering Convention Londres October (1995) 3. Bourigault, D.: L E X T E R , a Natural Language Processing Tool for Terminology Extraction. Proceedings of the 7th E U R A L E X International Congress Goteborg (1996) 4. Borchardt, G.-C.: Thinking between the Lines, Computers and Comprehension of Causal Descriptions. MIT Press Cambridge Massachusetts (1994) 5. Desclds, J.-P.: Langages applicatifs, Langues naturelles et Cognition. Hermds Paris (1990) 6. Desclds, J.-P., Jouis, C., Oh, H.-G., Reppert, D.: Exploration contextuelle et sdmantique: Un syst~me expert qui trouve les valeurs sdmantiques des temps de l'indicatif dans un texte. Knowledge Modelling and expertise transfer. D.Herin-Aime, R. Dieng, J.-P. Regourd, J.-P. Angoujard (eds) IOS Press Amsterdam Washington D C Tokyo (1991) 371-400 7. Desclds,J.-P.,Minel, J.-L.:L'exploration contextuelle.In Le rdsum6 par exploration contextuelle. Communications to the Cogniscience-Est Meeting, Nancy, November 1994. Technical Report C A M S 95(1) (1995) 3-17 8. Francois, F.: Changement, causation, action. Librairie Droz Gen~ve-Paris (1989) 9. Garcia, D., Aussenac-Gilles, N., Courcelle, A.: Exploitation, pour la moddlisation, des connaissances causales ddtectdes par COATIS dans les textes. Proceedings of 7th Journdes d'Acquisition des Connaissances. Sate France (1996) 10. Gros, C., Assadi, H., Aussenac-Gilles, N., Courcelle, A.: Task Models for Technical Documentation Accessing. Proceedings of the 10th European Knowledge Acquisition Workshop. Nottingham (UK) (1996) 11. Jackendoff, R.: Semantics and Cognition. Cambridge (Mass.) MIT Press (1983) 12. Jackiewicz, A.: Expression lexicale de la relation d'ingrddience. Faits de Langue 7 (1996) 13. Jouis, C., Mustafa-Elhadi, W.: Conceptual Modeling of Database-Schema using linguistic knowledge. Application to Terminological databases. Proceedings of the First Workshop on Application of Natural Language to Databases (NLDB:9295). AFCET Versailles France (1995) 103-118 14. Langacker, L.: Foundation of Cognitive Grammar. Standford Univ. Press 1 (1987) 15. Levin, B.: English Verb classes and Alternations, Preliminary investigations. University of Chicago Press (1993) 16. Pustejovski, J.: Generative Lexicon. MIT Press (1995) 17. Saint-Dizier, P.: Verb semantic Classes for French: Construction and Semantic Representation. Proceedings of IFIP, Conference on verb semantic classes. Univ. of Pennsylvania (1995) 18. Talmy, L.: Semantics and Syntax of Motion. Syntax and Semantics 4 NY Academic Press (1975) 181-238. 19. Talmy, L.: How Language Structures Space, Spatial Orientation: Theory, Research and Application. H. Pick, L. Acredolo (eds.) Prenum Press (1983) 20. Talmy, L.: Force dynamics in language and cognition. Cognitive Science 12 (1988) 49-100
© Copyright 2024 Paperzz