User Usermodel model Architecture of NLG-systems Style Style Pragmatics Pragmatics Control Control Surface generator Deep generator Knowledge Representation Hercules Dalianis Text Text Planner Planner E-mail: [email protected] Tel: +46 70 568 13 59 http://www.dsv.su.se/~hercules Text Text Plan PlanLibrary Library Sentence Sentence Planner Planner Aggregation Aggregation Reference Reference SentenceSentenceDelimitation Delimitation Lexical LexicalChoice Choice Theme ThemeControl Control Surface Surface Grammar Grammar Natural Language text BaseBaseLexicon Lexicon DomainDomainLexicon Lexicon Knowledge KnowledgeRepresentation Representation Semantics Semantics Ontology Ontology Hercules Dalianis 1 Hercules Dalianis 2 Deep generation • An knowledge representation is indata to the text generator • A user model describes what type of user that is going to read the text (query analysis) • The user model will assist in the content selection in the deep generation. • The deep generator plans content and organize what to say • A text planner plans the text the discourse such the text becomes coherent. – The sentences in the text will become coherent • A sentence planner plans each sentence, – Sentence length, aggregation and pronouns, tempus. – Schemas can also be used Hercules Dalianis 3 Surface generation 4 Generation grammars • DCG Definite clause grammar Prolog based • Systemic grammars (Hallidays) • The surface generator realizes the message. • Generation grammar (language) – Can interact between syntactic, semantic and pragmatic parts of the system. – Each part of the grammars delimits choices until correct sentence structure. – Syntactic correct text – In correct selected language • Lexicon – Base lexicon – Domain lexicon • Functional Unification Grammar (Martin Kay) – Each part of the grammar are modular and can be changed • Full control of generation Hercules Dalianis Hercules Dalianis 5 Hercules Dalianis 6 Query analysis and user modelling • • • Deep generationWhat should be said? • Content selection Has to analyze the query Find out what the user already knows Use some sort of user model where the system can relate to. – Selects from its abundant knowledge base what should be said • Text planning (Discourse planning) – Make a text plan, that contains a number of sentence plans. The text plan has to be coherent. • Sentence planning – Decides the form of sentences, active or passive form, pronominal use, aggregation, sentence length. Hercules Dalianis 7 Hercules Dalianis Sentence generationHow should it be said? Deep generation • Aggregation coordination ellipsis • Removes redundant information without changing the content • ASTROGEN - Aggregated deep and Surface naTuRal language GENerator • http://www.dsv.su.se/~hercules/ ASTROGEN/ASTROGEN.html • Selects language -grammar • Lexical choices (words) – Base dictionary – Domain dictionary • Post processing Hercules Dalianis 9 Götaland får duggregn på måndag Götaland får regn på tisdag Götaland får hagel på onsdag Götaland får snö på torsdag Götaland får kuling på fredag Götaland får storm på lördag Götaland får orkan på söndag Svealand får duggregn på måndag Svealand får regn på tisdag Svealand får hagel på onsdag Svealand får snö på torsdag Svealand får kuling på fredag Svealand får storm på lördag Svealand får orkan på söndag Subjekt och predikat Predikat och aggregering ackusativ objekts Götaland får duggregn på måndag, aggregering regn på tisdag, hagel på onsdag, Göta- och Svealand får duggregn på måndag snö på torsdag, Göta- och Svealand får regn på tisdag kuling på fredag, Göta- och Svealand får hagel på onsdag storm på lördag, Göta- och Svealand får snö på torsdag orkan på söndag Göta- och Svealand får kuling på fredag Svealand får duggregn på måndag, Göta- och Svealand får storm på lördag regn på tisdag, Göta- och Svealand får orkan på söndag hagel på onsdag, snö på torsdag, Subjekt och Predikat och kuling på fredag, predikat ackusativ storm på lördag, aggregering objekts orkan på söndag aggregering Göta- och Svealand får duggregn på måndag får regn på tisdag får hagel på onsdag får snö på torsdag får kuling på fredag får storm på lördag får orkan på söndag – A parsing in a query interface can give one or more interpretations of a query. – Paraphrase these interpretations to NL – Usually one sentence generation – One can use bi-directional grammar both for parsing and generation (For example DCG, Definite Clause Grammar in Prolog) Obunden lexikal aggregering Obunden lexikal aggregering Införande av signaleringsord Både Göta- och Svealand får ostadigt väder hela veckan Hercules Dalianis 10 • Paraphrase parsed questions Göta- och Svealand får ostadigt väder på måndag, tisdag, onsdag, torsdag, fredag, lördag och söndag Göta- och Svealand får ostadigt väder hela veckan Hercules Dalianis What can one use text generation for? Götaland får ostadigt väder på måndag, tisdag, onsdag, torsdag, fredag, lördag och söndag. Svealand får ostadigt väder på måndag, tisdag, onsdag, torsdag, fredag, lördag och söndag. Bunden lexikal aggregering 8 Bunden lexikal aggregering Götaland får ostadigt väder hela veckan Svealand får ostadigt väder hela veckan Predikat och ackusativ objekts aggregering Göta- och Svealand får ostadigt väder hela veckan Införande av Signaleringsord Både Göta- och Svealand får ostadigt väder hela veckan 11 Hercules Dalianis 12 What can one use text generation for? • Generation of the target text in machine translation • Generation of abstracts • Generation of weather or stock reports directly from raw data. • On several languages • Generation of hand books on several languages in the same time. Hercules Dalianis 13 Hercules Dalianis 14 Paraphrasing • Validation of formal specification by paraphrasing them to NL • Explanation of medical, juridical or technical complex systems for example MYCIN Hercules Dalianis 15 Fråga: vem behandlar Adler. (Who treats Adler?) SQL: SELECT DISTINCT T3.name, T1.name,T2.empl_no, T3.reg_no FROM DOCTOR T1, ATTDOCTOR T2, PATIENT T3 WHERE (T1.name = `Adler J.`) AND (T2.empl_no = T1.empl_no) AND (T3.reg_no=T2.reg_no) PARAFRAS AV SQL I NATURLIGT SPRÅK: Vilka patienter behandlar doktor Adler J. (What patients is treated by doktor Adler J) Fråga: vilken diagnos har Amster ? (What diagnosis has Amster?) SQL: SELECT DISTINCT T2.other_info, T1.name, T2.reg_no FROM PATIENT T1, DIAGNOSIS T2 WHERE (T1.name = `Amster K.`) AND (T2.reg_no=T1.reg_no) PARAFRAS AV SQL I NATURLIGT SPRÅK vilka diagnoser har en patient som heter Amster K. (what diagnosis does a patient have with the name Amster K.) Hercules Dalianis 16 Generation of one or more languages • • • • Fråga: vem behandlar Hansson. (who treats Hansson) SQL: SELECT DISTINCT T1.name,T2.empl_no, T3.reg_no, T3.name FROM DOCTOR T1, ATTDOCTOR T2, PATIENT T3 WHERE (T2.empl_no = T1.empl_no) AND (T3.reg_no = T2.reg_no) AND (T3.name=T2.`Hansson A.`) PARAFRAS AV SQL I NATURLIGT SPRÅK: Vilka doktorer behandlar patienten Hansson A. (which doctors treats the patient Hansson A.) Hercules Dalianis HSQL-(Help system for structured query language) The prototype is implemented on database that contain information about different hospitals and their operation. Example of generation of SQL from Natural language (NL) and paraphrasing from SQL back to NL Weather reports COGENTEX, Canada Stock exchange Car manual TechDoc, Honda Tyskland Support system – Scarrie - Scania (via “Scania Swedish” och directly) 17 Hercules Dalianis 18 Generation from rawdata Utdata från GOSSIP: Indata till GOSSIP: ee(martin,ttyp0,login,[],8:20:03,_,_,0). The system was used for 7 hours 32 minutes and 12 seconds. ee(martin,ttyp0,editor,[f1],8:30:00,9:10:32,0:40:32,240). The users of the system ran editors and compilers during this ee(martin,ttyp1,editor,[f2],8:42:21,9:13:14,0:30:53,183). ee(martin,ttyp0,logout,[],9:21:05,_,1:01:02,0). time. Compilers were run six times (the cpu-time equal to 46% ee(martin,ttyp0,login,[],10:17:32,_,_,0). of the total cpu-time). Editors were run twelwe times (the ee(martin,ttyp0,editor,[f1],10:20:58,12:15:27,1:54:29,1200). cpu-time equal to 53% of the total cpu-time). Two users, Martin ee(martin,ttyp1,editor,[f2],11:00:39,11:32:48,0:32:09,185). and Jessie, logged on to the system. Jessie used the system for ee(jessie,ttyd0,login,[],11:03:46,_,_,0). 63 % of the time in use. ee(jessie,ttyd0,editor,[f5],11:12:45,12:48:22,1:35:37,573). ee(jessie,ttyd1,compiler,[f4],11:23:32,11:31:01,0:07:29,300). ee(jessie,ttyd1,editor,[f3],11:32:25,11:45:56,0:13:31,70). ee(jessie,ttyd1,editor,[f4],11:47:34,11:59:09,0:11:35,65). ee(jessie,ttyd1,compiler,[f4],12:04:47,12:08:32,0:03:45,186). ee(jessie,ttyd1,editor,[f3],12:09:57,12:16:34,0:06:37,15). ee(jessie,ttyd1,editor,[f4],12:18:43,12:39:24,0:20:41,154). ee(martin,ttyp0,logout,[],12:20:21,_,2:02:49,0). ee(jessie,ttyd1,editor,[f7],12:56:01,13:15:02,0:19:01,143). ee(jessie,ttyd0,editor,[f6],12:59:56,13:20:43,0:20:47,187). Hercules Dalianis 19 network[1] • CLARE-Ericsson Subscribers are part of a network. Mobile subscribers are subscribers. Subscribers can either be in the state idle or busy. The state busy can either be in the substates ringtone, ringsignal, busytone or dialtone. When one subscriber is calling an other subscriber then the first subscriber has ringtone and the other subscriber has ringsignal. Validation of formal specifications by paraphrasing them to NL • Ellemtel-Ericsson • LOXY is a predicate logical language to describe telephone services • Clare = LOXY+Conceptual model • Validates LOXY and Clare by translating them to NL Hercules Dalianis ASTROGEN-Validates subscriber[0..1000] #phonenumber • XOR idle busy XOR [1] ringtone calling [1] ringsignal busytone dialtone isA mobile_subscriber Hercules Dalianis 21 paraphrase(f(pres,isa,john,subscriber) & (text plan) f(pres,state,john,idle) & f(pres,poss,john,f(pres,attr,phonenumber,100)) & f(pres,poss,john,f(pres,attr,phonenumber,101)) & f(pres,isa,mary,subscriber) & f(pres,state,mary,idle) & f(pres,poss,mary,f(pres,attr,phonenumber,200)) & f(pres,isa,tom,subscriber) & f(pres,state,tom,idle) & f(pres,poss,tom,f(pres,attr,phonenumber,300)) ). INSTANTIATION example OF SPECIFICATION network ENTITIES John:subscriber phonenumber = 100;phonenumber = 101 IS STATES idle; Mary:subscriber phonenumber = 200 IS STATES idle; Tom:subscriber phonenumber = 300 IS STATES idle; END; Hercules Dalianis 22 STEP/EXPRESS VOLVEX • The STEP/EXPRESS standard • The manufacturing industry, car, boat, airplane, process industry, power, oil, etc • AP Application Protocol • AP 214, 500 concepts • ASTROGEN Hybrid system with both text generation and canned text. john, mary and tom are subscribers they are idle john has phonenumbers 100 and 101 mary has a phonenumber 200 tom has a phonenumber 300. yes Hercules Dalianis 20 23 Hercules Dalianis 24 ASTROGEN STEP-VOLVEX-ASTROGEN lexical lexicon • ASTROGEN - Aggregated deep and Surface naTuRal language GENerator • http://www.dsv.su.se/~hercules/ ASTROGEN/ASTROGEN.html building tool STEP AP EXPRESS to Prolog f-structures ASTROGEN Natural Language Validation Hercules Dalianis 25 ENTITY fillet SUBTYPE OF (transition_feature); END_ENTITY; 5.2.3.1.75 Fillet A Fillet is a concave circular arc transition between two intersecting Face (see 4.2.167) objects without any constraints concerning changes of the radius along the Fillet. ENTITY constant_radius_fillet SUBTYPE OF (fillet); radius : feature_parameter; first_offset : OPTIONAL feature_parameter; second_offset : OPTIONAL feature_parameter; END_ENTITY; ?- question(fillet). A constant_radius_fillet is a subtype of a fillet. A fillet is an entity. It is a subtype of a transition_feature. (Pron.) A Fillet is a concave circular arc transition between two intersecting Face (see 4.2.167) objects without any constraints concerning changes of the radius along the Fillet. (Canned text) Hercules Dalianis 27 ENTITY project_relationship; (*UOF:S5*) related : project; relating : project; relation_type : undefined_object; description : OPTIONAL string_select; END_ENTITY; 4.2.357 Project_relationship A Project_relationship is a relationship between two Project (see 4.2.356) objects (Example text) EXAMPLE 174 -- For the development of a new car, a project is set up that is responsible for the development decisions as well as for the accounting of the costs. yes Hercules Dalianis 29 Hercules Dalianis 26 ENTITY project; (*UOF:S5*) id : undefined_object; name : string_select; description : OPTIONAL string_select; actual_start_date : OPTIONAL date_time; actual_end_date : OPTIONAL date_time; planned_start_date : OPTIONAL event_or_date_select; affected_product_class : SET[0:?] OF product_class; work_program : SET[0:?] OF activity; planned_end_date : OPTIONAL period_or_date_select; END_ENTITY; 4.2.359 Project A Project is a unique process with a time limit, with a defined goal, with a defined budget, and with defined resources. A Project is a type of organization representing a work programme that consists of a set of assigned actions. See ARM definition for Project in paragraph 4.2.356 for more information. Hercules Dalianis 28 ?- question(project & project_relationship). a project is an entity and a project has an undefined_object id and a project has a string_select name and a project has a string_select description and a project has a date_time actual_start_date and a project has a date_time actual_end_date and a project has an event_or_date_select planned_start_date and a project has a product_class affected_product_class and a project has an activity work_program and a project has a period_or_date_select planned_end_date and a project_relationship is an entity and a project_relationship has a project related and a project_relationship has a project relating and a project_relationship has an undefined_object relation_type and a project_relationship has a string_select description. yes Hercules Dalianis 30 ?- sort(1,2,3),canned_text, canned_example,clause_comma, pronoun, predicate_do. ?- question(project & project_relationship). A project and a project_relationship are entities. (Agg.) They have a string_select description. (Pron.) A project has a date_time actual_end_date. It has a date_time actual_start_date. (Pron.) It has a product_class affected_product_class. " It has an undefined_object id. It has a string_select name. It has a period_or_date_select planned_end_date. " It has an event_or_date_select planned_start_date " It has an activity work_program. A project_relationship has a project related. It has a project relating. (Pron.) It has an undefined_object relation_type. " A Project is a unique process with a time limit, with a defined goal, with a defined budget, and with defined resources. (Canned text) A Project_relationship is a relationship between two Project (see 4.2.356) objects (Example text) EXAMPLE 174 -- For the development of a new car, a project is set up that is responsible for the development decisions as well as for the accounting of the costs. Hercules Dalianis 31 ?- sort(1,2,3),subject_pred,predicate_do,subject,predicate,sym_rel,canned_text, canned_example, clause_comma. ?- question(project & project_relationship). A project and a project_relationship are entities. A project and a project_relationship have a string_select description. A project has a date_time actual_end_date, a date_time actual_start_date, a product_class affected_product_class, an undefined_object id, a string_select name, a period_or_date_select planned_end_date, an event_or_date_select planned_start_date and an activity work_program. A project_relationship has a project related, a project relating and an undefined_object relation_type. A Project is a unique process with a time limit, with a defined goal, with a defined budget, and with defined resources. A Project_relationship is a relationship between two Project (see 4.2.356) objects. EXAMPLE 174 -- For the development of a new car, a project is set up that is responsible for the development decisions as well as for the accounting of the costs. yes ?- Hercules Dalianis 33 sort(1,2,3),subject_pred,predicate_do,subject,predicate,sym_rel,,canned_text, canned_example. ?- question(project & project_relationship). A project, a project_relationship are entities and A project, a project_relationship have a string_select description and A project has a date_time actual_end_date, a date_time actual_start_date, a product_class affected_product_class, an undefined_object id, a string_select name, a period_or_date_select planned_end_date, an event_or_date_select planned_start_date, an activity work_program and A project_relationship has a project related, a project relating and an undefined_object relation_type. A Project is a unique process with a time limit, with a defined goal, with a defined budget, and with defined resources. A Project_relationship is a relationship between two Project (see 4.2.356) objects. EXAMPLE 174 -- For the development of a new car, a project is set up that is responsible for the development decisions as well as for the accounting of the costs. yes Hercules Dalianis 32
© Copyright 2026 Paperzz