Laugh to me: exploring computational humour André Filipe Rodrigues de Carvalho Dissertation submitted to obtain the Master Degree in Information Systems and Computer Engineering Jury Chairman: Supervisor: Member: Prof. João Paulo Marques da Silva Prof. Ana Maria Severino de Almeida e Paiva Prof. David Manuel Martins de Matos May 2012 2 Acknowledgements I would like to start by thanking my supervisor, professor Ana Paiva, for giving me the chance to tackle this unconventional theme. Thanks also to all those at GAIPS, specially to António Brisson whose input and motivation made it possible for me to be able to deliver this dissertation. I would also like to thank the Jovens da Ramada group (it would be hard to mention all the names), for all they taught me over the years, and the words that stuck with me whenever I needed courage - “tracei um projecto, por ele vou lutar.” Special thanks to Alexandre Oliveira, for all the motivation and for making me believe in myself, and to Nuno “Formiga” Filipe, for his input and the late hours debugging session. Thanks to all those amazing people I met in my journey at IST. Special thanks to those who shared my destiny, Henrique Campos, Henrique Reis and Joana Almeida, for making me feel less alone, for all the solidarity and occasional madness (madness I tell you!). Thanks to all my friends, Bruno, Pires, André, Calado, Magui and others for their support and for helping me keep my sanity, the little I had left. Thanks to all comedians who have made my life better over the years by keeping me laughing. And last but not the least – on the contrary – a big thank you to my family, who asked me way too many times how the thesis was going but gave me all the support and love I needed. Porto Salvo, May 14th 2012 André Filipe Rodrigues de Carvalho 4 To Margarida, who taught me the special relativity theory of life. I miss you. 6 Resumo A comunidade de Inteligência Artificial tem mostrado um crescente interesse na área das Narrativas Interactivas. No entanto a maioria dos esforços têm sido dedicados ao género dramático. Tentativas de criar sistemas de comédia para serem representados por agentes autónomos têm sido escassas, apesar do humor ter um papel tão importante no entretenimento, por exemplo. Esta tese tem o propósito de contribuir para a criaçao de sistemas de Comédia Interactiva. Começamos por apresentar um panorama de teorias de humor de vários campos, como da linguı́stica, filosofia e psicologia, bem como alguns conhecimentos básicos de escrita de comédia. Utilizando conceitos destes trabalhos bem como noções de Narrativas Afectivas, propomos um modelo conceptual de um sketch que pode ser representado por agentes autónomos. Neste modelo discutimos como criar agentes e cenários cómicos. Consideramos os agentes não apenas como personagens, mas também como actores que podem seleccionar acções baseando-se no seu valor emocional por forma a contribuir para uma Escalada Emocional e dessa forma providenciar o buildup apropriado para a punchline. Além disso, controlando a forma como a Escalada Emocional se desenvolve através de linhas guia parametrizados, o agente também pode definir o ritmo do sketch. As condições para a punchline ser activada emergem da interacção entre os personagens. Para testar a practicalidade do nosso modelo implementámos um protótipo capaz de mostrar um sketch cómico estruturado. 8 Abstract The Artificial Intelligence community has shown an increasing interest in Interactive Storytelling. However most of the efforts have been in the drama genre. Attempts at designing comedy systems acted by autonomous agents have been scarce, even though humour plays such an important role in entertainment, for instance. This thesis has the purpose of contributing to the creation of Interactive Comedy systems. We start by providing a background in humor theories from several fields, such as linguistics, philosophy and psychology, as well as the basics of comedy writing. Using concepts from this background as well as notions of Affective Storytelling, we propose a conceptual model of a sketch to be acted by autonomous agents. In this model we discuss how to author comic agents and scenarios. We consider agents not only as characters, but also as actors that can actively select actions based on their emotional output in order to contribute to an Emotional Escalation and thus provide a proper buildup for a punchline. Additionally, by controlling how the Emotional Escalation develops using parameterized guidelines, the agent can also set the pace of the sketch. The conditions for the punchline to be triggered emerge from the interaction between characters. To test the feasibility of our model we implemented a prototype that is capable of showing a structured comic sketch that builds up to the punchline. 10 Palavras Chave Keywords Palavras Chave Escalada Emocional Linhas-guia Emocionais Humor Computacional Narrativa Afectiva Agentes actor autónomos Keywords Emotional Escalation Emotional Guidelines Computational Humour Affective Storytelling Autonomous actor agents 12 Contents Bibliography iii 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Document Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background On Humour 5 2.1 From Ancient Greece to the Renaissance . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 The Superiority Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Incongruity Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Raskin and Attardo’s linguistic theories of humour . . . . . . . . . . . . . . . . . 8 2.5 Relief Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.6 Reversal Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.7 From Theory To Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.7.1 Comedy Writing Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.7.2 Sketch Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Related Work 15 3.1 Natural Language and Computational Humour . . . . . . . . . . . . . . . . . . . 15 3.2 Interactive Storytelling and Computational Humour . . . . . . . . . . . . . . . . 17 3.2.1 Plan based IS systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.2 Interactive Comedy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Agents with Emotions and Affective Storytelling . . . . . . . . . . . . . . . . . . 21 3.3.1 Affective Reasoner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.2 FAtiMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.3 Madame Bovary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.4 Façade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 4 Humour sketch model 4.1 27 Incongruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1.1 Action Incongruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1.2 Emotional Incongruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1.3 Context Incongruity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 i 4.2 Comic characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Sketch Beginning And End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4 Emotional Escalation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 Comic agent model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.6 Heuristic function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5 FAtiMA and ION Frameworks 5.1 5.2 37 FAtiMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.1.1 OCC model of emotions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.2 Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.3 Affect Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.1.4 Agent behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.1.5 Theory of Mind component . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.1.6 Authoring files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 ION and Unity3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6 Laugh To Me prototype 45 6.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.2 Emotional Goals and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.3 Heuristic implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.4 Scenario Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.5 Scenario Authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.5.1 Action and goal authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.5.2 Character authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.5.3 Emotional goals authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Prototype Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.6.1 Seller with Distress guideline only . . . . . . . . . . . . . . . . . . . . . . 54 6.6.2 Seller with Distress and Gloating guidelines . . . . . . . . . . . . . . . . . 55 6.6.3 Client with Joy and Pride guidelines . . . . . . . . . . . . . . . . . . . . . 56 Animation system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.7.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.7.2 Expressing emotions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.6 6.7 7 Evaluation 61 7.1 Test structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.2 Seller behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 7.3 Client behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.4 Sketch overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 8 Conclusion 69 Bibliography 71 ii A Seller Appraisal Rules 75 A.1 Selected in prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 A.2 Not selected in prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 B Client Appraisal Rules 79 B.1 Selected in prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 B.2 Not selected in prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 C Script (Without Gloating) 81 D Script (With Gloating) 85 E Questionnaire 89 iii iv List of Figures 3.1 STANDUP interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Pink Panther Interactive Comedy system . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Madame Bovary Interactive Storytelling system . . . . . . . . . . . . . . . . . . . 24 4.1 Conceptual model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.1 Affect derivation from appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.1 Agents architecture overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.2 Emotional goal DispleaseClient with Distress only . . . . . . . . . . . . . . . . . 54 6.3 Emotional goal DispleaseClient with Distress and Gloating . . . . . . . . . . . . 55 6.4 Emotional goal KeepPride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.5 Animated system screenshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.6 Animation system archicteture overview . . . . . . . . . . . . . . . . . . . . . . . 59 6.7 Example sprite for the eyes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 7.1 Seller’s feeling evolution throughout the sketch . . . . . . . . . . . . . . . . . . . 62 7.2 Box plots for Seller’s behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.3 Client’s feeling evolution throughout the sketch . . . . . . . . . . . . . . . . . . . 64 7.4 Box plots for Client’s behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.5 Box plots for Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.6 Humour compared to length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.7 Client’s feeling evolution throughout the sketch . . . . . . . . . . . . . . . . . . . 67 v vi List of Tables 2.1 Summary of different theories of humour, according to their main claims, arguments and criticisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 11 viii Chapter 1 Introduction The dictionary defines humour as “the quality of being amusing or comic” [22], and defines both amusing and comic as the quality of what causes laughter or is meant to do so. We can thus think of humour as the process of evoking laughter – it is not, however, the only one, as tickling someone, for example, can hardly be considered an example of humour, and laughter itself has other functions besides being a response to humour [55]. Rod A. Martin proposes a definition of humour from the psychological point of view comprising four elements: a “social context”, a ”cognitive-perceptual process“, an “emotional response” and the “vocal-behavioral expression of laughter” [44]. From these points we focus especially on the cognitive-perceptual process. This is related to the production of humour and how and why we perceive some things as humourous. The mental process we use in processing informations (ideas, words, actions) and what makes them funny is still a topic of debate, and has been so for millenia. This thesis focuses on how humour works and how the actions are sequenced in a humour sketch, in order to extract a model that can be used by autonomous agents to act out a sketch. The field of Interactive Storytelling concerns systems that allow the user to interact and influence the storyline. Interactive Storytelling has been a topic of growing interest from the Artificial Intelligence community, and there have been a number of works in the area such as Façade [45] or FearNot [7]. The majority of these, however, are interactive dramas. Only very few are comedies, and the topic of how humour can be modeled in the context of Interactive Storytelling has only been marginally approached. The history of Computational Humour (the subfield of AI that concerns the production of humour) in its whole is somewhat more rich, but deals primary with Natural Language systems. In this thesis we propose establishing a connection between a sketch and its emotional content. We think of the agents as both characters and active actors who play the roles of those characters and contribute actively to the emotional content of the scene. As characters we make a distinction between comic and regular characters, according to their personality and emotional reactions. As actors, we introduce the concept of Emotional Escalation, as a mechanism that is used by agents to select their actions during the sketch, providing for a more interesting ground in which to reveal the punchline. By doing this we attempt to make agents that are able to control the pacing and timing of the sketch. We also implement a prototype system of the model as a proof-of-concept. 1 1.1 Motivation It is not only that humour happens during social interactions; it is also relevant to those interactions, and occurring in situations ranging from everyday conversations to business meetings. It is a big part of who we are as humans. It has been suggested that the role humour plays in human-human interactions can be adapted to improve human-computer interactions as well [47], [61]. Indeed, there have been studies indicating it may improve human-computer interfaces, making the experience more likeable without increasing the task completion time [46]. This also suggests it may play an important role in building user trust, a crucial aspect in e-commerce applications, for instance [47]. Humour also helps cope with frustration, a property which Kim Binsted has suggested is useful in compensating the natural shortcomings of every program and interface (Binsted specifically approaches the issue in terms of Natural Language interfaces) [9]). Even when criticizing, humorous or ironic messages are perceived as less offensive than a direct message that implies the same meaning [44], a property that may be useful in tutoring systems. Another quality of humour is how it relates to memory. When humorous and non-humorous material is present, the humorous parts are more easily remembered. This property may also be of interest in advertising. Stock and Strapparava consider advertising to have a “great potential for the adoption of computational humour” [63]. The main application of Interactive Storytelling is in games, including serious games (games in which the main purpose is not entertainment), such as FearNot! [7], designed to help children develop strategies against bullying, or IN-TALE [58], that deals with a military-leader training scenario. Serious games can benefit from the aforementioned properties of humour: building trust, helping cope with frustration, etc. Humour may also be important in the issue of agent believability, as suggested by Dufresne and Hudon [24], due to its vital role in social interactions. 1.2 Problem Interactive Comedy works have proposed some mechanisms that lead to sequences of actions that may be perceived as humorous. There has not, however, been made an attempt to make a computational model of a comedy form. As such we propose to tackle the task of modeling a humour sketch, a comedy form that consists of a small scene that develops from a certain initial premise. Stock and Strapparava describe humour as AI-complete, meaning that the “modeling of humour in all of its facets” is as hard to solve as the most difficult Artificial Intelligence problem [63]. The same way, we cannot consider the problem of modeling a sketch in all its aspects. Rather we focus on the structure of the sketch, and how actions should be sequenced in order to maximize its comic value. Thus, we propose to answer the question “How can we endow autonomous agents with the capacity of creating a comic situation or sketch, specially concerning its pace and timing?” Our hypothesis is that we can handle this problem by analyzing how emotions evolve during a comedy sketch. Agents could be actors that choose the actions deliberately to produce an 2 emotional output that is such that mimics that of a comedy sketch. Thus our research hypothesis is: “Agents can control the pacing and timing of a sketch by using emotions to define the evolution of the story and contributing actively with their actions to produce an emotional escalation.” The goal of this thesis is to give a specific contribution for modeling some of the constraints of humour, particularly those related to delivery in the context of a comedy sketch. 1.3 Document Outline We start by taking an overview of humour theories and how thought on the subject has evolved, in chapter 2. We also take in consideration some practical advice of comedy writing. We take a special focus on how this relates to sketch comedy writing. In talking about sketches we consider the perspective of the character with view to make a sketch model centered on agents. We proceed to consider the state-of-the-art in Computational Humour in chapter 3. We discuss the works of Natural Language that have been the main focus of Computational Humour, as well as the Interactive Comedy systems that have been discussed. We also relate the contributions of these works and how they relate to the background we provided on humour. Finally, we present works in Affective Storytelling in which the agents contribute actively to the development of emotions in the scene or use either emotions or a concept of story tension to control how the story proceeds. We discuss our model in chapter 4. We present a conceptual model of a sketch and of elements in humour. We also provide a generic model of a comic agent and how he can be represented to interact with other agents and develop a sketch from that interaction. We introduce a categorization of incongruities which we use as a basis to explain how a comic character differs from a regular character and to establish a comic premise and a punchline. We also define Emotional Escalation and propose the concepts of Emotional Goals and Guidelines that relate story time to the emotions the agent wants to affect on the sketch. We define a heuristic function that can be used by the agent in order to select actions according to this Emotional Guidelines. We follow this by discussing, in chapter 6, how we implemented the proposed model. In doing so we refer to the frameworks we used to create the agents and that are analyzed in chapter 5. Our implementation is an extension of FAtiMA an architecture for agents with emotions that is discussed in chapter 5. We also discuss how we connected the agents’ FAtiMA mind with an animation system capable of expressing the agents emotions. We used the resulting animations from our prototype to evaluate how our implementation related to our model and study some aspects of said model – we discuss this in chapter 7. Finally, in chapter 8, we summarize the contributions of this thesis and discuss future work. 3 4 Chapter 2 Background On Humour There has been plenty of interest in the topic of humour, in fields such as philosophy, psychology [44], linguistics [5] or literary criticism [64], among others. An impressive number of theories have been devised. However, they are not concrete enough to be directly translated into a computational system. Nonetheless, these theories still help contextualize some processes found in Computational Humour systems. We start by presenting theories on humour and comedy developed until the Renaissance period. We divide the theories developed thereafter in four categories: superiority theories, incongruity theories (including here the linguistic theories of Raskin and Attardo [5]), relief theories (we chose to include the arousal theories in this group) and reversal theory (which is a play theory of humour). We also discuss two beginners books on comedy, Comedy Writing Step By Step, by Gene Perret [50], and The Comic Toolbox by John Vorhaus [68], which presents somewhat more practical advices that are common knowledge in the comedy writing community. 2.1 From Ancient Greece to the Renaissance In the literature, Plato’s (c. 429-349 BC) remarks on comedy and humour are usually regarded as the first ever written [64]. Plato’s view was that excessive laughter was inappropriate for a citizen in a republic. Aristotle (384-322 BC) showed much the same view in his Poetics(c. 330 BC) [4]. In considering Poetics, one must take note of the fact that a reported second book of that work, mainly devoted to comedy, was never found. In what is known of Poetics Aristotle refers to comedy as opposed to tragedy, with the purpose of better explaining the latter. Aristotle presents comedy as a “species of the ugly”, showing an example in the comic mask which was “ugly and distorted”. Thus, for Aristotle, “Comedy aims at representing men as worse, Tragedy as better than in actual life.” Aristotle’s insight of humour and laughter, however, was slightly more positive than Plato’s. While he also condemns the excessive use of humour, Aristotle is able to see its possibilities as a tool to captivate the listener. Other important work by Aristotle concerning humour is his Rhetoric (c. 322 BC), where he discusses several types of jokes. In Rhetoric, while he considers irony to be suitable to be used by the speaker, he finds buffoonery inappropriate and to be avoided. The book that Aristotle wrote about Comedy, if there ever was one, was never found, as 5 mentioned before. However, some researchers claim that the contents of the lost book are summarized in a Greek text known as Tractatus Coislinianus. These and other works of the sort were a big influence on Cicero (106-42 BC), who approaches the topic of humour in De Oratore (55 BC). Both the Tractatus Coislinianus and De Oratore make a distinction between two types of humour that can be referred to as verbal and referential humour [5]. Both are verbally expressed, in the sense that they are usually conveyed through language (contrasting with physical humour). The verbal type of humour relies solely on the wordplay. Referential humour, on the other hand, uses language only to present the joke, and humour resides in the situation or observation presented. This distinction has been often used in the literature although the exact terminology sometimes varies. According to Cicero, what separates these two joke types is the fact that in verbal humour the phrasing cannot be altered. Thus, verbal humour is much harder (or even impossible) to translate. The Middle Ages (5th-15th centuries) were not very productive concerning texts dealing with humour itself [64], [5]. Nevertheless this was the time that the definition of comedy started to broaden [64]. While in the Classical Era it meant a very specific dramatic form, relying on somewhat formulaic plots and stereotypical characters, during the medieval period the word comedy started to apply to various works both in verse and prose. The Renaissance marked a revival of the already mentioned classical sources. One example is the treatise on humour by the Italian philosopher Vincenzo Madius (1498-1564), De Ridiculis, where he builds upon the work of Aristotle and Cicero [5]. One of the most interesting original remarks by Madius was what could be translated as the “surprise” element of humour. Madius argues that “the cause of laughter does not reside only in ugliness” as suggested by Aristotle, “but it is also the work of surprise” (as quoted by [5]). The dramatists Giangiorgio Trissino (1478-1550) and Bernardino Pino da Cagli (c.1525-1601) also commented on the thoughts of Aristotle, Cicero and Horace (another Latin thinker that was very influential on the Renaissance view of humour), interpreting the “ugliness” that classical authors associated with comedy as a “lack of proportion”. 2.2 The Superiority Theories The superiority theories (also called disparagement, aggression or degradation theories) may be rooted in the ideas of Aristotle and Plato. Plato described the relation between aggressive feelings and humour in Philebus, stating that “when we laugh at the folly of our friends, pleasure, in mingling with envy [...]; and so we envy and laugh at the same instant” [53]. But the paradigmatic quotation behind these group of theories is that of Thomas Hobbes (1588-1679) in the Human Nature, the first part of his The Elements of Law, Natural and Politic. He argues laughter “is nothing else but a sudden glory arising from sudden conception of some eminency in ourselves, by comparison with the infirmities of others, or with our own formerly” (as quoted by [64]). Thus, according to Hobbes, humour resides in a feeling of superiority over someone else or our own past selves. More recently, Charles Gruner, a professor of speech communication at the University of Georgia, developed his own superiority theory of humour [34],[44]. According to Gruner, humour is a form of aggression that arises in a playful context. It is unlike “real” aggression, in the sense 6 there is no physical harm involved, and works more in a game-like way. As such, Gruner argues that there is always a winner and a looser in every joke, and that we enjoy humour much the same way we enjoy winning. A usual remark by researchers who disagree with this aggression theory is that it does not seem to apply to several simple riddles and puns. Gruner, however, argues that riddles and puns are used to display one’s facility with words. Also, according to Gruner, when joking about ourselves we are showing the superiority of our current self over our past self. Thus, we can only joke about the fact we are lazy if we are not feeling lazy at the moment of the joke, and we can only laugh at our mistakes when, and because, we feel they are past us. In Gruner’s perspective, even the most seemingly innocent jokes display some sort of theory aggression, even those that, at first, appear to rely solely on absurdness and non-sense. Finally, though there seems to be evidence that aggressive elements play an important role in several forms of humour, the superiority theory does not seem to cover all kinds of humour. Gruner’s theory relies on such a broad definition of aggression that is able to encompass all human behaviour, and not just humour. Such a concept of aggression seems to become irrelevant in actually explaining why a certain joke is funny. In fact, this view of humour exclusively as aggression is considered outdated nowadays. However, it is a general agreement that humour can be used as a form of aggression. Indeed, an interesting fact shown by recent research is that humour can be both aggressive and pro-social at the same time. 2.3 Incongruity Theories The Italian Renaissance philosopher Vincenzo Madius was already quoted in section 2.1 saying an element of surprise was essential to humour, something Aristotle had already suggested in Rhetoric [3]. Perhaps the best description of what surprise and incongruity mean in humour is the one by Immanuel Kant (1724-1804), who describes laughter as “an affection arising from the sudden transformation of a strained expectation into nothing”[37]. Arthur Schopenhauer (1788-1860) further defined incongruity as the incompatibility between a general concept and a particular case in which that concept does not apply. Arthur Koestler builds on this idea, suggesting the concept of bisociation [38]. Bisociation occurs when a given concept or event can be seen, at the same time, as pertaining to two normally incompatible frames of reference that are, however, totally consistent with that same concept or event. Koestler, as well as many other supporters of incongruity theories, usually considers incongruity necessary, but not sufficient, for humour to occur. According to the incongruityresolution theories, for a joke to work, the incongruity must make sense, this is, it must be resolved ([44], pg. 64). Thomas Shultz, at McGill University, developed one such theory, in which he considers the punchline to create an incongruity that contrasts with what was suggested by the set-up of the joke ([44], pg. 64). One must go back and search for an ambiguity in the set-up and interpret it in a different way in order to get the joke. Another two-stage theory was proposed by Jerry Suls from the University of New York at Albany, who viewed the perception of humour as a problem-solving process ([44], pg. 64-66). According to this theory, the listener predicts the likely outcome during the joke set-up. As the 7 punchline does not reflect the listener’s expectation, he must find a cognitive rule that solves the problem. If he fails to find such a rule, however, he would be confused rather than amused by the joke. Some cognitive researchers, however, think incongruity alone may be enough to make a joke work and that resolution is not at all necessary. The Swedish psychologist Göran Nerhardt developed an experience which he called the “weight judgement paradigm” ([44], pg. 68-69). According to his methodology participants, unaware of the true goal of the study, are told to compare identical-looking weights against a standard reference weight. At first all weights evaluated are very similar. Then participants are given a weight that is much lighter or heavier than the standard. The surprise frequently led to humorous reactions among the participants, that ranged from smiles to laughing out loud. Nerhardt and others have argued the findings that resulted from these experiences show incongruity alone suffices for humour. Yet, empirical evidence does not seem to support the view that the funniest jokes result from the most surprising endings, as defended by Shultz and Suls. Much to the contrary, it seems predictability is favoured over surprise in matters of humour ([44], pg. 71). Finally, another frequent criticism to incongruity theories, whether including resolution or not, is that it focuses solely on the cognitive aspects of humour. Social and emotional aspects of humour are almost completely disregarded by these theories, thus failing to explain some aspects of jokes. A reversal theory (see section 2.6) would explain the “weight judgement” phenomenon as a shift from a serious (scientific experiment) to a non-serious (“ridiculously” different weights) context. Thus, this type of humour may not be related only to the cognitive process that seems to be the scope of incongruity theories. 2.4 Raskin and Attardo’s linguistic theories of humour According to cognitive science, our mind organizes information into knowledge structures called schemas ([44], pg. 85-86). A schema describes general characteristics of objects or events, containing several slots that can describe specific types or instances of that object. For example, a bird would have slots for the beak, wings, feet, etc. Slots may also be filled with default values that represent the typical object. If, for example, we hear a story about a bird, the typical bird will be the one to come to mind. Scripts, as introduced in AI by Roger Schank and Robert Abelson, are considered specific types of schemas that deal with routine activities. An example given by Schank and Abelson, is a restaurant script, which comprises events involved in going to the restaurant: sitting at a table, ordering from the menu, and so on. In hearing a narrative about someone going to the restaurant, this script is triggered, allowing the listener to fill in the blanks. The script also accounts for what is appropriate and relevant in going to the restaurant. A story that defies these expectations is perceived as incongruous. Victor Raskin’s Semantic Script Theory of Humour (SSTH) explains humour by using a notion of incongruity based on scripts ([44], pg. 89-90). According to Raskin, the resolution of the incongruity between the punchline and the set-up of the joke happens when shifting from the original script, conveyed in the set-up, to another that makes sense of the punchline. The script opposition can be manifested by pairs such as good versus bad or life versus death. 8 The SSTH was later reviewed by Raskin and Salvatore Attardo, creating a new more generic theory called the General Theory of Verbal Humour (GTVH) ([44], pg 91-92). This theory represents jokes as an hierarchical group of six Knowledge Resources (KRs), from the most abstract to the most concrete. These KRs are thought to be related with the cognitive representation of humorous texts, thus going beyond script opposition as the only relevant factor for humour cognition. The most abstract KR is precisely Script Oppositions. The Logical Mechanisms refer to joke techniques used to activate the alternate script. The Situations refers to people, objects and activities involved in the joke. The Targets, which are not necessarily present in all jokes, concern the subject of the joke. Narrative Strategies refers to the format of the joke, for example, a riddle. Finally, the most concrete KR is the Language, which refers to the actually wording of the joke. 2.5 Relief Theories Theories of relief draw inspiration from Herbert Spencer (1820-1903), who described laughter to be a release of nervous energy excess ([44], pg. 33-36). One of the most important and influential relief theories of humour is that of Sigmund Freud (1856-1939), whose psychoanalytic theories were dominant in psychological research of the first half of XXth century. He saw laughter as the release of an excess of psychic energy. The Freudian theory distinguishes between three types of laughter-evoking situations: wit, humour and comic. Each of these was associated with a mechanism that allowed to release some psychic energy in the form of laughter. With wit, which Freud related mainly with canned jokes, the mechanism was a cognitive trick, for example, a wordplay, that would distract our superego, or “conscience”, normally responsible for inhibiting our actions according to some sort of moral code. In this case, it is the energy normally involved in the inhibition of those sexual and aggressive impulses that is released as laughter. By humour Freud meant the ability to be amused by the incongruous elements found in situations that would usually lead to negative or painful emotions. Thus, the energy released is, in this case, the energy that would be usually associated with such emotions, and that became redundant as a result of this altered perspective of the situation. The comic would happen in situations where you predict an outcome, and your expectations are defeated. In this case, mainly sourced in non-verbal situations, it is the energy you took into anticipating what would happen that would lead to laughter. Freud’s theories, however, found somewhat limited support in empirical investigation ([44], pg. 41-42). Yet it covers a number of very important aspects such as the importance of humour as a defence mechanism in coping with stress and in enabling the discussion of topics usually deemed inappropriate. Several theories relating humour and arousal appeared during the 1960s and 1970s. One such theory is that of Daniel Berlyne ([44], pg. 58), from the University of Toronto, based on a known inverted U-relationship between subjective pleasure and arousal, that is: too much or too little arousal is bad, the optimal level being something in between. According to Berlyne there are two arousal-related mechanisms in humour. The arousal boost mechanism occurs during the telling of a joke, when arousal is elevated to the optimal level. The arousal jag mechanism happens when the arousal has been elevated past the optimal level. The joke punch line takes 9 the level of arousal back to a pleasurable level. Berlyne saw these changes of arousal as the cause behind laughter, in contrast with the relief of some kind of stored energy, as suggested by Spencer and Freud. Experimental results have not confirmed Berlyne’s theory completely but yielded some interesting results ([44], pg. 59). The relation between arousal level and humour seems to be linear rather than an inverted-U, that is, more arousal is directly related more enjoyment. According to Rod Martin “humour itself is an emotional response that is accompanied by increases in arousal” ([44], pg. 62). 2.6 Reversal Theory The American writer Max Eastman, defended, in his book Enjoyment of Laughter [25] that most theories fail to explain humour properly because they fail to take humour as a type of play. He wrote that “no definition of humor, no theory of wit, no explanation of comic laughter, will ever stand up, which is not based upon the distinction between playful and serious.”. More recently, this view has been taken by several humour theorists. As discussed in section 2.2, Gruner, though a supporter of aggressive theories, recognized humour differed from other forms of aggression due to its playful nature. Berlyne, mentioned in section 2.5 saw changes emotional arousal as the main cause of humour, but also accounted for the importance of a context of play. Anglo-American psychologist Michael Apter has proposed the reversal theory a theory of humour which revolves around the concept that humour is play ([44], pg. 75-77). Apter defines play as a special mental disposition to see the world and our own actions regarding that world. Apter refers to this playful mind state as the paratelic state, as opposed to the telic, serious state, in which we undertake more serious activities. According to Apter, we reverse from one state to another several times over the course of a day. The two states are distinct in several aspects. In the telic state the goals are more important than the means, while in the paratelic state one enjoys the activities for their own sake. Contrasting with arousal theories such as Berlyne’s, Apter defends that we have different reactions to arousal in the serious and non-serious states. When in the telic state, high arousal means anxiety, and is unpleasant, and low arousal means relaxation. In the paratelic state one seeks excitement, or high arousal, over boredom, or low arousal. Apter also regards both the aggression and Freudian theories of humour, claiming that aggressive and sexual themes increase arousal, which is seen as pleasurable in a paratelic state. The cognitive aspects of humour, on which contradictory ideas or concepts of the same object, and that is regarded as good in the paratelic state because they supposedly increase arousal. However he disagrees that there is some kind of resolution to this incongruity, arguing instead that the incongruity theories are based, are also addressed by the reversal theory. Apter’s view is that the cognitive process exposes two opposite ideas are simultaneously recognized, and the emphasis is in the creation of the incongruity (the “synergy” as Apter calls it) rather than in resolving it. By addressing some of the issues of other theories of humour, it seems that reversal theory helps to explain many of their flaws, and most experimental studies seem to support this theory ([44], pg. 80-81). It also provides an explanation to the role of humour in coping with stress 10 and some of everyday humour that exists in conversations, not just canned jokes. However, this theory is more recent and less well known among humour researchers than the other theories discussed in this section. Theory Claims Superiority Humour is a form of aggression Main Arguments Many jokes have aggressive themes Incongruity Two-step model of Resoluhumour: need to retion solve an incongruity Explains well the cognitive process involved in jokes Incongruity Incongruity alone is enough to create humour Some experiments seem to show resolution is optional in some cases Many jokes have sexual themes TensionRelief Arousal Reversal SSTH GTVH Humour is the release of accumulated psychic energy Humour is due to changes in the arousal level Humour is a type of play and we change between playful and serious mental states during the day Linguistic theory based on Script Opposition Linguistic theory based on Knowledge Representation There seems to be a correlation between arousal and humour Experimental results back up the theory, which incorporates some processes described by other theories Explains certain types of linguistic humour It’s more generic than SSTH Main Criticisms To encompass all jokes, definition of aggression become so broad it pertains to almost everything Empirical results show some predictability may actually benefit humour Some jokes are better explained by a two-phase model Limited evidence Arousal increases linearly with humour It’s still not very well known It’s very specific, since it considers only Script Opposition Like SSTH, some details are not precise enough to translate into an implementation Table 2.1: Summary of different theories of humour, according to their main claims, arguments and criticisms. 2.7 From Theory To Practice Considering the theories presented (summarized in table 2.1), most are not concrete and precise enough to translate into a concrete implementation. Beginner’s books on comedy offer less 11 formal but often more practical advice. The validity of these books is sometimes questioned by those in the comedy writing business. For example, Portuguese comedian Nuno Markl said in an interview that “it always stroke me as odd that people write books on how to make humour” [1] 1 . Gene Perret, who was head-writer of the American comedian Bob Hope for many years and authored several books on comedy writing, argues that the existence of comedy writers contradicts their own statement. As he puts it “they weren’t born comedy writers, so they must have learned somehow” ([50], pg. 10). We take in account two such books, Comedy Writing Step By Step by Gene Perret [50] and The Comic Toolbox [68]. We start by discussing how some of the comedy writing concepts show up in both of these books, and then consider their discussion of sketch writing. Taking an overview of other books on comedy, interviews of comedians, and the similarities between the two books here presented, one can conclude many of the advices provided by Perret and Vorhaus’s books are part of the common knowledge in the comedy writing community, though the actual nomenclature of the terms used may vary. 2.7.1 Comedy Writing Concepts Incongruity is a concept that seems to be necessary to humour. Perret states that normally “a joke comprises two different ideas that come together to form one” [50] (echoing Koestler’s concept of bisociation). We consider two main uses of incongruities in comedy writing: as comic premises and as punchlines. Comic premises are the initial idea behind a comic scene or a joke, such as a man interviewing a dictator about his love of botanic. The punchline is the ending part of a joke or a scene, that resolves the incongruity and provides the humour. John Vorhaus, whose perspective on comedy writing is strongly character-centric, introduces the concept of strong comic perspective in relation to comic characters (pg. 42). A strong comic perspective is a point of view by a character that is very unique and related to specific traits of his or her personality. This point of view is unlike that of a normal person. Vorhaus also points out that comedy often arises when two characters that are what he calls comic opposites are joined together. In this case, joining these characters provides the comic premise. He gives as an example “a college nerd and a party beast” (pg. 52) who are roommates. He also points out, however, that not all comic opposites generate interesting situations. In introducing the strong comic perspective concept, Vorhaus stresses the word strong. In comedy everything should be exaggerated, another point that is agreed upon by Perret, who says “humorous exaggeration sometimes (...) [states] a case more powerfully than brilliant oratory” (pg. 104). We can see exaggeration as a method to achieve incongruity, because, as we exaggerate features of the character or of the situation they find themselves in, the gap between the comic reality and normal reality grows. An important factor in comedy is building up the tension (set-up) before delivering the joke (pay-off). Perret, whose book is mainly directed at stand-up comics, also makes the point that separate one-liners can work better if sequenced and grouped in a routine (pg. 85). According to Perret, it also helps to cover for weaker jokes, which are laughed at, and seem funnier, because the spectator gains momentum from previous lines. This is consistent with Provine’s 1 “Sempre me fez alguma confusão que se escrevam livros sobre como fazer humor” in the Portuguese original. 12 observations that laughter is contagious and people deem the jokes they laugh at funnier than those they do not [55]. Comic stories such as sketches have the same purpose of providing a structure for jokes. As Perret puts it “jokes are the building blocks of comedy” (pg. 50). Vorhaus hints at another role stories play in comedy writing, which is establishing an emotional connection with the audience, for instance giving characters certain qualities that make the viewer care for them. Another word between Vorhaus (pg. 2) and Perret (pg. 103) books is truth. They both state that “comedy is truth”, though Vorhaus adds “and pain” (which can be considered in relation to superiority theories). This implies precisely that a joke must tell something significant, something the viewers care about. As such, a joke must be given an appropriate context, which is one of the purposes of both a comic story and a stand-up routine. Both Vorhaus and Perret are cautious about the importance of puns, since random puns by themselves do not have any significance for the viewer. Vorhaus (pg. 128) admits the use of a pun as a way to better a joke, as long as the pun itself is not considered to be a joke. Perret is more adept of puns, saying they are not funny if they only depend on the wordplay, but that they can make a good joke when they create a good mental image, joining two contrasting ideas (pg. 64-65) (which, as stated, is what Perret views as the basis to any joke). Another central concept to humour writing is attention to detail. Perret says jokes should have succinct set-ups and punchline (pg. 122-124). In a long build-up, he notes one should have other jokes in between. Vorhaus suggests details add to comic value (pg. 129), raising interest. This should not be seen as contradictory. Perret also sees the ability the skill to evoke mental images as crucial to comedy (pg. 19). In the example Vorhaus gives to illustrate how to add detail to a story, he makes several smaller jokes in the process. There is a fine balance between detail and verbosity. A good reason to add detail is to build the tension of the scene. Both Perret and Vorhaus agree that comedy writing is the process of building up tension until a final release. Vorhaus stresses this fact often throughout the book, stating that the easiest time to joke is when characters should find themselves in the most dire situations. Perret seems to agree as he remarks “a tense situation can give you the greatest straight lines” (pg. 131). 2.7.2 Sketch Writing Sketches are short, isolated scenes that develop a certain comic premise (a comic premise is the incongruity that composes the initial idea of a comic story). British comedians Mitchell and Webb consider sketches as a good format to start on comedy writing [36]. While they present a lot of challenges specific to comedy, the fact they are short means one can worry less about intricate details of the characters, for example. As such, sketches seem a good format to use when considering a first prototype on Interactive Comedy. Both Gene Perret’s Comedy Writing Step By Step ([50], pp. 153-193) and John Vorhaus’s The Comic Toolbox ([68], pp. 154-161) dedicate a chapter to sketch comedy. While Perret sees jokes as the building blocks of comedy, he remarks a sketch is not just a collection of jokes, much like a house is not just a collection of bricks. Perret considers a good sketch must have a “a premise; some complications; an ending, or, in other words, a beginning, 13 a middle and an end” (pg. 154). Vorhaus presents 9 points to consider when writing a sketch, which we will furter discuss and map onto the three parts of the sketch defined by Perret. Both Perret and Vorhaus provide a similar account of what a premise should have. Vorhaus considers the first step is to find a comic character, and then to find him a force of opposition (pg. 154-155). This force of opposition may be another comic character or a regular character. Then we need to set the context of the sketch and create a union, a relationship between characters. This is so we can create a conflict and that the characters still have some reason to keep together throughout the sketch. Conflicts may start as something simple, such as one character asking another for the time and this one refusing to say it (example by Vorhaus, pg. 155). The second phase, what Perret describes as some complications, is the middle section of the sketch. In this phase the conflict escalates. The characters will get more and more tense as the sketch progresses. The next step, according to Vorhaus, is to raise the stakes (pg. 156). The reason for the conflict now is something greater. According to Vorhaus, the greater the jeopardy the characters are in (the greater the tension) the better received the punchline (the relief ) will be. Vorhaus insists on creating this tension, by saying one should “push the limits” (pg. 156). If a character is mad he turns completely insane. We should use exaggeration at this point, without being too afraid of logic gaps. This phase ends when the tension reaches its limit. Vorhaus describes this as seeking “an emotional peak” (pg. 156), when the tension gets unbearable high. The third and last phase which was simply dubbed ending by Perret, and which he compares to the punchline of a joke. As Perret puts it “a sketch is like an elongated or acted-out story joke” (pg. 158). Vorhaus eighth of nine steps is to find a winner (pg. 157), which at first sight resembles Gruner’s view of humour as a game 2.2. A “winner”, in this case, is someone who achieves its goal (e.g.: the character finally gets told the time). However, Vorhaus says that it does not matter if someone wins, if someone loses or they both win or lose, in which he differs from Gruner. The only important thing at this point is that the sketch does end. The ninth of Vorhaus’s steps is the last and suggests changing the frame of reference (pg. 157). This concept once again traces back to Koestler’s concept of bisociation (see section 2.3) and Raskin’s concept of Script Opposition (see section 2.4). In this last step, we find the sketch was actually about something else (e.g.:the character was actually a foreigner who thought that “What time is it?” meant “How are you?”). Perret stresses the importance of the ending, and that it should consist of the best joke of the sketch (pg. 158). Perret also notes the ending should be compatible with the body of the sketch (pg. 160). This echoes the role of the resolution concept in incongruity-resolution (see section 2.3). 14 Chapter 3 Related Work Computational Humour (the subfield of Artificial Intelligence concerned with the production of humour) has been mostly connected with works in Natural Language. We present this here to give some context of how this area has developed over the years. We also give an account of the existing works in Interactive Comedy, that have been mostly based about superiority theories of humour, and comic situations in which an agent fails to perform an action. Finally we take a look at some works in Affective Storytelling that relate emotions with story progression. 3.1 Natural Language and Computational Humour Language is arguably the preferred medium in humour, perhaps because it is both very expressive and very ambiguous at times. Unsurprisingly, most of the research in Computational Humour was done in the sub-field of Articial Intelligence concerning Natural Language. The first working prototype of a Computational Humour system was probably Lessard and Levison’s 1992 Tom Swifties generator[39], which used the VINCI Natural Language generator. Tom Swifties are puns in which a character’s (Tom) quoted sentence is described by an adverb that bears relation to that sentence. For example: “’this matter is opaque’ , said Tom obscurely.” Following this work, Lessard and Levison presented another prototype for making riddles [40], which also used VINCI and worked in a similar manner. LIBJOG was a system developed by Attardo and Raskin, which followed closely [57]. It generated jokes simply by replacing the blanks in a light-bulb joke template. For example, the input (Poles ((activity1 hold the light bulb) (numberX 1)) (activity2 turn the table he is standing on) (numberY 4))) would result in the joke “How many Poles does it take to screw in a light bulb? Five. One to hold the light bulb and four to turn the table he’s standing on.” (example taken from [10]). The system JAPE, developed by Binsted and Ritchie at the University of Edinburgh [11],[10] was a more ambitious attempt, featuring WordNet [70], a much bigger lexicon than those used by previous pun generators. JAPE was able to produce certain types of simple punning riddles, 15 of the form “What’s the difference between A and B?” and “How is A like B?” for example. Each entry being associated with a lexeme (a unique symbol for the entry), its written form and other bits of information, such as words it rhymes and alliterates with, as well as other semantic and syntactic relations. The lexeme also holds the information on which actions are associated with the object of the lexeme. In JAPE a lexeme was a basic abstract entity that translated the meaning of a word, or a family of words. The fact that certain actions were not related with that lexeme needed also to be explicitly stated. The joke is then generated from this information, according to a number of rules concerning the formation of simple punning riddles. Taking in account that an egg can be broken and boiled and a potato can be baked and broiled, and also the alliterations and rhymes present between words yields the joke “What’s the difference between a potato and an egg? One you bake and broil, the other you break and boil.” [10]. Figure 3.1: The STANDUP system interface: a) choosing the type of joke b) the joke itself c) the pun explained JAPE was later used to develop STANDUP [43], a system aimed at helping children develop better language skills. STANDUP was based on a second version of JAPE, but it had some specific concerns related to its interface (see image 3.1). The user was able to ask for a joke to be generated by choosing the type of riddle or the topic, for instance. The system could explain the pun if asked too, in order to improve the linguistic knowledge of the user. STANDUP also introduced some improvements to the joke creation, such as avoiding repeating the same word in the question and in the answer of the riddle. STANDUP was tested with nine children with cerebral palsy, showing encouraging results [69]. Results should, however, be taken with care, because of how small the test group was. Another important work in computational humour was Stock and Strapparava’s HAHAcronym, a simple prototype that sought to produce ironic acronyms [62]. HAHAcronym also used WordNet, extended with hierarchic domain information about the terms. These domains are used by HAHAcronym to explore the incongruity between groups of concepts, for example, Sex vs. Religion, drawing inspiration from the GTVH’s concept of Script Opposition (see section 2.4). Domains also allow simple ontological reasoning, such as knowing a country pertains to a certain continent. HAHAcronym makes use of the relations of antonymy and synonymy between words provided by WordNet. Once again, information about the phonetics of the words was added. The HAHAcronym has two modes of use: one is the generation of new acronyms, the other the redoing of existent acronyms in an ironical fashion. As far as the recreation of existing acronyms goes, several strategies may be adopted. For example, the reanalyzer may propose 16 Catholic Ray Tube instead of Cathodic Ray Tube. In this case it chose to keep most of the original acronym (the main heads of the parse tree), changing only the adjective. The adjective was changed towards the domain of Religion, because the system was configured to produce choices in that domain. It also prefers words that rhyme with the original ones (Catholic/ Cathodic) so that the acronyms will still sound similar. When creating new acronyms from scratch, the system accepts as input a main concept and an attribute. For instance, main concept “processor” and attribute “fast” may result in the acronym “TORPID - Traitorously Outstandingly Rusty Processor for Inadvertent Data processing”. It is noticeable the use of the word TORPID in the acronym, an antonym of the property “fast”, while the object of the acronym is still the processor. In considering computational humour works on natural language it is useful to refer to the distinction between verbal and referential humour (see section 2.1). As of yet, most systems still concern only verbal humour. According to Hempelmann et al. [35], JAPE suffers from the same problem as LIBJOG, as it only fills the blanks in a template. They propose to build a system closer to the GTVH (see section 2.4) using ontological semantics, to solve this problem. Though JAPE can actually make a pun, by using certain properties of a database of words, its puns do not have a context, and usually do not go beyond the simple wordplay. As discussed in section 2.7), puns are usually considered unfunny if voided of context. The successor of JAPE, STANDUP, puts the wordplay capabilities of JAPE to good use, and makes some improvements upon it, but it does not add much in terms of humour. Yet it is useful to see, in practice, potential applications of Computational Humour. HAHAcronym can already be considered a cross between referential and verbal humour [59]. The acronyms generated can be generated according to pre-defined attributes, and will usually convey ironical expressions regarding those attributes (for example TORPID vs. the atributte “fast”). Nevertheless, context remains one of the main problems to solve in Computational Humour. Using an approach based on Interactive Storytelling, which we will discuss next, provides the inherent meaning of the story, thereby solving this problem. 3.2 Interactive Storytelling and Computational Humour Interactive Storytelling systems are concerned with creating stories whose unfolding depends upon the user. A common approach to Interactive Storytelling in AI has been to use autonomous agents to play the role of the characters. In some approaches agents are more actors than characters, that is, they take decisions based on information that is only available at a higher level of reasoning than that of a character. 3.2.1 Plan based IS systems Planning plays a big part in Interactive Storytelling, both to formulate the plot or each character course of action. The main approach in existing Interactive Comedy works has been to have agents failing their plans. The comicality of this situation can be seen from various perspectives: according to the incongruity theory, it would be the gap between what the character wants and what he actually gets; according to superiority theories, this would be a situation of superiority 17 over the characters who fail their intents. Hierarchical Task Network and Heuristic Search Planning Many different planning techniques have been proposed, two of which have been prominent in IS approaches to Computational Humour. There have been two main different planning formalisms used: Hierarchical Task Network (HTN) planning or Heuristic Search Planning (HSP). A comparison between these planning formalisms can be found in [19]. In HTN planning there’s an hierarchical model of actions, from the most generic to primitive actions. The root node coincides with the plan goal, or main task, and the leaf nodes correspond to primitive actions. The main task is successively decomposed into subtasks, until these subtasks can be described by the primitive actions. A common representation for HTN is AND/OR graphs. The fact that OR nodes stand for different possible courses of action provides a potential for story variability expression. Concerning the AND nodes, there are three possibilities: assuming task decomposability, total ordering, or both. Task decomposability means that all the tasks are totally independent from each other. The same is to say that one action of the decomposition will not affect the other, neither by undoing what a different task accomplished nor by facilitating another task. Total ordering means that the tasks will be done in a fixed order. The total ordering of tasks is a more structured approached, which can be useful to separate different scenes, for instance, keep a well structured plot; however, it may also be too restrictive, reducing the fluidity of the story itself. If a certain plan fails at a certain node, HTN re-planning usually starts at the level that is immediately superior to that node. Imagine the goal is to make money, for which there are two intermediate OR nodes, working, borrowing and begging, and for each of these nodes there are various possibilities - say the node working may lead to work as a waiter, or as a banker or as a shopman. If the character failed to get a job as a waiter he would try to get another job, and never try the alternative approaches, borrowing or stealing. This leaves little room for drastic changes in the plot. Other disadvantage of the HTN is that it can easily become hard to maintain; an HTN plan should not be much deeper than seven levels [15]. An alternative planning mechanism that has been implement in Interactive Comedy is Heuristic Search Planning (HSP). HSP can be divided in three main parts: a domain model, a search algorithm and an heuristic function [14]. The domain formulation is usually based on STRIPS operators that correspond to the actions an agent can take, with the pre-conditions, an add-list and a delete-list. The search algorithm is used to search the space of states from the initial state. The heuristic function estimates, for each newly generated state (while searching), the prospect of reaching the goal state. This heuristic works on a relaxed version of the problem, ignoring interactions between sub-goals. It seems easier to describe and maintain coherence in a story using HTN, since the description of a character’s role is more logical and structured. In HSP the narrative consistency is only assured by the existence of a single narrative drive which is the initial state and goal of the character. However, HSP offers greater variability: at each step every operator is considered, which means the re-planning can be more dramatic, and the shifts in the story more drastic. 18 3.2.2 Interactive Comedy Comedy as a genre has been pursued in various works of Interactive Storytelling. One prototype by Cavazza et al. at the University of Teeside [20, 15] was based on the sitcom Friends. However, the sitcom genre was chosen more because the storyline is relatively simple and fitted the character-centred paradigm than as an explicit attempt at making humour. Yet, the prototype did result in some funny situations, which emerged from the failure of a character’s plans. Cavazza reckons that the situation of two characters that have different conflicting goals is “likely to result in a series of comic situations and quiproquos.” The group at Teeside further explored this idea in another prototype based on Pink Panther cartoon [16]. In the Pink Panther, similarly to other cartoons, the comic situations are usually related with the inability of the main character both to achieve his or her goal and to learn from the past mistakes. Different theories explain the humour in this situation in different manners: Incongruence-Resolution theory would say it arises from the gap between what the character wants, and what he actually accomplishes; Superiority theory would explain it because of the inferior position the Pink Panther puts himself. Figure 3.2: The Pink Panther prototype: example results (left) and action selection (right) [16] The planning mechanism used was Heuristic Search Planning (HSP) extended with executability conditions, a mechanism proposed by Cavazza et al. to allow the character to attempt actions that fail the original goal [16, 14]. Executability conditions are pre-conditions that are taken as granted by the agent. They are needed for success, but they are not taken into account in action selection. For example, a character goes to take a shower assuming there is water, not knowing the water was cut [16]. Cavazza et al. suggests that these dependencies can lead to 19 the failure of other actions and have potential to generate comic situations. However plan failure is actually relevant whatever the genre contemplated by the interactive storytelling system, considering stories would be pretty short if everything went the her’s way. This also allows longrange dependencies between actions, which may lead to other comic failures - thus executability conditions allow the agent to pursue flawed plans. In this Pink Panther prototype, the potential comic situation results of a plan failure derived from both the executability conditions and the long-range dependencies between actions. An example that shows this dependencies the Pink Panther is shown in figure 3.2. In this example, the Pink Panther goes to sleep (A), and hears a noise from a leaking tap (B). Because the Pink Panther has an executability condition of “quiet” in order to sleep, it fails. She tries to fix it (C) but fails (D). She goes down to the basement (E) and cuts off the water supply (F). Because she was dusty (G) she decides to take a shower (H) but it fails because she just cut out the water supply. Thawonmas et al. proposed a similar model [65], noting that there should be some control how the plan failure occurs. They used Hierarchy Task Network (HTN), but also extended it with executability conditions. This means there were several possible candidate actions for achieving the same primitive task, with different executability conditions. The choice between the candidates was done by an heuristic. In case an action failed, the cost of the primitive task would accumulate with the heuristic cost of this failed operator. If the total cost went over a given threshold, the system would backtrack and try a new action. They implemented a prototype based on Mr. Bean, and gave as an example the actions pick and pick-secretly. If the Mr. Bean used pick-secretly, there could not be anyone aware that he was picking the object. Another important work by David Olsen and Michael Matheas further developed this method in ACME, a prototype system set in the world of the Coyote and Road-Runner cartoons [48]. The Coyote and the Road-Runner are two autonomous agents. The behaviour of the RoadRunner is fairly straightforward: he is either standing still or running away from the Coyote. The Coyote also has two states: chasing the Road-Runner and planning the use of an item in order to try catch the Road Runner. The system itself has the story goal of frustrating Coyote’s plan through the occurrence of some gag, such as an anvil falling over Coyote. Every item the Coyote encounters has a type, which refers to the way the Coyote may use it to catch Road-Runner, and attributes, which are associated with the different gags that can be generated. For example a rocket may be of the type is-moveable, meaning it can be used to move faster, and has the attributes is-fast, which is associated with the crashing gag, and is-explosive, which is associated with the explosion gag. The Coyote’s plan is basically to set up and use items. The use of every item requires a certain number of steps. A value of anticipation is calculated, that grows with the number of steps the Coyote has gone through. This value is associated with the probability of a gag occurring, such that the further the Coyote is in the set-up process the more likely it is that the gag will interrupt Coyote’s plan. When the Coyote is close enough to an item, it starts planning its use. At every step of the Coyote’s plan, the system uses the anticipation heuristic to decide when to unfold the gag. The comic situation results of the intertwining of the Coyote’s and the system’s plans. 20 The choice of the gag to be executed also depends on which step the Coyote is in. For example, the gag in which the rocket explodes is not available until the Coyote starts to light the rocket. If two gags are available (in the case of the rocket, for example, crashing and exploding) the system decides randomly which one to apply. The details on how the gag is applied are also related to the world state. For example, if the Coyote throws a dynamite stick at Road-Runner, it must come back to the Coyote in some way (for example by reversing its trajectory in mid-air) before exploding. If, however, the Coyote is already holding the dynamite, the explosion can happen immediately. Another system by Thawonmas et al. [66] tried a different approach for comedy production. The main difference is in the interactivity the user had with the system. In all systems described in this section until now, the viewer is only able to control certain aspects such as the location of objects on the scene. Thawonmas proposed a system in which the user controls a character whose actions are mimicked by an agent controlled character also inspired by Mr. Bean. The agent uses HSP planning, and the goal is set by the user interaction. Whatever the user does becomes the goal of the character agent, who atempts to mimic the viewer actions. Thawonmas et al. suggestes humour could arise from the conflict of the goals of the Mr. Bean agent, who acts his character while at the same time attempting to mimic the user. However, in the prototype described, there was still no specific strategy for humour production, and no humour value was included in the heuristic function used. 3.3 Agents with Emotions and Affective Storytelling There have been many works in Interactive Storytelling concerned about the agent’s emotions. It is a cross between Affective Computing and Interactive Storytelling, sometimes called Affective Storytelling. According to Bates “emotion is one of the primary means to achieve believability” [8] in agents. As discussed in section 2.7, the perspective, personality and emotions of the character are very important to comedy. Clark Elliott has proposed linking certain humorous situations as a sequence of emotions [28]. This thesis is concerned with how agents can impact a scene emotionally, and how emotions can steer a scene. Though not all the works presented in this section are directly connected with computational humour, they exemplify how a character’s action selection can be directly related with the impact they want to have in emotional content of the scene. 3.3.1 Affective Reasoner Clark Elliott’s Affective Reasoner [29] is an appraisal system that relates emotions with story variability. The premise is that two stories that are, for the most, identical, in terms of the events, are perceived as different because of the appraisal of actions done by the characters. In the Affective Reasoner, a character can have several distinct tags for each event in the story, each tag comprising a set of emotions that the character feels towards that situation. These are exclusive disjunctive, meaning that a character may only have one of those attitudes in a given story instantiation. The different combinations of tags that can be chosen lead to different story variations which Elliott calls story-morphs. 21 Elliott suggested the use of his Affective Reasoner paradigm in the context of Computational Humour, exemplifying with a very specific type of humorous situation [28]. The situation is that of an authority figure, that defends a certain principle, and a victim or a group of victims whom it accuses of violating that very principle. Elliott gives the example of a school dean, who, in faculty meetings, criticized those who were late, saying that, as the meetings were scheduled in advance, they had no reason not to show up on time. Everyone in the meeting thought it was funny when the dean himself was late in the following meeting. Elliott proceeds to analyse this situation in terms of the Affective Reasoner. He notices that the way characters feel about the situations is the same in many different, but similar scenarios: 1. the authority figure is angry at the victims over one principle; 2. the victims feel ashamed; 3. the authority figure breaks that very principle; 4. the authority figure feels ashamed; 5. the victims gloat. Elliott notices that one may even be both the authority figure and the embarrassed victim in the same story. The story he uses to exemplify is that of Elliott himself and his friend Richard who, in order not to be considered nerds, decided to learn to ride the motorcycle. Thus, they themselves are authority figures who are self-critic over the fact they are nerds. At the same time, they are the victims of the authority figure anger, and feel ashamed. However, they fail in their mission, and Richard has a spectacular accident when trying a manoeuvre he thought it would be cool. The authority figure in themselves, whom do not want to relax and accept the fact they are nerds, as Elliott puts it, is ultimately defeated by the victim figure of their “true” nerd selves. In the end, the victims gloat over the authority figure once again. These examples fit the type of situation that inspired superiority theories of humour (section 2.2), as Elliott points out. However, it can also be seen as an incongruity of the perceived authority over the principle, before, and the actual authority, after. This proposal is an example of how humorous stories may be described in terms of their emotional content. Elliott also suggests that the mirth of a story can be tweaked by changing the degree and nature of emotions. He argues that, if we care about the authority figure (for example, because his failure may cause him to get divorced, or get fired), his failure may be taken as sad. Reversal theory (section 2.6), for instance, would explain this situation because of the sudden serious context of the situation. 3.3.2 FAtiMA FAtiMA stands for FearNot! Affective Mind Architecture and it is an architecture for agents with emotions. There have been several scenarios based on FAtiMA, such as FearNot! [7], an IS system in which the user controls a victim of bullying that aims at teaching children to cope with such situations, or ORIENT [6], a work that promotes intercultural empathy. FAtiMA has 22 also been extended with several components, such as a model of empathy [6] or a Theory of Mind (ToM) . It has been suggested that FAtiMA agents could support a double appraisal mechanism, in which the agent reappraises a selected action according to the emotional impact in others [42]. This could make agents behave more like actors and less like characters, who evaluate the dramatic interest of an action. Indeed double appraisal mechanisms have been shown to create more interesting narratives [41]. Another extension of FAtiMA that has been proposed and is currently under implementation aims at making emotional intelligent agents [21]. An agent with emotional intelligence should be able to: • use emotions in decision making; • understand and reason about emotions (for example, Anger indicates a possible intention of harming other); • manage emotions, this is, try to elicit certain emotions in oneself and in others. In FAtiMA, characters can translate events to emotions according to the OCC appraisal rules (see section) defined for each character. This component would work by having these appraisal rules modeled as planning operators. For example, the rule that states an undesirable event causes distress would be mapped as an operator that has a precondition that the event is undesirable, and as effect the distress caused. FAtiMA’s reactive rules, or Action Tendencies, would also be mapped in such a way. These Action Tendencies are triggered by a specific emotional state which would be represented as the preconditon of the operator. For example, an Action Tendency of crying at a high level of distress would be mapped as an operator with the level of distress as precondition and crying as an effect. If a character had a goal of making another character cry, it would try make a plan to cause the level of distress that is represented in the precondition of the Action Tendencie. To evaluate an action according to its emotional output it would make use of the operators developed from the characters. Since we chose to use FAtiMA in our project (both because of its availability and features such as OCC appraisal mechanism or ToM), we will discusse it in more detail (specifically about the components used in, or altered for, the Laugh To Me prototype) in chapter 6. 3.3.3 Madame Bovary An Interactive Storytelling in which emotions had a lot of focus was a work by Pizzi et al. based on Gustave Flaubert’s novel Madame Bovary [52],[51],[17]. They remarked that this was a “psychological novel” and, as such, character’s feelings and relationships with other characters had a greater role than the actual action outcome. They proposed describing narrative in terms of the agents feelings. These feelings, rather than being based in an appraisal theory (such as OCC which we describe in section 5.1.1) were inspired by Flaubert’s own preparatory works for the novel – for example, anger-towards(x,y), establishes x felt anger towards y. Each of this 23 feelings can have three values, Low, Medium, or High, considering how much they relate to the character at one point in time. In this system, each character plans its actions using an HSP planner (see section 3.2.1). To allow to direct characters’ behaviours according to their psychological states, the planning operators reflect in their preconditions the emotional state needed for the character to select a certain action, and in its effects how the action will make the character feel. For example, one operator is related to Emma (the protagonist of Madame Bovary, married to Charles) asking Rodolphe (potential lover of Emma) to run away with her. Preconditions state that this can only happen if Emma has high affinity with Rodolphe, and accepts the risk of adultery towards Charles; effects of this action are rising Emma’s feeling of womanhood and excitement to High. Besides these action operators, that related to physical actions and communication, the characters also have interpretation operators. These work in a similar manner to appraisal mechanisms (such as those found in FAtiMA), updating the character’s feelings according to the events. The contribution of each action to achieve the character desired emotional state is given by the heuristic function used by the HSP planner, in which a low value means that the character is closer to a desired emotional state and a high value means it is more distant. Pizzi et al. suggested the use of values of this function to measure narrative progression from the point of view of the character. Figure 3.3: Emma Bovary decides to sing to fight boredom (taken from [51]) The character can thus be aware of how its situation evolves, and have specific feelings about 24 it. For example, if the heuristic value fails to decrease, the character is finding several obstacles that separate it from its desired emotional state. As an effect it may feel boredom. This may affect the character’s actions. In one example, Cavazza et al. show a bored Emma deciding to sing, in order to reduce her boredom level (see image 3.3). Another example given is an initial decrease of the heuristic, which gives a character hope, followed by a prolonged increase. This can, according to Pizzi et al. “correspond (...) to the narrative notion of ’shattered hopes’ ”[51]. 3.3.4 Façade Façade is a system developed by Michael Mateas and Andrew Stern [45] that depicts a dinner with a couple, Grace and Trip, to which the user was invited. The viewer is dragged to a marital discussion between Grace and Trip, and invited to interact (and possibly try and remedy the situation and bring the couple back together). Façade is different from other systems mentioned until now because story progression is not purely character based, but also plot based. This means that the action does not evolve simply because of the actions each character does, but also as the result of a plot that is managed independently at a higher level. User’s interaction is integrated with Façade as part of abstract social games, that vary depending on which part of the narrative the user is in. In the first part there are two social games: the affinity game in which Trip and Grace try to tell which side the user takes, according to its input and the hot button game in which the user may introduce polemic topics, such as sex or divorce, to try and provoke reactions from Trip and Grace and get to know the characters and their backstory. The second part of the story is tied to the therapy game, in which the user contributes to the each character’s degree of self-realization about their own problems. The system also keeps track of the tension level, which is affected by player interaction in the various social games, as well as by the interaction between Trip and Grace. Façade’s story is organized by several small elements that are then sequenced in various orders. Joint Dialogue Behaviours (JDB) are small coordinated dialogs between Grace and Trip about a certain topic. This JDBs are then grouped and sequenced within a structure called a beat. These beats sequence JDBs in order to incorporate the user input and to explore certain topics that may make the social game in action progress. The beats are then sequenced by a higher level drama manager that handles high-level plot decisions and that is only active when a beat is finished or aborted. This drama manager tries to sequence the beats in order to make the story evolve according to an author defined tension arc. Blom and Beckhaus had proposed an extension of Façade that consisted precisely of using a tension value to control the story progression [12]. Blom and Beckhaus proposed modeling not only the story’s tension, but also the expected user’s tension in relation to the story. They used the term Emotional Storytelling to refer to the concept of explicitly controlling the emotional content of the story to create an engaging user experience. 25 26 Chapter 4 Humour sketch model As we saw in chapter 1 the problem we are trying to solve is defined by the question: “How can we endow autonomous agents with the capacity of creating a comic situation or sketch?” We start by proposing a model of sketch that can be acted by such agent actors or characters. As we have seen in chapter 2, there is neither a consensual theory of humour nor a theory with a description precise enough to be almost directly translated to a computational system (outside of Raskin and Attardo’s GTVH theory for linguistic humour). Our model results mainly of the general ideas presented by incongruity-resolution and tension-relief theories (sections 2.3 and 2.5 respectively), taking in account the more practical knowledge described in section 2.7. First, we present a more concrete definition of incongruity, establishing several categories and using a character-centric perspective, since the model is to be implemented by autonomous agents. We use incongruities for creating both the comic premise and the punchline (if there is any). We also propose a defined structure with beginning, middle and end, as hinted by Vorhaus and Perret (see 2.7). We present the end as the resolution of the sketch, but we take the concept in a more generic way than that defined by incongruity-resolution theories. Note that not all sketches end by resolving an incongruity. Famous Monty Python’s Dead Parrot sketch [56], for instance, about a pet shop seller who refuses to accept there is anything wrong with the dead parrot he sold beforehand to the client, ends with a colonel entering the scene and ordering the show to “get on with it”. This is not a punchline and it does not present a solution to the incongruous premise between the seller and the client. Nevertheless it “resolves” the sketch in the sense it provides a concrete ending to wrap it up. We also want to model the pacing and timing of the sketch itself. We do this by establishing a connection between the evolution of the character’s actions and their emotional state. Thus we model this middle section as a tension-relief mechanism, taking in account the sketch models presented by Perret and Vorhaus. However, rather than providing a model of tension, as used by Façade for instance (see section 3.3.4), we handle the full emotional range, providing the context in which to trigger the resolution. We call this buildup mechanism emotional escalation. We create agents that actively contributes to this emotional escalation through their action selection. Thus our agents are not mere characters but actors, like those described in the double 27 appraisal work discussed in section 3.3.2. In describing our model we give examples of actual sketches and how we can see them in terms of our model. 4.1 Incongruence We define incongruence as a gap or inconsistence, which can be exploited for comic purposes. In this sense, the comic premise, which Vorhaus defines as “the gap between comic reality and real reality” can be described as the incongruence that establishes the high concept of the sketch. A comic reality is like a caricature of real reality, a distorted vision in which some features are emphasized, downplayed, or contrasted with their opposite. For the purposes of our model we consider incongruences as the result of the contrast of actions, context or emotions, or a combination of these three. We will pay special attention to emotional incongruences. 4.1.1 Action Incongruence An action incongruence would be one in which the actions of the character conflict with the character’s ultimate goal or even the simple common sense of how he should act. This has been the type of incongruence more often explored in Interactive Comedy (see section 3.2.2). It is common in slapstick comedy, a type of comedy characterized by physical humour. A famous scene of City Lights shows the inauguration of a new statue. When unveiled it reveals Charlie Chaplin sleeping on it [18]. He wakes up to see a crowd of people calling him to get off the statue. While trying to do that he gets stuck on a sword wielded by the statue a situation of which he fails to get away from. When the anthem plays he tries to stand up in a respectful position, holding his hat over his chest. After the anthem stops the crowd resumes to wave him off, and Chaplin keeps trying to get off the statue. When he gets off the sword he lands his bottom on the statues face. The incongruence here is the constant gap between the intention behind the action and its actual effect. An incongruence also ensues when a character’s actions are inconsistent with what is expected of the character. In the sketch comedy show Little Britain, one of the running gags consists of the poor job ethic of character Carol. Carol, who works in services (originally a bank), and when asked something by the customer, takes her time and ends up replying “computer says no”. She always fail to be empathic with the customers, and her response is always void of emotion, which makes this character an example of emotional incongruity as well. 4.1.2 Emotional Incongruence An emotional incongruence can be defined as a contradiction between the characters expected and actual appraisal of events. A good example is a sketch of the British comedy show A Bit Of Fry And Laurie, in which a client enters a store to buy a Jason Donovan single [32]. Jason Donovan is an Australian pop singer. The client felt overwhelming shame when asking for Jason Donovan’s music, so he tried to cover it up, by asking for condoms as well. Buying condoms is usually not something people boast about or feel very comfortable buying. However the client speaks loud and clear when asking for the condom packets, and in a low, hindered voice when requesting Donovan’s single. The actions asking for condom packets and asking for a Jason 28 Donovan single have unexpected and incongruent appraisals. The sketch evolves by the seller repeatedly asking the client to clarify his request of the Jason Donovan single, until the client’s growing shame and distress lead him to leave the store without accomplishing his initial goal. In this example of emotional incongruity, the disparity results from the overreaction of one of the character’s. The incongruity can also result of a gross devaluation of the event. In a Monty Python sketch [56], a man enters a cheese shop asks one by one, all sorts of cheese: Roquefort, Camembert, Gorgonzola, etc. The seller always says he is out of that particular kind, until the client finally asks the vendor if he has any cheese at all, and the seller replies “Yes, sir”. The client breathes in and proceeds to say “Now, I’m gonna ask you that question once more... and if you say no, I’m going to shoot you through the head. Now, do you have any cheese at all?” This time the sellers replies no, and the client shoots him, showing hardly any reaction. He then remarks “what a senseless waste of human life” with very slight contempt, and puts on a cowboy hat. This is an example of deadpan. Deadpan delivery happens when situations are acted with no perceivable change in emotion. 4.1.3 Context Incongruity A context incongruity happens when the character acts as if it was in a different context than that it actually founds him or herself in. In the Friend’s episode The One With Ross’s New Girlfriend [31], there is a scene in which Phoebe cuts Monica’s hair. Monica asks for a haircut resembling Drew Barrymore’s, but Phoebe understands it wrong and cuts her hair as that of Dudley Moore. This would be an example of action incongruity. The scene that follows Monica’s bad haircut, however, is handled like if it was a surgical procedure. Phoebe comes out and explains the situation like a clinical matter, saying sentences resembling those you would hear in a hospital such as “I’m not going to lie to you. It’s bad.” and “we managed to stop the curling”. 4.2 Comic characters We define a comic character as a character that acts or reacts in an incongruent manner in relation to what is expected of him, given the situation. The main source of incongruence in our model, being basically a character centered approach, is a character personality. We consider the same definition of personality behind FAtiMA agents, which consists of a set of goals, appraisal rules and emotional reactions. All of these define the character and can be tweaked to create a comic perspective (see section 2.7), this is, a unique vision that is incongruent with expectations, but coherent with the character itself. In a sketch of A Bit Of Fry And Laurie [32], an outraged father comes to complain to the school principal of what his son was taught in school. The boy had told his mother “sexual intercourse can often bring about pregnancy in the adult female”. The father’s distorted view of sexuality and sexual education made this sentence feel very obscene, and as such he showed reproach towards the teacher and the school. As the sketch develops, his outrage is more and more pronounced. At the same time the father views his son as a good that can be traded, and has no actual emotional connection to him. He asks the principal to exchange his son, and accepts with joy a 29 couple of locusts from the lab in return. This is a rather uncommon (to say the least) appraisal of the world, which makes this father a comic character. The principal, on the other hand, reacts as expected, except for the fact that he was more willing than usual to accept and negotiate with the father. Thus the principal is a regular character. Whether or not the scene should feature two comic characters in opposition, depends on what it takes to enhance the one character’s comic perspective and develop the conflict. Sometimes a sketch features several comic characters that have the same comic perspective, still they can conflict with each other. In the four Yorkshiremen sketch, Monty Python act four rich men complaining (or rather, bragging) of who came from the poorest background in their youth, and each one tries to beat out the next. One of them says “there were 150 of us, living in a shoe box, in the middle of the road.” Other one asks “cardboard box?” and, after a positive response, remarks “you were lucky” and tells his own tale of disgrace [56]. Sketch comedy characters are also uncomplicated. Since they will act during a small period of time, there is usually only one or two traits of the characters in focus at a time. For the father in the Fry and Laurie’s sketch would be his odd praiseworthiness appraisal of simple biological lessons and of exchanging his son as a commercial good. For the Yorkshiremen would be the desirability of bragging over more and more extreme poverty situations. 4.3 Sketch Beginning And End Sketches are small scenes that usually follow a straightforward structure, with beginning, middle and end. Since the middle section of the sketch is the main focus of this thesis, we will address the beginning and ending of sketches in this section, and leave a more lengthy look at the buildup for section 4.4. The beginning of the sketch is where the characters meet and their conflict is established. This conflict is usually the comic premise (see section 2.7.1) or a direct result of it. This first part needs to set up the context to the scene and provide a reason for the two characters to interact despite their conflict - a relationship between the characters. These can be salesman and customers, doctors and patients, teacher and student, or any other relationship. Besides the particularities of each character and of the situation, this relationship and a subject are what define the conflict. In the cheese shop sketch, the context would be that of a cheese shop, with a customer and a client, and the subject of conflict is the cheese itself. In the A Bit Of Fry And Laurie sketch where the parent meets the principal to talk about his kids biology class, the context is the principals office, and the subject is his kid’s lesson on female pregnancy, which he considers as foul language. The sketch always goes towards a resolution, a closure. This is where the conflict ends. This ending is usually marked with a punchline, though, as we mentioned before, that is not necessarily always the case. A punchline is a shift of context, a new piece of information that makes us rethink what we have just seen. In incongruence-resolution theory, this is the resolution element. Usually the punchline follows some hints given during the sketch until it reaches a conclusion. In the sex education sketch the parent trades his son for a couple of locusts 30 after explaining his thought process, comparing this exchange to trading a flawed product at a store. A common type of sketch punchline is changing the frame of reference altogether, as Vorhaus suggests (see section 2.7.2). In one A Bit Of Fry And Laurie [32] sketch a man is shown a series of holiday photos by another man, and expresses vocally his growing boredom and distress, until finally bursting, screaming and crying. The man then replies “right, you touch my daughter again, and it will be a slideshow, you understand?” This example also shows how important it is that, before a punchline, the viewer has all the information needed to understand it. It is just as important that he has none more than that. Every bit that is not trimmed out has the potential to dampen the surprise and the resulting effect of the punchline. Surprise is a central element in incongruity-resolution theory (see section 2.3) and comedy writers do stress the need for attention to detail in comedy writing (see section 2.7.1). Not all sketches end up with such a context shift. In the Jason Donovan sketch (see section 4.1.2) the customer simply gives up on buying Donovan’s single. Note, however, that the conflict was resolved nonetheless (in the sense that it had a closure). 4.4 Emotional Escalation Sketches only reach their full potential when the sequence of actions is properly defined, paced and timed. In between the beginning and the end, characters explore the comic premise by adding to the scene tension. The characters build up on their original conflict, until an emotional peak is reached. A buildup, however, needs to be gradual. Take the cheese shop sketch (see section 4.1.2). Each cheese the customer asks for adds to the tension. At one point in the sketch, the seller starts to tell which types of cheese he does have but the customer replies “don’t tell me, I’m keen to guess.” In the holiday photo sketch, the man watching photos is not crying all along - he gets there by watching photo after photo. One such progressive set of actions leading to an emotional peak is shown in a sketch by Portuguese comedy group Gato Fedorento. It features two policeman and a criminal doing a “good cop, extremely good cop” routine, instead of the traditional “good cop, bad cop”, during an interrogation [30]. The second cop to enter the scene offers the suspect tea and cookies, assuming he is hungry because of the time spent in the interrogation. He then puts the suspect’s cigarette off, explaining its time for the suspect to eat something, and that he can’t spend so much time without eating. The criminal complains about having the cigarette put off. The officer replies he has bought more cigarette packs for him to have after eating, and that he even asked a friend of the suspect what his favourite brand was, and even says he’s saving a joint for him, even though it is illegal (this incongruence is noted in a side joke, in which the officer says he rolls joints for his friends, but then he busts them). He proceeds trying to please the criminal by offering to open a window assuming the suspect feels hot, then a coat in case he is cold; the suspect says he is neither hot nor cold. The officer goes on to ask what the suspect wants for dinner, proposing a lasagna; the suspect says he’s not hungry. The officer offers other options, the suspect ends up saying the lasagna is fine. The 31 policeman then offers to squeeze a black spot off the suspect’s face. Then he offers him a foot massage. The policeman’s actions obviously become more and more exaggerated as the sketch evolves. The suspect’s distress and annoyance grow as a direct result of the interrogator’s actions. The policeman feels he has indisposed the criminal and starts apologizing profusely. In the process of apologizing, he tries to calm down the suspect by offering a foot massage again, which he thinks would have calming effects. This attitude once again annoys the suspect, which triggers the resolution of the sketch: the suspect gives in and accepts to sign a confession in order to end his “torture.” We call this progressive rising of emotions towards a peak an Emotional Escalation. 4.5 Comic agent model As we have seen, a sketch’s Emotional Escalation is the direct result of the actions of the characters. We propose that each agent, instead of representing only a passive character that acts according to its traits, can also act to arouse, in itself and others, emotions that can lead to a comic resolution. Thus we propose the agents have an action selection mechanism that takes in account the emotional impact of the actions in the scene. We present our conceptual model for a comic agent in figure 4.1. Figure 4.1: Comic agent conceptual model. In figure 4.1 we can see each agent has an action selection mechanism that takes in account the Emotional Goals of the agent. This mechanism must comply with our progressive emotional escalation model. This means it should not only depend on what emotions the agent wants to evoke, but also on how he wants to evoke them over time. We propose the agents selection mechanism takes in account Emotional Guidelines. An Emotional Guideline is a monotonically increasing function that maps the scene time (which may be represented by the number of character’s actions) into the desired emotional output for a given emotion (we assume in our model that the emotion model used provides some sort of measurement of the intensity of emotions at a given time; in FAtiMA it could be the emotion potential). This defines the pacing of the sketch, because we can make the Emotional Escalation grow faster or slower by simply 32 changing the growth parameters of the Guidelines. The action selection mechanism would work by simulating the possible actions and analyzing the expected emotional output according to the desired emotional output value. This value is provided by an Emotional Guideline, taking in consideration the moment of the scene in which the action selection happens. The action selection will try to find the best match for the set of Emotional Guidelines, according to two factors: the action selected should have an emotional output higher than the last action selected before that; the action selected should have a simulated emotional output that is as close to the guideline as possible. As can be seen in figure 4.1, we group Emotional Guidelines into Emotional Goals. Each Emotional Goal should be activated by a number of preconditions, so that there is better control to when and which Emotional Guidelines are activated at a time. This could allow the agent to shift Emotional Guidelines during the sketch, by taking into consideration changes in the world – for example, reactions from other agents. If the character is comic, he should have a personality that is emotional incongruent with what should be expected of him. The rules that define his appraisal of the world should be incompatible with what one would normally expect. That allows for him to choose comic actions and create a conflict with other character. As an example consider the Gato Fedorento sketch presented earlier. An agent representing the character of the nice cop would have as an Emotional Goal to make the suspect feel Joy. However, his perception of what causes Joy is inconsistent with that of the suspect, which is an emotional incongruity defined by his personality. How hard he would try to make the other character feel Joy would depend on the growth rate of the Emotional Guideline function. We propose to divide a sketch in three distinct periods, the beginning, the middle and the punchline or resolution. The beginning is scripted, and establishes the conflict. The resolution, or resolutions, are also scripted, but have a set of preconditions that determines when they are activated. A punchline can only be activated when the viewer has sufficient knowledge of the scene to understand it, and when the emotional context is right. In between the actions are oriented through emotional goals (which, as stated, consist of a set of emotional guidelines). This will try to provoke an emotional escalation. These emotional goals are also subject to preconditions, which means that the agents may pursue the emotional escalation in different manners at different times in the sketch. This emotional goals and guidelines, together with the emotional preconditions defined for the punchline are what define the timing and pace of the sketch. 4.6 Heuristic function To influence action selection in order to comply to a given Emotional Guideline, i.e., to try and select actions for which the expected emotional output is as similar to what is defined by the guidelines as possible, we propose a specific heuristic. We assume the agent is capable of appraising actions and events to generate emotions which can be measured – we call this value the emotional output. The emotion output defined by a given guideline at a certain point is the desired emotional output. We call our heuristic cost the emotion weight. The agent tries to select a plan that minimizes the value of the emotion weight. 33 The emotion weight defines how much an action contributes to make the scene comply to the Emotional Guidelines. In considering this there are some aspects to have in mind: • the worst case scenario is when an action does not generate any of the emotions defined by the various guidelines; • the same action can contribute to one guideline defined emotion and not to other that is also active; • an action may contribute to a guideline but not to increase the current emotional output; • we need to consider how far the expected output is from the desired emotional output at a given time; • the best case scenario is an action that increases or maintains the current emotion output and has the minimum difference to the desired emotional output. To know whether an action contributes to a given Emotional Guideline or not, we simulate the emotional state. This process is similar to that of the double appraisal work described in 3.3.2. The agent appraises the event that would be generated should he execute that action, and gets a simulated emotional state. This simulated emotional state must be sufficient to know whether the action generates the emotion defined by each guideline and if so what is the expected output of that emotion. When calculating the heuristic, we need access to all of the currently active Emotional Guidelines (this is, guidelines that pertain to active Emotional Goals). We calculate the emotion weight for each guideline separately, taking in account special cases, such as when an action does not contribute for any active guideline. When evaluating the contribution of the action for each active Emotional Guideline, there are several cases to consider. If an action generates the emotion defined by the guideline, we consider the absolute value of the difference between the expected emotional output and the desired emotional output. It does not matter whether the expected emotional output is above or bellow the desired emotional output. We want the growth to be as similar as possible to that of the emotional guideline and that is why we consider the absolute difference. What is important to take in account, though, is whether the last action selected by the character was stronger than this one or not (since Emotional Escalation should be monotonically increasing). When calculating the emotion weight of an action, we consider whether its expected emotional output is greater or lesser than that of the last action selected. If it is smaller we add a penalty to the emotion weight value. This penalty needs to have some importance but not so much that it rejects the action no matter if it fits all other guidelines better or not. Thus the formula for the emotion weight of each guideline for a certain action (that generates the desired emotion) is: emotionW eight = emotionP enalty + 10 ∗ |expectedEmotOutput − desiredEmotOutput| If an action does not generate the emotion defined by the guideline, we calculate the emotion weight taking in account the expected emotional output as 0 and add the expected emotional 34 output of the last selected action. This rule is as penalizing as the last action was strong. This allows the agent to, at first, consider actions that only contributes to a certain emotional guideline and, as the action develops, be stricter with which actions he chooses. Thus, the emotion weight of each guideline for a certain action (that does not generate the desired emotion) is: emotionW eight = lastActionExpectedEmotOutput + 10 ∗ desiredEmotOutput We also decided that the contribution to each guideline should be relative to their importance to the scene. Since we want our emotions to escalate, guidelines that currently have the biggest desired emotional outputs are defined as more important. As such, the contribution of each guideline is multiplied by a factor, between 0 and 1, that reflects how an Emotional Guideline currently compares to others. This factor is given by calculating the fraction of the desired emotional output of the guideline currently being tested relative to the total sum of the desired emotional outputs of all the guidelines. Translating this into a formula we have, for n guidelines: f inalEmotionW eight = emotionW eight ∗ desiredEmotOutput n X desiredEmotOutputi i=0 Finally, when an action does not contribute to any of the guidelines, it should be more penalized than it is by this last rule. In this case the emotion weight is a constant value that is equal to the maximum emotion weight that could be defined by the previous two rules. This emotion weight is multiplied it by the number of active guidelines plus one, to ensure the action (or the plan containing it) is always rejected if there are alternatives. 35 36 Chapter 5 FAtiMA and ION Frameworks In this chapter we take an overview of the frameworks and architectures that we used (and extended) to implement our model. The purpose of this chapter is to summarize the properties of these frameworks that are relevant to our implementation, and we refer back to this chapter throughout chapter 6. 5.1 FAtiMA FAtiMA [21] is an agent architecture in which the agent’s behaviour is particularly influenced by its emotions and personality. FAtiMA has been used in a variety of scenarios (see 3.3.2). For each prototype, FAtiMA was expanded with new features. Because of this FAtiMA has been refactored to have a modular component based architecture. In this prototype we make use of the core, reactive, deliberative, OCC affect derivation and theory of mind components. The main cycle of the Core layer iterates cyclically through the components. It is in this cycle that the agent appraises the events. The events are changes in the world. When an event regards an action, it defines the subject which is the agent who performs the action, and the target which the target of the action. Events can be divided into three types, according to how recent they are: • Past event – refers to any event that is part of the agent’s memory. • Recent event – refers only to events that are part of the current episode. Since our prototype only deals with one episode, this and the past event have the same significance. • New event – refers to events that have just happened, within a very short time interval. The output of the core component depends on the components that are added, as the core component itself does not suffice for the agent to act. Thus the specifics of how a character appraises an event or how it selects an action depend fully on the components added to the Core layer. FAtiMA itself is independent of the appraisal system used. We used the OCC (see section 5.1.1) model of emotions, which was already implemented and is used frequently in Interactive Storytelling [13], [29], [2]. 37 5.1.1 OCC model of emotions OCC stands for Ortony, Clore and Collins [49], who proposed an appraisal model of emotions that has been widely used to make agents with emotions in Interactive Storytelling. Appraisal refers to the process by which we map the world to our emotions. Seeing someone we like, for example, is something we may find desirable – we are appraising the event of seeing the person we like. OCC encompasses in total 22 valenced emotion types. Valenced means that these emotions always have a negative or a positive charge: for example Joy has a positive valence while Distress has a negative valence. OCC proposes a hierarchical organization of these emotion types according to how we appraise them. The several groups are well being, attribution, fortune of others, attraction and prospect based emotions. The OCC model considers several appraisal variables. The ones we make use of in our implementation are praiseworthiness, desirability, and desirability for other. Praiseworthiness refers to the moral aspect of an action, whether it is right or wrong. Desirability defines how pleasing an event is, in general. A desirable event is so for every agent. Desirability for other concerns only how pleasing an event is for other than the agent itself. We can give each OCC emotion a numeric value. The value calculated directly after the appraisal is called the emotion potential. However we also have an emotional threshold defined for each emotion that defines how impervious a person is to a certain emotion. If the appraised potential of an emotion is below the threshold, the emotion is not felt. The strength to which we feel an emotion is called intensity, and is defined as the difference between the emotion potential and the threshold. OCC emotion also defines the decay rate that defines how fast a certain emotion fades, that is, how much the intensity of an emotion decreases across time. A bigger decay rate means the emotion fades away faster. The only way a certain emotion ceases to be felt according to the OCC model is by this process of decaying over time. 5.1.2 Appraisal The appraisal process is the process by which the character evaluates the events and consequently derives emotions. FAtiMA stores the intermediate appraisal values in an Appraisal Frame. An Appraisal Frame stores the appraisal variables in relation to an event, and is modified by an agent’s appraisal components. The agent’s Affect Derivation components are then responsible for taking the Appraisal Frame and update the agent’s emotions accordingly (we will discuss this in section 5.1.3). These appraisal variables are tied to events through emotional reactions declared by specific authoring (see section 6.5). Each appraisal variable can have a value between -10 and 10 for each event. The Reactive component is responsible for taking these appraisal variables into the Appraisal Frame. The emotional reaction rules are organized in a tree by the Reactive component. When an event is matched, the Reactive component tries to choose the reaction that is better defined. This can be useful for creating different reactions when an action is done by the agent and when it is done by others. For the latter case, a default rule would just have the event name, while for the former, one could define a specific rule in which the subject was the agent itself. 38 5.1.3 Affect Derivation After the appraisal of the events, the OCC Affect Derivation component derives the emotions, according to its implementation of the OCC model (see section 5.1.1). As mentioned, OCC emotions are divided hierarchically into several groups, depending on which appraisal variables influence its derivation. The emotions are derived with a potential between 0 and 10 (remember they are only felt if the potential is bigger than the threshold, and we call intensity to the difference between the potential and threshold). We give an overview of how the emotions considered in our implementation are generated, depending on whether the appraisal variables are positive or negative and according to the action subject (other or self), in figure 5.1. Figure 5.1: Affect derivation from appraisal depending on the action subject (self or other relative to the agent appraising) and whether the appraisal variable is positive or negative. Well-being emotions depend only on the desirability appraisal variable. A positive desirability value creates Joy, while a negative value outputs Distress. The base potential is given by the absolute value of the desirability variable: BaseP otential = |Desirability| Attribution emotions depend on the praiseworthiness appraisal variable and on the subject of the action. Positive praiseworthiness can lead to Pride or Admiration depending on the subject of the event being the agent itself or other. Negative praiseworthiness outputs Shame and Reproach. The base potential is calculated in a similar fashion to the well being variables, taking in account the absolute value of the associated praiseworthiness variable: BaseP otential = |P raiseworthiness| The “fortune of others” emotions are the ones resulting from the “desirability” and “desirability for other” appraisal variable. A desirable action can generate the Happy-For emotion if it 39 is also desirable for others, or the Gloating emotion if it is not. An undesirable action generates either Resentment or Pity. The formula for the base potential of these emotions is the average of the absolute value of both appraisal variables: BaseP otential = (|Desirability| + |DesirabilityF orOther|)/2 . There are also composed emotions, that derive from other emotions. Distress associated with Shame results in Remorse, while when associated with Reproach it generates Anger. On the other hand Joy can associate with Admiration to generate Gratitude or with Pride to elicit Gratification. Note we did not take into account in our prototype neither attraction nor prospective-based emotions. Attraction emotions are related to the relationships between the characters. Handling these relationships would represent unnecessary overhead taking into account our purposes. There are only two characters and the focus is the interaction between them, meaning their emotional state already represents how they feel towards each other. The decision to not take in account prospective-based emotions results of how these emotions are generated and our unusual use of active pursuit goals (see section 6.5.1). Besides emotions, agents also have a mood. This mood has impact in emotions and vice versa. This impact takes into account the valence of each emotion. A positive mood increases the potential of positive emotions and decreases that of negative emotions. A negative mood works the opposite way. The mood is also changed by the potential of the emotions. It is affected negatively by the potential of negative emotions, and positively by the potential of positively valenced emotions. It was determined that the contribution of each emotion to the mood is 10% of its potential. The output of emotions done by the OCC Affect Derivation component also considers the decay rate, defined in the emotional threshold rules. For each emotion an agent defines a certain decay rate, from 0 to 10. This establishes how fast or slow does an emotion fade through time. 5.1.4 Agent behaviour In the previous sections we saw how the agent perceives the world. The behaviour components define how the agent acts in the world. The agent has both reactive and deliberative components that ultimately define its behaviour. While the deliberative component chooses actions by constructing plans derived from the active goals, the Reactive simply chooses whether or not to activate an action based on the emotional state of the agent. The deliberative component builds a plan according to an Active Pursuit goal. An Active Pursuit goal is defined by a name, a set of preconditions, success and failure conditions. The preconditions consist of predicates about properties, events or emotions that, when true activate the goal. All of the preconditions need to be true in order for a goal to be activated. Success conditions are a set of predicates the agent tries to achieve. Failure conditions define when a character failed to reach a certain goal. Each character defines the Active Pursuit goal together with the importance of success and 40 failure. In each cycle, the Deliberative component selects as the current intention an Active Pursuit goal whose preconditions are true, if there is any. FAtiMA’s Deliberative component uses a partial order STRIPS[60] like planner. The planner takes the goals success conditions as open preconditions, that is, the conditions that the planner will try to accomplish. These open preconditions are matched with the effects of the available actions, to build a first set of plans. It then runs an heuristic function (section 6.3) to determine which of these plans is the best so far. If the plan chosen by this process has no more open preconditions (including the preconditions of the actions in the plan, not only of the goal), the first action of the plan is selected to be executed. If that is not the case, the agent repeats its cycle, and the Deliberative component repeats the process: first, it tries to match the open preconditions of the current best plan, thus replacing that original plan; then, it evaluates all the plans again; finally it checks if there is or not an action for execution. There are two other types of goals besides Active Pursuit: Interest goals and Emotional goals. Emotional goals are a part of our implementation and, as such, will be discussed in section 6.2. Interest goals define a number of properties the agent wishes to preserve. The agent does not actively try to achieve these properties. However the planner takes into account the inter-goal threats, this is, whether the plan puts at risk the conditions an Interest goal is set to preserve. The reactive behaviour of the character is defined by a set of Action Tendencies. An Action Tendency is characterized by its name, a set of preconditions and an eliciting emotion. The set of preconditions determines when the specific Action Tendency applies. The eliciting emotion is the emotion that triggers the Action Tendency. When the agent experiences the eliciting emotion with more than a certain minimum intensity (defined in authoring), it activates the Action Tendency. When a character has an ambiguous choice between Action Tendencies he chooses the one associated with the strongest emotion. 5.1.5 Theory of Mind component A Theory of Mind (ToM) is a term coined by Premack and Woodruff [54] that refers to the ability of forming a mental model of others. It has two main functions. One is to represent the other’s knowledge, beliefs, intentions, etc. The second is to allow reasoning about this knowledge, predicting other’s behaviour or explaining their actions. FAtiMA’s ToM component allows agents to gather and store knowledge about other agents. This component works by creating a simplified model of the self that is associated with others and updated according to the perceived events. As the agent perceive events, the ToM component evaluates these events from the perspective of other agents, using this model. This includes appraising and altering the emotional state as well as updating the memory. We can also simulate events on other characters to predict how others would react. Though agents initially model others according to their own view of the world, these models can be updated, since agents have different behaviours defined (different emotional thresholds or action tendencies, for example). As such, the ToM component also attempts to read the emotions of the characters following a given event. 41 5.1.6 Authoring files Authoring is done via XML files that define several parameters. There are four types of authoring files in FAtiMA: one for the scenario, one for the goal library, one for the action library and one for the character’s personality. The scenario file defines what objects and agents are part of the world. It also establishes properties of the agents, for example, establishing the agent is a person. It is also where the connection to the sockets that allow the communication of the agents with the world simulation is defined (see 5.2). The goal library file describes what goals are available to the agents, while the action library file gathers the actions the agent can use to achieve a goal. As mentioned before, goals consist of preconditions, success conditions and failure conditions, while actions have preconditions and effects (as well as a probability that is used in planning). We use basically three types of conditions: • Emotion condition – it is true when a given emotion has a higher intensity than a given minimum value; • Event condition – can define whether a certain event took place or not. There are three types of event conditions, for new, recent and past events; • Property condition – it checks a certain property of an agent or an object, relating a name and a value, and verifying whether they are equal or not. The preconditions of goals and actions as well as a goal’s success and failure conditions can be of any of these types. The effects of an action consist of property conditions that become true after the action has been executed. All the actions for which the preconditions are verified can be used in planning by all the agents. Whether a goal can be activated by a character or not, however, is defined in their personality file. The personality file defines how the character appraises and reacts to the events, as well as how it acts in the world, by defining their goals and their reactive behaviour. When a character’s possible goals are set, the importance of their success or failure is also defined by numerical parameters. As it was mentioned in section 5.1.4, these are used to choose between conflicting goals, when several happen to be activated at the same time. They are also relevant in generating prospective-based emotions. A big part of the personality of the character is defined by the emotion thresholds and decay rates. This means defining the threshold and decay rates of the emotions of the character. As defined by the OCC model, the threshold defines the level of potential of an emotion above which this emotion is actually felt by the agent. The intensity of an emotion equals the potential minus the threshold. As for the decay rate, it defines how fast or how slow an emotion fades through time. The emotion thresholds and decay rates thus describe how sensitive or impervious a character to a certain emotion. For example, introverts are usually more emotional, which means, that they would have lower emotion thresholds [23]. Neurotic characters feel most negative emotions 42 in a stronger manner than most, thus their attributed thresholds and decay rates for emotions such as Distress, Fear, Disappointment or Remorse would be very low. A personality file also defines the emotional reactions of the character, i.e., how they appraise each event. Each emotional reaction is connected to an event, but one can define different appraisal rules for when the subject of the event is the agent itself or anyone else. These emotional reactions are characterized by the appraisal variables (Desirability, DesirabilityForOther and Praiseworthiness). The character file also specifies the Action Tendencies of the character. 5.2 ION and Unity3D To make a graphic output for our prototype we used Unity 3D, a game engine with a drag and drop interface that allows scripting in C#. The core element of a Unity application (or game) is the Game Object. These Game Objects are composed by several properties and scripts that extend its behaviour. Initially they only have a transform, defining their position in the world, but its possible to add models, textures, physical properties, etc. To bridge the FAtiMA minds with the graphical system we used the ION framework [67], which provides a simulation for dynamic multiagent environments. The ION framework communicates the FAtiMA minds through sockets, and considers four basic elements: Entities, Properties, Actions and Events. ION follows the Observer pattern [33], and uses the Event mechanism to manage the other three elements. Entities register to Events that are raised when a Property or an Action state changes. The ION framework provides several scripts that are added to the Game Objects defined in Unity to connect it to the FAtiMA agents and the world simulation. Each Property defines an event handler for when it is changed, while each Action provides three event handlers: Start when the Action begins, Step, for each time the Action is updated, and Stop, that is raised when the Action finishes. 43 44 Chapter 6 Laugh To Me prototype In this chapter we detail how we implemented a proof-of-concept prototype of our model. We used FAtiMA (see section 5.1) to create the agents mind, and extended it to be able to use Emotional Goals and Guidelines and comply to our model defined in chapter 4. We also discuss in some detail the authoring of the two characters and how it relates to our model, as well as the resulting sketch and how the action selection mechanism worked in thse cases. Finally, we describe how we implemented an animation system capable of showing the agents emotions. 6.1 Architecture Overview Our implementation consists of two autonomous FAtiMA agents extended with Emotional Goals and Guidelines, connected to a 2D animation system developed in Unity3D through the ION framework. In figure 6.1 we can see an overview of the architecture with an emphasis on the agents. For now we abstract from the specificities of the animation system and its connection to ION and FAtiMA. We will detail the implementation of the animation system in section 6.7. As can be seen in figure 6.1, we used the FAtiMA’s Reactive, Deliberative and ToM components. We extended the FAtiMA framework to account for Emotional Goals and Emotional Guidelines. Emotional Goals are groups of Goal Emotions. Goal Emotions are in turn defined by the type of emotion we want to alter and the Emotional Guideline that expresses how we want to alter it. An Emotional Guideline is a function that defines the evolution of the emotion potential as the agent acts. To avoid confusion we will, for the remaining of this chapter, use the concept of Emotional Guideline to mean both the Goal Emotion and the Emotional Guideline itself. A Joy Linear Emotional Guideline would thus refer to a Goal Emotion of Joy that would evolve according to a linear function. Note that an Emotional Goal can be composed of more than one Guideline that regards the same emotion. 6.2 Emotional Goals and Guidelines As defined by our model, the action selection mechanism of our agents is influenced by their Emotional Goals. An Emotional Goal is simply a set of preconditions and Emotional Guidelines. The preconditions define when the Emotional Goal is activated. After an Emotional Goal is 45 Figure 6.1: Agents architecture overview. activated, the agent will consider, in its action selection, the rules defined by the Emotional Goal. These rules are described by the Emotional Guidelines. An Emotional Guideline is essentially a function that returns the desired potential of an emotion according to the number of actions the agent executed since the Emotional Goal was activated. The Emotional Guidelines should be monotonically increasing, to comply with the definition of Emotion Escalation we provided in chapter 4. However Emotional Guidelines should also define how that increase of potential happens. We implemented three types of Emotional Guidelines: Linear, Quadratic and Sigmoid. We chose these three types because of their different behaviours. A Linear guideline grows to infinity with the number of actions at a constant speed. A Quadratic guideline also approaches infinity as the action progresses but with a linear growth rate. A Sigmoid guideline grows to approach a horizontal asymptote, as the number of actions grows to infinity. Each of the three types of Emotional Guideline can be parameterized through authoring. We call the output of these guidelines at a given point the desired emotional potential. The Linear guideline has two parameters, a growth rate m and an initial value c. It is defined by the function: L(t, m, c) = m ∗ t + c In this function t stands for the number of actions the agent executed since the Emotional Goal was activated while m and c are constants, which define the growth rate and the initial value of the guideline respectively. We refer to m as growth rate to use the same terminology we used 46 in describing other types of guideline. In the case of the Linear guideline m defines the slope. The Quadratic guideline defines an accelerating growth of an emotion’s potential. It is defined by the function: Q(t, m, c) = (m/1000) ∗ (t2 ) + c In this guideline, t also represents the variable number of actions, while c and m stand for the initial value and the growth rate. The growth rate m is divided by a constant value (we set this value at 1000) to facilitate authoring, and avoid the need to use very small decimal numbers to represent the growth of the guideline. The growth rate m corresponds to the speed of growth of the function. Because the Linear and Quadratic guidelines tend to infinity with the number of actions, we set a limit for these guidelines value. As such the output value of these guidelines never exceeds 10, which is the maximum value of an emotion’s potential. Thus the desired emotional potential of these guidelines is at most 10. The Sigmoid guideline has one more parameter than the Linear and Quadratic guidelines. Besides the initial value and the growth rate, it is also defined by the interval between the initial value and the horizontal asymptote (that defines the maximum value of the Sigmoid guideline). As such a Sigmoid guideline is defined by a function: S(t, m, c, i) = i∗t +c m + |t| In this case t, m and c have the same meanings as before, while i represents the interval between the minimum value and the maximum value (that is, the horizontal asymptote). Thus, the counter-domain of a Sigmoid function is [c, i + c]. Note that the Sigmoid function is only defined for t ≥ 0, otherwise the minimum value of the function would be c − i. It is also convenient to not the growth rate is equal to the number of steps needed for the function to equal c + 2i . 6.3 Heuristic implementation Our model (see chapter 4) defines a heuristic function that allows to select actions according to the defined guidelines. In our model we have the generic concept of emotional output. We decided to represent this emotional output by an emotion’s potential (see section 5.1.1). As mentioned in section 6.2, the emotion potential each guideline defines at a certain point is called the desired emotional potential. There are a couple of particularities to consider in the implementation of the simulation described in section 4.6. One is that the agent does not consider his current emotions, so that the simulated emotional state only contains the new emotions that result of appraising the simulated event. The other is that the simulation also does not take into account the agent’s emotional thresholds. As such, when simulating the appraisal, we consider all emotion thresholds to be 1. This allows us to consider a larger set of actions, independently of the character’s personality. The reason behind this is that actions should be chosen for how they contribute to the scene, not only for the agent’s character. A comic character often has exaggerated traits. For example, an extremely calm character would have a very high threshold to Distress. If the agent wants 47 to maximize other’s Distress (and because the initial model of other we have is based on that of the subject itself – see section 5.1.5) the existence of this threshold would result in discarding most actions. This is because only a very highly negative Desirability would cause a Distress emotion with enough potential to be felt by the character – an action for which the potential is bigger than the Distress threshold. If the opposing character is very sensible to Distress, he would feel these weaker emotions (and, theoretically, the reaction could even be read to update the model-of-other we have of it). From the simulated emotional state that results of this process we can gather which emotions the action generates and what is the expected potential of each emotion the action generates, as required by our model. When calculating the heuristic, we have access to all of the currently active Emotional Guidelines and we iterate through these guidelines to calculate the emotion weight for each one. If an action generates the emotion defined by the guideline, we consider the absolute value of the difference between the expected emotional potential and the desired emotional potential. Our model also states it is important to take into account whether the last action selected by the character was stronger than the one we are evaluating or not. To know this, each Emotional Guideline is notified when a certain action is selected, and stores the expected emotional potential associated with that action. When selecting an action it will consider the expected emotional potential of the last selected action. When calculating the emotion weight of an action, we consider whether its expected emotional potential is greater or lesser than that of the last action selected. FAtiMA deliberative agents already select their actions through heuristics. The heuristic cost defined by this function takes into account several properties, such as the number of the steps of the plan and possible threats (to the Interest Goals). We now calculate the emotion weight as defined by our model and take it into account as well when planning. 6.4 Scenario Outline We required for our scenario two characters, at least one of whom should be a comic character; an object of conflict between these two characters; and a reason to keep the characters together troughout the sketch. We set out our sketch in a pastry shop, involving one Client and one Seller. The Client is a regular character, while the Seller is the comic character who refuses to sell the cake (thus the cake is the object of conflict and their client/seller relationship what bounds them together). The Client is obese, and the attitude of the Seller ranges from being plain insulting to stress the fact he is overweight as a reason not to sell the cake. The punchline would be the Seller trying to sell something else. We defined the Seller would want to make the Client angry, and as such it seemed fitting to make him trying to sell antidepressants when the Client reached that state. After the Client refuses to buy the pills, the Seller, failing to see the inconsistence of his own actions, blames the crisis for the fact he did not sell the other product. 48 6.5 Scenario Authoring Authoring is the process by which we define the scenario in FAtiMA. It allows one to specify the personalities of the agents, the actions that are available to them, the objects that are part of the world and predicates about the agents and the objects. It is through this parametrization that different FAtiMA agents are defined. 6.5.1 Action and goal authoring Since our model is based on a structured plot, we defined a state object, which holds the plot information, such as who has the turn to talk, and what the moment of the sketch is. The sketch moment property refers to the current moment of the sketch, as defined by the model: Begginning, Middle and Punchline (throughout this section, punchline and resolution are used synonymously). When the sketch moment is equal to End, it means the sketch has finished. The use we do of the Active Pursuit Goals is to define what the agents should do as actors and not as characters. We use Active Pursuit Goals to define that the agents should interact with each other and when they should activate a punchline. The InteractWith is a generic goal both agents have, and that states that, when they have the turn to talk, they should do an action that requests an answer from the other agent. We define this goal the following way: GoalInteractWith([target]): PreConditions: [target != self] [sketchMoment != End] [sketchTurn = self] SuccessConditions: [sketchTurn == [target]] As such, the agents will always try to interact with each other, by acting and then giving the turn to the other agent. Because it is an Active Pursuit goal (see section 5.1.4), it will arouse prospect-based emotions. Whenever the agent activates this goal, it feels Hope, and if it succeeds it feels Satisfaction. However, while interacting is a goal of the agent as an actor, not as the character it is acting to be, the actual emotions are considered a part of the character itself – the role the actor agent is playing. Because of this contradiction all prospect-based emotions were disregarded in the authoring of the character. The sketch moment property is used to filter which actions are available to the characters as the plot develops. The beginning and punchlines are scripted, meaning that there is only one course of action defined for the characters (though, theoretically, an agent could choose to activate different resolutions, once a punchline is activated he has a strict sequence of actions to follow). This sequence of actions is established by the preconditions. Each action has two specific preconditions: that the event of the action that should have happened before did happen, and that this action was not yet used by the character. The sketch begins with the characters greeting each other. Then, the Seller offers his help to the Client, and the Client answers by asking for a cake. This marks the transition to the middle 49 part of the sketch, as the subject of conflict between the characters is already determined: the purchase of the cake. Most actions are available to the characters in this middle section, since the actions they choose in this section to impact the emotional content of the scene is the main focus of this thesis. These actions are all mostly defined the same way. What makes them different from each other is the appraisal the characters make of them, since it is that appraisal they will consider in the action selection. Occasionally, however, an action needs to be provided a certain context. For example, an agent can only give cake if it actually has the cake. More interesting contexts are provided by the emotion preconditions: a character can only try to calm another one down if he perceives the target of its action has a certain level of Distress; likewise, a character can only admit an eating problem if he is self-conscious about it, that is, if he feels Shame, and recognizes himself to be fat. The triggering of the punchline is done by a specific goal. The context in which to activate the punchline is given by the goals preconditions. Whether or not a punchline is activated depends on the Emotional Guidelines and whether they lead to this context or not. We only defined a punchline that matched the guidelines we authored. Had the scenario allowed for the characters to follow different Emotional Guidelines, the need would arise for more punchlines that took advantage of the different reactions those guidelines would evoke. Our punchline is that when the Client gets angry at the Seller (who has acted in an incongruent manner), the Seller prompts to sell anti-depressants instead of the cake the Client asked for. As such, this punchline is only activated when the Client shows anger at the seller. This anger is shown by activating an Action Tendency (see 5.1.4). This is defined as the following Active Pursuit goal: GoalActivateAntiDepressantPunchline(): PreConditions: [Client showed anger] [sketchMoment = Middle] [sketchTurn = self] SuccessConditions: [sketchMoment = AntiDepressantPunchline] Once the sketch moment is set to a specific punchline, the sequence of actions is also predefined, as in the beginning of the sketch. Usually the end of the punchline will coincide with the end of the sketch, and such is the case in our prototype. 6.5.2 Character authoring The authoring of the character defines its personality. In the scope of this work, the personality is defined as the set of emotional rules and reactions, action tendencies and goals of a character. Following our model, we distinguish between two types of characters: comic and regular characters. A comic character appraises the real world in a way that is inconsistent with what would be normally expected of him, in order to generate action and emotional incongruences, while 50 regular characters appraise actions without a strong variation to what is expected of him. In our scenario the Seller is a comic character, while the Client is a regular character. Being asked by a client for a cake, the normal appraisal of selling this cake would be with high desirability. Instead, and because he is a comic character, the Seller appraises very negatively the action GiveCake. Seller character As a comic character the Seller has a comic perspective; in our scenario the Seller is a very unpleasant person, who takes pleasure in actions that are undesirable for others. The Insult, MakeFatPeopleJoke and MakeSarcasticRemark actions are appraised taking this into account. When the subject is the Seller he appraises these events as desirable for him but not for others. When the action subject is someone else, however, the Seller appraises these events as undesirable. Thus, this character likes to insult other characters but does not like to be insulted. Note that the stereotypical relation between a seller and a client – “the client is always right” – is incongruent with this appraisal. Though the aforementioned actions all follow the same pattern (high desirability for self, low desirability for others), they must have different gradations of those variables. Since we also want our emotional escalation to be gradual, we need the actions available to also be appraised with varying degrees of intensity. We set that MakeSarcasticRemark is a less strong action than Insult which in turn is less strong than MakeFatPeopleJoke. MakeFatPeopleJoke was chosen to be stronger than a simple Insult because, while the latter is generic, the former is quite personal and an overweight person would more easily get related to and feel offended by it. The agent also appraises these actions as blameworthy. Some actions are appraised by the Seller as undesirable both for him and for others, such as RaiseMoralIssues or the Reason actions. We chose to set these rules so because though these are actions that would go against the desires of the Client they are somehow justifiable and can be respected by other characters. This is contrary to the comic persona of the Seller, who would not want respect. However these are appraised as praiseworthy. All the appraisal rules used by the Seller are described in appendix A. The Seller is quite insensitive to most emotions, with a threshold of 8 in 10 to Shame or Remorse, for example. Anger and Distress have medium thresholds of 4 and 5 respectively. However the Seller gloats and feels Joy easily, with a threshold of only 2 for both emotions. Client character As a regular character the Client has appraisal rules more in line with what is expected of him. The Client, appraises actions such as Insult, MakeSarcasticRemark and MakeFatPeopleJoke as blameworthy and undesirable both to others and to himself. While he also finds undesirable for the Seller to RaiseMoralIssues in order not to sell him the cake, he finds it justified and, as such, praiseworthy. Since his patience would be put to test by the Seller, we focused on the actions the character could use to recover pride after the Seller’s negative actions. Actions such as TryCalmDown, DemandRespect and FormallyComplain are thus appraised as praiseworthy. While these last 51 two actions have the same high praiseworthiness value, their desirability is different, as the Client finds more desirable the DemandRespect action than the FormallyComplain. Once again, it should be stressed the importance to differentiate the appraisal of different actions in order to understand their relevance for the emotional guidelines and the subsequent emotional escalation. The actions that would lead the Client to his ultimate goal of getting the cake are also appraised as desirable. As such, the action AskForCakeOrCandy is appraised with high desirability. There is little else, besides this and the actions already referred as highly praiseworthy, that the Client character finds desirable. The full list of appraisal rules of the Client is found in appendix B. As for his emotional reactions, the Client is slightly more insensitive to Anger (with a threshold of 4 out of 10) than to Distress (threshold of 3). However, he is also more sensitive to Reproach (threshold 2), the other emotion involved in generating Anger (see section 5.1.1) than to Distress. Action Tendencies The Action Tendencies of the characters allow them to show what they are feeling. We defined some simple Action Tendencies: • BlushSlightly and BlushHeavily to show varying degrees of shame; • ShowSlightAnger and ShowGreatAnger to show varying degrees of anger; • FrownSlightly and FrownHeavily to show varying degrees of distress; • SmileSlightly, SmileOpenly and SmileWidely to show varying degrees of joy; These Action Tendencies are used to animate expressions (6.7 as well as to activate our punchline. Since the punchline context is that the Client is angry, the precondition for activation is the Client shows anger, through the ShowGreatAnger Action Tendency. 6.5.3 Emotional goals authoring If the defined Active Pursuit Goals were all we used to define the behaviour of the character, they would always choose the same action in order to achieve their generic goal of interacting with one another. However, as established by our model, the behaviour of the characters is mainly directed by their emotional goals and guidelines. The goal of the Seller as a comic character is to annoy the Client. As such, two emotions that the Seller will likely want to arouse are Distress in the Client and Gloating in himself. Note that Distress is the result of a negative desirability, while Gloating is caused by appraising an event as desirable, but not desirable for others. This is contradictory: an event that is appraised that has having both positive and negative Desirability. As such an event will only lead to both Gloating in the Seller and Distress in the Client if we consider two factors. The first is that the Seller appraises desirability differently when the subject of the action is himself and when it is not. The second is the use of a ToM when trying to cause Distress in the Client. Note this ToM is initially based on the model of 52 the Seller himself. Since the Seller appraises differently the event when he is the target, he will appraise the action as not desirable for the Client. This is due to the fact that, through his ToM, the Seller is able to see the perspective of the Client, and understand the Client sees himself as the target of that action. We show in figures 6.2 and 6.3 two different representations we used to prototype the Seller’s emotional goal. The difference between these two representations is that while A only takes advantage of the Distress emotion, B also has an emotional guideline for Gloating. In section 6.6 we will see the impact this has in the character’s action selection. The emotional goal DispleaseClient-A consists of an initial sigmoid curve of the Client’s Distress that is followed by an exponential growth. This makes for a change of pace in the sketch, that starts slow, but escalates fast afterwards. Emotional goal DispleaseClient-B is an extension of this emotional-goal. In this case, however, we added exponentials to represent the growth of the Seller’s Gloating emotion. There are two exponential guidelines representing Distress and Gloating, to make its relative weight more relevant than the sigmoid curve, as they grow (see section 4.6). The Client himself, being a regular character, could perhaps have been purely reactive, as its actions are mainly reactions to the Seller’s inappropriate behaviour. However a reactive FAtiMA agent has a far more unpredictable output, which could not be controlled by our interaction goal, that only allows the characters to speak in turns. As such we also defined the Client’s behaviour through an Emotional Goal. The initial goal of the Client is to accomplish his goal of getting the cake; we define this as a Joy emotional guideline. However, as the Client is provoked, he will need to react by making actions that recover his hindered pride. As such the Client also has two Pride emotional guidelines. 6.6 Prototype Output We discuss the sketches obtained as a result of the described authoring before analyzing the animation system because these results are independent of it and the purpose of the graphical output is rather to enable testing users reaction to the work. It should be taken in account, though, that the timing of the animations also has an influence on this output. Since emotions and the emotional state evolve through time (because of the decay rate), so does the mood, which ultimately has a certain influence on the expected potential of each action (see section 5.1.3). The outputs described in this section were the result of running the sketch with the graphical system, and are the ones that were presented to viewers. The output script of the final sketches that resulted from the implementations here described, with the two different Seller Emotional Goals, DispleaseClient-A and DispleaseClient-B, found in appendices C and C respectively. We can see the Emotional Escalation is mainly perceived by the Client, the regular character, who starts by smiling slightly, then gets distressed, and finally angry (showing two different levels of anger), as the Seller is more insensitive to most actions, with high thresholds for his emotional reactions. In figures 6.2, 6.3 and 6.4, we show the various Emotional Goals represented by their different guidelines and the actions chosen as a consequence of those guidelines, which makes it easier to 53 understand the description we will present in this section. Note that the potential represented is the actual potential of the emotion felt by the character. As such, if the potential is below the threshold, it shows on the plot as if it has zero value. 6.6.1 Seller with Distress guideline only Figure 6.2: Emotional Goal DispleaseClient-A. Letters represent actions: A-RaiseMoralIssues, B-Reason, C-WarnHinderAppetite, D-MakeSarcasticRemark, E-MakeFatPeopleJoke Figure 6.2 shows a graphical representation of the Emotional Goal Displease Client-A. We can see both Distress Emotional Guidelines that are part of this Emotional Goal, a Sigmoid, we dub S1 and a Quadratic we call Q1 . Before explaining how each action was selected, recall Distress is the result of negative Desirability. Note as well that, since the desired effect is to distress the Client, the emotional output shown in figure 6.2 refers to the model of other the Seller has of the Client. At the beginning, the most influent guideline (remember the higher the emotional potential defined by the guideline, the more importance it is given by the heuristic) is the Sigmoid, S1 . This Sigmoid has a slow growth rate, and this is why the initial action chosen – RaiseMoralIssues repeats several times, as the Quadratic guideline grows to catch it. The sketch only proceeds when the Quadratic guideline’s value is bigger than that of the Distress potential of action RaiseMoralIssues, choosing a more undesirable action, which is Reason. From here on, the exponential guideline gains preponderance, making the sketch evolve at a faster pace. The Seller selects the action WarnHinderAppetite, followed by MakeSarcasticRemark and finally MakeFatPeopleJoke. These actions are more and more undesirable, leading to a growing Distress of the Client. The MakeFatPeopleJoke is appraised by the Client as especially undesirable (and also as undesirable for the target of the action), which angers the Client above the limit imposed by the ShowGreatAnger Action Tendency. Throughout the sketch, the Client reacts to the Seller’s actions. The Client starts by showing simply Distress. This is due to two reasons. One is that the Client’s Distress threshold is lower 54 than the Anger threshold. The other is that actions such as RaiseMoralIssues are appraised by the Client as undesirable but praiseworthy (remember Anger is the product of negative Desirability and Praiseworthiness in relation to actions of others, see section 5.1.1). Only the sarcastic remark and the fat people joke are appraised in such a way that it triggers the Anger Action Tendencies of the Client. 6.6.2 Seller with Distress and Gloating guidelines Figure 6.3: Emotional Goal DispleaseClient-B. Above the Gloating guidelines, below the Distress guidelines. Letters represent actions: A-RaiseMoralIssues, B-WarnHealthRisks, CMakeSarcasticRemark, D-Insult, E-MakeFatPeopleJoke The Emotional Goal DispleaseClient-A and Displease-Client-B are very similar. The differences, that lead to a slightly different sketch, are that it considers Gloating and the growth rates are slower in general, changing the timing and pacing of the sketch. We can see this slower growth rate by looking at image 6.3. The beginning of the sketch is still mainly oriented by the Distress guidelines (that have higher initial values), making the Seller choose the action RaiseMoralIssues, as it did in the 55 sketch without Gloating. However the action that follows this is different, WarnHealthRisks. It could choose neither Reason nor WarnHinderAppetite (the actions it chose in the sketch without Gloating) because the Gloating guidelines have now gained relevance, and neither of those actions produces Gloating, since they have positive Praiseworthiness. The Seller only defines one appraisal rule for Reason, which means he considers it undesirable, regardless of the perspective (the Client’s or the Seller’s). As for WarnHinderAppetite the Seller considers it desirable for other, producing a Happy-For rather than a Gloating emotion. The other action that is different in this sketch is Insult. This case is different, be- cause though it also generates Gloating, the Seller views it as an action as undesirable as MakeFatPeopleJoke. As such the Insult action could also be selected by the previous Emotional Goal, which only considered Distress. In this case, there is a specific incentive to move on from the Insult to MakeFatPeopleJoke which is the contribute of the Gloating, whereas when using the other Emotional Goal the actions can only be differentiated by other factor, such as the order they appeared in authoring. The Client does not share the same appraisal of action Insult as does the Seller. He is less angered by an insult than by a joke. As explained in section 6.5.2 we made it so because a fat people joke would be more concrete, more personal. In case a sketch was to end with the Insult action, it would probably never end, since it would never make the Client sufficiently angry to trigger the Action Tendency that activates the punchline. One other action that is likely to end a sketch is MakeSarcasticRemark, as the Client also appraises it as highly undesirable. Though less so than MakeFatPeopleJoke, repeated sarcastic remarks would make (depending on the time between actions and the consequent effects of the decay rate) the Client angry enough to trigger the ShowGreatAnger Action Tendency. It should also be noted that Insult causes less Gloating than MakeSarcasticRemark. However the sum of the two Quadratic Distress guidelines is, at the point the Insult action is selected, bigger than the sum of the Gloating guidelines, making them more relevant. MakeSarcasticRemark is not chosen because it causes much less Distress. An important thing to notice is the difference in the pacing of the sketch. The fact that all the Emotional Guidelines have smaller growth rates means the Emotional Escalation is not as fast, and the Client does not get angry as quickly. Each character selects 3 more actions in this sketch than they did in the sketch without Gloating. 6.6.3 Client with Joy and Pride guidelines As it was already noted, the Client, as a regular character, could theoretically do without Emotional Goals. The Client has the specific aim of getting the cake he desires. Other actions should be related to the situation he finds himself in, because of the unexpected behaviour of the Seller. As such he seeks to recover his pride. We can see this represented of the Client’s emotional goal in image 6.4. The simplicity of the Emotional Goal attributed to the Client reflects in his simple behaviour, selecting only two different actions, AskCakeOrCandy and FormallyComplain. In the beginning the most relevant guideline of the Client is the sigmoid Joy. Though this guideline is not shown in figure 6.4, the AskCakeOrCandy action generates a level of Joy that is just below the guideline 56 Figure 6.4: Emotional Goal KeepPride. Pride guidelines only, Joy guideline not shown. Letters represent actions: A-AskCakeOrCandy, B-FormallyComplain value. It also produces Pride, though the potential of the emotion that results of this action is below the threshold level of the Client, which means the Client himself does not feel that emotion as a result of AskCakeOrCandy. The second action, FormallyComplain, is chosen because of the rapid growth of the Pride guideline. There are several other actions the Client could do that are praiseworthy, for example Reason and DemandRespect. However, the Client appraises reasoning as undesirable, while DemandRespect has an emotional pre-condition that requires the subject of the action to feel Reproach towards the target. 6.7 Animation system Our prototype includes an animation system that showcases the actions of our agents. The animation system should: • Provide a comic, non-serious context; • Allow the characters to communicate through textual utterings. • Be able to showcase the characters emotions; To comply with the first requirement we decided the environment should have a cartoonish look-and-feel, inspired by comic animated series. We decided that actions should be able to pop up speech bubbles with text, as used in graphic novels, which solved the second problem. The third property was a bigger challenge. We developed an animation system that emphasizes the facial expressions as a vehicle of emotion, a topic vastly researched by Paul Ekman [26], [27]. To do this we animated separately several features of the face, namely the skin color, the eyes, the eyebrows and the mouth. We can see the final result in figure 6.5, which shows the Client displaying Anger. 57 Figure 6.5: Screenshot of our prototype running in Unity. The Client is showing Anger. 6.7.1 Architecture Overview Each part of the character we want to control is animated independently. As such there is a controller for the mouth, eyes, eyebrows and face, as well as for the arm, as shown in figure 6.6. We call these partial expression controllers. This way we can create more expressions by combining different expressions of the different parts of the character face. These expressions are controlled by a global expression controller, that dictates the behaviour of each partial expression controller to express a given emotion. The animation itself is a sprite animation. Sprites are sequences of frames that are stored in one single image (see example in figure 6.7). We apply these sprites, as textures, to planes. The sprites are resized so that only one frame appears at a time. The horizontal offset applied to the texture varies over time, changing the frame that is viewed, thus animating the character. These sprites may also have different rows to represent different animations, so a vertical offset is also added to select the right animation. This resizing and animation is handled by an Animation Controller. Each Partial Expression Controller is a separate state machine with its own Animation Controller, and each different animation is represented by a state. The parameters of an animation state are the starting frame, the ending frame, the sprite row and the kind of loop (whether it should keep looping until explicitly told to stop or if it should stop at the beginning, end or in any specified frame in between). A Partial Expression Controller may cache animations in a FIFO (First In First Out) list. When an animation ends, the Partial Expression Controller pops the new animation state from 58 Figure 6.6: Architecture of the animation system the FIFO. The current animation state is then responsible for animating the transition to the next state. For example, when we want to blink a character’s eyes we push the StateCloseEyes followed by the StateOpenEyes. The current state animates the eyes to close, and sets the current animation state as StateCloseEyes. When the closing eyes animation finishes, the StateOpenEyes is pulled out, and StateCloseEyes triggers the animation for opening the eyes and sets the current animation state as StateOpenEyes. The Global Expression Controller can trigger several expressions by pushing specific states to the animation FIFO of each Partial Expression Controller. See section 6.7.2 for a description of how these states attempt to represent the emotional behaviour of the character. As previously mentioned, the speech is showed in a comics’ like speech bubble. The time the speech should last is set for each character (for example, the speech should last for half a second for each character, would make saying “Hello!” last for 3 seconds, including the exclamation mark). Each action that is verbally expressed has its own Speech Act. These Speech Acts consist of a list of utterances with the same meaning but different phrasing. The purpose is that each action can be selected more than once without always repeating itself exactly the same way. Since an utterance may be too long to show all at the same time, it can be divided into several units, called Sentences, that are united in a larger unit, called Paragraph. While saying each of these Sentences the character is talking. An action that is associated to a Speech Act only stops after the character stops talking. 59 Figure 6.7: Example sprite for the Seller’s eyes. 6.7.2 Expressing emotions Several animation systems have been developed that focus on showcasing emotions, using Paul Ekman’s basic emotion category system. Ekman himself focused mainly on facial expressions [27]. Ekman’s emotion theory originally comprises six basic emotions: anger, disgust, fear, happiness, sadness and surprise [26]. The output of a FAtiMA agent emotion is their reactive behaviour, composed of its Action Tendencies. As such, the main emotions we focused on were in showing Joy, Anger and Distress (since these were the emotions related to the Action Tendencies that were used). Mapping from the 22 OCC emotions to the 6 basic categories proposed by Ekman is not a trivial task. While Anger and Joy seem to relate directly to Ekman’s anger and happiness, Distress is less clear. Since Distress is the opposite of Joy in the OCC model, we interpreted this as sadness. Extreme distress could be portrayed as extreme sadness or agony. To show happiness, we used the smile, pulling up the corners of the mouth. There is a number of subtleties of smiles, but most would go lost in our cartoon like characters. To show sadness we lowered the corners of the mouth and pulled the inner part of the eyebrows up. We allow these features to be more or less accentuated in order to show varying degrees of sadness. To show anger we reddened the face and lowered the eyebrows, making the inner corners of the eyebrows go toward the nose. The actual mouth expression for anger is to press tightly and tense the lips [27], but we did not make a different mouth expression only for anger. Since OCC describes Anger as the combination of Distress and Reproach, we used the same mouth expressions we used for showing Distress. Besides these expressions presented we also used an eye roll animation to show contempt. We do not relate contempt directly to a showing of an OCC emotion, but rather in connection with the action MakeSarcasticRemark since sarcasm is precisely “the use of irony to show contempt.”. 60 Chapter 7 Evaluation In evaluating our implementation, we check if viewers acknowledge the emotional escalation and if the sketch prototyped was recognized as humorous. There are many factors to account for in humour, which makes this task difficult. We try to account for some of these factors, such as the length of the sketch or the quality of the punchline and see how they influence the perception of humour. 7.1 Test structure The test consisted of a web-distributed questionnaire about a video showing the prototype described in 6. The full questionnaire can be found in appendix E. Two versions were evaluated, one using DispleaseClient-A (with Distress Emotional Guideline only) and other one using DispleaseClient-B (with both Distress and Gloating Emotional Guidelines). Questions were organized in three groups. The first two groups presented similar questions and had the purpose of analyzing the viewer’s perception of each agent: the first one was about the Seller and the second one about the Client. The first three questions of these groups evaluate the feelings of each character in the beginning, middle and end of the sketch, through a set of nominal values from which the viewer chose one. The following three questions were Likert scales of statements about the character: the first about the perceived normality of his behaviour according to expected, the second about the coherency of the character, and the third about the evolution of his aggressiveness. The third group consisted of Likert scales of statements about the sketch as a whole, including: • the perceived length of the sketch; • the quality and comprehension of the ending or punchline; • the perceived funniness of the sketch; • if the viewers perceived an evolution of the characters emotional state. We registered 75 responses to the questionnaire using Distress and 30 to the one using Distress and Gloating. Each participant only answered one of the questionnaires. Out of the 75 answers to the inquiry without Gloating, 37 were males and 38 were females. The questionnaire 61 with Gloating had 14 female and 16 male participants. In this chapter, we discuss the results of both inquiries. We take special attention to the test without Gloating and compare it to the test with Gloating when relevant. 7.2 Seller behaviour (a) Seller’s feelings in the begin- (b) Seller’s feelings in the middle ning (c) Seller’s feelings in the end Figure 7.1: Bar charts representing the perceived feeling evolution of the Seller throughout the sketch (without Gloating). Figure 7.1 summarizes how the evolution of the character Seller’s feelings was perceived in the prototype with the Distress guideline. The participants clearly identify Happiness as the initial feeling of the Seller (60% of the answers) while a significative number do not identify the feeling as any present in the list (25.3%). The middle section of the sketch is slightly less clear, with responses scattered across answers Anger (6.7%), Disappointment(12%), Happiness (6.7%) and Sadness(9.3%). However the prevalent feeling is that the Seller gets worried as the sketch goes on (36%). 28% did not find in the choice list a word that could express the Seller’s feelings. As for the ending part of the sketch opinions are divided between answers Disappointment (42.7%) and Sadness (49.3%). The perceived emotions are thus consistent with both the actions and expressions of the Seller character. Initially, the Seller feels glad for seeing the Client, thus Happiness seems the most appropriate answer. As the sketch proceeds, the Seller’s smile fades to a neutral smile. Participants had some doubts on how to interpret this, but decided the Seller was worried. In the ending part the Seller fails to sell the antidepressant pills and, as a result, he sports an extremely sad smile. Participants recognized his sadness, and inferred, from the actions and subsequent reactions, the Seller got disappointed for not selling the antidepressants. The answers to the following questions, regarding the Seller’s behaviour, are summarized by the boxplots in figure 7.2. We can see the overwhelming majority of the participants (76%) think the Seller character did not behave according to expectations, which was question Q4. The cumulative percentage of those who answered between 1 and 3 (in which 1 means they totally disagree and 3 they neither agree nor disagree that the Seller behaved according to expectations) is 93.3%. We can thus say the Seller was recognized as the incongruent character. Question Q5 refers to the coherency of the Seller character throughout the sketch. However answers do not seem to show any kind of consensus, with both prototypes showing a quite even 62 Figure 7.2: Seller’s behavior (questions 10 through 13): Left (A) refers to prototype with Distress only; Right (B) refers to prototype with Distress and Gloating. distribution of answers for each possible value, and a high standard deviation. Finally, question Q6 asks if there was a perceived evolution of the character’s aggression, in which 1 meant the participant though the Seller did not became more aggressive as the sketch went on, while 5 meant the participant perceived a rise in aggression. In the prototype that does not feature Gloating it is very hard to reach a conclusion, as answers distribute almost equally for all values. However the sketch with Gloating seems to feature a clearer aggression evolution, as the median is higher than the test without Gloating, and 60% answered with either 4 or 5, meaning they agree the Seller became more aggressive. This may arise from the fact that the guideline with Gloating shows a slower development of the Sellers actions, who not only makes two sarcastic remarks but also insults the client, two of the arguably more aggressive actions available. However the data of this questionnaire is also not enough to reach a definite conclusion, since a significant percentage of the participants (23.3%) answered they did not agree with the affirmation. This lack of consensus is reflected by the high standard deviation (1.634). 7.3 Client behaviour The Client’s emotional escalation was even more straightforward than the Seller’s (see figure 7.3). Being the regular character, most of the emotional escalation of the sketch was perceivable through him. The initial perception of the Client’s feelings is similar to the Seller’s, with Happiness gathering 57% of the answers. The evolution of the Client’s feelings is then perceived 63 (a) Client’s feelings in the begin- (b) Client’s feelings in the middle (c) Client’s feelings in the end ning Figure 7.3: Bar charts representing the perceived feeling evolution of the Client throughout the sketch (Distress guideline) as a growth of Anger. 61.3% thought the Client was angry throughout the middle section of the sketch, growing to 70.7% in the ending part. It should also be noted the presence of Surprise, perceived in the middle section, with 12% of the answers, as well as in the ending section 6, 7%. The Surprise was not expressed with the animation (there were no raised eyebrows for example), which may indicate that the viewers either identified Surprise in the character’s actions (the formal complain) or identified with him, attributing to him some of their own surprise. Figure 7.4: Client’s behavior (questions 10 through 13): Left (A) refers to prototype with Distress only; Right (B) refers to prototype with Distress and Gloating. The next three questions provide more data about the Client than the same questions did 64 about the Seller. Looking at the box plot for question 10 in figure 7.4 we can see the participants agree the Client behaved as expected. They also agree the Client was coherent. Since the Client only did two actions throughout the whole sketch, this was to be expected. The answers about whether the participants perceived an evolution of the Client’s aggression or not yielded slightly less expected results. In the test without Gloating 44% totally agreed the Client became more aggressive and 25.3% somewhat agreed with that statement, with the test with Gloating featuring similar results (26.7% answered 4 and 46.7% answered 5). The Client only does two actions: asking for the cake and asking for the complaint book. Although one can consider from the fact he got a little more aggressive from the fact he started complain, most perception of aggression could arise from the emotions showed by the character, which the viewers identified mainly with anger, as mentioned before. 7.4 Sketch overview Figure 7.5: Box plots summarizing the answers about the sketch as a whole (questions 13 through 18) for questionnaires without (left) and with (right) Gloating. One thing we can take from looking at the box plots presented in figure 7.5 is that participants do not think characters felt the same way in the beginning they did in the end of the sketch (Q17). It also seems people agree the sketch with Gloating was too long, more so than the sketch without. It seems reasonable, since the sketch is actually longer. Question 18, which asked the participants if they agreed the sketch was funny, shows a mean of 3.17, with a 1.319 standard deviation. However these results provide more interesting 65 information if we relate the answers to question 18 with answers to the other questions, trying to understand why participants thought the sketch was funny or unfunny. We present some correlations, considering the test without Gloating, which had the most participants. Though these do not establish a cause-effect relation, they do provide some hints on how the viewers evaluated the sketch. Figure 7.6: Mean value of Likert scale about the sketch length (Q13) compared with funniness ratings (Q18). Figure 7.6 seems to suggest a relation between funniness of the sketch and the feeling the sketch ran for too long. Indeed, a Spearman correlation test indicates a correlation factor (rho) of −0, 366 significant at the 0, 01 level, between those who agreed the sketch was too long and those who agreed the sketch was funny. This supports the hypothesis that the feeling the sketch is too long grows in inverse proportion with how funny it is perceived to be. This may make the case that pacing is indeed an important subject in Interactive Comedy. In figure 7.7 we can see the relations between the mean value to answers Q15 and Q16 and the perceived funniness of the sketch. Q15 refers to how good the ending was, in which 5 means the participants completely agree the ending was good and 1 they completely disagree. Q16 asks if the ending should be better explained. A Spearman-rho correlation test for questions Q15 and Q18 shows a very strong correlation between participants who thought the ending was good and those who found the sketch funny (rho of 0.597, significant at the 0.01 level). This stresses the importance of the punchline of the sketch. If we see it in relation to participants who though the sketch was too long, we can also consider that more jokes are needed, to be triggered in the intermediate part of the sketch and provide a better buildup. The relationship between the understanding of the ending and the funniness does not seem so evident. The Spearman-rho test shows a correlation, though not as strong, with a rho of 0.356 significant at the 0.01 level. Once again this emphasizes the importance of the punchline, as well as the need for a good relation between the build-up and the punchline. Our model accounts for this relation through the preconditions that are needed to trigger a certain punchline. However, the buildup could probably be bettered by adding a bit more context to the actions each character selects. 66 (a) Relation between mean ending quality (Q15) and fun-(b) Relation between mean unintelligibility of ending niness (Q18). (Q16) and funniness (Q18). Figure 7.7: Bar charts representing the perceived feeling evolution of the Client throughout the sketch (Distress guideline) 67 68 Chapter 8 Conclusion From our overview of the works in Computational Humour we observed that it has mainly revolved around Natural Language systems that generate puns and other basic types of jokes. Humour in Interactive Storytelling (Interactive Comedy) has known limited evolution since its recent inception. Most of the implementations proposed focused on plan formalizations that allow the agent to fail. We intended to provide a more generic model that could be adapted for a larger variety of situations. We focused on sketch comedy writing. A sketch consists of a short comic scene, thus a good candidate as the type of scenario in which one could develop an Interactive Comedy system. In order to make a model of sketch that could be implemented as a system acted by autonomous agents, we considered a background of humour theories and comedy writing. Our proposed model divides a sketch structurally into three parts. The first part introduces the information needed to understand the rest of the sketch and provides the source of conflict between the agents. The middle part develops this conflict towards an emotional peak in which a resolution or punchline is activated. This resolution is also scripted; however, our model makes possible the existence of several different resolutions which are activated depending on the context that result from the evolution of the sketch. We also propose a categorization of incongruity which we take in account when authoring the personality of our agent, in order to make a distinction between comic and regular characters. Our main contribution is in the intermediate part of the sketch. We propose agents behave not only as characters but also as actors that play characters. As such they guide their actions in relation to an Emotional Guideline, that maps the scene time into the emotional output. The pacing of the sketch can be controlled by the shape of these guidelines, and how fast or slow they contribute to the Emotional Escalation. An Emotional Escalation is the evolution of emotions towards an emotional peak in which the sketch is resolved. We have implemented this model as a prototype built upon the FAtiMA agent architecture, and tied it to an animation system that is capable of expressing the agents emotions and thus portraying the emotional escalation. Evaluation of this prototype shows the viewers were able to identify how the agents emotions evolved throughout the sketch. The assessment of the comedic value of the resulting sketch is encouraging albeit non-conclusive. The relation between the perceived length of the sketch and its funniness suggests pacing should be a topic of interest in Interactive Comedy. 69 Our model contributes for the area of Interactive Comedy and Interactive Storytelling in general. Being an initial effort, it also presents a margin of improvement that is left as future work. It would be useful to have a more comprehensive evaluation and validation of the model that assesses how different Emotional Escalations affect the comic value of the sketch. Also, our model relies heavily on authoring, both for the characters personalities and on the Emotional Guidelines. Ideally some of this knowledge should be part of the agent itself, to reduce the authoring and improve the ability of the agent to change its behaviour according to his appraisal of the world and of his interaction with other agents. Other way to reduce authoring would be by having the agent have a model of what a “regular person” is, and derive possible incongruities to explore from there. Our evaluation also suggests that the actions selected during the sketch should be more coherent and provided better context. The evaluation also points to the fact that though the punchline is usually the high point of a sketch. The sketch could activate weaker jokes throughout, to make it more entertaining and less long. Finally, since humour is so connected to our social interactions, integrating the possibility of interacting with comic agents could probably also improve the comic value of the sketch. 70 Bibliography [1] APAD. Revista associação portuguesa de argumentistas e dramaturgos, 2008. [2] S. Araujo and L. Chaimowicz. A synthetic mind model for virtual actors in interactive storytelling systems. In AIIDE, 2009. [3] Aristotle. The Art of Rhetoric. Penguin Classics, 1992. [4] Aristotle. Poetics. Penguin Classics, 1997. [5] S. Attardo. Linguistic Theories of Humor. Mouton de Gruyter, 1994. [6] R. Aylett, N. Vannini, E. Andre, A. Paiva, S. Enz, and L. Hall. But that was in another country: agents and intercultural empathy. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1, pages 329–336, 2009. [7] R. S. Aylett, S. Louchart, J. Dias, A. Paiva, and M. Vala. Fearnot!: an experiment in emergent narrative. In Lecture Notes in Computer Science, pages 305–316. Springer-Verlag, London, UK, UK, 2005. [8] J. Bates. The role of emotion in believable agents. Commun. ACM, 37(7):122–125, 1994. [9] K. Binsted. Using humour to make natural language interfaces more friendly. In Proceedings of the AI, A-Life, and Entertainment Workshop, Intern. Joint Conf. on Artificial Intelligence, page 4, 1995. [10] K. Binsted. Machine humour: An implemented model of puns. PhD thesis, The University of Edinburgh: College of Science and Engineering: The School of Informatics, 1996. [11] K. Binsted and G. Ritchie. An implemented model of punning riddles. In AAAI Proceedings of the 12th National Conference on Artificial intelligence, volume 1, pages 633–638, 1994. [12] K. Blom and S. Beckhaus. Emotional storytelling. In IEEE Virtual Reality 2005 Conference, Workshop ”Virtuality Structure”, 2005. [13] A. Brisson, J. Dias, and A. Paiva. From chinese shadows to interactive shadows: building a storytelling application with autonomous shadows. In Proceedings of the Workshop on Agent-Based Systems for Human Learning and Entertainment, 2007. [14] M. Cavazza, F. Charles, and S. Mead. Intelligent virtual actors that plan... to fail. In Smart Graphics, pages 343–368. Springer Berlin / Heidelberg, 2003. [15] M. Cavazza, F. Charles, and S. J. Mead. Planning characters’ behaviour in interactive storytelling. Journal of Visualization and Computer Animation, 13:121–131, 2002. [16] M. Cavazza, F. Charles, and S. J. Mead. Generation of humorous situations in cartoons through plan-based formalisations. In Proceedings of CHI-2003 Workshop: Humor Modeling in the Interface, 2003. 71 [17] M. Cavazza, J.-L. Lugrin, D. Pizzi, and F. Charles. Madame bovary on the holodeck: immersive interactive storytelling. In Proceedings of the 15th international conference on Multimedia, pages 651–660, 2007. [18] C. Chaplin. City lights (chaplin collection). Park Circus DVD, 2011. [19] F. Charles, M. Lozano, S. J. Mead, A. F. Bisquerra, and M. Cavazza. Planning formalisms and authoring in interactive storytelling. In Proceedings of the 1 st International Conference on Technologies for Interactive Digital Storytelling and Entertainment, pages 216–225. Fraunhofer IRB Verlag, 2003. [20] F. Charles, S. J. Mead, and M. Cavazza. Character-driven story generation in interactive storytelling. In Proceedings 7th International Conference In Virtual Systems And Multimedia, pages 609–615, 2001. [21] J. Dias and A. Paiva. Agents with emotional intelligence for storytelling. In Affective Computing and Intelligent Interaction, volume 6974 of Lecture Notes in Computer Science, pages 77–86. Springer Berlin / Heidelberg, 2011. [22] O. Dictionaries. http://oxforddictionaries.com/. [23] T. Doce, J. a. Dias, R. Prada, and A. Paiva. Creating individual agents through personality traits. In Proceedings of the 10th international conference on Intelligent virtual agents, pages 257–264, 2010. [24] A. Dufresne and M. Hudon. Modeling the learner preferences for embodied agents: Experimenting with the control of humor. In Proceedings of the Workshop on Individual and Group Modelling Methods that Help Learners Understand Themselves (International Conference on Intelligent Tutoring Systems), pages 43–51, 2002. [25] M. Eastman. Enjoyment of Laughter. Transaction Publishers, 2009. [26] P. Ekman. Emotion in the Human Face. Cambridge University Press, 1982. [27] P. Ekman. Emotions Revealed: Recognizing Faces and Feelings to Improve Communication and Emotional Life. Holt Paperbacks;, 2004. [28] C. Elliott. Why boys like motorcycles: using emotion theory to find structure in humorous stories. In EBAA ’99 Workshop on Emotion-Based Agent Architectures, at the Autonomous Agents ’99 Conference, May 1999. [29] C. Elliott, J. Brzezinski, S. Sheth, and R. Salvatoriello. Story-morphing in the affective reasoning paradigm: generating stories semi-automatically for use with ”emotionally intelligent” multimedia agents. In Proceedings of the second international conference on Autonomous agents, pages 181–188, 1998. [30] G. Fedorento. Gato fedorento: Série meireles. DVD, 2004. [31] Friends. Friends: The complete tenth season. Warner Home Video DVD, 2005. [32] S. Fry and H. Laurie. A bit of fry and laurie - the complete collection. 2 Entertain Video DVD, 2006. [33] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Professional, 1994. [34] C. R. Gruner. The Game of Humor: A Comprehensive Theory of Why We Laugh. Transaction Publishers, 1999. 72 [35] C. F. Hempelmann, V. Raskin, and K. E. Triezenberg. Computer, tell me a joke ... but please make it funny: Computational humor with ontological semantics. In Proceedings of the 19th FLAIRS Conference, pages 746–751, 2006. [36] R. Herring and D. Mitchell. Writing sketches. The Guardian, 22 September 2008. [37] I. Kant. Critique of Judgment. Hafner Press, 1951. [38] A. Koestler. The Act Of Creation. Arkana, 1989. [39] G. Lessard and M. Levison. Computational modelling of linguistic humour: Tom swifties. In Proceedings of the 1992 ACH/ALLC Joint Annual Conference, pages 175–158, 1992. [40] G. Lessard and M. Levison. Computational modelling of riddling strategies. In Proceedings of the 1993 ACH/ALLC Joint Annual Conference, pages 120–122, 1993. [41] S. Louchart and R. Aylett. Evaluating synthetic actors, 2007. [42] S. Louchart and R. Aylett. From synthetic characters to virtual actors. In Proceedings of the Third Artificial Intelligence and Interactive Digital Entertainment Conference, 2007. [43] R. Manurung, G. Ritchie, H. Pain, A. Waller, D. O’Mara, and R. Black. The construction of a pun generator for language skills development. Applied Artificial Intelligence, 22(9):841– 869, 2008. [44] R. A. Martin. The Psychology of Humor: An Integrative Approach. Elevier Academic Press, 2007. [45] M. Mateas and A. Stern. Procedural authorship: A case-study of the interactive drama façade. In In Digital Arts and Culture (DAC, 2005. [46] J. Morkes, H. K. Kernal, and C. Nass. Humor in task-oriented computer-mediated communication and human-computer interaction. In CHI 98 conference summary on Human factors in computing systems, pages 215–216, 1998. [47] A. Nijholt. Embodied agents: A new impetus to humor research. In Proceedings Twente Workshop on Language Technology 20, pages 101–111, 2002. [48] D. Olsen and M. Mateas. Beep! beep! boom!: towards a planning model of coyote and road runner cartoons. In Proceedings of the 4th International Conference on Foundations of Digital Games, pages 145–152, 2009. [49] A. Ortony, G. L. Clore, and A. Collins. The Cognitive Structure of Emotions. Cambridge University Press, 1988. [50] G. Perret. Comedy Writing Step By Step. Writer’s Digest Books, 1982. [51] D. Pizzi, M. Cavazza, and J.-L. Lugrin. Extending character-based storytelling with awareness and feelings. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, pages 12:1–12:3, 2007. [52] D. Pizzi, M. Cavazza, B. S. Magerko, and M. O. Riedl. Affective storytelling based on characters’ feelings. In Intelligent Narrative Technologies, Papers from the 2007 AAAI Fall Symposium, pages 111–118, 2007. [53] Plato. Philebus. Penguing Classics, 1983. [54] D. Premack and G. Woodruff. Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1(04):515–526, 1978. 73 [55] R. R. Provine. Laughter: A Scientifical Investigation. Penguin Books, 2001. [56] M. Python. Monty python’s flying circus - the complete boxset. Sony Pictures Home Entertainment DVD, 2008. [57] V. Raskin and S. Attardo. Non-literalness and non-bona-fide in language: An approach to formal and computational treatments of humor. Pragmatics and Cognition, 2:31–69, 1994. [58] M. O. Riedl and A. Stern. Believable agents and intelligent story adaptation for interactive storytelling. In In Proceedings of the 3rd International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE06, pages 1–12. Springer, 2006. [59] G. Ritchie. Can computers create humor? AI Magazine, 30:71–81, 2009. [60] S. J. Russel and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 1995. [61] O. Stock. Password swordfish: Verbal humor in the interface. Humor International Journal of Humor Research, 16(3):281–295, 2003. [62] O. Stock and C. Strapparava. Hahacronym: a computational humor system. In Proceedings of the ACL 2005 on Interactive poster and demonstration sessions, pages 113–116, 2005. [63] O. Stock and C. Strapparava. Applied computational humor and prospects for advertising. In Proceedings of the 2006 conference on Rob Milne: A Tribute to a Pioneering AI Scientist, Entrepreneur and Mountaineer, pages 71–89, 2006. [64] A. Stott. Comedy. Routledge, 2005. [65] Thawonmas, Ruck, K. Tanaka, Hassaku, and Hiroki. Extended hierarchical task network planning for interactive comedy. In Intelligent Agents and Multi-Agent Systems, volume 2891, pages 205–213. Springer Berlin / Heidelberg, 2003. [66] R. Thawonmas, H. Hassaku, and K. Tanaka. Mimicry: Another approach for interactive comedy. In Proceedings of the 4th annual European GAME-ON Conference (GAME-ON 2003) on Simulation and AI in Computer Games, pages 47–52, 2003. [67] M. Vala, G. Raimundo, P. Sequeira, P. Cuba, R. Prada, C. Martinho, and A. Paiva. Ion framework – a simulation environment for worlds with virtual agents. In Proceedings of the 9th International Conference on Intelligent Virtual Agents, pages 418–424. Springer-Verlag, 2009. [68] J. Vorhaus. The Comic Toolbox: How To Be Funny Even If You’re Not. Silman-James Press, 1994. [69] A. Waller, R. Black, D. A. O’Mara, H. Pain, G. Ritchie, and R. Manurung. Evaluating the standup pun generating software with children with cerebral palsy. ACM Transactions on Accessible Computing, 1(3):16:1–16:27, 2009. [70] WordNet. http://wordnet.princeton.edu/. 74 Appendix A Seller Appraisal Rules A.1 Selected in prototype <EmotionalReaction desirability="6" desirabilityForOther="2" praiseworthiness="0"> <Event action="Greet" /> </EmotionalReaction> <EmotionalReaction desirability="-1" desirabilityForOther="-4" praiseworthiness="0"> <Event action="RaiseMoralIssues" /> </EmotionalReaction> <EmotionalReaction desirability="-3" desirabilityForOther="2" praiseworthiness="0"> <Event action="Reason"/> </EmotionalReaction> <EmotionalReaction desirability="3" desirabilityForOther="2" praiseworthiness="0"> <Event subject="[SELF]" action="WarnHinderAppetite" /> </EmotionalReaction> <EmotionalReaction desirability="-4" desirabilityForOther="5" praiseworthiness="0"> <Event action="WarnHinderAppetite" /> </EmotionalReaction> <EmotionalReaction desirability="-3" desirabilityForOther="-3" praiseworthiness="0"> <Event action="WarnHealthRisks"/> </EmotionalReaction> <EmotionalReaction desirability="4" desirabilityForOther="-5" praiseworthiness="0"> <Event subject="[SELF]" action="WarnHealthRisks" /> </EmotionalReaction> <EmotionalReaction desirability="6" desirabilityForOther="-7" praiseworthiness="-3"> <Event subject="[SELF]" action="MakeSarcasticRemark" /> </EmotionalReaction> <EmotionalReaction desirability="-4" desirabilityForOther="-4" praiseworthiness="-3"> <Event action="MakeSarcasticRemark" /> </EmotionalReaction> <EmotionalReaction desirability="3" desirabilityForOther="-8" praiseworthiness="-5"> <Event subject="[SELF]" action="Insult" /> </EmotionalReaction> 75 <EmotionalReaction desirability="-7" desirabilityForOther="-1" praiseworthiness="-5"> <Event action="Insult" /> </EmotionalReaction> <EmotionalReaction desirability="5" desirabilityForOther="-8" praiseworthiness="-5"> <Event subject="[SELF]" action="MakeFatPeopleJoke" /> </EmotionalReaction> <EmotionalReaction desirability="-7" desirabilityForOther="-4" praiseworthiness="-5"> <Event action="MakeFatPeopleJoke"/> </EmotionalReaction> <EmotionalReaction desirability="-9" desirabilityForOther="-4" praiseworthiness="0"> <Event action="RefuseAntiDepressants" /> </EmotionalReaction> A.2 Not selected in prototype <EmotionalReaction desirability="-8" desirabilityForOther="-8" praiseworthiness="-8" > <Event action="Threaten" /> </EmotionalReaction> <EmotionalReaction desirability="0" desirabilityForOther="4" praiseworthiness="3"> <Event action="MakeCompliment" /> </EmotionalReaction> <EmotionalReaction desirability="-7" desirabilityForOther="0" praiseworthiness="-6"> <Event action="RaiseQuestionCareerChoice" /> </EmotionalReaction> <EmotionalReaction desirability="0" desirabilityForOther="0" praiseworthiness="-4"> <Event action="MakeUpExcuses" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="2" praiseworthiness="2"> <Event action="Apologize" /> </EmotionalReaction> <EmotionalReaction desirability="4" desirabilityForOther="-4" praiseworthiness="-2"> <Event subject="Client" action="AdmitEatingProblem" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="-2" praiseworthiness="-8"> <Event action="Steal" /> </EmotionalReaction> <EmotionalReaction desirability="4" desirabilityForOther="-8" praiseworthiness="0"> <Event action="GiveCake" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="9" praiseworthiness="0"> <Event action="GiveCake" subject="[SELF]"/> </EmotionalReaction> 76 <EmotionalReaction desirability="2" desirabilityForOther="-2" praiseworthiness="8"> <Event action="DemandRespect" /> </EmotionalReaction> <EmotionalReaction desirability="-2" desirabilityForOther="4" praiseworthiness="0"> <Event action="TryCalmDown" /> </EmotionalReaction> 77 78 Appendix B Client Appraisal Rules B.1 Selected in prototype <EmotionalReaction desirability="2" desirabilityForOther="1" praiseworthiness="1" > <Event action="Greet" /> </EmotionalReaction> <EmotionalReaction desirability="2" desirabilityForOther="3" praiseworthiness="1" > <Event action="AskCakeOrCandy" /> </EmotionalReaction> <EmotionalReaction desirability="-2" desirabilityForOther="6" praiseworthiness="4"> <Event action="Reason" /> </EmotionalReaction> <EmotionalReaction desirability="2" desirabilityForOther="-4" praiseworthiness="8"> <Event action="FormallyComplain" /> </EmotionalReaction> <EmotionalReaction desirability="-9" desirabilityForOther="9" praiseworthiness="-9"> <Event action="TrySellAntiDepressants" /> </EmotionalReaction> <EmotionalReaction desirability="-2" desirabilityForOther="-4" praiseworthiness="-6"> <Event action="Insult" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="-6" praiseworthiness="-6" > <Event action="MakeFatPeopleJoke" /> </EmotionalReaction> <EmotionalReaction desirability="-4" desirabilityForOther="-6" praiseworthiness="-4" > <Event action="MakeSarcasticRemark" /> </EmotionalReaction> <EmotionalReaction desirability="-3" desirabilityForOther="2" praiseworthiness="3" > <Event action="WarnHealthRisks" /> </EmotionalReaction> 79 <EmotionalReaction desirability="-1" desirabilityForOther="-3" praiseworthiness="3" > <Event action="WarnHinderAppetite" /> </EmotionalReaction> B.2 Not selected in prototype <EmotionalReaction desirability="6" desirabilityForOther="2" praiseworthiness="2"> <Event action="MakeCompliment" /> </EmotionalReaction> <EmotionalReaction desirability="0" desirabilityForOther="8" praiseworthiness="2"> <Event subject="[SELF]" action="MakeCompliment" /> </EmotionalReaction> <EmotionalReaction desirability="4" desirabilityForOther="-2" praiseworthiness="8"> <Event action="DemandRespect" /> </EmotionalReaction> <EmotionalReaction desirability="-3" desirabilityForOther="-2" praiseworthiness="4" > <Event action="RaiseMoralIssues" /> </EmotionalReaction> <EmotionalReaction desirability="4" desirabilityForOther="6" praiseworthiness="4" > <Event action="TryCalmDown" /> </EmotionalReaction> <EmotionalReaction desirability="-2" desirabilityForOther="-4" praiseworthiness="-6" > <Event action="Threaten" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="5" praiseworthiness="9"> <Event action="AdmitEatingProblem" /> </EmotionalReaction> <EmotionalReaction desirability="0" desirabilityForOther="-2" praiseworthiness="0"> <Event action="RaiseQuestionCareerChoice" /> </EmotionalReaction> <EmotionalReaction desirability="-2" desirabilityForOther="-2" praiseworthiness="-4"> <Event action="MakeUpExcuses" /> </EmotionalReaction> <EmotionalReaction desirability="3" desirabilityForOther="6" praiseworthiness="2"> <Event action="Apologize" /> </EmotionalReaction> <EmotionalReaction desirability="-8" desirabilityForOther="-8" praiseworthiness="-6"> <Event action="Steal" /> </EmotionalReaction> <EmotionalReaction desirability="-7" desirabilityForOther="9" praiseworthiness="6"> <Event action="GiveCake" /> </EmotionalReaction> 80 Appendix C Script (Without Gloating) 81 82 RaiseMoralIssues AskForCake Reason AskForCake WarnHinderAppetite FormallyComplain MakeSarcasticComment Seller Client Seller Client Seller Client Seller AskForCake Client AskForCake RaiseMoralIssues Seller Client Reason Client RaiseMoralIssues Greet Greet OfferAssistance AskForCake WarnHinderAppetite AskForCake RaiseMoralIssues Seller Client Seller Client Seller Client Seller Seller Action Character Howdie! Hi there! May I help you? I’d like a chocolate cake. You will ruin your appetite. Can I have a chocolate cake? I can’t sell you cake, that’s against my moral code. Can’t we reach a common ground? An agreement? My religion opposes, you know, giving you cake. Can you please just give me a cake. Morally I’d have to say know. Otherwise... still a no. Chocolate. Cake. Can I get that. Selling cake would make for bad karma. I, myself, would fancy a cake. Can’t we reach a common ground? An agreement? I’d really like some cake. You will ruin your appetite. This is inadmissible. I want to write down a complaint. Sure, your grandma thinks you’re thin too. Speech Act SmileSlightly ShowSlightDistress SmileSlightly ShowSlightDistress ShowSlightDistress SmileSlightly SmileSlightly ShowSlightDistress SmileSlightly ShowSlightDistress SmileOpenly SmileSlightly SmileOpenly SmileSlightly SmileOpenly SmileSlightly SmileOpenly SmileSlightly SmileOpenly Expression Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Sketch Moment Beginning Beginning Beginning Beginning Middle Middle Middle 83 Seller Client Seller Client Seller Character Client Table C.1 – continued from previous page Action Speech Act Expression FormallyComplain This is inadmissible. I want to ShowSlightAnger write down a complaint. MakeFatPeopleJoke There’s only room here for one SmileSlightly of us. Or ten of me. FormallyComplain This is inadmissible. I want to ShowGreatAnger write down a complaint. TrySellAntiDepressants You look a tad distressed. SmileSlightly May I interest you in these amazing antidepressants? RejectAntiDepressants I don’t want those antidepr ShowGreatAnger ssants, I want a chocolate cake! ComplainAboutCrisis People buy less and less these ShowGreatDistress days. Surely it must be the crysis. Punchline Punchline Punchline Middle Middle Sketch Moment Middle 84 Appendix D Script (With Gloating) 85 86 RaiseMoralIssues AskForCake RaiseMoralIssues AskForCake WarnHealthIssues FormallyComplain WarnHealthIssues Client Seller Client Seller Client Seller AskForCake Client Seller RaiseMoralIssues Seller AskForCake Reason Client Client AskForCake RaiseMoralIssues Client Seller RaiseMoralIssues Greet Greet OfferAssistance AskForCake Reason Seller Client Seller Client Seller Seller Action Character Howdie! Hi there! May I help you? I’d like a chocolate cake. Can’t we reach a common ground? An agreement? Can I have a chocolate cake? I can’t sell you cake, that’s against my moral code. Can’t we reach a common ground? An agreement? My religion opposes, you know, giving you cake. Can you please just give me a cake. Morally I’d have to say know. Otherwise... still a no. Chocolate. Cake. Can I get that. Selling cake would make for bad karma. I, myself, would fancy a cake. Sell you the cake? I coudn’t. I’d really like some cake. If you keep eating like that, you’re gonna get a heart attack. I’d like your complaint book. Each cake you eat takes 10 years off your life. Speech Act ShowSlightDistress SmileSlightly ShowSlightDistress SmileSlightly ShowSlightDistress SmileSlightly SmileSlightly ShowSlightDistress SmileSlightly ShowSlightDistress SmileOpenly SmileSlightly SmileSlightly SmileOpenly SmileOpenly SmileSlightly SmileOpenly SmileSlightly SmileOpenly Expression Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Middle Sketch Moment Beginning Beginning Beginning Beginning Middle 87 Seller Client Seller Client Seller Client Seller Client Seller Client Seller Character Client Table D.1 – continued from previous page Action Speech Act Expression FormallyComplain This is inadmissible. I want to ShowSlightAnger write down a complaint. MakeSarcasticComment Sure, your grandma thinks SmileOpenly you’re thin too. FormallyComplain Give me your complaint book. ShowSlightAnger MakeSarcasticComment You really look low on sugar. SmileOpenly FormallyComplain I have a complaint! Let me ShowSlightAnger write it down. Insult You are so dumb! SmileOpenly FormallyComplain I’d like your complaint book. ShowSlightAnger MakeFatPeopleJoke There’s only room in here for SmileOpenly one of us... Or ten of me. FormallyComplain This is inadmissible. I want to ShowGreatAnger write down a complaint. TrySellAntiDepressants You look a tad distressed. SmileSlightly May I interest you in these amazing antidepressants? RejectAntiDepressants I don’t want those antidepres- ShowGreatAnger sants, I want a chocolate cake! ComplainAboutCrisis People buy less and less these ShowGreatDistress days. Surely it must be the crysis. Punchline Punchline Punchline Middle Middle Middle Middle Middle Middle Middle Middle Sketch Moment Middle 88 Appendix E Questionnaire 89 Questionnaire Thank you for taking the time to answer this inquiry. The goal of this questionnaire is to evaluate a prototype made in the context of a master thesis at IST - Instituto Superior Técnico. The questionnaire estimated completion time is 5 - 7 minutes. The questions in this form are related to a video that tries to portray a sketch or a short scene. The sketch happens in a pastry shop and involves two characters: a seller and a client, represented by the images below. First watch the video, then proceed to answer the questions. A - The following questions regard the seller. For each question choose the answer which you think applies the most. The following questions refer to the evolution of the seller's feelings throughout the sketch. 1. How did the seller feel in the beginning of the sketch? * Happiness Sadness Worry Worry Surprise Anger Disappointment None of the above 2. What were the seller's feelings in the middle the sketch? * He got happy/happier He got sad/saddder He got (more) worried He got (more) surprised He got angry/ angrier He got (more) disappointed None of the above 3. How did the seller felt in the end of the sketch? * Happiness Sadness Worry Surprise Anger Disappointment None of the above For the following questions give an answer from 1 to 5, in which 1 means you disagree completely, 5 that you agree completely according to the following scale. 1 2 3 4 5 Disagree completely Disagree somehow Neither agree nor disagree Agree somehow Agree completely 4. The seller behaved as expected, given the situation. * 5. The seller was coherent throughout the sketch. * (1) Disagree completely (1) Disagree completely 6. (2) (2) (3) Neither agree nor disagree (3) Neither agree nor disagree (4) (5) Agree completely (4) (5) Agree completely (4) (5) Agree completely The seller became more aggressive as the sketch went on. * (1) Disagree completely (2) (3) Neither agree nor disagree B - The following questions regard the client: For each question choose the answer which you think applies the most. The following questions refer to the evolution of the client's feelings throughout the sketch. 7. How did the client feel in the beginning of the sketch? * Happiness Sadness Worry Surprise Anger Disappointment None of the above 8. How did the client's feelings evolved throughout the sketch? * He got happy/happier He got sad/saddder He got (more) worried He got (more) surprised He got angry/ angrier He got (more) disappointed None of the above 9. How did the client feel in the end of the sketch? * Happiness Sadness Worry Surprise Anger Disappointment None of the above None of the above For the following questions give an answer from 1 to 5, in which 1 means you disagree completely, 5 that you agree completely according to the following scale. 1 2 3 4 5 Disagree completely Disagree somehow Neither agree nor disagree Agree somehow Agree completely 10. The client behaved as expected, given the situation. * (1) Disagree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely (4) (5) Agree completely (4) (5) Agree completely 11. The client was coherent throughout the sketch. * (1) Disagree completely (2) (3) Neither agree nor disagree 12. The client became more aggressive as the sketch went on. * (1) Disagree completely (2) (3) Neither agree nor disagree C - The following questions regard the scene as a whole: For the following questions give an answer from 1 to 5, in which 1 means you disagree completely, 5 that you agree completely according to the following scale. 1 2 3 4 5 Disagree completely Disagree somehow Neither agree nor disagree Agree somehow Agree completely 13. The sketch was too long. * (1) Disagree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely (3) Neither agree nor disagree (4) (5) Agree completely 14. The sketch was humorous. * (1) Disagree completely 15. The sketch had a good ending. * (1) Disagree completely 16. The ending should be better explained. * (1) Disagree completely (2) 17. The characters felt the same way in the beginning as they did in the end of the sketch. * (1) Disagree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely (2) (3) Neither agree nor disagree (4) (5) Agree completely 18. The sketch was funny. * (1) Disagree completely D - Generic Information 19. Age: * < 18 19 - 25 26-35 36-45 46-55 >55 20. Gender: * * = Input is required This form was created at www.formdesk.com Male Female
© Copyright 2026 Paperzz