Impact of various types of smiles on perceived impression of virtual agents Daniel Moscoviter University of Twente P.O. Box 217, 7500AE Enschede The Netherlands [email protected] ABSTRACT Smiling has a big influence on our every-day social interaction. In this article, the impact of smiling when applied to a virtual agent that serves as a health advisor is studied. Using experimental evidence from literature, three different types of smiles are compared in a fictional situation after being modeled on a virtual agent. These scenarios have been rated by test subjects in a practical experiment, and the results of this experiment have been analyzed. The results ended up being inconclusive, owing to multiple factors that include an unrealistic virtual agent and low number of test subjects. Keywords virtual agent, facial expression, smile, politeness, experiment, questionnaire 1. INTRODUCTION Smiling is one of the most recognizable facial expressions, involving the flexing of muscles near the mouth and eyes. Research has shown that people can, both consciously and unconsciously, distinguish different types of smiles, even when applied to virtual agents [12]. These smiles often have their own meaning, including showing amusement or anxiety. The different types of smiles each have their own morphological and dynamic characteristics as well [10]. Morphological characteristics refer to which facial muscles are contracted or relaxed, while dynamic characteristics refer to the duration of a smile and the velocity at which the muscles are activated. Virtual agents with emotional attributes are created with different goals in mind. While it is known that people can distinguish different types of smiles of virtual agents, and that inappropriate facial expressions can negatively influence interactions [9], it is not yet known whether these smiles have different effects on perceived impression when applied to virtual agents in specific scenarios. Studies on different types of smiles have been done by Ochs et al. [10], who list three types of smiles and their characteristics, based on their study involving participants creating one amusement, politeness and embarrassment smile through a web interface. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. 14th Twente Student Conference on IT January 21st , 2011, Enschede, The Netherlands. Copyright 2010, University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science. This study measures the impact of different types of smiles in a specific scenario by means of a practical experiment. Several research questions can be defined to aid in this study. The main research question is: What impact do different types of smiles have on the perceived impression of virtual agents? To help answer this question, some sub questions are answered first: • What are the most important types of smiles, and what are their characteristics? • How can these characteristics be modeled on a virtual agent? • What is a preferable scenario to test the impact of these different smiles using a practical experiment? • What are some of the qualities that a virtual agent has to possess in this environment? • What conclusions can be drawn from the results of the practical experiment? Ochs et al.’s paper and its references are the main resources for answering the first sub question. The types of smiles they defined are implemented using modeling tools provided by the University of Twente. The exact scenario in which the impact of the different types of smiles is tested is specified. It involves a virtual agent trying to tell the test subject some good or bad news; the virtual agent smiles during the conversation. This scenario involves a total of three different types of messages: a positive one, a negative one, and a neutral one. An example of such a scenario is a virtual agent delivering health communication and even health behavior change interventions [3]. This virtual agent tries to deliver good news by complimenting the user on his physical fitness, and bad news by stimulating the user to increase his physical activity. The way the virtual agent delivers the messages to the user requires special attention: politeness has been shown to have a very big impact on the motivational state of learners by Brown et al. [5], and a way to apply their politeness theory to virtual agents have been described by Johnson et al. [7]. What type of politeness to implement is a trade-off in the desire to communicate, the desire to be efficient, and the desire to maintain the user’s face. Table 1. Most common characteristics of different smile types. Amused Embarrassed Polite Big mouth yes no no Open mouth yes no no Asymmetry no yes no Tense lips no yes no Raised cheeks yes no no 2. SMILES AND POLITENESS 2.1 Smiles Ochs et al. have created a web application, E-smilescreator, to allow users to create different smiles on a virtual agent [10]. Test subjects were tasked with creating three different types of smiles: one amused, one embarrassed and one polite smile. Users could manipulate the virtual agent’s face by specifying characteristics using multiplechoice radio buttons, showing a graphical virtual agent being updated with these choices simultaneously. These characteristics included specifying whether the mouth should be big or small, open or closed, symmetrical or asymmetrical, whether the lips should be tense, and whether the cheeks should be raised or not. Additionally, users, were asked to rate the smile they created on a Likert scale. A total of 348 users participated in their study. An analysis of the error rates showed that the amused smile type was better classified than the embarrassed and polite smiles types. The morphological characteristics of the smiles that were chosen most often have been presented in Table 1. We see that according to this information, a big, open mouth and raised cheeks are characteristics unique to the amused smile, while asymmetry and tensed lips are unique to the embarrassed smile. We will use these characteristics in the experiment. Smile characteristics can be implemented in FACS (Facial Action Coding System [6]), a system to categorize facial expressions. FACS can generate nearly every expression that is possible to produce with the human anatomy. It specifies a total of 32 AU s (Action Units) that each define the contraction or relaxation of a small number of facial muscles. For example, the raised cheeks in Table 1 can be represented with FACS ’ AU6. 2.2 Politeness Brown et al. have formulated the politeness theory [5]. They defined two kinds of face: positive and negative. Positive face is about being liked, respected and approved of by others—self-esteem. Negative face is the want to not be imposed on by others—freedom to act. In social interaction, it is important to maintain both faces of all the participants. Sometimes, FTAs (face-threatening acts) can not be avoided in communication. In these cases it is important for the speaker to choose the right strategy to deliver his message. interest. This strategy is intended to make the hearer feel good about himself. Usually used in situations where people know each other fairly well. Negative politeness Recognizes the want to be respected by the hearer, but assumes that the speaker is imposing on him. This strategy is used when there is awkwardness and social distance in a situation. Off-record An indirect strategy intended to take pressure off of the speaker. Tries to not impose directly. Which politeness strategy to choose depends on the want to communicate the face-threatening act, the want to be efficient, and the want to maintain the face of the hearer. 3. EXPERIMENT The effects of different types of smiles have been tested in a user experiment. The types defined by Ochs et al. have been implemented in a modeling framework using the characteristics in Table 1, and combined with a fictional scenario. Videos that show this scenario for each of the three selected smile types—amused, embarrassed and polite—were created and presented to test subjects. The test subjects were asked to rate the videos based on a composed questionnaire. 3.1 Scenario To create a believable and apt situation to present to the test subjects, a dialogue was constructed. This dialogue is based on the scenario of a health advisor telling a test subject about their current health status, and ways to improve their healthiness. The dialogue is completely fictitious for two reasons: • We want the experience to be the same for all the test subjects, to be able to draw more accurate conclusions. • Due to time constraints, both on the side of the researcher and the test subjects, actually providing individual health status reports is not viable. The dialogue was created such that the test subjects were presented with three different types of messages: Positive Stating that the test subject has had been in good shape in previous health inspections. Neutral Stating that regular exercise is important to stay in good shape. Negative Stating that the test subject’s cholesterol level has risen recently and advising him to change his eating habits. Brown et al. defined four main politeness strategy types: Bald on-record Does not try to minimize the threats to the hearer’s face. Applying this strategy will probably embarrass the other person or make them feel uncomfortable. This strategy is commonly used among people who now each other well, such as family and friends. Positive politeness Minimizes the face-threatening act to the hearer’s face by expressing friendliness and After a brief introduction by the virtual agent, the positive message is presented using positive politeness by complimenting the test subject on his condition. The negative message, however, starts with negative politeness by stating that the latest results are unfortunate. It then switches to a bald on-record strategy, telling the test subject it should change its eating habit. The dialogue ends in positive politeness, with the virtual agent taking an exaggerated interest in the test subject; he’s told that she looks forward to following his progress in future encounters. 3.2 Questionnaire A questionnaire was composed to evaluate the qualities of the different types of smiles. Rather than directly asking about the smiles, users were asked to rate various qualities that a virtual agent should possess. A large part of these questions were taken almost verbatim from Bailenson et al. [2], but translated to Dutch to make it easier to understand for the test subjects. The social presence and likeability parts of their questionnaire served as the main contributions to the selected questions. Their questions were designed to assess perception of co-presence and likeability and have been successfully used in their previous works [1]. The sensitivity of the data involved leads to a couple of major qualities that a virtual agent in this situation should possess. The user should feel at ease, so the likeability of the agent should be high. At the same time, the user should take the virtual agent and her advise seriously, so her ability to provide high social presence, sincerity, and realism are aspects to take into account [4]. The likeability quality is important for a virtual health advisor, because the user should feel at ease when discussing personal information. This makes the social presence a relevant scale for this experiment as well. The status and interest scales in their questionnaire are of much lesser importance in this scenario, so they were not included. Figure 1. Picture of the agent smiling while talking in the video of the amused smile. In addition to likeability and social presence, there are some other qualities the virtual agent should possess to be taken seriously. The following questions were therefore made specifically to test the scenarios in this experiment, bringing the total amount of questions to 12: • Q1: How well do you understand the conversation? • Q2: To what degree do you feel addressed personally? • Q3: Does the conversation flow naturally? • Q4: Does the conversation feel sincere? • Q5: Does the person make an effort to appear sincere? Rather than Bailenson et al.’s 7-point scale ranging from -3 to 3, with 0 being the neutral option, a 5-point scale with 3 as the neutral option was applied for consistency with the scale used for the other questions in this questionnaire. The majority of the questions was formulated in such a way that lower ratings implied a negative assessment. The ratings of the few questions that did not follow this format have been inverted before analysis. The test subjects were also asked to enter their age for statistical purposes. No other personal information was requested and/or processed. 3.3 Set-up The scenario has been implemented using Elckerlyc, a behavior realizer for virtual humans developed by the Human Media Interaction group of the University of Twente [13]. Elckerlyc’s HMI FaceEditor assisted in precise control over every facial AU defined by FACS via its FACS to MPEG4 converter. Ochs et al.’s findings were used to accurately model the different types of smiles, and they were subsequently saved as a FAP (Facial Animation Parameters [11]) file. The dialogue and smiles were implemented in the BML (Behavior Markup Language [8]) format by directly converting the FAP file to face elements with au attributes, Figure 2. Picture of the agent smiling while talking in the video of the embarrassed smile. directly manipulating AU s on a virtual agent. It was then opened in Elckerlyc’s EnvironmentDemo. For this experiment, only the morphological characteristics were considered. The sound of the dialogue was created by the cmu-slt-hsmm text-to-speech engine. By recording the playback of this BML file and converting it to an H.264/MPEG-4 Part 10 video file with AAC audio, a media file is created that can be played on most systems, unlike the more compact BML script format. Example frames of the resulting videos for the amused and embarrassed smiles can be seen in Figures 1 and 2. These videos were placed on a web page, accompanied by a short explanation and the questionnaire. The web page would randomly show the visitors a video of one type of smile, ensuring a roughly equal distribution of test subjects per video type. 3.4 Procedure The test subjects were asked to watch one of the three available videos on a computer of their choosing, based on which they were asked to fill in the aforementioned questionnaire to assess their perception of the virtual agent. The test subjects had the option to watch their designated video multiple times, although very few subjects made use 5 5 4 4 Rating 3 Rating 3 2 × 1 Amused × Embarrassed 2 × × 1 Polite × × Amused Smile type Embarrassed Polite Smile type Figure 3. Graph of the results of social presence’. Figure 4. Graph of the results of Q3. of this opportunity. After submitting their answers the questions to the questionnaire, the subjects were thanked for their cooperation and assured that their results would be processed anonymously. been performed to compare the question groups. The results are presented in Table 2. No significant differences have been found. We can see that for some questions, the means deviate up to a full point, but all of them fall within each other’s standard deviation. Many test subjects left comments about their experience with the virtual agent after the experiment finished, despite this not being part of the intended procedure. Due to their unscientific nature these comments will not be evaluated in the results, but they will be briefly mentioned in the discussion. 3.5 Subjects A total of 22 people were found willing to lend their time to participate in the experiment. This group was selected to be as homogeneous as possible. It consisted solely of males aged between 17 and 30 years old, with a mean of 21.5 and standard deviation of 2.76 years. It was not possible to equally distribute the three different videos among the total amount of test subjects. The first video, which contained the amused smile, was tested by 8 test subjects; one more than the 7 test subjects that were assigned to each of the other two videos. None of the test subjects were aware of the specifics of the experiment, and while many expressed interest in knowing what the experiment was testing exactly, details on this were only supplied after they finished their participation. Accidental multiple submissions by one test subject were filtered out of the results prior to analysis. 4. RESULTS The answers submitted by the test subject have been evaluated on various points. First by testing the reliability of the scales that were used, then by comparing the results of the different videos. 4.1 Reliability The social presence and likeability parts of Bailenson et al.’s questionnaire were analyzed for reliability. The social presence scale resulted in Cronbach’s α = 0.49. Removing the question about perceiving the other person in a virtual room improved this to α = 0.66, so this subset of questions will be referred to as social presence’ from here on, and the removed question as Q8. The likeability scale resulted in Cronbach’s α = 0.62. The remaining questions are Q1 to Q5, equivalent to the numbering in section 3.2. 4.2 Comparison An ANOVA analysis with 95% confidence interval has As an example of this, the results for the social presence’ scale has been graphed in Figure 3. The question closest to a statistically significant difference is Q3 with F (2, 19) = 2.85, p < 0.08. It is graphed in Figure 4. Even here the embarrassed and polite smiles are still tied, however. No clear correlation between the answers and the test subjects’ age could be found. 5. DISCUSSION No clear conclusions about the smiles can be extracted from the results. The lack of variation between the ratings of the different types of smiles is a clear problem, as is the associated high standard deviation. It is likely that the low variation between the ratings of the different types of smiles is directly related to the overwhelmingly negative assessment of the virtual agent. The main criticisms voiced by the test subjects, both in the answers of the questionnaire and in comments that were made after the completion of the experiment, can be attributed to the virtual agent’s lack of liveliness, its unnatural movements, and robotic voice. Despite the robotic voice, test subjects generally had no problem following the conversation. Just one person rated the intelligibility as being below average. The test subjects unambiguously indicated that the conversation was highly unnatural and insincere, though some noticed that the virtual agent at least tried to be sincere. Most test subjects never had the illusion that they were talking to a real person, or that the virtual agent was aware of their presence. The likeability scale ratings were probably dragged down by most test subjects finding the virtual agent highly unattractive. Q8 was formulated as: I perceive that I am in the presence of another person in the virtual room with me. It was excluded from the other questions about social presence because of issues with its reliability. Its high standard deviation can be explained by the ambiguity of the question; it asks about a virtual room, which is not a clear concept when looking straight at a computer screen. This Table 2. Means and standard deviations of results. Amused smile Embarrassed smile Polite M SD M SD M Social presence’ 1.41 0.38 1.25 0.43 1.64 Likeability 1.94 0.78 2.07 0.93 2.14 Q1 4.00 0.93 3.57 0.79 4.14 Q2 1.75 0.71 2.00 1.16 2.29 Q3 1.88 1.13 1.14 0.38 2.14 Q4 2.50 1.20 1.57 0.79 2.43 Q5 2.13 0.84 2.43 1.62 1.86 Q8 2.38 1.51 2.00 1.53 1.57 question would probably be more relevant in a situation involving virtual reality. The overall high standard deviation is partially due to the relatively low number of test subjects. Increasing this number would probably result in at least a few statistically significant distinctions. In the future, this experiment could be repeated with a more realistic virtual agent and more test subjects for more accurate results. Additionally, interaction between the virtual agent and the user could be implemented to increase the realism of the scenario. This could be done by allowing the user to answer a virtual agent’s questions and taking the conversation in a particular direction based on these answers. A variant of this approach that is easier to implement would be to offer the user a select number of answers in the form of clickable buttons. 6. CONCLUSION A couple of different types of smiles can be distinguished. The most important ones are the amused, embarrassed and polite smiles. The amused smile is characterized by a big, open mouth with raised cheeks. The embarrassed smile’s main characteristics are an asymmetrical smile and tense lips. The polite smile typically has a small, closed mouth. The characteristics of these different types of smiles can be modeled on a virtual agent. In this article, this is done in the Elckerlyc software suite by using the FAP and BML file formats to define the facial expressions, and supplying an appropriate dialog. The different types of smiles can be evaluated in the scenario of a health advisor discussing a patient’s healthiness. Since these conversations involve the exchange of sensitive information, the politeness as described by Brown et al., is an important factor. The data involved is sensitive, so the virtual agent should possess several qualities. Likeability is important for the user to feel at ease, but the virtual agent should be taken seriously as well, making social presence another important factor. The results of the practical experiment did not show a statistically significant user preference of one type of smile over another. The main reasons for this are an overall lack of realism, causing an unequivocal negative assessment of the virtual agent, and a low number of test subjects. 7. ACKNOWLEDGMENTS The author would like to thank his supervisors, Betsy van Dijk and Dennis Reidsma, and his fellow students in the Intelligent Interaction track of the Bachelorreferaat course for their valuable feedback. Thanks also go out to all the test subjects who participated in the experiment. 8. smile SD 0.78 0.63 1.07 1.11 0.69 0.98 0.69 1.13 REFERENCES [1] J. Bailenson, J. Blascovich, A. Beall, and J. Loomis. Interpersonal distance in immersive virtual environments. Personality and Social Psychology Bulletin, 29(7):819, 2003. [2] J. Bailenson, R. Guadagno, E. Aharoni, A. Dimov, A. Beall, and J. Blascovich. Comparing behavioral and self-report measures of embodied agents: Social presence in immersive virtual environments. Paper presented at. In Proceedings of the 7th Annual International Workshop on PRESENCE, 2004. [3] T. Bickmore, L. Caruso, and K. Clough-Gorr. Acceptance and usability of a relational agent interface by urban older adults. In CHI’05 extended abstracts on Human factors in computing systems, pages 1212–1215. ACM, 2005. [4] T. Bickmore and J. Cassell. Relational agents: a model and implementation of building user trust. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 396–403. ACM, 2001. [5] P. Brown and S. Levinson. Politeness: Some universals in language usage. Cambridge Univ Pr, 1987. [6] P. Ekman, W. Friesen, and J. Hager. Facial action coding system, volume 160. Consulting Psychologists Press Palo Alto, CA, 1978. [7] L. Johnson, P. Rizzo, W. Bosma, M. Ghijsen, and H. van Welbergen. Generating socially appropriate tutorial dialog. In ISCA Workshop on Affective Dialogue Systems, volume 3068 of LNAI, pages 254–264, Berlin Heidelberg New York, 2004. Springer Verlag. [8] S. Kopp, B. Krenn, S. Marsella, A. Marshall, C. Pelachaud, H. Pirker, K. Thórisson, and H. Vilhjálmsson. Towards a common framework for multimodal generation: The behavior markup language. In Intelligent Virtual Agents, volume 4133 of LNCS, pages 205–217. Springer, 2006. [9] R. Niewiadomski and C. Pelachaud. Model of Facial Expressions Management for an Embodied Conversational Agent. In Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction, pages 12–23. Springer-Verlag, 2007. [10] M. Ochs, R. Niewiadomski, and C. Pelachaud. How a Virtual Agent Should Smile? In Intelligent Virtual Agents: 10th International Conference, IVA 2010, Philadelphia, PA, USA. Proceedings, page 427. Springer, 2010. [11] I. Pandzic and R. Forchheimer. MPEG-4 facial animation: the standard, implementation and applications. Wiley, 2002. [12] M. Rehm and E. André. Catch me if you can: exploring lying agents in social settings. In Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pages 937–944. ACM, 2005. [13] H. van Welbergen, D. Reidsma, Z. Ruttkay, and J. Zwiers. Elckerlyc - a bml realizer for continuous, multimodal interaction with a virtual human. Journal on Multimodal User Interfaces, 3(4):271–284, 2010.
© Copyright 2026 Paperzz