PAPERS Emoacoustics: A Study of the Psychoacoustical and Psychological Dimensions of Emotional Sound Design ERKIN ASUTAY1, DANIEL VÄSTFJÄLL1,2, ANA TAJADURA-JIMÉNEZ3, ANDERS GENELL4, ([email protected]) ([email protected]) ([email protected]) ([email protected]) PENNY BERGMAN1, AND MENDEL KLEINER,1 AES Member ([email protected]) ([email protected]) 1 2 Division of Applied Acoustics, Chalmers University of Technology, Gothenburg, Sweden Department of Behavioral Science and Learning, Linköping University, Linköping, Sweden 3 Department of Psychology, Royal Holloway University of London, Engham, Surrey, UK 4 Environment and Traffic Analysis, VTI, Gothenburg, Sweden Even though traditional psychoacoustics has provided indispensable knowledge about auditory perception, it has, in its narrow focus on signal characteristics, neglected listener and contextual characteristics. To demonstrate the influence of the meaning the listener attaches to a sound in the resulting sensations we used a Fourier-time-transform processing to reduce the identifiability of 18 environmental sounds. In a listening experiment, 20 subjects listened to and rated their sensations in response to, first, all the processed stimuli and then, all original stimuli, without being aware of the relationship between the two groups. Another 20 subjects rated only the processed stimuli, which were primed by their original counterparts. This manipulation was used in order to see the difference in resulting sensation when the subject could tell what the sound source is. In both tests subjects rated their emotional experience for each stimulus on the orthogonal dimensions of valence and arousal, as well as perceived annoyance and perceived loudness for each stimulus. They were also asked to identify the sound source. It was found that processing caused correct identification to reduce substantially, while priming recovered most of the identification. While original stimuli induced a wide range of emotional experience, reactions to processed stimuli were emotionally neutral. Priming manipulation reversed the effects of processing to some extent. Moreover, even though the 5th percentile Zwickers-loudness (N5) value of most of the stimuli was reduced after processing, neither perceived loudness nor auditory-induced emotion changed accordingly. Thus indicating the importance of considering other factors apart from the physical sound characteristics in sound design. 0 INTRODUCTION Psychoacoustical models provide methods to link the physical properties of sounds and the resulting sensations [1]. They are widely used in product sound quality and auditory display design applications, where the designer strives to convey auditory information to users. Hence, in auditory display design, the central concept is the listener. In order to achieve successful auditory displays, designers must possess an understanding of how people hear. Traditional psychoacoustical research has offered invaluable knowledge on how human audition works. However, it does not resolve other issues related to the specific psychological processes that may impact the auditory perception of the listener, who is in an arbitrary affective state, possesses idiosynJ. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February cratic knowledge, experience, and expectations. Thus, we argue that physical characteristics are not able to fully capture psychological processes in human auditory perception and that focusing solely on form (i.e., signal) characteristics of sound would be to oversimplify auditory perception [2]. An alternative approach to psychoacoustics may consider emotional responses to sound and their influence on sound perception. Emotions play a central role in many perceptual mechanisms [3–6], and in our daily life we are constantly subjected to sounds, which often induce emotional reactions in its perceiver [7, 8]. Evoked sensations by and emotional reactions to sound may also imply (or affect) quality judgments. Following the discussion by Blauert and Jekosch [9], emotions and feelings have positive and negative components; and if a positive emotional experience 21 ASUTAY ET AL. is attributed to a specific sound, it may be judged of high quality. Therefore, understanding how sounds evoke emotions could improve our knowledge on human audition and contribute to creation of more successful sound design applications (see emotional design, [10]). Hence, we have named this alternative approach to psychoacoustics emotional sound design or emoacoustics. A substantial body of research on auditory-induced emotion has been trying to link physical characteristics of sound to certain emotional responses, including annoyance (for a review see [11]). Much of this research has linked physical characteristics such as loudness [12], spectral bandwidth [13, 14], temporal variability [15, 16] or frequency content [14, 16] of simple noise-tone complexes to auditoryinduced annoyance. However, unlike traditional psychoacoustics, the ecological approach to psychoacoustics [17] suggests that instead of hearing different physical properties of sound, such as intensities, frequencies, and durations, we hear “auditory” events, which occur in a specific context and are heard by a specific listener. Therefore, we argue that apart from physical properties, the listener and the context are also important factors in evoking emotional responses to auditory information [18]. In other words, one particular sound could evoke a certain emotional response for one listener-context combination, while it could induce a completely different set of emotional experiences and behaviors for another. Partly, this is due to “meaning” the listener attributes to sounds. The meaning ascribed to a sound stimulus is related to one’s own personal experience: evoked sensations and emotional responses are often time learned or conditioned responses, which are linked to the associations made when the stimulus is encountered [18]. The principal aim of the present study is to show that auditory stimuli are capable of inducing emotional reactions in the listener and further, that these responses do not depend solely on the physical properties of the sound. We argue that one needs to take into account the meaning ascribed to sound by the listener in order to understand how sounds induce emotions. In everyday listening people tend to first identify the sound source [19] and use this information to categorize the sound [20]. Therefore, in the present study we focus on the source identification of everyday sounds in order to investigate the importance of meaning in auditoryinduced emotion. We attempted to reduce the identifiability of a sound source while keeping the physical form of the sound as similar to the original as possible by the use of an FTT (Fourier-time-transform) processing algorithm. A similar algorithm was proposed by Fastl [21] in order to investigate the influence of the meaning of the sound on noise annoyance arguing that existing psychoacoustic models do not account for non-sensory influences. The proposed algorithm was based on executing FTT, performing spectral broadening, and executing inverse FTT into time domain. Later studies [22–24] investigated the effect of this procedure on loudness and annoyance judgments and on a number of semantic differential scales. As a result, it was found that the proposed algorithm caused significant reduction in the identifiability of everyday sounds. Further, these studies found that the procedure did not cause a substantial 22 PAPERS change in perceived loudness and that annoyance ratings for the processed stimuli were, on average, higher than those for the original stimuli. Even though there are similarities between the aforementioned studies and the present study, our aim is to experimentally manipulate emotional experience induced by environmental sounds. Since emotional responses are often related to the associations made when the stimulus is encountered, we attempted to break these associations by reducing identifiability of the sound source. We expected a change in auditory-induced emotions as a result of reduction in identifiability. However, this change was not expected as an overall increase in annoyance, but it was hypothesized to depend on the emotional content of the original stimuli. As a result, the stronger the emotional responses that are lead by these associations made when the stimulus is encountered, the larger the change when they are no longer accessible. Therefore, changes in auditory-induced emotion due to processing are expected to be larger for emotionally loaded compared to emotionally neutral stimuli. 1 EXPERIMENT 1.1 Methods 1.1.1 Participants Forty subjects (15 females) took part in the test in two different settings (20 participants in each setting). Their ages varied between 22 and 44 years (25.1 on average). They were compensated for their participation after the test. The experiment was carried out in accordance with the ethical standards in the 1964 Declaration of Helsinki. 1.1.2 Stimuli Eighteen auditory stimuli were selected from the International Affective Digitized Sounds (IADS) database [25], which consists of recordings of brief (approximately 6 s) everyday events. The criteria for selection were to make sure that chosen stimuli would cover a wide range of emotional responses and to avoid stimuli that contained music, erotica, and those that were judged to be of poor quality. First, all stimuli in the IADS database were divided into three groups according to their normative mean valence ratings (on a 9-point scale from 1 to 9): pleasant (greater than 6), neutral (between 4 and 6), and unpleasant (smaller than 4). Then, six sounds were selected from each group (Table 1). All stimuli were sampled at 44.1 kHz and were six-seconds long. 1.1.3 Manipulation Selected stimuli were manipulated using an FTT processing algorithm, which is based on executing FTT, spectral averaging, and carrying out inverse FTT [26]. The original sound is divided into time windows that are then transformed into frequency domain. The amplitudes are averaged within a frequency band and the manipulated signal is transformed back into time domain without changing its phase. This algorithm allows distorting spectral and temporal information, thus reducing the possibility to identify J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February PAPERS EMOACOUSTICS Table 1. Auditory stimuli that are used in the experiment and their normative mean valence and arousal ratings in the IADS database Category Unpleasant Neutral Pleasant Sound Normative mean valence rating Normative mean arousal rating Screaming woman Alarm Bees Growling dog Pigs Jackhammer Cuckoo clock Cat Ticking clock Helicopter Frogs by a lake Toilet flush Cows Carousel Rollercoaster Beverage bottle opening Laughter Applause 1.74 2.38 2.58 2.73 3.56 3.66 4.23 4.31 4.38 5.19 5.49 5.57 6.09 6.21 6.90 7.13 7.32 7.69 8.07 7.96 7.01 7.77 6.38 5.60 6.19 5.47 4.56 5.50 4.51 3.37 4.27 5.43 7.36 4.43 6.37 5.57 the sound source, while keeping physical characteristics such as general frequency content and temporal evolution as close to the original as possible. By a careful selection of length of the time window and width of the frequency band one can control changes in temporal and spectral domains and keep physical changes at a minimum level. In the present study, spectral averaging was done in 1/3 octave bands, and 75% overlapping hanning-windows with different lengths were used (4096, 8192 or 16384 samples). The authors decided the processing level (i.e., length of the time window) informally for each sound. The goal was to determine the lowest level of manipulation needed for reducing identifiability of sound sources. After the initial selection, a group of other individuals, unaware of the purpose of the study, were asked to identify the sound source of the processed versions of the stimuli. The final level of processing for each sound was determined after these evaluations. 1.1.4 Measures In the test, participants were asked to rate how they felt during each stimulus using the 9-point Self Assessment Manikin (SAM) scales of valence and arousal [27]. Using these scales subjects rated, respectively, the pleasantness and the arousal level of their current feelings. Participants were also asked to rate how annoying each stimulus was, using a 9-point unipolar scale from 1 (not at all annoying) to 9 (very much annoying). Moreover, in a similar scale participants indicated perceived loudness level for each sound (1–not loud at all, 9–very loud). Finally, they were asked to identify the sound source with their own words. J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February 1.1.5 Procedure The experiment was carried out in a classroom. Sounds were reproduced through headphones (Sennheiser HD414), and participants rated the sounds on scales using paper and pencil. In the first setting, participants rated all the stimuli that were processed and then rated all the original stimuli presented in a different order (for the sake of simplicity from now on these two stimulus sets will be referred to as processed and original sounds, respectively). Participants were not aware of the existence of two sets of stimuli. They were told that they would listen to and rate 36 auditory stimuli that were recorded in everyday situations. In the second setting, participants rated only processed sounds, which were primed by their original counterparts. Priming was done by presenting each original sound before their processed versions. Participants were asked to rate the sounds in the same way as in the first setting (from now on this stimulus set will be referred to as primed sounds). None of the participants took part in both settings. We expected that after processing, stimuli would be unidentifiable due to the changes inflicted on spectral and temporal patterns. However, distorting identification does not necessarily mean to eliminate meaning in the sound. Even with incorrect identification new associations can be made and new meanings might emerge. Moreover, despite the fact that we aimed for the lowest level of processing, original and processed sounds are not physically identical. Hence, by the use of the priming manipulation we had the aim of reverting to the original associations, which would be made inaccessible due to processing; and we attempted to manipulate a change in emotional experience induced by processed sounds. Therefore, priming the processed stimuli is expected to reverse the effects of processing on auditoryinduced emotion to some extent. Additionally, a third set of participants (22 subjects, mean age: 25.5 ±1.2) were asked to rate how they felt during each processed sound, using the aforementioned SAM scales (valence and arousal). Before hearing each sound participants were presented with a text describing the sound they would hear. This last group (verbally primed) was introduced as an additional priming measure, since another priming technique was needed to control for the fact that participants in the auditory priming group may think of original sounds during their judgments, even though they were explicitly instructed to rate processed stimuli. 1.2 Results 1.2.1 Identifiability Results of the identification task showed that the processing caused identifiability of the source to decrease dramatically (t = 19.9, df = 17, p < .0001), from a mean of 91% (±3.4%, SE of the mean indicated) correct identification for original sounds to a mean of 11% (±3.4%) for processed stimuli. Moreover, priming the processed stimuli caused identifiability to increase significantly compared to the processed condition (t = 18.8, df = 17, p < .0001), to a mean of 78% (±2.7%) correct identification. Even though 23 ASUTAY ET AL. PAPERS the effects of processing and priming on identifiability were not uniform across all sounds because of differences in temporal and spectral properties, changes for each individual sound were statistically significant (p < .01). 1.2.2 Affective Ratings The aim of the study was to demonstrate that auditoryinduced emotions depend not only on the physical characteristics of the sound, but also on the meaning attributed to the sound, and to fulfill this aim we experimentally manipulated auditory-induced emotion. The reduction in identifiability was expected to cause larger changes in auditoryinduced emotion for stimuli that have high emotional content (pleasant or unpleasant) compared to neutral stimuli. For each affective dimension (i.e., valence and arousal), a three-factor repeated-measures ANOVA, employing processing (original vs. processed), pleasantness of the original stimuli (pleasant, neutral or unpleasant) and sound (6 stimuli in each condition) as within-subject factors, was run. Results showed that, on the average, reactions along the valence dimension were significantly higher for original than for processed stimuli (F(1,12) = 32.422, p < .001). Also, a significant interaction was found between processing and pleasantness (F(2,24) = 78.046, p < .0001), showing that the effect of processing on valence differed significantly according to the pleasantness of the original stimuli. When the effect of processing was investigated for separate groups according to pleasantness of original stimuli, it was found that valence ratings of unpleasant sounds increased as a result of processing (F(1,18) = 20.546, p < .001), whereas for pleasant and neutral sounds valence ratings decreased due to processing (F(1,15) = 7.410, p < .03; and F(1,14) = 196.909, p < .0001 for neutral and pleasant stimuli, respectively). For the arousal dimension, the interaction of processing by pleasantness was also statistically significant (F(2,20) = 30.007, p < .0001). Further analysis showed that while arousal ratings of unpleasant sounds significantly decreased (F(1,18) = 35.532, p < .0001) due to processing, they increased for neutral and pleasant stimuli (F(1,14) = 6.567, p < .03; and F(1,13) = 6.844, p < .03 for neutral and pleasant stimuli, respectively). An additional repeated-measures ANOVA, which employed pleasantness and sound within-subject factors and a between-subjects factor of priming (processed vs. primed), was run. A significant interaction of priming and pleasantness factors was found for both valence (F(2,54) = 7.809, p < .01) and arousal (F(2,48) = 3.81,p < .05) dimensions, indicating that the emotional experience differed depending on the pleasantness of the original stimuli. Univariate analyses showed that valence ratings for primed stimuli that were originally pleasant were higher compared with their processed counterparts (F(1,30) = 9.91,p < .01), whereas it was lower for originally unpleasant stimuli (F(1,35) = 7.05,p < .05). Moreover, arousal ratings for originally unpleasant primed stimuli were higher compared with processed stimuli (F(1,35) = 7.99,p < .01). Mean valence and arousal judgments for original, processed, and primed stimuli can be seen in Fig. 1. 24 Fig. 1. Mean (±SE) valence and arousal judgments for original, processed, and primed stimuli. Fig. 2. Effect of processing and priming on Perceived Annoyance. Stimuli are sorted according to the valence ratings for the original stimuli. Inset: Processing by pleasantness interaction. Finally, possible differences between the two different methods of priming were investigated by the use of repeated-measures ANOVA using pleasantness and sound as within-subject factors and priming method (verbal vs. auditory) as between-subjects factor. No significant main effect of priming method could be found (F(1,35) = .06, p = .802; and F(1,34) = 1.08, p = .307 for valence and arousal dimensions, respectively). Further, the priming method factor did not significantly interact with any other factor. This finding suggests that there was no substantial difference between the priming methods. 1.2.3 Annoyance A three-factor ANOVA was run employing processing (original vs. processed), pleasantness of the original stimuli (unpleasant, neutral or pleasant) and sound (6 sounds in each condition) as within-subject factors to investigate the impact of processing on auditory-induced annoyance (for mean levels see Fig. 2). Apart from the significant main effect of pleasantness (F(2,36) = 35.736, p < .0001), the interaction processing by pleasantness (F(2,36) = 46.855, p < .0001) was found to be significant. This finding suggested that even though processing did not have a significant main effect, the effect of processing on auditoryinduced annoyance differed significantly depending on the pleasantness of the original stimuli. When the effects were investigated among the separate groups according to the J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February PAPERS Fig. 3. Effect of processing and priming on Perceived Loudness. Stimuli are sorted according to the valence ratings for the original stimuli. Inset: Processing by pleasantness interaction. pleasantness of original stimuli, significant effects were found for unpleasant and pleasant stimuli in opposite directions (F(1,18) = 31.139, p < .0001; and F(1,19) = 37.387, p < .0001 for unpleasant and pleasant stimuli, respectively). In other words, due to FTT processing, auditory-induced annoyance decreased for unpleasant stimuli, while it increased for pleasant stimuli. However, for neutral stimuli there was no significant effect of processing. In order to investigate the differences between processed and primed stimuli, a separate ANOVA that compared the ratings of participants in the two different settings was carried out. It employed priming (primed vs. processed) as a between-subjects factor while keeping pleasantness and sound as within-subject factors. Although no significant main effect of priming was found, the priming by pleasantness interaction was statistically significant (F(2,76) = 7.725, p < .003), indicating that priming had different effect on auditory-induced annoyance depending on the pleasantness of the original stimuli. Further univariate tests showed that after priming auditory-induced annoyance for initially unpleasant stimuli increased while it decreased for the rest (see inset in Fig. 2). 1.2.4 Loudness Since previous research showed that loudness is one of the most important factors that influence auditory-induced annoyance, change in perceived loudness as a result of processing was also investigated in the present study (Fig. 3). In a three-factor ANOVA, which employed processing (original vs. processed), pleasantness (unpleasant, neutral or pleasant), and sound, there was a significant main effect of processing indicating that perceived loudness for original sounds was higher on average than it was for processed sounds (F(1,19) = 6.470, p < .05). Furthermore, there were significant interactions of the factors sound by processing (F(5,95) = 4.058, p < .05) and pleasantness by processing (F(2,38) = 28.293, p < .0001), showing that the effect of processing was not uniform on all stimuli. When the differences in loudness was investigated within separate groups according to the pleasantness of the original J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February EMOACOUSTICS stimuli, a significant effect of processing was found for unpleasant sounds (F(1,19) = 39.107, p < .0001), indicating that the perceived loudness for original stimuli was higher compared with processed stimuli. However, there were no significant effects of processing for neither pleasant nor neutral stimuli, which implies that processing caused perceived loudness to decrease for unpleasant stimuli only. Post-hoc analysis, in which changes in the 5th percentile Zwickers-loudness (N5) were studied, showed that N5 decreased for almost every stimulus due to processing. Therefore, in order to further investigate the dependence of perceived loudness on the psychoacoustics metric (N5), an additional post-hoc regression analysis was done. The mean levels of perceived loudness of original and processed stimuli were taken as dependent variables and mean valence, arousal, annoyance ratings, and N5 were taken as independent variables. Primed stimuli were not included in this analysis since participants’ loudness judgments might have been affected due to the fact that they heard original stimuli before rating processed stimuli. The analyses showed that among the above-mentioned independent variables the best linear model to explain the variance in perceived loudness (Adj. R2 = .77, p < .0001) included the variables of auditory-induced arousal (standardized β = 0.69) and N5 (standardized β = 0.39). Since previous research has claimed a strong relationship between auditory-induced annoyance and loudness for meaningless sounds [28], we studied the dimensional correlation between the means of auditory-induced annoyance and perceived loudness for each stimulus. Auditoryinduced annoyance and perceived loudness were positively correlated for all the three stimulus-groups (original, processed, and primed stimuli), even though the correlation was weaker for processed stimuli. Pearson Correlation coefficients were .77, .54, and .74 (N = 18, p < .05 for all three) for original, processed, and primed stimuli, respectively. 2 DISCUSSION AND CONCLUSIONS The present study set out to show that physical characteristics of acoustic stimuli are not able to fully capture auditory-induced emotion. Auditory emotions are often idiosyncratic and depend on the meaning an individual in a certain circumstance ascribe to sounds [18]. To investigate the meaning that is attached to sounds, we reduced the identifiability of a number of environmental sounds in order to distort the meaning attribution process. The results from the identification task showed that the proposed FTT processing worked as intended, correct identification was reduced significantly for every stimulus. In order to experimentally vary the context, and force specific association and meaning, we introduced a priming manipulation. Correct identification was significantly increased for every processed stimulus in the priming condition. Although the change in correct identification due to processing was not the same for every stimulus, auditoryinduced emotion varied systematically depending on 25 ASUTAY ET AL. emotional content of the original stimuli. The influence of the implemented processing algorithm on auditoryinduced valence and arousal was in the opposite direction for pleasant and unpleasant stimuli. Also, auditoryinduced annoyance was affected only when the original stimulus was emotionally charged, i.e., auditory-induced annoyance increased for pleasant stimuli, whereas it decreased for unpleasant stimuli. These changes on emotional experience due to processing support the hypothesis that sounds with high emotional content become more neutral when the associations that lead to these strong emotional responses are inaccessible. However, we maintain that it is likely that even with the processed sounds, subjects still perceived auditory events rather than physical properties, which may well be plausible here since subjects were instructed that both original and processed stimuli were recorded in everyday-situations depicting everyday-events. Responses to the identification task support this argument, since subjects identified the source of processed sounds or had associations even if they were incorrect. Nevertheless, these associations were emotionally neutral. Furthermore, it was also hypothesized that once these associations are formed again by the use of priming, the effects of processing on auditory-induced emotion would be reversed, which is supported by the results. In other words, priming manipulation caused processed stimuli to trigger emotionally loaded associations consistent with the initial exposure. As a result, the emotional experience changed significantly for those sounds that were initially emotional, i.e., pleasant or unpleasant. Earlier studies [22–24], which employed a similar paradigm, suggested that due to processing (1) identifiability of the sounds were greatly reduced, and (2) annoyance judgments were on average higher compared to original stimuli, whereas (3) loudness judgments were not affected by processing. In the present study, we found a reduction in correct identification due to processing. However, our results on auditory-induced annoyance were not totally in line with the previous studies. The reason for this difference might be that in previous research, the stimulus set was comprised mainly of non-stationary everyday noises and product sounds most of which had emotionally neutral meaning, while in the present study sounds with a wide range of emotional content are used, including human and animal vocalizations. Our stimulus set contained highly unpleasant sounds (Table 1), whose processed versions induced less annoyance compared to their original counterparts. Moreover, our findings on loudness judgments were not in line with the previous research. However, direct comparison of results on loudness judgments between studies might not be relevant, since implemented processing algorithm was not identical to the one that was used in previous studies. In particular, our processing algorithm caused a decrease in Zwicker loudness (N5), while processing in previous studies kept loudness-time function intact. Emotional responses and evoked sensations are often time learned responses, which are linked to the associations made when the stimulus is encountered [18]. In the present 26 PAPERS research, we attempted to interrupt these associations by reducing identifiability of sound sources. Looking at participants’ valence, arousal, and annoyance ratings, we claim that emotional experience induced by stimuli became neutral when the original associations were made inaccessible. Thus, one can say that processed stimuli were emotionally neutral. By the use of priming manipulation we tried to force specific, emotionally loaded associations on (processed) stimuli that were emotionally neutral. As a result, priming the processed stimuli helped activating emotionally loaded associations, and thus, even though sounds were physically identical, the primed stimuli induced wider range of emotional experience (both pleasant and unpleasant), compared with processed stimuli. This finding supports our argument that physical or psychoacoustical characteristics of auditory stimuli cannot fully account for evoked sensations and emotional reactions. Like much of previous research (e.g., [11, 12]), which linked annoying quality of a sound to its loudness, the present study also found significant correlations between annoyance and loudness judgments. Furthermore, the implemented algorithm caused the 5th percentile loudness (N5) to decrease for almost every stimulus. However, neither emotional experience nor loudness judgments followed this decrease. Looking more carefully at the perceived loudness results, we found that it decreased only for originally unpleasant stimuli; and that auditory-induced arousal and N5 could account for 77% of the variance in perceived loudness in a regression analysis, to which the contribution of auditory-induced arousal was substantially larger compared to calculated loudness. Hence, these findings suggest that changes in psychoacoustic metrics might not fully capture the changes in resulting sensations in an everyday listening situation. Loudness judgments might have been affected by the emotional experience. Unpleasant stimuli, which are more salient compared to neutral or positive stimuli [29], might be perceived as louder due to the negative affect it induces. However, this speculation has to be investigated further before drawing any firm conclusions. In conclusion, the present study demonstrates that sound induces emotional reactions in its perceiver and that focusing only on the physical properties of auditory stimuli is not sufficient to understand auditory-induced emotion. One also needs to consider the associations made by the listener. Therefore, we claim that auditory-induced emotion is an important component in auditory perception to consider in product sound quality and in sound design applications in general, where some form of auditory information is conveyed to the listener. Hence, sound designers need to be aware of the different capacities of physical, psychoacoustical, and psychological dimensions of auditory displays for causing an emotional reaction in its perceiver, in order to succeed in designing effective auditory displays. ACKNOWLEDGMENTS A preprint version of this paper has been presented at the 3rd International Workshop on Perceptual Quality of Systems, PQS, Bautzen, Germany, September 2010. J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February PAPERS 3 REFERENCES [1] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models (Springer-Verlag, Berlin, 1990). [2] U. Jekosch, “Meaning in the Context of Sound Quality Assessment,” Acta Acustica, 85 pp. 681–684 (1999). [3] A. Damasio, Descartes’ Error (Putnam, NY, 1994). [4] R. Zeelenberg, E. Wagenmakers and M. Retteveel, “The Impact of Emotion on Perception: Bias or Enhanced Processing,” Psychological Science, 17(4), pp. 287–291 (2006). [5] E. A. Phelps, S. Ling and M. Carrasco, “Emotion Facilitates Perception and Potentiates the Perceptual Benefits of Attention,” Psychological Science, 17(4), pp. 292–299 (2006). [6] T. Gan, N. Wang, Z. Zhang, H. Li, and Y. Lou, “Emotional Influences on Time Perception: Evidence from Event-Related Potentials,” NeuroReport, 20, pp. 839–843 (2009). [7] M. M. Bradley and P. J. Lang, “Affective Reactions to Acoustic Stimuli,” Psychophysiology, 37(2), pp. 204–215 (2000). [8] A. Tajadura-Jiménez and D. Västfjäll, “AuditoryInduced Emotion: A Neglected Channel for Communication in Human-Computer Interaction,” in C. Peter and R. Beale [Eds.], Affect and Emotion in Human-Computer Interaction: From Theory to Applications, 63–74(SpringerVerlag, Berlin / Heidelberg, 2008). [9] J. Blauert and U. Jekosch, “A Layer Model of Sound Quality,” J. Audio Eng. Soc., this issue. [10] D. Norman, Emotional Design: Why We Love (or Hate) Everyday Things(Basic Books, NY, 2004). [11] D. Västfjäll, “Affect as a Component of Perceived Sound and Vibration Quality in Aircraft,” Ph.D. thesis (Chalmers TH, 2003). [12] B. Berglund, U. Berglund and T. Lindvall, “Scaling Loudness, Noisiness and Annoyance of Aircraft Noise,” Journal of the Acoustical Society of America, 57, 930–934 (1975). [13] U. Ländström, E. Åkerlund, A. Kjellberg, and M. Tesarz, “Exposure Levels, Tonal Components, and Noise Annoyance in Working Environments,” Environment International21(3), 265–275 (1995). [14] E. A. Björk, “Laboratory Annoyance and Skin Conductance Responses to Some Natural Sounds,” Journal of Sound and Vibration, 109(2), 339–345 (1986). [15] K. Hiramatsu, K. Takagi, and T. Yamamoto, “Experimental Investigation on the Effect of Some Temporal Factors of Nonsteady Noise on Annoyance,” Journal of the Acoustical Society of America, 74(6), 1782–1793 (1983). J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February EMOACOUSTICS [16] E. A. Björk, “Startle, Annoyance and Psychophysiological Responses to Repeated Sound Bursts,” Acta Acustica, 85, 575–578 (1999). [17] J. G. Neuhoff, Ecological Psychoacoustics (Elseiver Academic Press, San Diego, CA, 2004). [18] P. Juslin and D. Västfjäll, “Emotional Responses to Music: The Need to Consider Underlying Mechanisms,” Behavioral and Brain Sciences, 31(5): 559–621 (2008). [19] W. W. Gaver, “What Do We Hear in the World? An Ecological Approach to Auditory Event Perception,” Ecological Psychology, 5(1): 1–29 (1993). [20] P. Bergman, A. Sköld, D. Västfjäll and N. Fransson, “Perceptual and Emotional Categorization of Sound,” Journal of Acoustical Society of America, 126(6): 3156–3167 (2009). [21] H. Fastl, “Neutralizing the Meaning of Sound for Sound Quality Evaluations,” Proceedings of the 17th ICA 2001, Rome, Italy, edited by Int. Commission of Acoust., vol IV, CD-ROM (2001). [22] W. Ellermeier, A. Zeitler and H. Fastl, “Impact of Source Identifiability on Perceived Loudness,” Proceedings of the 18th ICA 2004, Kyoto, Japan, 1491–1494 (2004). [23] W. Ellermeier, A. Zeitler and H. Fastl, “Predicting Annoyance Judgments from Psychoacoustic Metrics: Identifiable versus Neutralized Sounds,” Proceedings of the 33rd Internoise 2004, Prague, Czech Republic, Book of Abstracts, No.267 (2004). [24] A. Zeitler, W. Ellermeier and H. Fastl, “Significance of Meaning in Sound Quality Evaluation,” Proceedings SFA/DAGA 2004, Strasbourg, France (2004). [25] M. M. Bradley and P. J. Lang, “International Affective Digitized Sounds (IADS): Stimuli, Instruction Manual and Affective Ratings (Tech. Rep. No. B-2)”, Gainesville, FL : The Center for Research in Psychophysiology, University of Florida (1999). [26] A. Genell, “Perception of Sound and Vibration in Heavy Trucks,” PhD thesis (Chalmers TH, 2008). [27] P. J. Lang, “Behavioral Treatment and BioBehavioral Assessment: Computer Applications,” in J. B. Sidowski, J. H. Johnson, and T. A. Williams [Eds.], Technology in Mental Health Care Delivery Systems, 119–137 (Ablex Publishing, Norwood, NJ, 1980). [28] D. Västfjäll, “Affective Reactions to Meaningless Sounds,” Cognition and Emotion, in press. [29] A. Tajadura-Jiménez, A. Väljamäe, E. Asutay and D. Västfjäll, “Embodied Auditory Perception: The Emotional Impact of Approaching and Receding Sound Sources,” Emotion, 10(2): 216–229 (2010). 27 ASUTAY ET AL. PAPERS THE AUTHORS E. Asutay D. Västfjäll A. Tajadura-Jiménez Erkin Asutay is a Ph.D. student at the Division of Applied Acoustics, Chalmers University of Technology. His research interests are how sound induces affective reactions in the listener and how emotions affect auditory perception. Also, he carries out research on objective and subjective characterization of mechanical key action properties of pipe organs, where he studies the importance of haptic feedback that the organist receives while playing on perception of the instrument. r Daniel Västfjäll has dual Ph.D.s in psychology and acoustics from Göteborg University and Chalmers University of Technology. He is currently full professor of cognitive psychology at Linköping University, researcher at the Division of Applied Acoustics, Chalmers University of Technology, and research scientist at Decision Research, Eugene, OR. His interest centers around the role of emotion in auditory perception, emotional reactions to music, and emotion-cognition interactions. He has published work on emoacoustics in various leading scientific journals and has led several research projects on emotion and sound. r Ana Tajadura-Jiménez studied telecommunication engineering at Universidad Politécnica de Madrid, Spain. She received a Master’s degree in digital communication systems and technology in 2002 and a Ph.D. degree in psychoacoustics in 2008, both from Chalmers University of Technology, Gothenburg, Sweden. From 2009, she is a post-doctoral researcher at Royal Holloway, University of London, UK. She has been a visiting researcher at Universitat de Barcelona, Spain (2006) and at NTT Communication Science Laboratories, Japan (2007, 2011). Dr. TajaduraJiménez has coauthored a number of papers on auditory and multisensory perception, emotional processing, and self-perception in everyday contexts and new media technologies. r Anders Genell, after obtaining his MSc in sound and vibration at Chalmers University of Technology, continued as a doctoral student within the CRAG-MSA research group, 28 A. Genell P. Bergman M. Kleiner focusing on emotional reactions to and quality aspects of interior vehicle sound and vibrations. In 2008, he presented his thesis entitled “Perception of Sound and Vibration in Heavy Trucks.” Since then he has been employed at the Swedish National Road and Tranpsort Research Institute (VTI), where he, among other things, has developed a realtime generated sound model for advanced moving base driving simulators. r Penny Bergman studied at Gothenburg University. She received a Master’s degree in management in 2005 and a B.Sc. in psychology in 2006. She is currently a doctoral student at the Division of Applied Acoustics at Chalmers University of Technology. Her research interests are mainly the role of affect in interpretation and perception of sounds, multimodal integration, and the role of sound in restorative environments. r Dr. Mendel Kleiner received his M.Sc.E.E. degree from Chalmers University of Technology, Göteborg, Sweden, in 1969 and obtained a Ph.D. in building acoustics in 1978. He held a position as lecturer at Chalmers, teaching and doing research on active room acoustics and sound field simulation. In 1989, Dr. Kleiner was awarded the honorary title of ‘D. He became a fellow of the Acoustical Society of America in 2000 and is a member of its technical committee on architectural acoustics. Between 2003-2005, Prof. Mendel Kleiner was professor in the School of Architecture and the director of the program in architectural acoustics at Rensselaer Polytechnic Institute, Troy, NY, USA. He is the author or coauthor of more than 130 scientific papers, holds two patents, and has written two books, Worship Space Acoustics and Acoustics and Audio Technology, both published by J. Ross, Ft. Lauderdale, FL. Current research interests are auditory presence, virtual acoustics and human behavior, sensory interaction, room acoustics and auralization (particularly spaces for music and worship), acoustic measurement, audio and electroacoustics, sound reproduction systems (microphones, loudspeakers and arrays), sound quality (vehicles, machinery, technical and communication systems), as well as musical acoustics. J. Audio Eng. Soc., Vol. 60, No. 1/2, 2012 January/February
© Copyright 2026 Paperzz