Topics in Cognitive Science 8 (2016) 425–434 Copyright © 2016 The Authors. Topics in Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society ISSN:1756-8757 print / 1756-8765 online DOI: 10.1111/tops.12197 Foundations of Intonational Meaning: Anatomical and Physiological Factors Carlos Gussenhoven Department of Linguistics, Radboud University Nijmegen Received 17 January 2014; received in revised form 17 July 2015; accepted 6 August 2015 Abstract Like non-verbal communication, paralinguistic communication is rooted in anatomical and physiological factors. Paralinguistic form-meaning relations arise from the way these affect speech production, with some fine-tuning by the cultural and linguistic context. The effects have been classified as “biological codes,” following the terminological lead of John Ohala’s Frequency Code. Intonational morphemes, though arguably non-arbitrary in principle, are in fact heavily biased toward these paralinguistic meanings. Paralinguistic and linguistic meanings for four biological codes are illustrated. In addition to the Frequency Code, the Effort Code, and the Respiratory Code, the Sirenic Code is introduced here, which is based on the use of whispery phonation, widely seen as being responsible for the signaling and perception of feminine attractiveness and sometimes used to express interrogativity in language. In the context of the evolution of language, the relations between physiological conditions and the resulting paralinguistic and linguistic meanings will need to be clarified. Keywords: Intonational meaning; Paralinguistics; Biological codes; Effort Code; Frequency Code; Respiratory Code; Sirenic Code 1. Introduction Human vocal communication proceeds through two systems, language and paralanguage (Ladd 1996). The simultaneous use of these systems is particularly evident in vocal fold vibration, and tone and intonation therefore provide a unique platform for seeing them in action. This contribution has two goals. First, it argues that the form–meaning relations in expressive uses of vocal pitch are grounded in anatomical and physiological effects on vocal Correspondence should be sent to Carlos Gussenhoven, Afdeling Taalwetenschap, Faculteit der Letteren Radboud University Nijmegen, Postbus 9103 6500HD Nijmegen, The Netherlands. E-mail: c.gussenhoven @let.ru.nl 426 C. Gussenhoven / Topics in Cognitive Science 8 (2016) fold vibration and that the intonation systems of languages are biased toward the form– meaning relations based on these “biological codes.” Second, it introduces the Sirenic Code as an explanation for the use of breathy or whispery voice in questions. 2. Paralinguistics Paralinguistics preceded the origin of language. Just as non-verbal gestures like the nose wrinkle for social distancing and the eyebrow flash for surprise or fear are “grounded in our common anatomical-emotional history’ (Segerstr ale and Molnar, 1997; in reference to a chapter by Wolf Schiefenh€ ovel in their volume), so paralinguistic form–meaning relations are based on metaphorical interpretations of the effects of anatomical and physiological factors on vocal fold vibration. John Ohala pointed out that differences in larynx size and the resulting differences in vocal fold size, notably those between men and women, are responsible for the interpretation of high pitch as “small” meanings and low pitch as “big” meanings, a relation he termed the Frequency Code (Ohala, 1983 1984, 1996). In retrospect, there are additional codes besides the size/frequency relation. Gussenhoven (2002, 2004) extended the idea to the Effort Code and the Respiratory Code. (The latter term is from Nolan [2006].) A fourth code to be discussed below is the Sirenic Code. Paralinguistic meanings can be divided into those that relate to the message (“informational” meanings) and those that relate to the speaker (“affective” meanings). An example of an informational meaning is “interrogative,” while an example of an affective meaning is “cooperative.” 2.1. The Frequency Code The association of “small” meanings with high pitch yields interpretations like submission, friendliness, uncertainty, and vulnerability, while “big” meanings are their opposites: dominance, aggressiveness, certainty, and protectiveness. Placing his account within the general complex of agonistic signals in mammals and birds (Morton, 1977), Ohala (1983) noted that the association of large larynxes with large creatures is pre-determined in mammalian species through sexual dimorphism. Puberty in human males is marked by a disproportionate growth of the larynx, increasing vocal fold mass, as well as a lowering of the larynx, increasing the length of the vocal tract, in addition to the onset of peripheral facial hair. These adjustments serve to create the impression of a large creature, which will discourage potential aggressors. 2.2. The Effort Code The Effort Code is based on the effect that more careful pronunciation has on the pitch range and the precision in the execution of pitch movements. Wider pitch range and more precise realizations of pitch events are associated with greater significance. In addition to greater insistence, wider pitch range can signal cooperativeness, as in infant-directed speech. For some languages, it has been shown that pitch falls tend to be steeper and C. Gussenhoven / Topics in Cognitive Science 8 (2016) 427 timed earlier when used to express focus (Smiljanic & Hualde, 2000). A less expected use of the Effort Code is pitch range compression toward mid pitch to express negation, intended to effect the retraction as opposed to the addition of information (Gussenhoven, 2004, p. 88). The prediction was tested by Reckling and K€ugler (2011) for German. Speakers compressed their pitch to the mid range in negative utterances and listeners associated that adjustment with negation. 2.3. The Respiratory Code The Respiratory Code, earlier the “Production Code” (Gussenhoven, 2002), relates the declining f0 caused by diminishing subglottal air pressure to a phrasal profile beginning with high pitch and ending with low pitch. The physiological effect applies to the exhalation phase, and thus to speech produced during a breath group. However, even though breath intakes do not reliably coincide with phrase boundaries, the association is with the beginnings and ends of phonological units, like the intonational phrase and the utterance. In fact, declination is itself an exploitation of the Respiratory Code, being largely under the control of the speaker.1 At the beginning of these phonological constituents, a high pitch signals new topics and low pitch the continuation of a topic. At the end, high pitch signals turn maintenance, while low pitch signals closure and thus potentially a turn shift. In English, for instance, higher and later peaks have been reported in topic-initiating sentences (Wichmann, House, & Rietveld, 1997). 2.4. The Sirenic Code Breathy or husky voice is associated with feminine sexiness, anecdotal evidence for which is found in the British English term “bedroom voice” as well as by informal reports that breathy voice is associated with female prostitution in South Korea. Breathiness is more common in female speech, as concluded by van Bezooijen (1984, p. 25) on the basis of studies like Henton and Bladon (1988), Trittin and de Santos y Lle o (1995), and Mendoza, Valencia, Mufioz, and Trujillo (1996). From the perspective of regular phonation, breathy voice is typically created by slack vocal folds and weak adduction, so that no firm closure stage is achieved. The escaping air that causes the glottal friction cannot fully contribute to the force that pushes the closed vocal folds up and apart, and because of the larger width of the glottis, the vocal folds are less affected by the lowered air pressure between them when air flow is increased (thus diminishing the Bernoulli effect). Both low arousal and femininity have been associated with breathiness in perception research. First, breathy voice is associated with meanings like relaxed, intimate, friendly, and timid, as found on the basis of synthetic stimuli by Gobl and Nı Chasaide (2003). Second, voice quality, and breathy voice in particular, have different perceptual effects in male and female speech. Addington (1968) found that female breathy voice was perceived as feminine, small, slim, good-looking, immature, and humorous, while the only unique attribute for male breathy voice was artisticity. Xu, Lee, Wu, Liu, and Birkholz (2013) showed 428 C. Gussenhoven / Topics in Cognitive Science 8 (2016) that male listeners preferred a female voice that signals a small body size, a voice with high pitch, wide formant dispersion, and breathy voice. Female listeners preferred a male voice with the opposite qualities for the first three factors; breathy voice was positively evaluated, which Xu et al. (2013) explain as a softening effect on the aggressiveness associated with a large body size, in agreement with the low arousal attribute referred to above. An important question for our topic is why a breathy or husky female voice should have these attributes. Laver (1991) points out that voice quality characteristics may derive from hormonal conditions, “where, for example, these result in changes in the copiousness and consistency of the supply of lubricating mucus to the larynx, and in the characteristics of the mucous membrane covering the actual vocal cords. Such changes occur in the pregnant and pre-menstrual states in women (Perello, 1962).” These conditions, Laver goes on, may cause “slight harshness and whispery or breathy voice.” Possibly, such states may also occur during sexual arousal as a result of increased hydration of the sexual organs at the expense of other organs.2 Evidence that a woman’s voice is more attractive during the pre-menstrual period is provided by Pipitone, Gallup, and Gordon (2008). 3. “Biological codes” and language structure Parallel to segmental structure, intonational structure consists of a set of tonal morphemes, a phonological grammar describing the way they are combined and integrated into the sentence, plus language-specific phonetic realization rules (Pierrehumbert, 1980). In many languages, single tones (i.e., H for high or L for low) or complexes of tones (HL, etc.) are morphemes located at the edges of prosodic constituents (“boundary tones”) or in locations inside these constituents (“pitch accents”). In this section, examples are given of intonational morphemes whose tonal composition reflects one of the four codes outlined above. The non-arbitrariness of intonational morphemes is akin to segmental sound symbolism (Dingemanse, 2011; Ohala, 1996). For instance, the frequent occurrence of palatal consonants and front high vowels for diminutive morphemes is explained by the interpretation of a “small” meaning suggested by the forward position of the tongue body and the spread position of the lips, gestures that mimic a short forward section of the vocal tract (Ohala, 1996), acoustically inversely related to the second vowel formant. These discrete interpretations of gradient paralinguistic forms need not be present in all languages. Interestingly, while paralinguistic meanings of pitch shapes may relate to states of the speaker (affective meanings), this is rarely the case for intonational morphemes, which by and large have informational meanings. (See Supplementary Information S1.) 3.1. The Frequency Code in grammars In (1a), the British English Would you like some coffee? combines with the pitch accent H*L on coffee, the initial boundary tone %L, and the final boundary tone H%. C. Gussenhoven / Topics in Cognitive Science 8 (2016) 429 (The * after a tone marks it as synchronizing with the accented syllable, while % indicates the boundary of the intonation phrase.) The phonetic realization is given above the sentences, where the bullets indicate the timing and pitch height of the pronunciation of each tone, with connecting lines forming the pitch contour. Structure (1a) is discretely permutable with other tone options. Instead of %L we could have %H, instead of H*L we could have L*H, and so on. Replacing H*L with H* will lead to the intonation contour in (1b) (where the mid pitch on cof- is directly followed by high pitch on -fee, without first dipping down), but there is no third contour with a semi-dip and an intermediate meaning. Structure (1a) is commonly used for questions in England, while (1b) is the usual form in the United States and Canada. These rising contours are representative of interrogative intonations found world-wide. Although there are many languages that have falling question intonation (Bolinger, 1978), the number of languages with rising question intonations is well above chance. Grammaticalization of secondary uses of the Frequency Code (see Supplementary Information S1.) is found in languages with pitch accents defining earlier peaks for statements and later peaks for questions, like Neapolitan Italian (D’Imperio & House, 1997). Increased final syllable length is found in West Greenlandic, where questions are formed by adding a mora (a phonological timing unit) to the end of the utterance (Rischel, 1974). As a result, the intonation contour gets shifted rightward, creating different shapes on the final two syllables. 3.2. The Effort Code in grammars The prosodic expression of focus is typically achieved with the help of intonational structures that enhance the prominence of the focus constituent. West Germanic languages have no pitch accents on words that come after the focus of the sentence, as in coffee in Would you like still MORE coffee? Other languages may contrast H-toned pitch accents for focused words with L-toned ones after the focus (Frota, 2000). Withdrawal of information, that is, negation, through pitch range reduction can also be expressed discretely. In Engenni, a tone language, high tones are lowered and low tones are raised in negative sentences from the verb onwards. This feature may be the only structural way negation is expressed in the language (Thomas, 1978, p. 67). 430 C. Gussenhoven / Topics in Cognitive Science 8 (2016) 3.3. The Respiratory Code in grammars Many languages use a high final boundary tone (H%) as a marker of non-finality, often referred to as “comma-intonation.” An iterative lowering of H-tones in an utterance, a feature known as Downstep, is a second type of grammaticalization of the Respiratory Code. Its meaning (“there is no room for further discussion”; Gussenhoven, 2004, p. 107) is a version of the “finality” meaning of final low pitch. It explains the preference for the downstepped pronunciation of titles of stories in English, a point where the listener is expected not to interrupt the reader. Observe that this account interprets the fact that H% may mean both “non-finality” and “interrogativity‘’ in the same language as accidental, since they derive from different codes. Central Swedish keeps them separate: L% is used in questions and statements, while H% predominantly occurs in non-final intonational phrases and more rarely in questions (Riad, 2014, p. 266). 3.4. The Sirenic Code in grammars If breathy voice signals femininity, as suggested in Section 2.4, it may be usable to signal interrogativity, the informational meaning of the Frequency Code based on “smallness” associated with the female larynx. It might moreover be expected to be a feature in languages with reduced availability of pitch cues, as in tone languages. “Lax voice” was reported to occur as a question cue in the last syllable of phrases in 36 of 78 African languages listed by Rialland (2007), many of which are tone languages. Rialland describes this type of question marker as a complex of lengthening, vowel lowering, and low pitch. Increased final lengthening is identified as a secondary feature of the Frequency Code in Supplementary Information S1. and low pitch is a natural accompaniment of breathy voice. The function of vowel lowering, which may increase the amplitude of the first harmonic, is not clear. Breathy or whispered termination, a phrase-final [+spread glottis] feature, can be independent of the lexical tone value (low, high, falling) on the last syllable. In (2a) (from Ikaan, a Benue-Congo language spoken in Nigeria), a final whispered vowel co-occurs with H and in (2b) with L (Salffner, 2010). In addition, questions have expanded pitch range, here speculatively indicated by the initial %H boundary tone. (An acute accent indicates H, a grave accent L, while acute-grave and grave-acute indicate falling and rising tone, respectively.) C. Gussenhoven / Topics in Cognitive Science 8 (2016) 431 The Sirenic Code may dispense with the breathy voice and have low tone as a secondary cue. For instance, in Basaa (Benue-Congo, Cameroon), polar questions end in low-toned [e], which assimilates to a preceding adjacent vowel, causing the vowel in final open syllables to lengthen by one low-toned mora (Makasso, 2008, p. 54), as illustrated in (4a,b,c). WH-words have a final floating H-tone that attracts an extra mora in phrasefinal, though not in phrase-medial position (Hamlaoui & Makasso, 2011). These examples again show that secondary grammaticalizations (final syllable lengthening) may exist in the absence of primary grammaticalizations (H% to mark questions). 4. Conclusion The phonological forms of a considerable proportion of the intonation contours in languages derive from paralinguistic form-meaning relations that have arisen through metaphorical interpretations of effects that anatomical and physiological conditions have on vocal fold vibration. One question here concerns the extent to which these observations provide a perspective on the possible continuity between phylogenetically older systems of communication and language. Ohala (1983), in fact, provides ample parallels between the paralinguistic meanings of the Frequency Code as used in human communication and agonistic signaling in animals. This connection exists independently of the degree to which animal signals encode referential meanings as opposed to involuntarily reflecting biological states, which provide information to other animals (cf. Rendall, Owren, & Ryan, 2009). A second question concerns the extent to which paralinguistic meanings are reflected in early development of human communication (cf. Oller, 2000). This issue is considered in Supplementary Information S2. Another position defended here is that final lax voice to mark interrogativity, as attested in many African languages (Rialland, 2007), indirectly supports the conception behind the Frequency Code by Ohala (1983), which holds that vocal features that express femininity may be used to express interrogativity. The Sirenic Code was introduced here to account for the interrogativity meaning of breathy voice. Varying hydration of the larynx as a function of menstrual and sexual activity and the resulting attractiveness of a breathy or husky voice were proposed as the basis for the metaphor in this case. Ohala (1983, 1984) derives the interrogativity meaning of the Frequency Code from the submissiveness and dependence signalled by high pitch. It could also be a referential transfer of the “uncertainty” meaning as relating to the speaker to one relating to the message (Gussenhoven, 2004). In either interpretation, the interrogativity meaning of whispery voice may have been mediated by the shared feminity attribute of high pitch and whispery phonation. 432 C. Gussenhoven / Topics in Cognitive Science 8 (2016) Acknowledgments I am grateful to John Laver and Kim Oller for discussing the issues with me and amply providing me with references, and to Annie Rialland and two anonymous reviewers for commenting on a previous version. Notes 1. The contribution of falling subglottal pressure is hard to establish (Ohala, 1990; Strik & Boves, 1995). Subglottal pressure variation is largely due to impedance effects of the articulation of speech sounds. An indication of the role of subglottal pressure is given by the finding by van Katwijk (1974) that the onset of speech coincides with a sudden increase in subglottal air pressure, while “at the endings of utterances a gradual shapeless diminishing of pressure occurred.” 2. Laver (1991) mentions sexual arousal along with the other reproductive functions, but he observes in a personal communication (2013) that he has found no confirmation in the medical literature. On the connection between hormonal states and singing performance, see L~a and Davidson (2005), who also give a survey of observations and research on the connections between the larynx and hormonal states. References Addington, D. (1968). The relationship of selected vocal characteristics to personality perception. Speech Monographs, 35, 492–503. van Bezooijen, R. (1984). Characteristics and recognizability of vocal expressions of emotion. Dordrecht, the Netherlands: Foris. Bolinger, D. (1978). Intonation across languages. In J. In Greenberg (Ed.), Universals of human language, Vol. 2 (Phonology) (pp. 471–524). Stanford, CA: Stanford University Press. D’Imperio, M., & House, D. (1997). Perception of questions and statements in Neapolitan Italian. Proceedings of EUROSPEECH 1997, 1, 251–254. Dingemanse, M. (2011). The meaning and use of ideophones in Siwu. Nijmegen, the Netherlands: Max Planck Institute for Psycholinguistics. Frota, S. (2000). Prosody and focus in European Portuguese. New York: Garland. Gobl, C., & Nı Chasaide, A. (2003). The role of voice quality in communicating emotion, mood and attitude. Speech Communication, 40, 189–212. Gussenhoven, C. (2002). Intonation and interpretation: Phonetics and phonology. In Speech Prosody 2002 (pp. 47–57). ProSig and Universite de Provence, Laboratoire de Parole et Langage: Aix-en-Provence. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge, UK: Cambridge University Press. Hamlaoui, F., & Makasso, E. M. (2011). Basaa Wh-questions and prosodic structuring. ZAS Papers in Linguistics, 55, 47–63. Henton, C., & Bladon, A. (1988). Creak as a socio-phonetic marker. In L. M. Hyman & C. N. Li (Eds.), Language, speech and mind: Studies in honor of Victoria A. Fromkin (pp. 3–29). London: Croom Helm. C. Gussenhoven / Topics in Cognitive Science 8 (2016) 433 van Katwijk, A. F. V. (1974). Accentuation in Dutch. Assen: van Gorcum. L~a, F., & Davidson, J. W. (2005). Investigating the relationship between sexual hormones and female Western classical singing. Research Studies in Music Education, 24, 75–87. Ladd, D. R. (2008 [1996]). Intonational phonology (2nd ed.). Cambridge, UK: Cambridge University Press. Laver, J. (1991). Voice quality and indexical information. In J. Laver (Ed.), The gift of speech: Papers in the analysis of speech and voice (pp. 147–161). Edinburgh: Edinburgh University Press. Makasso, E.-M. (2008). Intonation et m elismes dans le discours oral spontan e en B as aa. Ph.D. thesis, Aix-Marseille Universite. Mendoza, E., N. Valencia, J. Mufioz, & H. Trujillo (1996). Differences in voice quality between men and women: Use of the Long-Term Average Spectrum (LTAS). Journal of Voice, 10, 59–66. Morton, E. W. (1977). On the occurrence and significance of motivation-structural rules in some bird and mammal sounds. The American Naturalist, 111, 855–869. Nolan, F. (2006). Intonation. In B. Aarts & A. MacMahon (Eds.), Handbook of English linguistics (pp. 433– 456). Oxford, UK: Blackwell. Ohala, J. J. (1983). Cross-language use of pitch: An ethological view. Phonetica, 40, 1–18. Ohala, J. J. (1984). An ethological perspective on common cross-language utilization of f0 in voice. Phonetica, 41, 1–16. Ohala, J. J. (1990). Respiratory activity in speech. In W. J. Hardcastle & A. Marchal (Eds.), Speech production and speech modeling (pp. 23–53). Dordrecht, the Netherlands: Kluwer. Ohala, J. J. (1996). The frequency code underlies the sound symbolic use of voice pitch. In L. Hinton, J. Nichols, & J. J. Ohala (Eds.), Sound symbolism (pp. 325–347). Cambridge, UK: Cambridge University Press. Oller, D. K. (2000). The emergence of the speech capacity. Mahwah, NJ: Lawrence Erlbaum. Perell o, J. (1962). La disfonia premenstruel. Acta Oto-rino-laringologica Ibero-Americana, 23, 561–563. Pierrehumbert, J. B. (1980). The Phonetics and Phonology of English Intonation. Ph.D. thesis, MIT. Distrubuted by Indiana University Linguistics Club. Pipitone, R. N., & J. Gallup, Gordon G. (2008). Women’s voice attractiveness varies across the menstrual cycle. Evolution and Human Behavior, 29, 268–274. Reckling, F., & K€ugler, F. (2011). Pitch range in negative and positive connoted sentences in German. Rendall, D., M. J. Owren, & M. J. Ryan (2009). What do animal signals mean? Animal Behaviour, 78, 233–240. Riad, T. (2014). The phonology of Swedish. Oxford, UK: Oxford University Press. Rialland, A. (2007). Question prosody: An African perspective. In T. Riad & C. Gussenhoven (Eds.), Tones and tunes. Volume I: Typological and comparative studies on tone and intonation (pp. 35–56). Berlin: Mouton de Gruyter. Rischel, J. (1974). Topics in West Greenlandic phonology. Copenhagen: Akademisk Forlag. Salffner, S. (2010). Intonation and phonation type as markers in Ikaan yes/no questions. Paper presented at the Fourth Conference on Tone and Intonation (TIE4), Stockholm. Segerstr ale, U., & Molnar, P. (1997). Non-verbal communication: Crossing the boundary between culture and nature. In U. Segerstr ale & P. Molnar (Eds.), Non-verbal communication: Where nature meets culture (pp. 1–26). Mahwah, NJ: Laurence Erlbaum. Smiljanic, R., & Hualde, J. I. (2000). Lexical and pragmatic functions of tonal alignments in two SerboCroatian dialects. In A. Okrent & J. Boyle (Eds.), Proceedings from the Main Session of the 36th Regional Meeting of the Chicago Linguistic Society, Vol. 36–1 (pp. 469–482). Chicago: CLS. Strik, H., & Boves, L. (1995). Downtrend in f0 and psb. Journal of Phonetics, 23, 203–220. Thomas, E. (1978). A grammatical description of the Engenni language. Dallas, TX: Summer Institute of Linguistics. Trittin, P. J. T., & de Santos y Lleo, A. (1995). Voice quality analysis of male and female Spanish speakers. Speech Communication, 16(4), 359–368. Wichmann, A., House, J., & Rietveld, T. (1997). Peak displacement and topic structure. In A. Botinis, G. Kouroupetroglou, & G. Carayannis (Eds.), Intonation: Theory, models and applications. Proceedings of an 434 C. Gussenhoven / Topics in Cognitive Science 8 (2016) ESCA workshop (pp. 329–332). Athens (Greece): ESCA and University of Athens, Department of Informatics. Xu, Y., Lee, A. Wu, W.-L. Liu, X., & Birkholz, P. (2013). Human vocal attractiveness as signaled by body size projection. PLOSone, 8, 1–9. doi:10.1371/journal.pone.0062397 Supporting Information Additional Supporting Information may be found in the online version of this article: Supporting Information S1. Biological codes. Supporting Information S2. On intonational meaning. Supporting Information S3. Intonation in phylogeny and ontogeny. Supporting Information S4. Sound files for examples (2) and (3), courtesy of Sophie Salffner, and (1).
© Copyright 2026 Paperzz