Hearing, Balance and Communication, 2013; Early Online: 1–9 ORIGINAL ARTICLE Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. Language development in infants: What do humans hear in the first months of life? ALAN LANGUS & MARINA NESPOR International School for Advanced Studies (SISSA), Language, Cognition and Development Laboratory – Trieste, Italy Abstract In this article we discuss experimental work on language acquisition that we have carried out in recent years. We first describe the most common methods used in infant research, then we concentrate on linguistic rhythm and the aspects of language that might be learned in the first year of life on the basis of signals contained in the speech stream. These include the basic level of rhythm carried by the two most basic phonological categories of consonants and vowels, and rhythmic alternation at the phrase level and its signal to syntax. Because linguistic rhythm is one of the first aspects of language that infants perceive and represent, the research discussed may help to diagnose hearing problems and lead to new ways of training individuals with speech impairments. Key words: language acquisition, infants’ research, rhythm, rhythmic classes, Iambic-Trochaic Law Introduction Rhythm pervades the universe: natural as well as cultural phenomena are governed by rhythm. The waves of the sea move rhythmically. Day and night alternate rhythmically. Our hearts beat rhythmically, as do those of other animals. We breathe rhythmically and we walk rhythmically. Music in all cultures is governed by rhythm and so is dance, but what is rhythm? To our knowledge, the most essential – thus more general – definition of rhythm can be found in “Rhythm is order in movement, Plato, The Laws, Book II, 93”. While the elements that establish order may vary in different natural and cultural phenomena, order is always established by alternation, or the regular recurrence of a pattern. What is rhythm in language? What are the elements that establish order in the movement of speech? Linguistic rhythm, as rhythm in music, is hierarchical in nature. There are different elements that establish rhythmic alternation – or order – at different levels of the rhythmic hierarchy (1,2). For example, on the most basic level, linguistic rhythm is signalled by the space occupied by vowels in the speech stream (V%) and the standard deviation of consonantal intervals (∆C) (3). However, if we move higher on the prosodic hierarchy – to the phonological phrase level – we find that rhythm is signalled through the systematic recurrence of prosodic cues such as pitch, intensity or duration in the prominent elements of phonological phrases. Because of infants’ sensitivity to prosody, and because the alternation of rhythm at different levels of the rhythmic hierarchy signals different linguistic properties, it offers an ideal opportunity for investigating the early stages of language acquisition. The study of rhythm perception in early infancy can therefore not only tell us whether infants can discriminate languages on the basis of rhythm, but also when infants become sensitive to the specific cues that are carried by rhythm. This is particularly relevant, since rhythmic alternation has been shown to offer cues to the size of the syllabic repertoire (3), to the mean length of words (4) and to basic syntactic properties of language (5,6). Since the sound pattern of a language, in particular its prosody, gives cues to different grammatical properties, it is feasible that infants start their first steps into the acquisition of grammar at the pre-lexical stage, i.e. before knowing the meaning of words. Correspondence: M. Nespor, ERC – PASCAL, SISSA via Bonomea 265, 34136 Trieste – Italy. E-mail: [email protected] (Accepted 17 June 2013) ISSN 2169-5717 print/ISSN 2169-5725 online © 2013 Informa Healthcare DOI: 10.3109/21695717.2013.817133 2 A. Langus & M. Nespor Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. To give the reader a flavour of infant research on language acquisition, we therefore begin with a brief description of the various techniques used to test newborns and young infants. We then concentrate on what infants know about language when they are born and how they progressively tune in to the language of their environment. This will serve as a basis for discussing the aspects of linguistic rhythm infants know, how infants process linguistic rhythm at different levels of the rhythmic hierarchy, and what they learn about their language of exposure through rhythm during the first year of life. How do we know what infants know? Unlike in experiments with human adults, preverbal infants cannot be asked what they know or be told to cooperate in specific tasks. Neither can they be extensively trained like laboratory animals to perform tasks that they intuitively do not understand. It is important to bear in mind that infants’ voluntary behaviour is highly limited: infants begin to intentionally grasp things around 5–6 months of age, utter their first words around their first birthday and begin to utter multi-word utterances around two years of life. As a consequence, to investigate what infants know before they can overtly tell us, researchers have to rely on ingenious ways of interpreting and using infants’ naturally occurring responses to environmental stimuli. In the following, we briefly describe the arsenal of research methods ranging from behavioural methods that rely on infants’ behavioural responses, such as sucking and looking, to neural responses in the brain measured with imaging techniques such as EEG and fNIRS. Behavioural methods Sucking is one of the first behavioural instincts of infants. While nutritive sucking is essential for infants to acquire food, non-nutritive sucking is considered a natural reflex for infants to demonstrate their need for contact, to feel secure and relaxed. As infants tend to suck more when presented with novel environmental stimuli than with familiar ones (7), the reflex has extensively been used to determine infants’ sensitivity to speech (8). In a typical non-nutritive sucking experiment a pacifier is held in the infant’s mouth. The pacifier is connected to two loudspeakers that react to the subject’s sucking so that sucking with enough energy causes the loudspeakers to emit a linguistic sound, for example ‘ta’. In the first phase of the experiment, called habituation, each high amplitude sucking triggers the same sound. After a while the subjects suck less and when the sucking rate reaches a habituation criterion (usually 50%), half of the neonates – the experimental group – hears a sound that differs minimally from the habituation sound, for example ‘da’. The other half of the subjects – the control group – keeps listening to the habituation sound. If the new stimulus triggers high amplitude sucking while the old one makes the subject suck progressively less, it can be concluded that the infant can distinguish already at birth two syllables that differ only in the consonantal onset. The second group of behavioural methods used in infant research relies on infants’ looking behaviour. While it may seem counterintuitive to use infants’ looking behaviour to determine what infants know about the sound of their mother tongue, these methods are among the most common for investigating infants’ knowledge about spoken language. Looking-time experiments rely on the observation that infants tend to be more attentive to environmental stimuli that they have not seen or heard before than to familiar ones. As a consequence, they look longer to novel than to familiar stimuli (9). For example, in a typical looking-time experiment an infant sees a visual stimulus on a screen, for example a check-board, and hears simultaneously an auditory stimulus until the infant’s looking decreases to a criterion (usually 50%). When the infant has been habituated, the auditory stimulus changes while the visual stimulus remains the same. If the infant discriminates the new sound from the familiar one, it is expected to regain its interest in the visual display. The difference between the lookingtime to novel and familiar auditory stimuli is taken as a measure of discrimination. While simple looking-time measures have successfully been used even in newborns, the methodology requires the experimenter to code the looking behaviour manually offline. To reduce the manual work, to avoid subjective biases of the coders and to obtain a more precise measure for infants’ looking behaviour, today researchers use eye-trackers. Eye-trackers emit infrared light that is reflected on the cornea of the eyeball. This reflection of the light on the cornea is recorded by a sensor that then calculates the exact point of gaze of each eye up to 120 times per second. Because modern eye-trackers no longer need to be fixed on the subject’s head, the eye-tracking method allows researchers to adapt classic looking-time paradigms and behavioural methods to measure infants’ behavioural reflexes to environmental stimuli highly precisely in real time. However, because during the first weeks of life the human eye goes through considerable changes, it is only possible to measure eye movements after three months of age. Language development in infants Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. Neuroimaging methods Because human infants lack fine control and coordination over their motor behaviour, researchers have increasingly turned to imaging techniques that measure neural responses to environmental stimuli. As neuroimaging techniques do not require an overt behavioural response they are ideally suited for studying cognitive and linguistic processing in very young infants. With the advancement of brain imaging techniques, it has become increasingly possible to investigate the biological foundations of language with very young infant populations and even with newborns. In recent years, using advanced imaging techniques, researchers have therefore also started to study the specific brain areas that are dedicated to language perception in newborns and young infants. This line of research mainly uses two imaging techniques. Electroencephalography (EEG) is the recording of electrical activity along the scalp by measuring the voltage fluctuations resulting from ionic current flows within the neurons of the brain. EEGs can detect changes over milliseconds. Electroencephalography (EEG) and event-related potentials (ERPs) provide non-invasive methodological techniques that allow researchers to examine the relation between brain and behaviour in infancy. Because EEG and ERPs can be utilized across the entire lifespan, they provide two of the very few methodological tools that can directly measure and compare cognitive and linguistic processing at different ages. Even though EEG and ERPs both reflect the electrical activity of the brain, and both are collected in a similar manner, they represent slightly different aspects of brain function: EEG measures the brain’s ongoing electrical activity, while ERPs show changes in electrical activity in response to a discrete stimulus or event. EEG provides information regarding the resting state of the brain, synchrony between regions, or spectral changes in response to a cognitive event. In contrast, deflections in the ERP reflect specific aspects of sensory and cognitive processes associated with various stimuli. Due to the high temporal resolution, ERPs are well suited for studying the mental processing of environmental stimuli as they unfold. However, infant research is increasingly also interested in understanding brain organization during the first weeks of life. fNIRS (functional nearinfrared spectroscopy) is a non-invasive imaging method ideally suited to test auditory competence in newborn infants. fNIRS measures infants’ brain activity through haemodynamic responses associated with neuron behaviour. It relies on the fact that when monochromatic light travels through a medium, some of it is absorbed in the medium, 3 some of it is scattered and some of it is transmitted. For example, familiar auditory stimuli cause a decrease of oxyhaemoglobin and novel auditory stimuli and an increase in the oxyhaemoglobin (11). The properties of the vascular response measured using fNIRS are therefore comparable to those described for the BOLD (blood oxygen level dependence) effect in fMRI (10). This makes the methodology highly relevant for studying brain organization and brain responses in very young infants who cannot be tested with fMRI. fNIRS has successfully been used to show the cortical organization of the newborn brain (10), memory for words (11) and cortical processing of linguistic structures (12). For a recent overview of the methodology see Gervain et al. (13). Tuning into the language of the environment Humans start acquiring language from birth (10,12), if not before (14,15). The auditory system of humans becomes functional around the 25th week of gestation. This enables the foetus to begin processing linguistic information before birth. The auditory information from the environment is filtered and frequencies above 1000 Hz are attenuated (16). This suggests that they have, at best, limited access to the acoustic information necessary for discriminating segmental information, especially consonants. The foetal hearing experience may thus affect some aspects of speech perception more than others. However, the sound that does reach foetuses provides sufficient information for the perception of suprasegmental aspects of speech. Importantly, foetuses also encode speech information into memory ! especially suprasegmental information. Both newborns and foetuses demonstrate the ability to discriminate their native language from a foreign language (17–20) and their mother’s voice from another woman’s voice (21,22). Because most fine-grained segmental information is filtered out before it reaches the foetus, foetuses can encode only some suprasegmental characteristics of speech. It is therefore interesting that human infants are prepared for language perception the day they are born. For example, Peña et al. (10) played newborn human infants speech streams, the same speech streams played backward, and silence while measuring infants’ brain activity with fNIRS. Backward speech is the best possible control for speech, since playing speech backward preserves the physical characteristics of the speech signal, but results in an acoustic realization that the human vocal tract cannot produce. Importantly, newborn infants’ brain responses showed an increased cerebral activity in the left hemisphere, as with adults, when listening to Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. 4 A. Langus & M. Nespor speech compared to when listening to backward speech or silence. These results therefore suggest that the ability to perceive speech does not require experience with spoken language after birth, and that the cortical areas responsible for the perception of spoken language are functional already at birth. Newborn human infants also show remarkable abilities in speech perception that they cannot have developed in utero. During the very first days of life, newborns distinguish any pair of phonemes attested in one of the languages of the world (8,23,24). This ability to discriminate phonemes matures around 28 weeks of gestation as shown also in pre-term infants (15). Furthermore, it has also been shown that newborns can distinguish the location of word primary stress: they hear the difference between pápa and papá or méta and metá (25). The ability to distinguish both phonemes and the phonological realization of lexical stress shows that at birth human infants are capable of distinguishing all the sounds used in natural languages. At birth human infants are thus ready to learn any language to which they may be exposed. The question is how infants tune in to the sounds of their language(s) of exposure. Adult humans are not particularly good at distinguishing sounds they are not familiar with. When perceiving sounds from foreign languages they tend to assimilate these to sounds that exist in their own language. For example, a native speaker of French – a language with systematic word final stress – loses the capability of distinguishing stress in other locations. In an experiment where monolingual native speakers of French were asked to say if a word they heard was wásuma or wasúma, they were unable to discriminate the two stress patterns (26,27). Similarly, monolingual speakers of Japanese ! a language with very restricted possibilities of consonant clusters ! asked to say if they heard ebzo or ebuzo were at chance (28). Infants slowly restrict their discrimination abilities to speech sounds that are systematically present in their environment. Foreign vowels start being ignored towards the sixth month of life (29,30) and foreign consonants towards the ninth month of life (31). By the first year of life infants have lost the ability to discriminate phonemes from foreign languages that are not present in their mother tongue. This focusing on the sounds present in the environment has expressively been called ‘learning by forgetting’ (32): infants learn to ignore sounds that are not used to distinguish meaning in the language of their surrounds, or that are not used systematically. Interestingly, they ignore not only the sounds of foreign languages but also the individual variability of speakers. For example, if certain individuals speak Italian R with a uvular r [ ] rather than an alveolar r [r], an infant successfully learns to ignore this difference. While it is difficult to sketch the exact time-frame in which infants narrow down to specific aspects of spoken language, there is some evidence which suggests that native language influences vocal production already at birth. For example, a recent study monitored the early sound production of French and German newborns and found that the melodic contours of the cries of the two groups differed significantly (33). The cries of the French newborns showed a raising melody while those of the German group had a falling melody. These differences between newborns in the two linguistic environments suggest that the influence of the native language begins in utero. The tuning into the native language may therefore at least partially be already present in newborns, and gradually shapes the perception of spoken language throughout the first year of life. The basic rhythmic level signals syllabic structure Having established that human infants are born with the ability to tell apart the different segments of the world’s languages, we can now begin our journey into the perception of rhythm. The basic rhythmic level accounts for the different rhythmic classes attested in the languages of the world. The presence of rhythmic classes was first proposed by Loyd James (34) who observed that English sounds very different from Spanish: the first resembles the noise of messages in the Morse code, the second the sound of a machine gun (34). There have been several proposals as to which phonetic element is responsible for these different rhythms. For example, the specific elements that established isochrony are interstress intervals in English, and syllables in Spanish (35) and all languages belong to one of these rhythmic classes: they are either stress-timed or syllable-timed (36). A third class was added later to account for the rhythm of Japanese (37), where the phonological constituent that establishes isochrony was proposed to be the mora. However, subsequent work by different phoneticians revealed that isochrony between different constituents is not present in the speech signal (38,39). Which property of the signal is then responsible for the machine-gun vs. Morse code effect, i.e. for the undeniable rhythmic difference between English, Dutch or Russian on the one hand, and Spanish, Greek or Italian, on the other? These languages differ in their syllable structures: stress-timed languages have a greater variety of syllable types than syllable-timed languages and, as a result, they have Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. Language development in infants heavier syllables. Furthermore, in stress-timed languages, unstressed syllables usually have a reduced vocalic system (sometimes reduced to just one vowel, schwa), and unstressed vowels are consistently shorter than stressed vowels, or even absent (40). These features combine with one another to give the impression that some syllables are far more salient than others in stress-timed languages, and that all syllables tend to be equally salient in syllable-timed languages (41). It was therefore proposed that the measures responsible for this basic rhythmic structure are the space occupied by vowels in the speech stream (V%) and the standard deviation of consonantal intervals (∆C) (3). This proposal of basic linguistic rhythm where languages fall into different rhythmic classes on the basis of the percentage of time occupied by vowels and on the regularity with which vowels recur – that in turn predicts the richness of the syllabic repertoire – has received considerable support from infant studies. For example, French newborns discriminate two languages such as English and Japanese that belong to different rhythmic classes, but not languages such as English and Dutch that belong to the same rhythmic class (20). It can thus be said that newborns identify the rhythmic class of their language of exposure. Infants fail to discriminate between languages belonging to the same rhythmic class throughout the first months of life (42). Only four-month-old bilingual infants exposed to two languages from the same rhythmic class (Spanish and Catalan) can discriminate the two languages they are actually exposed to (43). There is thus strong evidence for the proposal that during the first months of life infants rely primarily on the basic rhythmic level to discriminate between languages. Telling apart languages from the same rhythmic class appears to require experience with the specific languages. What could infants learn through the identification of the rhythmic class of their native language? The ability to discriminate languages from different rhythmic classes shows that infants – in addition to being able to discriminate between different phonemes – can use quantitative information about consonants and vowels to tell apart languages they may encounter in their surrounding. These results furthermore suggest that newborn infants may have an idea of the richness of the syllabic repertoire of their language of exposure. A high %V implies that there are not so many consonants, and the syllables must be quite simple. A low %V implies instead many consonants, thus a complex syllabic structure (3). A simple syllabic structure is in turn predictive of word length since the number of possible monosyllables that could be used as alone-standing words is restricted. This is the case, for example, of Japanese, where most words 5 are polysyllabic. Languages with complex syllabic structure – such as English or Dutch – are instead rich in monosyllables. The identification of the rhythmic class could thus also offer a bias as to the mean length of common words and thus provide a valuable aid to speech segmentation (4). The sensitivity to this basic level of linguistic rhythm in early stages of cognitive development is also important because there is a partial division of labour between vowels and consonants. While the main role of consonants concerns the lexicon, the main role of vowels is that of allowing the identification of the rhythmic class as well as of the development of a bias as to the mean length of common words (44). For example, in an artificial speech experiment adult participants prefer to use consonants to vowels in word identification (45) and they tend to use relationships among vowels to extract generalizations while disregarding the relationships among consonants for the same purpose (46).These asymmetries are not by-products of low-level acoustic differences between consonants and vowels (47,48), but result from the categorical representations of consonants and vowels themselves. This bias for using consonants for word learning and vowels for rule learning is found in infants by at the latest 12 months of age (49). The ability to perceive linguistic rhythm – to discriminate, represent and use phonemes categorically – from the earliest stages of language acquisition may therefore be necessary for the acquisition of words and rules of the mother tongue. Phrasal rhythm signals word order At a hierarchical level higher than that of the alternation between vowels and consonants, rhythm is determined by the alternation of stresses. It is important to note that the different levels of the prosodic hierarchy are organized so that lower levels are exhaustively contained into higher ones (50). This is best exemplified by considering the prosodic constituents most relevant for the present paper: the Phonological Phrases. The Phonological Phrase extends from the left edge of a phrase to the right edge of its head in head-complement languages; and from the left edge of a head to the left edge of its phrase in complement-head languages (1). While the number of Phonological Phrases contained in an Intonational Phrase that immediately governs them may vary, Phonological Phrases never straddle Intonational Phrase boundaries; Phonological Phrases are exhaustively contained in Intonational Phrases. The most well known use for Phonological Phrases is for speech segmentation. Because syntactic structure is automatically mapped onto prosodic structure during speech production (1), many prosodic cues Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. 6 A. Langus & M. Nespor signal syntactic and word boundaries. For example, among the most common cues for Phonological Phrase boundaries is final lengthening (51–54). Adult listeners use phonological phrase boundaries to identify where words begin or end (55). Similarly, 10-month-old infants can use phonological phrase boundaries to find words from continuous speech (56). Furthermore, because prosodic boundaries are necessary for infants to detect rule-like regularities from continuous speech (57), phonological phrase boundaries may also help infants to extract rules from different levels of the prosodic hierarchy (58). However, the systematic recurrence of Phonological Phrases is also important because the location as well as the physical manifestation of stress in Phonological Phrases give a cue to the word order of a language. Of the six logically possible orders of Subject (S), Object (O) and Verb (V), the basic word order of the great majority of the languages of the world (76%) is either SOV – as in Turkish, Japanese or Basque ! or SVO – like English, Hausa or Greek (59). Nespor et al. (6) proposed that the stress in Phonological Phrases differs systematically between SVO and SOV languages. Languages with the SVO order have stress in final position (e.g. on the Noun in a Verb-Noun sequence) manifested mainly through duration, i.e. prominent syllables are longer than less prominent ones. Languages with the SOV order have stress in initial position (e.g. on the Noun in a NounVerb sequence) manifested mainly through pitch and intensity, i.e. prominent syllables are more intense and have a higher pitch than less prominent ones. Given the uniformity of word order across constituents of different types within one language, could the specific word order of the language of exposure be inferred from the manifestation of phrasal rhythm in the prelexical period? To begin answering this question, French 6- and 12-week-old infants were exposed to French (SVO) and Turkish (SOV) sentences that differed solely on the location of relative prominence within phonological phrases (5). Turkish and French were chosen in this study because their syllabic structure is similar in complexity and because they both have word final stress. To eliminate the possible effect of segments – e.g. French but not Turkish has a uvular R ‘r’ [ ]; Turkish but not French has a high-unrounded posterior vowel – all sentences were resynthesized so as to have the same segments for each category. The findings of the study show that infants at this young age could discriminate Turkish and French sentences exclusively on the basis of phrasal rhythm, highlighting that phonological phrase level rhythm is perceived already during the first months of life. If infants can discriminate two languages solely on the basis of their phrasal prominence, are they also able to hear if in a sequence of alternating weak and strong beats, the strong beat is initial or final in its group? This puzzle could be solved if a perceptual mechanism could indicate if a strong element is initial or final. A perceptual mechanism first proposed to account for grouping in music (60–62) – the Iambic-Trochaic Law (ITL) – was proposed to be responsible for the different acoustic manifestation of phrasal stress depending on whether it is initial or final in its phrase (6). The ITL states that a sequence of sounds that alternates in duration is grouped iambically (weak-strong), while a sequence of sounds that alternates in intensity – and pitch for speech at the phonological phrase level – is grouped trochaically (strong-weak). Whether a language has the Verb-Object or the Object-Verb order is thus signalled by phrasal rhythm. Adults and seven-month-olds were tested for their memory of sequences of syllables that alternate in either pitch or duration (63). Exposed to flat syllables in the test phase, adults were better at remembering pairs of syllables that during familiarization had short syllables preceding long syllables, or highpitched syllables preceding low-pitched syllables. Instead, infants familiarized with syllables alternating in pitch, showed a preference to listen to pairs of syllables that had high pitch in the first syllable. However, no preference was found when the familiarization stream alternated in duration. One possible explanation for this asymmetry is related to the syntactic difference between SVO and SOV languages. It is widely accepted that SVO is syntactically unmarked and that all languages are at an underlying syntactic level SVO. The surface SOV order would then be result of movement (64,65). External confirmation for this theory comes from historical change: change in word order is unidirectional from SOV to SVO (66, 67). It is therefore possible that infants are sensitive early on only to the prosodic profile that indicates that the order of their language of exposure is not the unmarked one. Taken together, there is strong evidence from language acquisition that the perception of linguistic rhythm is one of the first aspects of language young infants perceive and process. Because rhythm is, at different levels of the rhythmic hierarchy, carried by the core elements of the speech signal, it is helpful for a variety of tasks that the young language learners face: from speech segmentation to learning one of the basic properties of the syntactic structure of the language to be acquired. Sensitivity to speech sounds by cochlear implanted infants While studies on language acquisition zoom in on the factors that influence speech perception and processing, it is often difficult to disentangle which Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. Language development in infants abilities require early exposure to linguistic input. As newborns are able to discriminate all the sounds found in the world languages, listening to speech in early infancy would appear to primarily help to narrow their cognitive abilities and tune into their mother tongue. However, because native language acquisition is constrained by critical periods – or the window of opportunity – during which native competence of a language can be acquired (68), it is also likely that this narrowing effect of speech input is sensitive to the specific stage of language acquisition. One source of valuable information about how listening to speech modulates our cognitive abilities is individuals who are born deaf and gain hearing through cochlear implants. Speech intelligibility in cochlear implanted individuals is directly related to the quality of the prosody they hear (69). There is some evidence which suggests that cochlear implanted children perform significantly worse than normal controls in the perception of amplitude, pitch and temporal structure of spoken language (70). Studies in languages such as Chinese and Thai, where the lexical meaning of words is modulated by tones, show that the main problem for cochlear implanted children lies in the perception of pitch (71). This is also evident from studies which show that cochlear implanted children have also problems in interpreting intonation and its relation with the elocutionary force of a sentence, e.g. distinguishing a statement from a question. As a consequence they also have a significantly poorer performance than normally-hearing individuals in the perception and production of emotional prosody (72). It remains an open question whether prosodic processing can only emerge if the speech input is present during the window of opportunity – the critical period during which language can be acquired, or at least can be acquired natively. However, because rhythm interprets the grammatical structure of sentences at different levels of the prosodic hierarchy, the inability to process prosody may cause serious difficulties in acquiring language in individuals who receive a cochlear implant after a certain age. On the one hand, the research described above may be useful to diagnose hearing impairments in infants and young children. On the other hand, it may help to find ways to train the perception and production of prosody in cochlear implanted individuals – both children and adults. One possibility, albeit an unexplored one, may be to complement auditory training with training in the perception and production of rhythm in a non-acoustic modality, e.g. in the visual modality. For example, it has been shown that the Iambic-Trochaic Law determines grouping of sequences also in vision, where adult participants can group sequences of shapes according to visual 7 frequency, intensity and duration as in the auditory domain (73). Conclusions From the very beginning of life, humans are sensitive to all consonantal and vocalic distinctions present in the different languages of the world. Their sensitivity to prosody – rhythm and intonation – appears to emerge already in the last weeks of gestation. Knowledge of vowels and consonants, that carry the basic level of linguistic rhythm, is necessary to learn the lexicon. There is evidence that consonants are privileged in learning words because of their relative stability. In contrast, vowels, because of their variability, provide more information about the grammatical structure of languages. Specifically, the percentage of time vowels occupy in the speech stream can give a cue to the size of the syllabic repertoire of the language of exposure and thus to the average length of common words. Because vowels are the main carrier of prosody they also give a cue to one of the basic syntactic properties of language ! word order. Cochlear implanted children have been shown to be especially impaired in prosody in general and in the perception – and thus the production – of tonal distinctions, in particular. This may suggest that the prosodic processing of speech, especially of intonation, is a precarious ability that requires experience with speech from the earliest stages of development. Declaration of interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper. The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement n° 269502 (PASCAL). References 1. Nespor M, Vogel I. Prosodic Phonology. 2nd edition 2007. Dordrecht: Foris; 1986. 2. Nespor M. On the rhythm parameter in phonology. In: Roca I, editor. Logical issues in language acquisition. Dordrecht: Foris; 1990. p. 1–19. 3. Ramus F, Nespor M, Mehler J. Correlates of linguistic rhythm in the speech signal. Cognition. 1999;73: 265–92. 4. Nespor M, Shukla M, Mehler J. Stress-timed vs. syllable timed languages. In: van Oostendorp M, Ewen CJ, Hume E, Rice K, editors. The Blackwell Companion to Phonology. Chichester: Wiley-Blackwell; 2011. p. 1147–59. 5. Nespor M, Guasti MT, Christophe A. Selecting word order: the Rhythmic Activation Principle. In: Kleinhenz U, editor. 8 6. 7. 8. 9. Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. A. Langus & M. Nespor Interfaces in Phonology. Berlin: Akademie Verlag; 1996. p. 1–26. Nespor M, Shukla M, van de Vijver R, Avesani C, Schraudolf H, Donati C. Different phrasal prominence realization in VO and OV languages. Lingue e Linguaggio. 2008;7:1–28. Siqueland ER, DeLucia CA. Visual reinforcement of non-nutritive sucking in human infants. Science. 1969;165: 1144–6. Eimas PD, Siqueland ER, Jusczyk P, Vigorito J. Speech perception in infants. Science. 1971;171:303–6. Gottlieb G. Krasnegor NA. Measurement of Audition and Vision in the First Year of Postnatal Life: A Methodological Overview. Connecticut: Ablex; 1985. Peña M, Maki A, Kovačić D, Dehaene-Lambertz G, Koizumi H, Bouquet F, Mehler J. Sounds and Silence: an optical topography study of language recognition at birth. Proc Nat Acad Sci USA. 2003;100:11702–5. Benavides-Varela S, Hochmann JR, Macagno F, Nespor M, Mehler J. Newborns’ brain activity signals the origin of word memories. Proc Nat Acad Sci USA. 2012;109:17908–13. Gervain J, Macagno F, Cogoi S, Peña M, Mehler J. The neonate brain detects speech structure. Proc Nat Acad Sci USA. 2008;105:14222–7. Gervain J, Mehler J, Werker JF, Nelson CA, Csibra G, Lloyd-Fox S, et al. Near-infrared spectroscopy: a report from the McDonnell infant methodology consortium. Dev Cogn Neurosci. 2011;1:22–46. Peña M, Pittaluga E, Mehler J. Language acquisition in premature and full-term infants. Proc Nat Acad Sci USA. 2010;107:3823–8. Mahmoudzadeh M, Dehaene-Lambertz G, Fournier M, Kongolo G, Goudjil S, Dubois J, et al. Syllabic discrimination in premature human infants prior to complete formation of cortical layers. Proc Nat Acad Sci USA. 2013;110:4431–2. Lecanuet JP, Granier-Deferre C, Jacquet AY, Capponi I, Ledru L. Prenatal discrimination of a male and female voice uttering the same sentence. Early Dev Parent. 1993;2: 217–28. Kisilevsky B, Hains S, Brown C, Lee C, Cowperthwaite B, Stutzman S, et al. Foetal sensitivity to properties of maternal speech and language. Infant Behav Dev. 2009;32:59–71. Mehler J, Jusczyk P, Lambertz G, Halsted N, Bertoncini J, Amiel-Tison C. A Precursor of Language Acquisition in Young Infants. Cognition.1988;29:143–78. Moon C, Panneton-Cooper R, Fifer WP. Two-day-olds prefer their native language. Infant Behav Dev. 1993;16:495–500. Nazzi T, Bertoncini J, Mehler J. Language Discrimination by Newborns: Toward an Understanding of the Role of Rhythm. J Exp Psychol: Hum Percept Perform. 1998;24:1–11. DeCasper AJ, Fifer WP. Of human bonding: Newborns prefer their mothers’ voices. Science. 1980;208:1174–6. Kisilevsky B, Hains S, Lee K, Xie X, Huang H, Zhang K, Wang Z. Effects of experience on foetal voice recognition. Psychobiol Sci. 2003;14:220–4. Eimas PD, Miller JL. Contextual effects in infant speech perception. Science. 1980;209:1140–1. Dehaene-Lambertz G, Peña M. Electrophysiological evidence for automatic phonetic processing in neonates. NeuroReport. 2001;12:3155–8. Sansavini A, Bertoncini J, Giovanelli G. Newborns discriminate the rhythm of multisyllabic stressed word. Dev Psychol. 1997;33:3–11. Cutler A, Mehler J, Norris D, Segui J. The Syllable’s Differing Role in the Segmentation of French and English. J Mem Lang. 1986;25:385–400. 27. Cutler A, Mehler J, Norris D, Segui J. The Monolingual Nature of Speech Segmentation by Bilinguals. Cogn Psychol. 1992;24:381–410. 28. Dupoux E, Kakehi K, Hirose Y, Pallier C, Mehler J. Epenthetic vowels in Japanese: a perceptual illusion? J Exp Psychol: Hum Percept Perf .1999;25:1568–78. 29. Kuhl PK. Human adults and human infants show a ‘perceptual magnet effect’ for the prototypes of speech categories, monkeys do not. Percept Psychophys.1991;50:93–107. 30. Kuhl PK, Williams KA, Lacerda F, Stevens KN, Lindblom B. Linguistic experience alters phonetic perception in infants by six months of age. Science. 1992;255:606–8. 31. Werker JF, Tees RC. Cross-language speech perception: evidence for perceptual reorganization during the first year of life. Infant Behav Dev. 1984;7:49–63. 32. Mehler J, Dupoux E. Naître Humain. Paris: Odile Jacob; 1990. 33. Mampe B, Friederici A, Christophe A, Wermke K. Newborns’ cry melody is shaped by their native language. Curr Biol. 2009;19:1994–7. 34. Lloyd James A. Speech Signals in Telephony. London; 1940. 35. Pike KL. The Intonation of American English. Ann Arbor: University of Michigan Press; 1945. 36. Abercrombie D. Elements of General Phonetics. Chicago: Aldine; 1967. 37. Ladefoged P. A Course in Phonetics. New York: Harcourt Brace Jovanovich; 1975. 38. Borzone de Manrique AM, Signorini A. Segmental durations and the rhythm in Spanish. J Phon. 1983;11:117–28. 39. Wenk B, Wiolland F. Is French really syllable-timed? J Phon. 1982;10:193–216. 40. Bertinetto PM. Strutture prosodiche dell’ italiano. Accento, quantità, sillaba, giuntura, fondamenti metrici. Firenze: Accademia della Crusca; 1981. 41. Dauer R. Phonetic and phonological components of language rhythm. Paper presented at the XIth International Congress of Phonetic Sciences, Tallinn; 1987. 42. Christophe A, Morton J. Is Dutch native English? Linguistic analysis by two-month-olds. Dev Sci. 1998;1:215–9. 43. Bosch L, Sebastian-Galles N. Evidence of Early Language Discrimination Abilities in Infants From Bilingual Environments. Infancy. 2001;2:29–49. 44. Nespor M, Peña M, Mehler J. On the Different Roles of Vowels and Consonants in Speech Processing and Language Acquisition. Lingue & Linguaggio 2003;2:203–31. 45. Bonatti L, Peña M, Nespor M, Mehler J. Linguistic Constraints on Statistical Computations: The Role of Consonants and Vowels in Continuous Speech Processing. Psychol Sci. 2005;16:451–9. 46. Toro JM, Bonatti LL, Nespor M, Mehler J. Finding words and rules in a speech stream: functional differences between vowels and consonants. Psychol Sci. 2008;19:137–44. 47. Toro JM, Shukla M, Nespor M, Endress A. The quest for generalizations over consonants: asymmetries between consonants and vowels are not the by-product of acoustic differences. Percept Psychophys. 2008;70:1515–25. 48. New B, Nazzi T. The time-course of consonant and vowel processing during word recognition. Lang Cogn Process. 2012;1:1–15. 49. Hochmann JR, Benavides S, Nespor M, Mehler J. Consonants and vowels: different roles in early language acquisition. Dev Sci. 2011;14:1445–58. 50. Selkirk E. Phonology and syntax: the relation between sound and structure. Cambridge: The MIT Press; 1984. 51. Beckman M, Pierrehumbert J. Intonational structure in Japanese and English. Phonology Yearbook. 1986;3:15–70. Hearing, Balance and Communication Downloaded from informahealthcare.com by 147.122.60.69 on 07/23/13 For personal use only. Language development in infants 52. Cooper WE, Paccia-Cooper J. Syntax and Speech. Cambridge: Harvard University Press; 1980. 53. Klatt DH. Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J Acoust Soc Am. 1976; 59:1208–21. 54. Wightman CW, Shattuck-Hufnagel S, Ostendorf M, Price PJ. Segmental durations in the vicinity of prosodic phrase boundaries. J Acoust Soc Am. 1992;91:1707–17. 55. Christophe A, Peperkamp S, Pallier C, Block E, Mehler J. Phonological phrase boundaries constrain lexical access: I. Adult data. J Mem Lang. 2004;51:523–47. 56. Gout A, Christophe A, Morgan J. Phonological phrase boundaries constrain lexical access: II. Infant data. J Mem Lang. 2004;51:547–67. 57. Peña M, Bonatti L, Nespor M, Mehler J. Signal-Driven Computations in Speech Processing. Science. 2002;298: 604–7. 58. Langus A, Marchetto E, Bion RAH, Nespor M. Can prosody be used to discover hierarchical structure in continuous speech? J Mem Lang. 2012;66:285–306. 59. Dryer MS. The order of subject, object and verb. In: Haspelmath M, Dryer MS, Gil D, Comrie B, editors. The World Atlas of Language Structures. Oxford: Oxford University Press. 2005. p. 330–3. 60. Bolton T. Rhythm. Am J Psychol. 1894;6:145–238. 61. Cooper G, Meyer L. The Rhythmic Structure of Music. Chicago: University of Chicago Press; 1960. 62. Woodrow H. Time perception. In: Stevens S, editor. Handbook of Experimental Psychology. New York: Wiley; 1951. p. 1224–36. 9 63. Bion RAH, Benavides-Varela S, Nespor M. Acoustic markers of prominence influence infants’ and adults’ segmentation of speech sequences. Lang Speech. 2011;54:123–40. 64. Kayne RS. The Antisymmetry of Syntax. Cambridge: MIT Press; 1994. 65. Chomsky N. Minimalist Program. Cambridge: MIT Press; 1995. 66. Newmeyer FJ. On the reconstruction of ‘Proto-world’ word order. In: Knight C, Hurford JR, Studdert-Kennedy M, editors. The Evolutionary Emergence of Language: social function and the origins of linguistic form. Cambridge: Cambridge University Press; 2000. 67. Gell-Mann M, Ruhlen M. The origin and evolution of word order. Proc Nat Acad Sci U S A. 2011;108:17290–5. 68. Lenneberg EH. Biological Foundations of Language. Wiley; 1967. 69. Chin SB, Bergeson TR, Phan J. Speech intelligibility and prosody production in children with cochlear implants. J Commun Disord. 2012;45:355–66. 70. Meister H, Tepeli D, Wagner P, Hess W, Walger M, von Wedel H, Lang-Roth R. Experiments on prosody perception with cochlear implants. HNO. 2007;55:264–70. 71. Ciocca V, Francis AL, Aisha R, Wong L. The perception of Cantonese lexical tones by early-deafened cochlear implantees, J Acoust Soc Am. 2002;111:2250–6. 72. Nakata T, Trehub SE, Kanda Y. Effect of cochlear implants on children’s perception and production of speech prosody. J Acoust Soc Am. 2012;131:1307–14. 73. Peña M, Bion RAH, Nespor M. How modality specific is the Iambic-Trochaic Law? Evidence from vision. J Exp Psychol: Learn Mem Cogn. 2011;37:1199–1208.
© Copyright 2026 Paperzz