Interactions of speaking rate and prosodic organization in non

Interactions of speaking rate and prosodic organization in non-native speech production
Tuuli Morrill, George Mason University
Melissa Baese-Berk, University of Oregon
Contact: [email protected]
The variety of prosodic structures across languages poses a challenge for non-native speakers
– prosodic characteristics are acquired relatively late compared to other phonological
structure. Studies investigating the perception and production of non-native prosody have
focused mainly on word-level prominence (e.g., stress patterns). However, there has recently
been work on two global suprasegmental properties of non-native speech: (1) speaking rate,
and (2) phrasal intonation. In the current study, we begin to address questions raised in both
of these areas by examining rate and intonation together in a corpus of non-native speech (the
Archive of L1 and L2 Scripted and Spontaneous Transcripts and recordings (ALLSSTAR)
available from https://oscaar.ci.north- western.edu/index.html).
Overall, non-native speakers speak more slowly than native speakers (e.g., Guion, Flege,
Liu, & Yeni-Komshian, 2000). Recent work has shown that in addition to speaking more
slowly, non-native speakers of English exhibit greater variability in rate across utterances
when reading (Baese-Berk & Morrill, 2015); in other words, non-native speakers slow down
or speed up more than the native speakers. However, this pattern appears to be reversed in
spontaneous speech, where native English speakers exhibit greater variability than non-native
speakers (Morrill & Baese-Berk, 2015). Although it has been hypothesized that variability in
read speech could be due to processing difficulties, closer inspection of these results reveals
that non-native speakers exhibited approximately the same amount of variability in both
reading and spontaneous speech tasks (whereas native English speakers increased variability
in spontaneous speech). One question that arises then is what is the reason for the rate
variability observed in non-native speakers, if it is not primarily driven by speech task? Here,
we suggest that non-native speaking rate may interact with the prosodic organization of
utterances, and that this interaction is a potential source for increased variability across
utterances, particularly for non-native speakers.
The current study examines utterances of read speech in the ALLSSTAR corpus. The
data consisted of nine utterances per speaker from native Korean (n = 9), Mandarin (n = 8),
and English speakers (n = 10); all were reading The Little Prince in English. We employed
Smoothing Spline ANOVA (SS ANOVA) to model pitch contours of entire utterances and
examine differences between language groups. SS ANOVA has recently been used to
examine pitch contours of syllables and words in tone languages (e.g., Moisik, Lin, & Esling,
2014), as well as entire phrases (Morrill, 2015). Pitch (F0) contours were extracted using the
Praat auto-correlation algorithm; the contour was divided into 1000 equally spaced time
points from which an F0 value was extracted. F0 values were transformed to semitones
relative to 1Hz and normalized by speaker. SS ANOVA was implemented with the “gss”
package in R (Gu, 2014) and F0 contours were modeled with 95% Bayesian confidence
intervals. Speaking rate was measured for each utterance (syllables/second).
Analyses of pitch contours and speaking rate for native and non-native speakers revealed
several patterns. (1) In certain utterances (e.g., Sentence 5, Figure 1), non-native speakers
follow native intonation contours relatively closely, including the apparent placement of pitch
accents. (2) In other utterances (e.g., Sentence 14, Figure 2), non-native speakers diverge
from native intonation contours, only exhibiting overlap at the beginning and end of the
utterance. (3) As shown previously, non-native speakers are overall slower in rate than native
speakers (Figure 3). (4) Interestingly, the sentences in which non-native speakers are most
consistently slower than native speakers are those in which they most closely match the native
intonation contours (e.g., Figure 1); when non-native speakers approach faster rates, closer to
those of native speakers, they fail to match the native intonation contours (e.g., Figure 2).
This finding suggests a relationship between speaking rate and prosodic organization in
non-native speech. One possibility is that speakers’ realizations of intonation contours are
naturally more accurate when they are speaking more slowly (i.e, speaking faster leads to
inaccuracy). On the other hand, in the production of highly stylized intonation contours (e.g.,
the exclamation in Sentence 5, Figure 1), non-native speakers may slow their rate in order to
realize the intonation contour. It is possible that the interaction between rate and intonation
contour may be a driving force in the variability in speaking rate for non-native speakers.
These possibilities have implications for our understanding of prosodic organization and
production in both native and non-native speech.
Figure 1. Sentence 5
(S005) intonation
contours modeled
with 95% confidence
intervals for native
Mandarin (CMN),
English (ENG), and
Korean (KOR)
speakers
Figure 2. Sentence 14
(S014) intonation
contours modeled with
95% confidence
intervals for native
Mandarin (CMN),
English (ENG), and
Korean (KOR)
speakers
Figure 3.
Speaking rate in
each utterance for
native Mandarin
(CMN), English
(ENG), and
Korean (KOR)
speakers
References
Baese-Berk, M. M., & Morrill, T. H. (2015). Speaking rate consistency in native and non-native
speakers of English. The Journal of the Acoustical Society of America, 138(3), EL223–EL228.
Gu, C. (2014). Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software,
58(5).
Guion, S. G., Flege, J. E., Liu, S. H., & Yeni-Komshian, G. H. (2000). Age of learning effects on the
duration of sentences produced in a second language. Applied Psycholinguistics, 21(02), 205–
228.
Moisik, S. R., Lin, H., & Esling, J. H. (2014). A study of laryngeal gestures in Mandarin citation tones
using simultaneous laryngoscopy and laryngeal ultrasound (SLLUS). Journal of the
International Phonetic Association, 44(01), 21–58.
Morrill, T. (2015). The implementation of phrasal prosody by native and non-native speakers of
English: SS ANOVA for multi-syllabic intonation contours. In Proceedings of the 18th
International Congress of Phonetic Sciences. Glasgow, Scotland.
Morrill, T., & Baese-Berk, M. (2015). Speaking rate variability in spontaneous productions by nonnative speakers. The Journal of the Acoustical Society of America, 138(3), 1947–1947.