Distribution of tongue tip articulations in Hindi versus English and

Laboratory Phonology 11
51
Goldstein, Nam, Kuthreshtha, Root & Best
Distribution of tongue tip articulations in Hindi versus English
and the acquisition of stop place categories
Louis Goldstein*#, Hosung Nam#, Manisha Kulthreshtha#, Leslie Root, and Catherine Best#+
*Department Lingustics, University of Southern California; #Haskins Laboratories, New Haven, CT, USA
+MARCS Auditory Laboratories, University of Western Sydney
[email protected]; [email protected]; [email protected]; [email protected]
Explaining the decreased ability of infants around 8-10 months of age to distinguish at least some
phonological contrasts absent from their native language (Werker & Tees, 1984) has constituted a theoretical
challenge. Since infants of that age do not yet appear to have acquired the phonological system of the
ambient language, how is it that the structure of that phonology is nonetheless active in shaping their
perceptual behavior? A promising recent solution to this puzzle is provided by Maye et al. (2002), who have
argued that the distributional structure of the infant’s input (i.e., adult speech), combined with infants’
statistical sensitivities, could account for changes in early perception. Supporting this idea, they manipulated
the distributional properties of a VOT continuum (voiced to voiceless unaspirated), exposing infants to
unimodal vs. bimodal distributions in an experimental session. Bimodal distributions led infants to
perceptually discriminate items taken from the distinct modes, whereas they did not discriminate these items
when their experience was unimodal.
While this is a promising theoretical account, it begs the questions of how real-world input, which is
connected speech, is decomposed by infants into discrete items whose statistical properties can be
accumulated and what the relevant dimensions are along which statistics are kept. Potential answers to these
questions are provided by a different, though complementary, approach to modeling the infants’
developmental course in perception and production, namely, the articulatory organ (AO) hypothesis
(Goldstein & Fowler, 2003; Best & McRoberts, 2003). Under this hypothesis, infants can decompose the
oral-facial system into distinct organs (e.g., lips vs. tongue tip vs. tongue dorsum) from very early in life (as
consistent with neonates’ ability to perform facial mimicry: Melzoff & Moore, 1997), and they are sensitive
to the actions of these organs in producing constrictions within the vocal tract. This hypothesis predicts that
contrasts involving actions of distinct organs (e.g., /b/ vs. /d/) should be mastered relatively early, while
those involving quantitatively different actions of the same organ (e.g., /ð/ vs. /d/), should be acquired only
when the infant (or child) has attuned sufficiently to the distributional properties of the organ’s constrictions.
Results from children’s early word productions are consistent with this hypothesis (Goldstein, 2003), and
several perceptual findings can be explained with reference to it (Best & McRoberts, 2003).
Werker’s original experiments demonstrating the loss of ability to discriminate non-native contrasts are
clear examples of within-organ contrasts: Hindi dental and retroflex stops are tongue tip constrictions
(differing in the location of constriction), and Nthlakapmx velar and uvular stops are tongue dorsum
constrictions (also differing in location). Thus loss of ability of English-learning infants to discriminate
these stimuli could be explained as a result of their experience with a unimodal distribution of tongue tip or
tongue dorsum constrictions, while Hindi- and Nthlakapmx-learning infants would presumably experience
bimodal distributions.
Reasonable as this account seems, there are no data to support the hypothesis of bimodality of tongue tip
distributions in Hindi (or tongue dorsum distributions in Nthlakapmx). Even though a bimodal distribution
might be expected, it is possible that contextual variation in tongue tip positioning (or distributional
asymmetries within the language) might obscure an underlying contrast in constriction location, at least in
the surface articulation (and resulting sound). We therefore collected data on tongue tip constrictions in
running (adult) speech in Hindi, across a range of phonetic contexts, and compared it to data from English.
Method. A female Hindi speaker was recorded reading an approximately 6000-word story, while the
positions of her lips, tongue tip, tongue body and jaw were measured using EMMA. Locations of coronal
stops were identified from the acoustics, and time during that stop at which the tongue tip receiver was
closest to the palate was algorithmically determined. The horizontal position (advanced-retracted) of the
tongue tip (TTx) at that maximally constricted time for each stop was logged and the distribution of TTx
LabPhon11 abstracts
edited by Paul Warren
Wellington, New Zealand
30 June - 2 July 2008
Abstract accepted after review
52
Laboratory Phonology 11
Goldstein, Nam, Kuthreshtha, Root & Best
values was plotted. For comparison, English data from the Wisconsin X-ray database was analyzed for 6
read paragraphs using the same procedures. Because there were fewer coronals in these paragraphs than in
the Hindi story, the analysis was carried out for 3 of the X-ray subjects, 2 male and 1 female. The data from
each subject (both Hindi and English) was normalized to the range 0 (most advanced) to 1 (most retracted).
Results. As shown in Fig. 1a, the distribution of TTx was bimodal in Hindi. None of the 3 English subjects
showed a bimodal distribution. The results pooled across English subjects are shown in Fig. 1b.
350
<--Adv anced
Ret ract ed-->
40
<--Adv anced
Retr act ed-->
35
300
30
250
Figure 1a: Hindi
200
25
Figure 1b:
English
20
150
15
100
10
50
5
0
0.3
0.4
0.5
0.6
0 .7
T ongue Tip Horizonta l (normalize d)
0
0. 2
0.3
0 .4
0.5
0.6
0.7
0.8
0. 9
Tongue T ip Horiz ontal (norma lized)
Figure 1: Histogram of tongue tip horizontal positions in (a) Hindi (b) English
While the Hindi distribution appears bimodal, the distribution of retroflexes is fairly broad, and it is
important to know whether the overall distribution would afford the learning of two distinct constriction
categories. To test this, the observed normalized TTx values were input to a Hebbian learning model that has
been shown to result in self-organization of discrete phonetic categories from continuous input (Oudeyer,
2006; Nam et al., in press). With the Hindi data as input, the model converged on two sharply distinct
categories of neural units, while with the data from any of the English subjects, or with the pooled data, only
a single mode developed.
Conclusion. The adult data collected in this study suggest that if infants track distributional properties of
tongue tip constrictions within more naturalistic connected speech (however they manage to do that), they
will arrive at 2 categories of tongue tip behavior in stops in a Hindi environment, but a single category in an
English environment. This, in turn, provides evidence in support of distributional attunement as a possible
basis for changes in perceptual discrimination in the case of within-organ contrasts.
Acknowledgments. This research was supported by NIDCD grant DC-00403.
References
Best, C., & McRoberts, G. W. (2003). Infant perception of non-native consonant contrasts that adults assimilate in
different ways. Language & Speech, 46, 183–216.
Goldstein, L. (2003). Emergence of discrete gestures. In Solé, M.J., Recasens, D., & Romero, J. (Eds.), Proceedings of
the 15th International Congress of Phonetic Sciences (pp. 85-88). Rundle Mall: Causal Productions
Goldstein, L., & Fowler, C. (2003). Articulatory phonology: a phonology for public language use. In Meyer, A. &
Schiller, N. (Eds.), Phonetics and Phonology in Language Comprehension and Production: Differences and
Similarities (pp. 159-207). New York: Mouton.
Maye, J., Werker, J., & Gerken, L. (2002). Infant sensitivity to distributional information can affect phonetic
discrimination. Cognition, 82, B101–B111.
Meltzoff, A., & Moore, M. (1997). Explaining facial imitation: a theoretical model. Early Development and Parenting,
6, 179-192.
Nam, H., Goldstein, L., & Saltzman, E. (in press). Self-organization of syllable structure: A coupled oscillator model.
In Chitoran, I., Coupé,C., Marisco, E., & Pellegrino, F. (Eds.), Approaches to Phonological Complexity. Berlin:
Mouton.
Oudeyer, P.Y. (2006). Self-Organization in the Evolution of Speech. Oxford: Oxford University Press.
Werker, J. & Tees, R. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the
first year of life. Infant Behavior and Development, 7, 49-63.