A Rhythmic Bias in Preverbal Speech Segmentation

JOURNAL OF MEMORY AND LANGUAGE
ARTICLE NO.
35, 666–688 (1996)
0035
A Rhythmic Bias in Preverbal Speech Segmentation
JAMES L. MORGAN
Brown University
Four studies using a variant of the conditioned head turning procedure, in which response
latencies to extraneous noises occurring at different junctures within synthetic syllable strings
served as the dependent variable, investigated 6- and 9-month-old infants’ representations of
familiar and novel syllable pairs manifesting diverse rhythmic patterns. Familiar bisyllables
tended to be perceived similarly by both age groups: These bisyllables were perceived as being
cohesive, without regard to whether they manifested trochaic (longer–shorter) or iambic (shorter–
longer) rhythm. Novel bisyllables were perceived differently by the two age groups. Six-montholds appeared to perceive segmentally novel, but rhythmically familiar, bisyllables as cohesive,
whereas they failed to perceive rhythmically novel, but segmentally familiar bisyllables in the
same fashion. Similar patterns of results were obtained with trochaic and iambic bisyllables.
Nine-month-olds, however, were differentially sensitive to different rhythmic patterns. Older
infants appeared to perceive novel bisyllables as cohesive only when they manifested trochaic
rhythm, whether the bisyllables were segmentally or rhythmically novel. Nine-month-olds’
behavior is consistent with the possibility that they have adopted a metrical strategy for segmentation, as A. Cutler (1990 and elsewhere) has argued is the case for novice and expert English
speakers alike. The plausibility of such a strategy with respect to the nature of child-directed
speech is discussed, as are possible consequences for acquisition of such a strategy. q 1996
Academic Press, Inc.
To infants embarking on learning language,
input speech must initially be perceived as
a wordless babble. In speech, words are not
physically separated, as they are on the printed
page. Nor does any universal recipe for ‘word’
exist: Words may comprise one syllable or
many, one sequence of sounds or another, one
stress pattern or an alternate. To make matters
yet more complex, words are mutable: When
words come into contact with one another in
fluent speech, the sounds at their edges may
change (following language-specific patterns
of sandhi), and their rhythmic patterns may
be altered (Chomsky & Halle, 1968; Selkirk,
1984), to the degree that words excised from
This work was supported by NIMH Grant MH 48892. I
thank John Mertus for providing technical and programming
support and Nancy Allard, Shinina Butler, Yael Harlap, Eileen Hoff, Rachel Kessinger, Allison Klein, and Michelle
Rogers for assistance in testing subjects and coding video
tapes. Address correspondence and reprint requests to James
L. Morgan, Department of Cognitive and Linguistic Sciences, Brown University, Box 1978, Providence RI 02912.
E-mail: [email protected].
0749-596X/96 $18.00
fluent speech are often unintelligible (Bard &
Anderson, 1983; Pollack & Pickett, 1964).
Nevertheless, location and recognition of
words in fluent speech is critical to language
learning, for it is only as a result of these
processes that componential meanings of utterances can be understood or patterns of syntax can be discovered. Moreover, from the
advent of productive language and the burgeoning of language comprehension, it may be
surmised that infants have substantially solved
the segmentation problem and begun to build
a lexicon before the end of the first year. The
processes by which these occur are only beginning to be understood. This paper presents
a series of studies exploring whether preverbal, English-learning infants exhibit a bias toward language-predominant rhythmic patterns
in segmenting short sequences of syllables—
comparable to phonological words—from
larger strings of syllables.
The hypothesis that particular rhythmic patterns are exploited in speech segmentation by
novice and expert language users alike has
666
Copyright q 1996 by Academic Press, Inc.
All rights of reproduction in any form reserved.
AID
JML 2447
/
a004$$$$$1
09-25-96 11:50:42
jmla
AP: JML
667
PREVERBAL SPEECH SEGMENTATION
been championed in a series of papers by Cutler and her colleagues. Cutler (1990; Cutler &
Norris, 1988) has proposed that English
speakers may employ a Metrical Segmentation Strategy, according to which word boundaries may be posited preceding strong syllables (stressed syllables, with relatively long
durations and full vowels). Cutler and Carter
(1987) have argued that this strategy provides
a plausible first stage in segmentation, inasmuch as an overwhelming preponderance of
English content words in fluent speech—in
excess of 85%—begin with strong syllables.
Speakers of languages with rhythmic properties different than those of English may pursue
alternative segmentation strategies (Cutler,
Mehler, Norris, & Segui, 1986; Otake, Hatano, Cutler, & Mehler, 1993).
Cutler has adduced a variety of psycholinguistic evidence to show that adult English
speakers may employ such a metrical segmentation strategy. Cutler and Clifton (1984), for
example, found that mis-stressed words with
unreduced vowels were more easily recognized if they were mis-stressed to comport
with the English-predominant strong–weak
pattern (e.g., ráccoon) than if they were misstressed to fit the opposite pattern (e.g.,
rabbı́t). Cutler and Norris (1988) demonstrated that monosyllabic words embedded in
nonsense bisyllables are more readily detected
when the bisyllables manifest strong–weak
patterns than when the bisyllables manifest
strong–strong patterns. Analyses of ‘slips of
the ear’ by Cutler and Butterfield (1992)
showed that a large majority of errors can be
analyzed as resulting from postulation of
boundaries before strong syllables (for example, ‘‘it was illégal’’ may be misperceived as
‘‘it was an éagle’’).
Cutler (1996) has argued that an important
advantage of a metrical segmentation strategy,
in contrast to segmentation models that rely
exclusively on matching of input to existing
lexical representations (e.g., Cole & Jakimik,
1978), is that it provides a means for novice
language learners to begin to locate words in
fluent speech. Several studies of infant auditory and speech perception provide evidence
AID
JML 2447
/
a004$$$$$2
that infants possess capacities prerequisite for
application of a segmentation strategy based
on speech rhythm. Demany, MacKenzie, and
Vurpillot (1977) have shown that, by 2
months, infants can discriminate auditory patterns differing in rhythm, and Trehub and
Thorpe (1988) have shown that, by 6 months,
infants can categorize sequences on the basis
of rhythmic attributes. Spring and Dale (1977)
and Jusczyk and Thompson (1978) have
shown that infants can discriminate simple
syllabic sequences differing in stress patterns.
Jusczyk, Cutler, and Redanz (1993) have demonstrated that, by 9 months, English-exposed
infants prefer to listen to lists of words with
strong–weak patterns rather than to words
with weak–strong patterns. This last result
shows that infants are not only attending to
rhythmic patterns themselves, but also to their
relative frequencies of occurrence in input. In
addition, evidence from older children’s productions suggests that they are susceptible to
the same sorts of mis-segmentations as those
documented by Cutler and Butterfield (1992),
as the following exchange between the author
and his daughter (3;5) illustrates:
(in the supermarket)
Child: What is that flower?
Father: That’s called an azalea.
Child: Let’s get Mommy some zaleas.
Although this set of findings is compatible
with the possibility that English-exposed infants may employ a metrical strategy for segmentation, none of these results show that infants actually do so. Consider how this hypothesis might be tested in preverbal infants.
A corollary of the metrical segmentation strategy hypothesis is that weak syllables (unstressed syllables with relatively short durations and, often, reduced vowels) will tend to
be perceived as continuing, rather than beginning, words. From this, the prediction follows
that, other things being equal, strong–weak
(trochaic, or longer–shorter) sequences of syllables should be perceived as being more cohesive than weak–strong (iambic, or shorter–
longer) sequences of syllables.
Recent work by Morgan (1994; Morgan &
09-25-96 11:50:42
jmla
AP: JML
668
JAMES L. MORGAN
Saffran, 1995) investigating infants’ representations of synthetic syllable strings provides
evidence that speech rhythm is an important
factor influencing the perceived cohesiveness
of short sequences of syllables. This research
uses a NOISE DETECTION method wherein
head turning latencies to extraneous noises
(short buzzes) occurring in different junctures
in syllable strings are measured. The expectation underlying this method is that responses
to buzzes occurring between a pair of syllables
perceived as constituting a single unit will be
slower than will be responses to buzzes occurring between two syllables not perceived
as constituting a unit. This reflects the fact that
perceptual/representational units are cohesive,
the property that gives rise to the tendency of
such units to resist interruption.
Experiments by both Morgan (1994) and
Morgan and Saffran (1995) show that infants
between 6 and 9 months use rhythmic properties of stimulus strings as an initial means of
grouping pairs of syllables. With increased familiarization, 9-month-olds balance sequential
and rhythmic properties of the stimulus strings
in forming an integrated basis for grouping
syllables. In contrast, 6-month-olds continue
to rely on rhythmic characteristics of stimulus
strings in grouping particular syllables, even
when the sequential properties of the stimulus
strings are incompatible with grouping those
syllables.
However, one study in Morgan and Saffran
(1995) provides some evidence that fails to
support the metrical segmentation strategy hypothesis, in particular the prediction that trochaic sequences should be more cohesive than
iambic sequences. Two between-subjects
stimulus conditions from Morgan and Saffran,
Experiment 2, are relevant to present concerns. In both of these, rhythmic and sequential regularities across the set of stimulus
strings supported grouping of a particular pair
of adjacent syllables into a single unit. In one
condition, the rhythmic pattern across the pair
was trochaic (a longer syllable followed by a
shorter syllable); in the second condition, the
rhythmic pattern was iambic (a shorter syllable followed by a longer syllable). Morgan
AID
JML 2447
/
a004$$$$$2
FIG. 1. Nine-month-olds. Head turning latencies from
Morgan and Saffran, Experiment 2.
and Saffran found that familiarized trochaic
and iambic sequences were both perceived by
9 month old infants as being cohesive; as Fig.
1 shows, both conditions displayed significant
nonzero differences in latencies to buzzes occurring inside (‘‘within unit’’) and outside
(‘‘between units’’) these syllable pairs.
These results might provide some evidence
that 9-month-olds are not yet using a metrical
strategy for segmentation; 9-month-olds’ preferences for trochaic bisyllables (Jusczyk, Cutler, & Redane 1993) may not reflect particular
processing biases. However, these results are
subject to several alternative explanations.
First, Morgan and Saffran’s failure to observe
differences between trochaic and iambic sequences might be due to a failure to manipulate relevant stimulus properties. Rhythmic
patterns in their stimuli were produced by
varying the durations of individual syllables
only. In contrast, in spontaneous speech,
strong and weak syllables typically differ not
only in duration, but also in vowel quality,
pitch contour, and amplitude. These properties
may not all be required for distinguishing between strong and weak syllables. For example,
Ratner (1985) reports analyses of maternal
child-directed speech in which content word
(strong syllable) vowels and function word
(weak syllable) vowels were spectrally simi-
09-25-96 11:50:42
jmla
AP: JML
669
PREVERBAL SPEECH SEGMENTATION
lar; in these data, vowel duration was a powerful predictor of syllable type. Moreover, in
studies of adult segmentation, duration has
been shown to be a more powerful influence
than vowel quality, pitch contour, or amplitude (Nakatani & Schaffer, 1978). Nevertheless, differences in infants’ processing of trochaic and iambic stimuli may only appear
when they are presented with stimuli in which
these natural correlations are preserved.
Second, the infants in Morgan and Saffran’s
study had received considerable exposure to
the stimulus strings before testing. It is possible that any initial tendency toward perceiving
a trochaic sequence as more cohesive may be
neutralized by sufficient familiarization with
an iambic sequence. After all, words manifesting iambic patterns are possible in English;
for mature speakers, a metrical segmentation
strategy is a bias rather than an absolute constraint on processing. It would be counterproductive for language learners to adopt an absolute version of a metrical strategy for segmentation. Thus, a more relevant test of whether
infants employ a metrical strategy for segmentation might involve examination of the perceived cohesiveness of novel syllabic sequences. Note that the stimuli used by Jusczyk, Cutler, and Redanz (1993) included only
low frequency words, to which infants were
unlikely to have been previously exposed.
Thus, the preference they demonstrated was a
preference for novel trochaic sequences over
novel iambic sequences. The goal of the studies presented below is to further investigate
infants’ representations of novel syllabic sequences.
In the first two studies, 6- and 9-monthold infants were familiarized to a key pair of
syllables manifesting one particular rhythmic
pattern or another (either trochaic or iambic)
and were later tested on both the familiar pair
of syllables and a novel pair of syllables manifesting the trained rhythmic pattern. In the second two studies, 6- and 9-month-old infants
are trained in the same fashion as in the first
two experiments, but were later tested on the
same pair of syllables when they appeared in
AID
JML 2447
/
a004$$$$$2
both the trained and the opposing, untrained
rhythmic patterns.
To the extent that the metrical segmentation
strategy hypothesis correctly predicts that trochaic rhythm tends to enhance the perceived
cohesiveness of syllable sequences, asymmetries across stimulus conditions should result.
In particular, novel bisyllables should be perceived as being more cohesive when they
manifest trochaic rhythm, and familiar bisyllables manifesting a novel rhythmic pattern
should be perceived as being more cohesive
when the novel pattern is trochaic. Examination of infants at two ages may permit the
determination of whether enhanced cohesiveness of trochaic syllable sequences emerges
between 6 and 9 months, in tandem with the
emergence of preference for trochaic sequences documented by Jusczyk, Cutler, and
Redanz (1993).
EXPERIMENT 1
The noise detection method is a variant of
the conditioned head turning technique (Kuhl,
1985) that draws its inspiration from click detection methods formerly employed in investigating adult psycholinguistic representations
(see Fodor, Bever, & Garrett, 1974, for an
extensive historical review). Unlike the typical procedure of that time, which used subjects’ recollections of click locations as the
primary dependent measure, the noise detection method is based on measurements of response latencies to extraneous noises (short
buzzes) occurring in different junctures in syllable strings. (A handful of studies investigating adult sentence processing, including Abrams & Bever, 1969; Bond, 1972; Flores
d’Arcais, 1978; and Holmes & Forster, 1970,
used on-line click detection methods comparable to the method used here. Those studies
found differences in response latency as a
function of objective click location: Clicks
within syntactic units were responded to more
slowly than were clicks between units. These
results are consistent with the predictions
made here.) Under this method, infants are
conditioned to turn their heads in response to
short buzzes which are initially presented in
09-25-96 11:50:42
jmla
AP: JML
670
JAMES L. MORGAN
silent intervals between multisyllabic strings.
After training, buzzes are presented between
different pairs of syllables within the strings,
and differences in response latencies according to noise location are used to index
infants’ perceptual organization of the stimulus strings. Across stimulus conditions, prosodic and sequential properties of the multisyllabic strings are to be manipulated to
promote or discourage perceptions of cohesiveness between particular pairs of syllables.
Method
Subjects. Infants approximately 9 months
of age were recruited from the Providence
metropolitan area. Thirty-two infants, approximately half male and half female, served as
subjects; 16 infants were assigned at random
to each of the two stimulus conditions. Subjects had a mean age of 0;9.05 at the first
session (range 0;8.20 to 0;9.22). The two sessions required for each infant were scheduled
so that the time between the two sessions was
no more than 10 days.
Fifty-nine infants were tested in order to
attain the final sample; infants were excluded
from the study for the following reasons: Failure to complete initial shaping within 40 trials
or to meet the predetermined training criterion
within 30 trials in the initial session (14), interference from uncooperative siblings (1),
difficulties scheduling subsequent testing sessions within a 10 day period (4), and failure
to contribute at least two latencies to each cell
in the analysis (8).
Apparatus. Infants were tested in a soundproofed laboratory room. An experimenter in
an adjoining control room monitored the infant’s behavior via a Panasonic closed-circuit
television system. Trial duration, stimulus presentation, and delivery of reinforcement were
controlled by custom designed software running on an IBM PS/2 equipped with a Data
Translations DT2901 D–A board. Stimuli
were presented through a 5000 Hz TTE
411AFS low pass filter, a custom programmable attenuator, an Onkyo P-304 preamplifier,
and an Onkyo M-504 power amplifier that was
connected to an Electrovoice Sentry 100A
AID
JML 2447
/
a004$$$$$2
monitor loudspeaker located in the testing
room with the infant. Two smoked Plexiglas
boxes containing motorized animal figures
that provided reinforcement were located
above the loudspeaker.
Test sessions were recorded (video only)
with a Panasonic AG6040 time-lapse VCR,
which was interfaced to the computer. At the
beginning of each trial, the computer wrote
identifying information to the video tape
which was later used in coding the video tapes
for head turn latencies.
Stimuli. A set of stimulus syllables were
recorded and digitally edited for use in this
experiment. Original versions of the syllables
were spoken in isolation with a flat intonation
contour by a voice-trained adult female. Edited stimuli included two tokens each of [ko],
[ga], [pu], and [bc̀ ], and one token each of
[de] and [ti]. Desired durations were produced
by excising entire glottal periods, from zerocrossing to zero-crossing, so that no transients
were introduced. Shorter versions of [ko],
[ga], [pu], and [bc̀ ] (henceforth denoted by
[kò], [gà], [pù], and [bc̀ ]) were 350 ms,
whereas longer versions of [ko], [ga], [pu],
and [bć ] (henceforth denoted by [kó], [gá],
[pú], and [bć ]) were 500 ms in duration.
Shorter and longer versions of these syllables
did not differ in either amplitude or fundamental frequency contour. The tokens of [de] and
[ti] were edited to be 425 ms in duration. The
extraneous buzz was created by digitally concatenating clicks from a child’s toy cricket.
Subjects heard the syllables in trisyllabic
strings. Each string included one filler syllable, either [de] or [ti], which appeared in either
initial or final position. Each string also included one key pair of syllables which either
consisted of one token of [kò] or [kó] and one
token of [gà] or [gá], or else one token of [pù]
or [pú] and one token of [bc̀ ] or [bć ]. The
syllables constituting the key pair were always
adjacent to one another and always occurred
in the same relative order. Within the strings,
individual syllables were separated by 50 ms
silences; strings themselves were separated by
1000 ms silences.
Two between-subject stimulus conditions
09-25-96 11:50:42
jmla
AP: JML
671
PREVERBAL SPEECH SEGMENTATION
TABLE 1
TRISYLLABIC STRINGS INCLUDED
Condition
IN
STIMULUS CONDITIONS, EXPERIMENTS 1
Subcondition B
Iambic meter
Subcondition A
Subcondition B
2a
Stimulus stringsb
Phase
Trochaic meter
Subcondition Ac
AND
Training
Testing
gákòde
gákòde
Training
Testing
bć puD de
bć puD de
Training
Testing
gàkóde
gàkóde
Training
Testing
bc̀ púde
bc̀ púde
gákòti
gákòti
bć puD de
bć puD ti
bć puD ti
gákòde
degákò
degákò
debć puD
debć puD
debć puD
degákò
tigákò
tigákò
gàkóti
gàkóti
bc̀ púde
bc̀ púti
bc̀ púti
gàkóde
degàkó
degàkó
debc̀ pú
debc̀ pú
debc̀ pú
degàkó
tigàkó
tigàkó
tibć puD
tibć puD
tibc̀ pú
tibc̀ pú
a
Accute accents ( ´ ) indicate longer syllables (500 ms); grave accents ( ` ) indicate shorter syllables (350 ms);
syllables without accents are medium length (425 ms).
b
All four Training strings were used as background during Testing, but, for the conditions illustrated, only familiar
strings with [de] were used in test trials.
c
Subconditions C–D were similar to A–B, respectively, except that [ti], rather than [de] appeared in the novel test
items; subconditions E–H were similar to A–D, respectively, except that the order of syllables within key pairs was
reversed, e.g., [kógà] and puB bc̀ ] rather than [gákò] and bć pù].
were employed. Each condition included a
training set of four trisyllabic strings, across
which the key syllable pair consistently
manifested one of the two rhythmic patterns
used. This set of strings was used in initial
training, as background stimuli between test
trials, and as test stimuli in half of the test
trials (see Procedure for additional details).
Strings with key pairs comprising syllables
not included in the training strings provided
test stimuli for the remaining test trials.
Stimulus conditions are described below and
illustrated in Table 1.
In the Trochaic Meter condition, the rhythmic pattern over the key pair of syllables was
always trochaic: a longer syllable followed by
a shorter syllable. Across the training set of
strings, both this consistent rhythmic pattern
and distributional (or sequential) information
contributed evidence that the pair of syllables
constituted a unit. The distributional evidence
also signaled that [de] and [ti] each formed
independent units. Following training, infants
were tested on strings with either familiar or
AID
JML 2447
/
a004$$$$$3
novel key pairs of syllables. Eight subconditions were included, to each of which 2 infants
were assigned at random, and across which
key training syllables, key syllable order, and
use of [de] or [ti] in test items were counterbalanced. A representative subset of these subconditions are shown in Table 1. This condition tested the cohesiveness of a novel trochaically accented syllable sequence relative to a
familiar trochaically accented syllable sequence.
In the Iambic Meter condition, the rhythmic
pattern across the key pair of syllables was
always iambic: a shorter syllable followed by
a longer syllable. Again, both rhythmic and
distributional information provided evidence
that the pair of syllables constituted a unit,
and distributional evidence indicated that [de]
and [ti] each constituted independent units.
Following training, infants were tested on
strings with either familiar or novel key pairs
of syllables. Eight subconditions were included, some of which are shown in Table 1.
This condition tested the cohesiveness of a
09-25-96 11:50:42
jmla
AP: JML
672
JAMES L. MORGAN
novel iambically accented syllable sequence
relative to a familiar iambically accented syllable sequence.
As in Morgan and Saffran, trochaic and
iambic rhythmic patterns were produced here
by varying only the durations of individual
syllables. This provides a stringent test of the
metrical segmentation hypothesis. If differences in infants’ processing of trochaic and
iambic rhythms appear when they are presented with stimuli in which only syllable duration distinguishes strong and weak syllables,
such differences should appear a fortiori when
stimuli incorporate additional properties normally correlated with meter.
Procedure. Infants participated in one training session and one testing session. During
both sessions infants were seated on their parents’ laps at a small table. An assistant seated
directly across from the infant maintained the
infant’s attention at midline by displaying and
manipulating an assortment of toys. The experimenter in the control room observed the
infant on a video monitor. This experimenter
initiated trials upon observing the infant’s attention to be focused at the midline. Reinforcement was delivered contingent on the infant responding to a buzz with a headturn toward the loudspeaker, which was located 90
degrees from midline on the infant’s left. The
experimenter served as sole judge of whether
the infant turned its head during trials, depressing a button to signal to the computer
that a headturn had occurred. (The video camera was located directly above the loudspeaker, so the experimenter judged whether
the infant looked into the camera.) Reinforcement began when the experimenter signaled a
headturn, and lasted for 3 s. To preclude bias,
throughout all sessions, parent, assistant, and
experimenter all listened to music over
around-the-ear headphones (Sony MDR-CD
550).
During the Training Session, infants heard
the trisyllabic strings presented at a comfortable listening level (60–62 dB (B) SPL). Example stimulus sequences from each phase of
the training session are illustrated in Table 2.
The software randomly chose which of the
AID
JML 2447
/
a004$$$$$3
four training strings to present in each input
block; three repetitions of the string normally
occurred in each block. Upon judging the infant’s attention to be focused at midline, the
experimenter could request a trial; the trial
began at the earliest possible point following
an inter-string interval and included one block
of three repetitions of a string. In some instances, the trial began in the middle of a background stimulus block, in which case the
string in that block was repeated across the
trial. In other instances, the trial began at the
end of a background stimulus block, in which
case the software simply advanced to the next
stimulus block.
During the shaping phase, in each trial a
buzz was inserted in the middle of the first
1000 ms inter-string interval. Infants were allowed a window of 2.5 s from the onset of
the buzz in which to turn toward the loudspeaker in order to receive reinforcement;
head turns occurring after this period were not
reinforced. Initially, the buzz was 100 ms in
duration and 12 dB louder than the speech
stimuli; across shaping trials, after every two
consecutive correct trials, the buzz was shortened in 20 ms steps to 40 ms, and its intensity
was lowered in 4 dB steps until it was equal
to that of the speech. The shaping phase was
continued for 40 trials or until the infant
turned his or her head on two consecutive trials with the buzz at its minimum duration.
Following successful shaping, the criterion
phase began. In this phase of the training session, the computer randomly selected trials to
be either target (buzz) trials or control (nobuzz) trials. Infants were tested until they had
either attained the predetermined criterion by
responding correctly on seven of eight consecutive trials (by turning on target trials and not
turning on control trials) or completed 30 trials. Infants not attaining the criterion within
the specified number of trials were excluded
from further participation; infants who attained the criterion returned for a testing session.
The Testing Session began with a short review (requiring four head turns) recapitulating
the shaping phase of the Training Session—
09-25-96 11:50:42
jmla
AP: JML
673
PREVERBAL SPEECH SEGMENTATION
TABLE 2
EXAMPLES
OF
TRAINING TRIALS
IN
EXPERIMENTS 1–4a
Stimulus blocksb
Shaping phase
Criterion phase
Trial
type
Pre-trial
Trial
Post-trial
All
Target
Control
. . . gákòti3 gákòde3 degákò3
. . . tigàkò3 tigákò3 degákò1
. . . gákòde3 degákò3 tigákò2
gákòtif gákòti gákòti
degákòf degákò degákò
tigákò tigákò tigákò
gákòde3 tigákò3 degákò3 . . .
gákòde3 tigákò3 gákòti3 . . .
gákòti3 degákò3 gákòde3 . . .
a
Possible sequences are given here only for one subcondition in the Trochaic Meter (Trochaic Training) condition,
in which [gákò] served as the key training syllable pair.
b
Superscripted numbers indicate repetitions. f indicates the location of the buzz within a trial.
the buzz was presented in the first inter-string
interval, and the duration and amplitude of the
buzz were reduced following each head turn.
Following this, the testing phase began. In this
phase, between trials, infants heard the same
strings in the same fashion as before. Test
trials differed from training trials in that the
40 ms buzz occurred in one of the 50 ms intersyllable intervals within the first string, either
between the syllables constituting the key pair
(within unit) or between one of these syllables
and a filler syllable (between units). As in the
training trials, infants were allowed a period
of 2.5 s beginning with the onset of the buzz
in which to turn in order to receive reinforcement. The string presented in each of the trials
was predetermined by the software; hence, unlike the Training Session, interrupted stimulus
blocks were not extended through the trials.
The testing phase included 24 randomly ordered trials in total. Crossing of three binary
factors—Familiarity (familiar vs novel stimulus), Position (first inter-syllable interval vs
second inter-syllable interval buzz location),
and Unit Status (within-unit vs between-units
buzz location)—yields eight trial types, each
of which occurred three times. Example stimulus sequences illustrating each trial type are
provided in Table 3.
Note that across the set of stimulus strings, the
familiar syllables flanking within-unit buzzes—
the key pair of syllables—are invariant in segmental content, relative order, and rhythmic pattern. These properties should promote perception of this pair of syllables as a single unit;
AID
JML 2447
/
a004$$$$$3
hence, responses to extraneous noises occurring
between these syllables should be relatively
slow. In contrast, across the set of stimulus
strings, the familiar syllables flanking betweenunits buzzes vary in segmental content, relative
order, and syllable duration. These properties
should discourage perception of these pairs of
syllables as single units; hence responses to ex-
TABLE 3
EXAMPLES
OF
TESTING TRIALS
IN
EXPERIMENTS 1 & 2a
Trial Stimulib
Trial type
Familiar
Within-unit
first-interval
Within-unit
second-interval
Between-units
first-interval
Between-units
second-interval
Novel
Within-unit
first-interval
Within-unit
second-interval
Between-units
first-interval
Between-units
second-interval
gáfkòde gákòde gákòde
degáfkò degákò degákò
defgákò degákò degákò
gákòfde gákòde gákòde
bć fpúde gákòde gákòde
debć fpù degákò degákò
defbć pù degákò degákò
bć pùfde gákòde gákòde
a
Possible sequences are given here only for Subcondition A in the Trochaic Meter condition, in which [gákò]
served as the key syllable pair in training, and all strings
in test trials included [de] as the filler syllable.
b f
Indicates the location of the buzz within a trial.
09-25-96 11:50:42
jmla
AP: JML
674
JAMES L. MORGAN
traneous noises occurring between these syllables should be relatively fast. Differences in latencies to within-unit versus between-units
buzzes thus provide a measure of the degree to
which the key syllables are perceived as cohering with one another to form a unit.
All test trials were coded off-line from the
video tapes. Off-line coders were blind to the
on-line scoring of the experimenter, the nature
of each trial, and the conditions to which subjects were assigned. During coding, the VCR
was interfaced to the computer and controlled
by custom software. Using the identifying information written on the tape during the experiment, the computer advanced the videotape
to the beginning of the buzz in each trial. The
coder then played the tape forwards and backwards in slow motion (at speeds ranging down
to 1 field—1/60 s—per s) and identified the
point at which the infant began to rotate its
head in a smooth movement that culminated
in a complete head turn toward the camera.
The computer automatically calculated the
amount of time elapsed between this point and
the beginning of the buzz. Twenty-five percent
of the head turn latencies were coded a second
time to assess coder reliability.
Agreement between on-line experimenters
and off-line coders on whether head turns occurred exceeded 95%. Intra-coder reliability
on coding latencies was quite high: More than
90% of these latencies differed by no more
than two frames of the video tape (67 ms).
Root mean square averages were computed
for head turning latencies within each of the
eight Familiarity by Position by Unit Status trial
types. RTs of less than 100 ms were excluded.
Subjects were excluded if they did not have at
least two usable RTs in each cell (maximum
three). Difference scores were then computed
for each Familiarity by Position cell by subtracting the between-units mean from the
within-units mean. These difference scores provide measures of the relative cohesiveness of
the key syllable pair for the different Familiarity
by Position trial types.
Results and Discussion
A preliminary set of Stimulus Condition
by Familiarity by Position omnibus analyses
AID
JML 2447
/
a004$$$$$3
FIG. 2. Nine-month-olds. Means and standard errors of
differences in head turning latencies in Experiment 1 to
buzzes occurring inside versus outside key syllable pairs
for familiar and novel test stimuli. Large positive differences are consistent with perception of the syllable pairs
as cohesive units.
yielded no significant main effects or interaction terms involving the Position factor
(first inter-syllable interval vs second intersyllable interval buzz location). Therefore,
treatment means were collapsed across levels of Position, and this factor was not used
in subsequent analyses. Means and SEs of
difference scores for familiar and novel
stimuli for each of the stimulus conditions
are shown in Fig. 2.
Familiar stimuli. Thirteen of 16 subjects in
the Trochaic Meter condition responded more
slowly on average to ‘within unit’ buzzes than
to ‘between units’ buzzes; 12 of 16 subjects
in the Iambic Meter condition did likewise.
Overall, subjects required 569 ms (SD Å 108
ms) to initiate head turns on familiar withinunit trials, and 516 ms (SD Å 138 ms) to
initiate head turns on familiar between-units
trials. Both groups of subjects displayed significant positive difference scores: for Trochaic Meter, t(15) Å 2.75, p õ .05, and for
Iambic Meter, t(15) Å 2.15, p õ .05. Difference scores did not vary significantly across
the two stimulus conditions, t(30) Å 0.33, NS.
This pattern of results is similar to that previously obtained in Morgan and Saffran, Ex-
09-25-96 11:50:42
jmla
AP: JML
675
PREVERBAL SPEECH SEGMENTATION
periment 2 and suggests that no processing
advantage exists for familiar trochaic over familiar iambic sequences.
Novel stimuli. Once again, 13 of 16 subjects
in the Trochaic Meter condition responded
more slowly on average to within-unit buzzes
than to between-units buzzes. However, only
8 of 16 subjects in the Iambic Meter condition
did likewise. Subjects in the Trochaic Meter
condition required 665 ms (SD Å 96 ms) to
initiate head turns on novel within-unit trials
and 612 ms (SD Å 102 ms) to initiate head
turns on novel between-units trials. The mean
difference score for the Trochaic Meter condition was significantly greater than zero, t(15)
Å 2.82, p õ .05. Subjects in the Iambic Meter
condition required 623 ms (SD Å 88 ms) to
initiate head turns on novel within-unit trials
and 635 ms (SD Å 101 ms) to initiate head
turns on novel between-units trials; but the
mean difference score for the Iambic Meter
condition was not significantly different from
zero, t(15) Å 00.56, NS. Difference scores
varied significantly across the two conditions,
t(30) Å 2.58, p õ .05. The different patterning
of results across familiar and novel stimuli
was also evidenced by a significant Stimulus
Condition by Familiarity interaction in the
omnibus analysis of difference scores, F(1,30)
Å 4.85, p õ .05.
The results of this experiment replicate in
part those of Morgan and Saffran, Experiment
2: Infants appeared to perceive the familiarized key syllable pairs as equally cohesive regardless of whether they manifested trochaic
or iambic rhythm. However, infants’ responses to novel syllable pairs did vary depending on rhythmic pattern: Whereas novel
trochaic bisyllables appeared to be as cohesive
as familiar trochaic bisyllables, novel iambic
bisyllables were significantly less cohesive
than were familiar iambic bisyllables. The
comparable overall increases in head turning
latency on trials with novel stimuli (99 ms for
Trochaic Meter, t(15) Å 6.16, p õ .01; 83 ms
for Iambic Meter, t(15) Å 5.97, p õ .01) show
that infants in both conditions noticed the
change in stimulus. Therefore, the continued
relatively slow responses on novel within-unit
AID
JML 2447
/
a004$$$$$3
trials by infants in the Trochaic Meter condition cannot be attributed to their having ignored the segmental changes introduced in
these trials. Rather, the pattern of results in
this experiment is compatible with the possibility that 9-month-old, English-exposed infants are biased toward perceiving trochaic bisyllables as constituting cohesive portions of
the speech stream—a bias tantamount to use
of a metrical strategy for segmentation.
EXPERIMENT 2
Several recent studies have demonstrated
changes in speech perception occurring over
the second half of the first year, many of which
appear to reflect attunement to properties of
the native language. For example, Werker and
Tees (1984) showed that 6-month-olds can
discriminate non-native speech contrasts that
10-month-olds cannot, and Jusczyk, Friederici, Wessels, Svenkerud, and Jusczyk (1993)
showed that 9-month-olds prefer to listen to
words exemplifying native, rather than nonnative, sound sequences, whereas 6-montholds show no preference. Experiment 2 asks
whether comparable changes occur with regard to rhythmic biases in speech segmentation, testing 6-month-olds under the procedures of Experiment 1.
Two recent findings are particularly germane to making predictions in the present experiment. First, Morgan and Saffran (1995)
found that rhythmic regularities holding over
a set of stimulus strings were sufficient for 6month-olds to group pairs of syllables together. Thus, for familiar stimuli, 6-montholds may be expected to behave in a fashion
similar to that of the 9-month-olds in Experiment 1, providing evidence that familiar trochaic and iambic bisyllables are both perceived as cohesive. Second, Jusczyk, Cutler,
and Redanz (1993) found that 6-month-old,
English-exposed infants show no preference
for either trochaic or iambic novel bisyllables,
whereas 9-month-olds prefer trochaic novel
bisyllables. Thus, for novel stimuli, 6-montholds may be expected to behave differently
than did the 9-month-olds in Experiment 1,
failing to provide evidence that novel trochaic
09-25-96 11:50:42
jmla
AP: JML
676
JAMES L. MORGAN
and iambic bisyllables are perceived as differentially cohesive.
Method
Subjects. Infants approximately 6 months
of age were recruited from the Providence
metropolitan area. Thirty-two infants, approximately half male and half female, served as
subjects; 16 infants were assigned at random
to each of the two stimulus conditions. Subjects had a mean age of 0;6.13 at the first
session (range 0;6.1 to 0;7.0). The two sessions required for each infant were scheduled
so that the time between the two sessions was
no more than 10 days.
Sixty-six infants were tested in order to attain the final sample; infants were excluded
from the study for the following reasons: failure to complete initial shaping within 40 trials
or to meet the predetermined training criterion
within 30 trials in the initial session (19), difficulties scheduling subsequent testing sessions within a 10 day period (5), and failure
to contribute at least two latencies to each cell
in the analysis (10).
Apparatus, stimuli, and procedure. The apparatus, stimuli, and procedure used in this
study were identical to those used in Experiment 1.
Results and Discussion
As in Experiment 1, preliminary analyses
showed no significant effects involving the
Position factor, so treatment means were once
again collapsed across levels of Position, and
this factor was not used in subsequent analyses. Means and SEs of difference scores for
familiar and novel stimuli for each of the stimulus conditions are shown in Fig. 3.
Familiar stimuli. Ten of 16 subjects in the
Trochaic Meter condition, and 13 of 16 subjects in the Iambic Meter condition, responded
more slowly on average to within-unit buzzes
than to between-units buzzes. Overall, subjects required 599 ms (SD Å 113 ms) to initiate head turns on familiar within-unit trials,
and 547 ms (SD Å 105 ms) to initiate head
turns on familiar between-units trials. A significant positive difference score was found
AID
JML 2447
/
a004$$$$$4
FIG. 3. Six-month-olds. Means and standard errors of
differences in head turning latencies in Experiment 2 to
buzzes occurring inside versus outside key syllable pairs
for familiar and novel test stimuli. Large positive differences are consistent with perception of the syllable pairs
as cohesive units.
for the Iambic Meter condition, t(15) Å 2.44,
p õ .05, but not for the Trochaic Meter condition, t(15) Å 1.75, NS. Difference scores did
not vary significantly across the two stimulus
conditions, t(30) Å 00.52, NS.
Novel stimuli. Fourteen of 16 subjects in
the Trochaic Meter condition responded more
slowly on average to within-unit buzzes than
to between-units buzzes. Twelve of 16 subjects in the Iambic Meter condition did likewise. Overall, subjects required 623 ms (SD
Å 113 ms) to initiate head turns on novel
within-unit trials and 566 ms (SD Å 100 ms)
to initiate head turns on novel between-units
trials. Although only the mean difference
score for the Trochaic Meter condition was
significantly greater than zero, t(15) Å 2.56,
p õ .05, difference scores did not vary significantly across the two stimulus conditions,
t(30) Å 0.91, NS. The omnibus analysis of
difference scores did not yield a significant
Stimulus Condition by Familiarity interaction,
F(1,30) õ 1, NS (neither of the main effects
were significant in the omnibus analysis,
either).
Unlike 9-month-olds, 6-month-old Englishexposed infants do not appear to be biased
09-25-96 11:50:42
jmla
AP: JML
677
PREVERBAL SPEECH SEGMENTATION
toward perceiving trochaic bisyllables, whether
familiar or novel, as being more cohesive portions of the speech stream than iambic bisyllables. This comports with Jusczyk, Cutler,
and Redanz’s (1993) finding that 6-montholds showed no preference for listening to
trochaic, rather than iambic, bisyllables. In
contrast to Experiment 1, it is not clear that
introducing novel stimulus syllables was an
effective manipulation. First, in this experiment, infants’ patterns of responding were
very similar across familiar and novel syllables. Second, whereas 9-month-olds’ responses to novel syllables were on average
about 90 ms slower than their responses to
familiar syllables, 6-month-olds displayed a
much smaller difference. This raises the possibility that in the noise detection task, younger
infants may become ‘‘locked in’’ to certain
rhythmic patterns, to the point of failing to
notice or attend to segmental changes in the
stimuli. This comports with Morgan and Saffran’s previous finding that 6-month-olds tend
to focus on rhythmic stimulus properties in
grouping syllables, providing evidence that bisyllables with consistent rhythmic patterns are
perceived as cohesive, even when the segmental content of those bisyllables varies.
EXPERIMENT 3
Cutler and Clifton (1984) investigated English-speaking adults’ recognition of misstressed familiar words, finding that words
with unreduced vowels that were misstressed
to be trochaic were recognized faster than
were words mis-stressed to be iambic. In principle, this could come about for two reasons,
one having to do with lexical access, the other
having to do with segmentation per se.1 On
the first explanation, iambs mis-stressed as
trochees might be recognized more quickly
than trochees mis-stressed as iambs because
strong syllables are more likely to initiate lexical access than are weak syllables (Cutler &
Norris, 1988). Thus for trochees mis-stressed
as iambs, not only is there a mismatch be1
I thank one of the reviewers for pointing out the distinction between these possibilities.
AID
JML 2447
/
a004$$$$$4
tween expected and observed rhythmic patterns, but also lexical access may be mistakenly initiated at the second syllable. On the
second explanation, listeners may be biased
toward imposing trochaic segmentation (or
against imposing iambic segmentations) on
strings that they hear. Thus, iambs misstressed as trochees might be segmented as
single strong–weak units that correspond segmentally, if not rhythmically, to known units
(or words), whereas trochees mis-stressed as
iambs might be segmented as sequences of
two units (the first weak, the second strong),
neither corresponding segmentally to known
units. The pattern of reaction times observed
by Cutler and Clifton was more compatible
with the first explanation than the second.
The present study is a miniaturized version
of Cutler and Clifton (1984), using 9-monthold infants as subjects. Infants were first familiarized with a particular bisyllable embodying
either trochaic or iambic rhythm. They were
then tested to determine whether the familiar
version of the bisyllable and the mis-stressed
version of the bisyllable both are perceived as
relatively cohesive. In light of the results of
Experiment 1, it was expected that infants
would perceive familiarized bisyllables as being cohesive, regardless of the rhythmic pattern they manifested. However, because Experiment 1 suggests that 9-month-old infants
exposed to English are biased to perceive trochaic bisyllables as being particularly cohesive, it was also expected that infants would
perceive novel versions of the bisyllables as
being more cohesive when they were misstressed trochaically than when they are misstressed iambically.
The results of Experiment 1 suggest that
the most likely interpretation of the expected
pattern of results would appeal to infants’ biases for imposing trochaic segmentations on
input strings. Recall that in Experiment 1, segmentally novel trochaic bisyllables were perceived by 9-month-olds as being cohesive. Because these bisyllables were segmentally
novel, they could not have corresponded to
any existing ‘‘lexical’’ representation; hence
their cohesiveness was not related to processes
09-25-96 11:50:42
jmla
AP: JML
678
JAMES L. MORGAN
of lexical access. Thus, the explanation preferred by Cutler and Clifton for their adult
findings is not applicable to infants’ processing of syllable strings.
Method
Subjects. Infants approximately 9 months
of age were recruited from the Providence
metropolitan area. Thirty-two infants, approximately half male and half female, served as
subjects; 16 infants were assigned at random
to each of the two stimulus conditions. Subjects had a mean age of 0;9.17 at the first
session (range 0;9.9 to 0;10.4). The two sessions required for each infant were scheduled
so that the time between the two sessions was
no more than 10 days.
Forty-nine infants were tested in order to
attain the final sample; infants were excluded
from the study for the following reasons: failure to complete initial shaping within 40 trials
or to meet the predetermined training criterion
within 30 trials in the initial session (11), difficulties scheduling subsequent testing sessions within a 10 day period (2), and failure
to contribute at least two latencies to each cell
in the analysis (4).
Apparatus. The apparatus used in this study
was identical to that used in Experiments 1
and 2.
Stimuli. The stimuli used in this experiment included two digitally edited tokens
each of [ko] and [ga], and one token each of
[de] and [ti], originally spoken in isolation
with a flat intonation contour by a voicetrained adult female. Shorter versions of [ko]
and [ga] (henceforth denoted by [kò] and
[gà]) were 350 ms, whereas longer versions
of [ko] and [ga] (henceforth denoted by [kó]
and [gá]) were 500 ms in duration. Shorter
and longer versions of these syllables did
not differ in either amplitude or fundamental
frequency contour. The tokens of [de] and
[ti] were edited to be 425 ms in duration.
See Morgan (1994) for additional details
concerning stimulus properties.
Subjects heard the syllables in trisyllabic
strings. Each string included one token of [kò]
or [kó], one token of [gà] or [gá], and one
AID
JML 2447
/
a004$$$$$4
token of either [de] or [ti]. The tokens of [ko]
and [ga], which constituted the key pair, were
always adjacent to one another and always
occurred in the same relative order. The tokens of [de] and [ti] served as filler syllables
and always appeared in either initial or final
position. Within the strings, individual syllables were separated by 50 ms silences; strings
themselves were separated by 1000 ms silences.
Two between-subject stimulus conditions
were employed. Each condition included a
training set of four trisyllabic strings, across
which the key syllable pair consistently manifested one of the two rhythmic patterns used.
This set of strings was used in initial training,
as background stimuli between test trials, and
as test stimuli in half of the test trials (see
‘‘Procedure’’ for additional details). The
training set for the opposite condition provided test stimuli for the remaining test trials.
Stimulus conditions are described below and
illustrated in Table 4.
In the Iambic Training condition, the rhythmic pattern across the key pair of syllables in
the training set of trisyllabic strings was iambic: a shorter syllable followed by a longer
syllable. Across this set of strings, both this
consistent rhythmic pattern and distributional
(or sequential) information contributed evidence that the pair of syllables constituted a
unit. The distributional evidence also signaled
that [de] and [ti] each formed independent
units. Following training, infants were tested
on strings with key pairs manifesting either
iambic or trochaic rhythms. Four subconditions were included, to each of which four
infants were assigned at random, and across
which key syllable order and use of [de] or
[ti] in test items were counterbalanced. These
are illustrated in Table 4. This condition tested
the cohesiveness of a trochaically accented
syllable sequence relative to a familiar, iambically accented syllable sequence.
In the Trochaic Training condition, the
rhythmic pattern over the key pair of syllables
in the training set of strings was trochaic: a
longer syllable followed by a shorter syllable.
Again, both rhythmic and distributional infor-
09-25-96 11:50:42
jmla
AP: JML
679
PREVERBAL SPEECH SEGMENTATION
TABLE 4
Trisyllabic Strings Included in Stimulus Conditions, Experiments 3 and 4a
Condition
Phase
Iambic training
Subcondition Ab
Subcondition B
Trochaic training
Subcondition A
Subcondition B
Stimulus strings
Training
Testing
gàkóde
gàkóde
Training
Testing
kògáde
kògáde
Training
Testing
gákòde
gákòde
Training
Testing
kógàde
kógàde
gàkóti
gàkóti
gákòde
kògáti
kògáti
kógàde
degàkó
degàkó
degákò
dekògá
dekògá
dekógà
gákòti
gákòti
gákòde
kógàti
kógàti
kògáde
degákò
degákò
degàkó
dekógà
dekógà
dekògá
tigàkó
tigàkó
tikògá
tikògá
tigákò
tigákò
tikógà
tikógà
a
Acute accents ( ´ ) indicate longer syllables (500 ms); grave accents ( ` ) indicate shorter syllables (350 ms); syllables
without accents are medium length (425 ms).
b
Subconditions C–D were similar to A–B, respectively, except that [ti], rather than [de], appeared in the novel
test items.
mation provided evidence that the pair of syllables constituted a unit, and distributional evidence indicated that [de] and [ti] each constituted independent units. Following training,
infants were tested on strings with key pairs
manifesting either trochaic or iambic rhythms.
Four subconditions were included, as shown
in Table 4. This condition tested the cohesiveness of an iambically accented syllable sequence relative to a familiar, trochaically accented syllable sequence.
Procedure. The procedure in the Training
Session was identical to that followed in Experiments 1 and 2. See Table 2 for illustrations
of example stimulus sequences from each
phase of this session. The procedure followed
in the Testing Session was similar to that described earlier, except that the buzz appeared
within the second string in each trial. This
modification was adopted because infants
would have no means of discerning the rhythmic pattern across the key pair of syllables at
the point the buzz occurred if the buzz were
presented within the first string. In this experiment, the rhythmic pattern manifested by the
first test string served to prime infants’ expec-
AID
JML 2447
/
a004$$$$$4
tations for the rhythmic pattern manifested by
the second test string. Example stimulus sequences illustrating each trial type in the testing phase are provided in Table 5.
Results and Discussion
As in Experiments 1 and 2, a preliminary
set of Stimulus Condition by Familiarity by
Position omnibus analyses were conducted on
the difference scores. In this experiment, however, Position did interact with other factors;
specifically, the Stimulus Condition by Position interaction was significant for novel stimuli, F(1,30) Å 6.74, p õ .05. In light of this,
subsequent analyses of responses to novel
stimuli were carried out separately for first
and second inter-syllable interval trials.
Means and SEs of difference scores for familiar stimuli collapsed across levels of Position,
and for novel stimuli within each level of Position are shown in Fig. 4.
Familiar stimuli. Eleven of 16 subjects in
the Iambic Training condition responded more
slowly on average to within-unit buzzes than
to between-units buzzes; 14 of 16 subjects in
the Trochaic Training condition did likewise.
09-25-96 11:50:42
jmla
AP: JML
680
JAMES L. MORGAN
TABLE 5
EAMPLES
OF
TESTING TRIALS
IN
EXPERIMENTS 3
AND
a
4
Trial stimulib
Trial type
Familiar
Within-unit
first-interval
Within-unit
second-interval
Between-units
first-interval
Between-units
second-interval
gákòde gáfkòde gákòde
degákò degáfkò degákò
degákò defgákò degákò
gákòde gákòfde gákòde
Novel
Within-unit
first-interval
Within-unit
second-interval
Between-units
first-interval
Between-units
second-interval
gàkóde gàfkóde gákòde
degàkó degàfkó degákò
degàkó defgàkó degákò
gàkóde gàkófde gákòde
a
Possible sequences are given here only for Subcondition A of the Trochaic Training condition, in which [gákò]
served as the key syllable pair in training, and all strings
in test trials included [de] as the filler syllable.
b f
Indicates the location of the buzz within a trial.
Overall, subjects required 588 ms (SD Å 105
ms) to initiate head turns on familiar withinunit trials, and 526 ms (SD Å 93 ms) to initiate
head turns on familiar between-units trials.
Both groups of subjects displayed significant
positive difference scores: for Iambic Training, t(15) Å 2.39, p õ .05, and for Trochaic
Training, t(15) Å 2.74, p õ .05. Difference
scores did not vary significantly across the two
stimulus conditions, t(30) Å 0.43, NS. This
pattern of results is similar to those of Experiments 1 and 2, and again suggests that no
processing advantage exists for familiar trochaic over familiar iambic sequences.
First position novel stimuli. Twelve of 16
subjects in the Iambic Training condition (for
whom novel stimuli were trochaic) responded
more slowly on average to first interval
within-unit buzzes than to first interval between-units buzzes. However, only 6 of 16
subjects in the Trochaic Training condition
(for whom novel stimuli were iambic) did
AID
JML 2447
/
a004$$$$$4
likewise. Subjects in the Iambic Training condition required 604 ms (SD Å 112 ms) to initiate head turns on within-unit trials, and 545
ms (SD Å 89 ms) to initiate head turns on
between-units trials. Subjects in the Trochaic
Training condition required 574 ms (SD Å
102 ms) to initiate head turns on within-unit
trials, and 598 ms (SD Å 78 ms) to initiate
head turns on between-units trials. Neither
mean difference score was significantly
greater than zero: for Iambic Training, t(15)
Å 1.85, p Å .08, and for Trochaic Training,
t(15) Å 01.11, NS. However, difference
scores did vary significantly across the two
conditions, t(30) Å 2.16, p õ .05. Further evidence for different patterning of results across
conditions for familiar stimuli versus first-position novel stimuli was provided by a significant Stimulus Condition by Familiarity interaction in the omnibus analysis of difference
scores, F(1,30) Å 4.76, p õ .05. Responses
to first-position novel stimuli fit the prediction
that infants would perceive mis-stressed versions of the bisyllables to be more cohesive
when they were mis-stressed trochaically than
when they were mis-stressed iambically.
FIG. 4. Nine-month-olds. Means and standard errors of
differences in head turning latencies in Experiment 3 to
buzzes occurring inside versus outside key syllable pairs
for familiar and novel test stimuli. Large positive differences are consistent with perception of the syllable pairs
as cohesive units.
09-25-96 11:50:42
jmla
AP: JML
681
PREVERBAL SPEECH SEGMENTATION
Second position novel stimuli. Only 7 of
16 subjects in each condition responded more
slowly on average to second interval withinunit buzzes than to second interval betweenunits buzzes. Overall, subjects required 521
ms (SD Å 99 ms) to initiate head turns on
within-unit trials, and 526 ms (SD Å 86 ms)
to initiate head turns on between-units trials.
Neither mean difference score was significantly greater than zero: for both conditions,
t(15) õ 0, NS, nor did difference scores vary
significantly across the two conditions, t(30)
Å 0.56, NS.
Responses on second-position novel stimuli
clearly do not fit the prediction that misstressed trochaic stimuli would be perceived
as being more cohesive that mis-stressed iambic stimuli. A likely explanation for this
hinges on the design variation that was introduced in this experiment, namely, that buzzes
occurred within the second string in each trial.
On novel trials, the key syllable pair in the
first string manifested the unfamiliar rhythmic
pattern. This was done to prime infants to expect the unfamiliar rhythmic pattern in the second string, but it may well have also cued
infants that a trial was about to occur (the
change in rhythm was a perfect cue in this
regard). In corroboration of this, note that infants’ latencies were very short in novel trials
in this experiment, particularly in second-interval trials. Possibly, anticipatory responding
may have washed out effects of stimulus
rhythm in these trials. Including a number of
no-buzz catch trials might have reduced such
anticipatory responding; unfortunately, such
trials would also have reduced the novelty of
the unfamiliar rhythmic pattern.
Results from both familiar trials and firstinterval novel trials, however, do fit the predicted patterns. Data from the familiar trials
provide additional evidence that familiar trochaic and iambic sequences are processed
similarly by 9-month-olds—neither is perceived as being more or less cohesive than the
other. Data from the first-interval novel trials
comport with the findings of Experiment 1:
For 9-month-old English-exposed infants,
novel trochaic sequences (whether novel by
AID
JML 2447
/
a004$$$$$4
virtue of a rhythmic change or a segmental
change) are perceived as being more cohesive
than are comparable novel iambic sequences.
EXPERIMENT 4
The results of Experiment 2 suggested that,
in tasks comparable to that used here, 6month-old infants may get locked on to rhythmic properties of speech stimuli to the extent
that they fail to attend to segmental changes
in those stimuli. On this view, 6-month-olds
in Experiment 2 perceived segmentally novel
syllable pairs (manifesting a familiar rhythm)
to be as cohesive as segmentally familiar pairs
because they overlooked their novelty. When
rhythmic changes are introduced, however, a
different pattern of results should emerge: 6month-olds may be likely to perceive segmentally familiar syllable pairs as lacking cohesiveness when they appear in a novel rhythm.
If 6-month-olds process trochaic and iambic
rhythms in similar fashion, as previous results
from Jusczyk, Cutler, and Redanz (1993a), as
well as results from Experiment 2, suggest,
then changes from trochaic to iambic, or iambic to trochaic, should yield similar patterns
of responses.
Method
Subjects. Infants approximately 6 months
of age were recruited from the Providence
metropolitan area. Thirty-two infants, approximately half male and half female, served as
subjects; 16 infants were assigned at random
to each of the two stimulus conditions. Subjects had a mean age of 0;6.22 at the first
session (range 0;6.10 to 0;7.6). The two sessions required for each infant were scheduled
so that the time between the two sessions was
no more than 10 days.
Seventy-one infants were tested in order to
attain the final sample; infants were excluded
from the study for the following reasons: failure to complete initial shaping within 40 trials
or to meet the predetermined training criterion
within 30 trials in the initial session (18), difficulties scheduling subsequent testing sessions within a 10 day period (8), and failure
09-25-96 11:50:42
jmla
AP: JML
682
JAMES L. MORGAN
FIG. 5. Six-month-olds. Means and standard errors of
differences in head turning latencies in Experiment 4 to
buzzes occurring inside versus outside key syllable pairs
for familiar and novel test stimuli. Large positive differences are consistent with perception of the syllable pairs
as cohesive units.
to contribute at least two latencies to each cell
in the analysis (13).
Apparatus, stimuli, and procedure. The apparatus, stimuli, and procedure used in this
study were identical to those used in Experiment 3.
Results and Discussion
As in Experiments 1 and 2, preliminary
analyses showed no significant effects involving the Position factor. Thus, treatment means
were collapsed across levels of Position, and
this factor was not used in subsequent analyses. Means and SEs of difference scores for
familiar and novel stimuli for each of the stimulus conditions are shown in Fig. 5.
Familiar stimuli. Ten of 16 subjects in the
Iambic Training condition, and 11 of 16 subjects in the Trochaic Training condition, responded more slowly on average to withinunit buzzes than to between-units buzzes.
Overall, subjects required 627 ms (SD Å 90
ms) to initiate head turns on familiar withinunit trials, and 560 ms (SD Å 97 ms) to initiate
head turns on familiar between-units trials. A
significant positive difference score was found
for the Trochaic Training condition, t(15) Å
AID
JML 2447
/
a004$$$$$4
2.33, p õ .05, but not for the Iambic Training
condition, t(15) Å 1.87, p Å .08. Difference
scores did not vary significantly across the two
stimulus conditions, t(30) Å 0.42, NS.
Novel stimuli. Only 6 of 16 subjects in the
Iambic Training condition and 7 of 16 subjects
in the Trochaic Training condition responded
more slowly on average to within-unit buzzes
than to between-units buzzes. Overall, subjects required 695 ms (SD Å 94 ms) to initiate
head turns on novel within-unit trials, and 682
ms (SD Å 77 ms) to initiate head turns on
novel between-units trials. Mean difference
scores were not significantly different than
zero for either condition; in both cases, t(15)
õ 1, NS.
Six-month-old English-exposed infants appear to be equally affected by any change in
rhythm, whether it is from trochaic to iambic,
or from iambic to trochaic. Regardless of the
direction of change, 6-month-olds apparently
fail to perceive novelly accented bisyllables
as being cohesive, even though the segmental
content of the bisyllable is familiar. In contrast
to Experiment 2, it is clear that introducing
novel stimulus rhythms was an effective manipulation, as evidenced both by infants’ different patterns of responding to familiar and
novel rhythms and by infants’ relatively long
response latencies to stimuli with novel
rhythms. The results of this study are consistent with both the results of Experiment 2 and
those of Jusczyk, Cutler, and Redanz (1993a),
showing that at 6 months trochaic rhythm is
not privileged in speech segmentation.
GENERAL DISCUSSION
The results of the present set of studies support three conclusions. First, any initial biases
6- and 9-month-old infants may have for preferentially extracting or processing trochaic
versus iambic syllable sequences are neutralized by sufficient exposure to those sequences.
In all four experiments, infants provided
equivalent evidence of having represented familiar trochaic or iambic bisyllables as cohesive units. Repeated exposure to the bisyllables used in training may have induced infants to form a rudimentary sort of ‘‘lexical
09-25-96 11:50:42
jmla
AP: JML
683
PREVERBAL SPEECH SEGMENTATION
entry’’ for the familiar bisyllable; on any account, segmentation must rely at least in part
on matching of input to existing lexical representations. On this view, it is not surprising
to find that such matching may override initial
processing biases.
Second, at 6 months, infants do not appear
to process novel trochaic bisyllables differently than novel iambic bisyllables. This result
is consistent with Jusczyk, Cutler, and Redanz’s (1993) failure to find any rhythmic
preference in 6-month-olds. At 6 months (at
least in the sort of task used here), infants
do appear to be biased to attend to rhythmic
properties, rather than segmental properties, of
speech stimuli. Thus, 6-month-olds provided
evidence of having represented both novel trochaic and novel iambic bisyllables as cohesive
units, when the bisyllables were novel by virtue of segmental changes. However, when the
bisyllables were novel by virtue of rhythmic
changes, 6-month-olds provided evidence that
they represented neither novel trochaic nor
novel iambic bisyllables as cohesive units.
Third, at 9 months, infants are biased toward perceiving novel trochaic bisyllables as
forming more cohesive units than novel iambic bisyllables. This bias is evident whether
the bisyllables are novel by virtue of segmental changes (as in Experiment 1) or by virtue
of rhythmic changes (as in Experiment 3).
These results are consistent with Jusczyk, Cutler, and Redanz’s (1993) finding of a preference for trochaic bisyllables at 9 months and
suggest that English-learning infants may
adopt a metrical strategy for segmentation before the end of the first year.
How might infants come to adopt a trochaic
bias? One possibility is that this bias reflects
a universal stage of language acquisition. Such
a hypothesis has been advanced with regard
to children’s early speech production by Allen
and Hawkins (1980) and recently echoed, in
part, by Wijnen, Krikhaar, and den Os (1994).
However, analyses of early production across
a broad range of languages provide evidence
that such productions are conditioned by metrical properties of specific input languages
(Demuth, 1996). No evidence yet exists to
AID
JML 2447
/
a004$$$$$5
conclusively rule out the possibility of an
early, universal perceptual bias in processing
of trochaic sequences. Such evidence might
be obtained by testing infants exposed to languages in which initial stress in multisyllabic
words is exceptional (such as French or Hebrew) under the procedures used here. If the
present results are indeed due to 9-montholds’ exposure to English, then 9-month-old,
Hebrew-exposed infants ought to yield complementary patterns of responses, perceiving
novel iambic sequences as constituting more
cohesive units than novel trochaic sequences.
In the absence of such evidence, it still
seems most plausible that the bias observed
has its roots in infants’ linguistic experience.
Certainly, it would be a most curious coincidence if an innate bias for processing metrical
sequences would manifest itself over just the
same period that changes in perception of segmental properties of speech begin to reflect
linguistic experience. Evidence for such
changes has been provided by Best (1993);
Best, McRoberts, and Sithole (1988); Jusczyk,
Friederici, Wessels, Svenkerud, and Jusczyk
(1993); Jusczyk, Luce, and Charles-Luce
(1994); Kuhl, Williams, Lacerda, Stevens, and
Lindblom (1992); and Werker and Tees
(1984). The assumption that the trochaic bias
is not innate, however, requires an account of
how experience (e.g., with English) might
give rise to this bias.
Perhaps the simplest account is that infants
might generalize the properties of primitive
perceptual units to word-like units. Utterances—stretches of speech bounded by silences—are suitable candidates for primitive
units; note that boundaries of utterances necessarily coincide with word boundaries (except in instances of dysfluencies, which are
quite rare in child-directed speech). On this
explanation, if utterances typically begin with
strong syllables and end with weak syllables,
it would be reasonable for learners to hypothesize that smaller units within utterances begin
and end with comparable types of syllables.
Strategies that involve scanning from one
edge of a prosodic unit to the other, such as
that proposed in Dresher and Kaye (1990) for
09-25-96 11:50:42
jmla
AP: JML
684
JAMES L. MORGAN
TABLE 6
METRICAL PROPERTIES
OF
PERIPHERAL SYLLABLES
IN
MATERNAL UTTERANCES
All utterances
Two-syllable utterances
Child
N
Strong–weak
Weak–strong
N
Strong–Weak
Weak–strong
Adam
Eve
18148
9230
11.7%
15.3%
32.8%
31.3%
3767
1755
43.3%
27.7%
30.4%
47.0%
setting of the Foot Rhythm parameter, will
yield comparable results. If such parameters
are to be set before an extensive lexicon has
been established, the unit that is scanned will
generally not be equivalent to a word, but will
more likely be an utterance.
However, the syntactic structure of English
militates against input utterances manifesting
the metrical pattern desired under this class of
explanation: Phrases (and hence utterances)
often begin with function-word specifiers
(such as determiners, which are typically weak
syllables) and often end with content-word
heads. A computational analysis of maternal
input to Adam and Eve in the Brown corpora
(Brown, 1973; MacWhinney & Snow, 1985)
shows that 64% of maternal utterances began
with a function word and 66% of maternal
utterances ended with a content word, of
which the overwhelming preponderance were
monosyllabic. As shown in Table 6, further
analyses of accent patterns reveal that input
utterances do not generally begin and end with
strong and weak syllables, respectively,
whether all utterances, or only bisyllabic utterances, are considered.2 Thus, if children attempted to extrapolate from properties of ut2
These analyses were based on orthographic transcriptions and therefore provide only approximate characterizations of speech children hear. For purposes of these
analyses, all monosyllabic function words were considered to be weak syllables, all other monosyllabic words
were considered to be strong syllables, and all multisyllabic words were considered to be stressed in accordance
with their citation forms. Function words in utterancefinal position are often realized as strong syllables: What’s
thát or Put it dówn. Therefore, the ratio of SW to WS
utterances is probably somewhat smaller than indicated
by the data in Table 6.
AID
JML 2447
/
a004$$$$$5
terances to properties of words, they might in
some cases be inclined to adopt an iambic
bias, which apparently does not occur.
An alternative possibility is that the trochaic
bias in infants learning English emerges from
the initial stages of word-learning. On this
view, infants begin with a general expectation
that rhythm will play a role in defining linguistic units (and perhaps some notion that the
binary stress foot may be a significant rhythmic unit). Evidence of early attention to rhythmic properties of speech (e.g., Mehler, Jusczyk, Lambertz, Halsted, Bertoncini, & AmielTison, 1988) supports this contention. An
appropriate rhythmic bias (for trochaic rather
than iambic stress feet) then arises as a result
of early, limited distributional analyses of input. Infants may occasionally extract bisyllables, some iambic, some trochaic, from input
speech to entertain as possible words. If the
two syllables in question occur together with
some frequency in subsequent input, and if
neither of the two syllables occurs in combination with large numbers of different syllables,
then the putative unit will be retained; otherwise, it will be dropped. Trochaic bisyllables
that an infant acquiring English might extract
are likely to occur again in the input, inasmuch
as English bisyllabic words (or word-stems)
tend to be trochaic. Some iambic bisyllables
that an infant might extract, such as thedog,
will also occur repeatedly in the input. However, the first syllables of these will often be
function words and will thus typically occur in
combination with very many different second
syllables, a pattern supporting their analysis
as independent linguistic elements.
An analysis of bisyllabic utterances that re-
09-25-96 11:50:42
jmla
AP: JML
685
PREVERBAL SPEECH SEGMENTATION
curred in maternal speech to Adam and Eve
(either in isolation or within larger utterances)
supports the plausibility of a distributional
strategy.3 Virtually all recurring two-word bisyllabic utterances included at least one function word; sufficient evidence existed for these
to be analyzed as two-element sequences. In
recurring one-word bisyllabic utterances, trochaic types outnumbered iambic types by ratios of 5 to 1 for Adam and 9 to 1 for Eve.
In essence, this strategy harnesses the information available in the input lexicon. For example, in maternal utterances to both Adam
and Eve, 85% of bisyllabic word tokens, and
92% of bisyllabic word types, manifested
strong–weak rhythms. Note that use of babytalk forms in English, such as doggy or kitty,
in which monosyllabic nouns are converted to
canonical strong–weak bisyllables, causes the
strong–weak bias found in adult-directed
speech to be exaggerated in child-directed
speech. Ferguson (1964) observed that comparable affixes are found in baby talk registers
of a variety of languages. In other languages,
such as French, monosyllables are reduplicated to produce baby-talk forms exhibiting
canonical metrical patterns. In short, a distributional analysis of English input proceeding
from initial bisyllabic segmentations will tend
to reinforce a trochaic bias and discourage an
iambic bias, whereas a comparable distributional analysis of, say, Hebrew, would have
the opposite result.
Changes in perception of segmental properties of speech reported by Werker and Tees
(1984), Jusczyk, Friederici, Wessels, Svenkerud, and Jusczyk (1993), and others are also
traceable to infants’ distributional analyses of
input. On this view, the appearance of a trochaic bias appears contemporaneously with
these other changes in speech perception because they all result from the same underlying
process. Morgan and Saffran (1995) argue that
distributional analyses rest on the ability to
integrate disparate forms of information; thus,
3
This analysis included bisyllabic utterances that recurred within the same recording session and covered
transcripts 1–10 for each child.
AID
JML 2447
/
a004$$$$$5
the appearance of these changes in speech perception at 9 or 10 months may ultimately rest
on broader changes in infants’ cognition occurring at this epoch.
Regardless of how it may originate, infants’
use of a trochaic strategy for segmentation
ought to have implications for how acquisition
proceeds. An obvious prediction, one that is
strongly confirmed for English (Gerken, 1994)
and Dutch (Wijnen, Krikhaar, & den Os,
1994), is that infants’ early words should predominately conform to the strong–(weak)
rhythmic pattern. These data are ambiguous,
however; this pattern is equally well explained
by appeals to the nature of the lexicons of the
languages in question or to biases in speech
production. Somewhat more subtly, use of a
metrical strategy for segmentation may influence children’s acquisition of grammatical
morphemes: Frequent appearance in metrical
environments from which they are easily segmented should facilitate acquisition of particular morphemes.
Consider, for example, the relationship between parental yes–no questions and children’s acquisition of auxiliary verbs. Several
studies have found a significant positive relationship for children learning English (Furrow, Nelson, & Benedict, 1979; Gleitman,
Newport, & Gleitman, 1984; Newport, Gleitman, & Gleitman, 1977; Scarborough & Wycoff, 1986); indeed, this is the single most
robust finding in the literature on input and
acquisition. Newport, Gleitman, and Gleitman
advanced one explanation for this relationship, suggesting that normally unstressed auxiliaries become salient in initial position in
yes–no questions because they are stressed.
However, no acoustic measurements of utterance-initial auxiliaries in child-directed
speech have supported this claim.
The metrical segmentation strategy hypothesis provides an alternative explanation for the
relation between parental yes–no questions
and children’s acquisition of auxiliaries: Utterance-initial auxiliaries contribute to acquisition because they are easily segmentable
(more generally, functors consisting of single
weak syllables will be most easily segmented
09-25-96 11:50:42
jmla
AP: JML
686
JAMES L. MORGAN
when they appear in utterance-initial position). The leading edge of an utterance-initial
auxiliary will be readily identifiable because
it will typically follow a pause (Broen, 1972).
The second word of the sentence may begin
with a strong syllable or a weak syllable. If it
begins with a strong syllable, under the metrical segmentation strategy, this syllable will be
taken as signaling a word boundary, and the
initial auxiliary will fall out as residue. If the
second word begins with a weak syllable, the
metrical segmentation strategy will provide no
impetus to group it with the utterance-initial
(weak) auxiliary, and the initial auxiliary may
again fall out as residue. Contrast this with
the segmentability of utterance-medial auxiliaries: Under the metrical segmentation strategy, such auxiliaries, as weak syllables, will
tend to be aggregated with the most closely
preceding strong syllable and will not be
readily identifiable as individual words. Note
that this explanation predicts that frequent appearance in utterance-initial position may facilitate acquisition of other normally unstressed grammatical morphemes.
A trochaic strategy, of course, cannot provide a complete account of early speech segmentation. Not all trochees are words, nor are
all words trochaic. As noted above, matching
of input sequences to existing (lexical) representations must also play a role; clearly, recognition of sequences of speech sounds as exemplars of lexical entries (which after all is the
goal which segmentation serves) must involve
such matching. In establishing a lexicon, or
learning new words, infants might be able to
apply a trochaic strategy as a means of initially
isolating sequences of speech sounds that have
a high probability of constituting English
words. But whether a sequence of sounds is
actually a word also depends on its distributional regularities: The sequence can constitute a word only if it occurs in a predictable
form in a variety of contexts. On this view,
application of a trochaic strategy can help to
maximize the efficiency (by limiting the
scope) of the required distributional/sequential analysis ultimately required to determine
AID
JML 2447
/
a004$$$$$5
whether the isolated sequences of sounds are
words.
Recent work in our laboratory suggests that
additional biases, relating to phonotactic properties of the native language, may appear before the end of the first year: At 10 months,
infants perceive bisyllables with common syllable-to-syllable transitions (e.g., ‘‘monkey’’)
as more cohesive than bisyllables with infrequent syllable-to-syllable transitions (e.g.,
‘‘reptile’’). Whether and how a trochaic segmentation strategy might interact with other
segmentation biases is unknown at present.
Understanding how early speech segmentation proceeds remains an important problem,
because it is the representations resulting from
segmentation that provide the data for children’s discovery of the structure and meaning
of their native language.
REFERENCES
ABRAMS, K., & BEVER, T. (1969). Syntactic structure
modifies attention during speech perception and recognition. Quarterly Journal of Experimental Psychology, 21, 280–290.
ALLEN, G., & HAWKINS, S. (1980). Phonological rhythm:
Definition and development. In G. Yeni-Komshian,
J. Kavanagh, & C. Ferguson (Eds.), Child Phonology, Vol. 1 (pp. 227–256). New York: Academic
Press.
BARD, E., & ANDERSON, A. H. (1983). The unintelligibility of speech to children. Journal of Child Language,
10, 265–292.
BEST, C. T. (1993). Emergence of language-specific constraints in perception of non-native speech: A window on early phonological development. In B. de
Boysson-Bardies, S. de Schoen, P. Jusczyk, P.
McNeilage & J. Morton (Eds.), Developmental neurocognition: Speech and face processing during the
first year of life (pp. 289–304). Dordrecht: Kluwer.
BEST, C. T., MCROBERTS, G. W., & SITHOLE, N. M.
(1988). Examination of the perceptual reorganization
for speech contrasts: Zulu click discrimination by
English-speaking adults and infants. Journal of Experimental Psychology: Human Perception and Performance, 14, 345–360.
BOND, Z. S. (1972). Phonological units in sentence perception. Phonetica, 25, 129–139.
BROEN, P. (1972). The verbal environment of the language-learning child. Monograph of the American
Speech and Hearing Association (No. 17).
BROWN, R. (1973). A first language. Cambridge, MA:
Harvard Univ. Press.
CHOMSKY, N., & HALLE, M. (1968). The sound pattern
of English. New York: Harper & Row.
09-25-96 11:50:42
jmla
AP: JML
687
PREVERBAL SPEECH SEGMENTATION
COLE, R. A., & JAKIMIK, J. (1978). Understanding speech:
How words are heard. In G. Underwood (Ed.), Strategies of information processing (pp. 67–116). New
York: Academic Press.
CUTLER, A. (1990). Exploiting prosodic probabilities in
speech segmentation. In G. T. M. Altman (Ed.), Cognitive models of speech processing (pp. 105–121).
Cambridge, MA: MIT Press.
CUTLER, A. (1996). Prosody and the word boundary problem. In J. L. Morgan & K. Demuth (Eds.), Signal to
syntax (pp. 87–99). Mahwah, NJ: Erlbaum.
CUTLER, A., & BUTTERFIELD, S. (1992). Rhythmic cues
to speech segmentation: Evidence from juncture misperception. Journal of Memory and Language, 31,
218–236.
CUTLER, A., & CARTER, D. M. (1987). The predominance
of strong initial syllables in the English vocabulary.
Computer Speech and Language, 2, 133–142.
CUTLER, A., & CLIFTON, C. (1984). The use of prosodic
information in word recognition. In H. Bouma & D.
G. Bouwhuis (Eds.), Attention and performance X
(pp. 183–196). Hillsdale, NJ: Erlbaum.
CUTLER, A., MEHLER, J., NORRIS, D., & SEGUI, J. (1986).
The syllable’s differing role in the segmentation of
French and English. Journal of Memory and Language, 25, 385–400.
CUTLER, A., & NORRIS, D. G. (1988). The role of strong
syllables in segmentation for lexical access. Journal
of Experimental Psychology: Human Perception and
Performance, 14, 113–121.
DEMANY, L., MACKENZIE, B., & VURPILLOT, E. (1977).
Rhythmic perception in early infancy. Nature, 266,
718–719.
DEMUTH, K. (1996). The prosodic structure of early
words. In J. L. Morgan & K. Demuth (Eds.), Signal
to syntax (pp. 171–184). Mahwah, NJ: Erlbaum.
DRESHER, B. E., & KAYE, J. D. (1990). A computational
learning model for metrical phonology. Cognition,
34, 137–195.
FERGUSON, C. (1964). Baby talk in six languages. American Anthropologist, 66, 103–114.
FLORES D’ARCAIS, G. B. (1978). The perception of complex sentences. In W. J. M. Levelt & G. B. Flores
d’Arcais (Eds.), Studies in the perception of language. New York: Wiley.
FODOR, J. A., BEVER, T. G., & GARRETT, M. F. (1974).
The psychology of language. New York: McGraw–
Hill.
FURROW, D., NELSON, K., & BENEDICT, H. (1979). Mothers’ speech to children and syntactic development:
Some simple relationships. Journal of Child Language, 6, 423–442.
GERKEN, L. A. (1994). Young children’s representation
of prosodic phonology: evidence from Englishspeakers’ weak syllable omissions. Journal of Memory and Language, 33, 19–38.
GLEITMAN, L. R., NEWPORT, E. L., & GLEITMAN, H.
AID
JML 2447
/
a004$$$$$6
(1984). The current status of the motherese hypothesis. Journal of Child Language, 11, 43–79.
HOLMES, V. M., & FORSTER, K. I. (1970). Detection of
extraneous signals during sentence recognition. Perception and Psychophysics, 7, 297–301.
JUSCZYK, P. W., CUTLER, A., & REDANZ, L. (1993). Infants’ sensitivity to predominant stress patterns in
English. Child Development, 64, 675–687.
JUSCZYK, P. W., FRIEDERICI, A. D., WESSELS, J. M.,
SVENKERUD, V. Y., & JUSCZYK, A. M. (1993). Infants’ sensitivity to the sound patterns of native language words. Journal of Memory and Language, 32,
402–420.
JUSCZYK, P. W., LUCE, P., & CHARLES-LUCE, J. (1994).
Infants’ sensitivity to phonotactic patterns in the native language. Journal of Memory and Language, 33,
630–645.
JUSCZYK, P. W., & THOMPSON, E. (1978). Perception of
a phonetic contrast in multisyllabic utterances by 2month-old infants. Perception and Psychophysics,
23, 105–109.
KUHL, P. K. (1985). Methods in the study of infant speech
perception. In G. Gottleib & N. Krasnegor (Eds.),
Measurement of audition and vision in the first year
of postnatal life: A methodological overview. Norwood, NJ: Ablex.
KUHL, P. K., WILLIAMS, K. A., LACERDA, F., STEVENS,
K. N., & LINDBLOM, B. (1992). Linguistic experience
alters phonetic perception in infants by 6 months of
age. Science, 255, 606–608.
MACWHINNEY, B., & SNOW, C. E. (1985). The child language data exchange system. Journal of Child Language, 12, 271–296.
MEHLER, J., JUSCZYK, P., LAMBERTZ, G., HALSTED, N.,
BERTONCINI, J., & AMIEL-TISON, C. (1988). A precursor of language acquisition in young infants. Cognition, 29, 143–178.
MORGAN, J. L. (1994). Converging measures of speech
segmentation in preverbal infants. Infant Behavior
and Development, 17, 389–403.
MORGAN, J. L., & SAFFRAN, J. R. (1995). Emerging integration of sequential and suprasegmental information
in preverbal speech segmentation. Child Development, 66, 911–936.
NAKATANI, L. H., & SCHAFFER, J. A. (1978). Hearing
‘‘words’’ without words: Prosodic cues for word perception. Journal of the Acoustical Society of
America, 63, 234–245.
NEWPORT, E. L., GLEITMAN, H., & GLEITMAN, L. R.
(1977). Mother, I’d rather do it myself: Some effects
and non effects of maternal speech style. In C. E.
Snow & C. A. Ferguson (Eds.), Talking to children:
Language input and acquisition. Cambridge: Cambridge Univ. Press.
OTAKE, T., HATANO, G., CUTLER, A., & MEHLER, J.
(1993). Mora or syllable? Speech segmentation in
Japanese. Journal of Memory & Language, 32, 358–
378.
09-25-96 11:50:42
jmla
AP: JML
688
JAMES L. MORGAN
POLLACK, I., & PICKETT, J. M. (1964). The unintelligibility of excerpts from conversations. Language &
Speech, 6, 165–171.
RATNER, N. B. (1985). Dissociations between vowel durations and formant frequency characteristics. Journal
of Speech and Hearing Research, 28, 255–264.
SCARBOROUGH, H., & WYCOFF, J. (1986). Mother, I’d
still rather do it myself: some further noneffects of
‘motherese.’ Journal of Child Language, 13, 431–
438.
SELKIRK, E. O. (1984). Phonology and syntax. Cambridge, MA: MIT Press.
SPRING, D. R., & DALE, P. S. (1977). Discrimination of
linguistic stress in early infancy. Journal of Speech &
Hearing Research, 20, 224–232.
AID
JML 2447
/
a004$$$$$6
TREHUB, S. E., & THORPE, L. A. (1989). Infants’ perception of rhythm: Categorization of auditory sequences
by temporal structure. Canadian Journal of Psychology, 43, 217–229.
WERKER, J., & TEES, R. C. (1984). Cross-language speech
perception: Evidence for perceptual reorganization
during the first year of life. Infant Behavior and Development, 7, 49–63.
WIJNEN, F., KRIKHAAR, E., & DEN OS, E. (1994). The
(non)realization of unstressed elements in children’s
utterances: evidence for a rhythmic constraint. Journal of Child Language, 21, 59–83.
(Received July 15, 1994)
(Revision received August 11, 1995)
09-25-96 11:50:42
jmla
AP: JML