Listeners Use Vowel Harmony and Word

Laboratory Phonology 11
61
Kabak, Maniwa & Kazanina
Listeners Use Vowel Harmony and Word-Final Stress to Spot
Nonsense Words: A Study of Turkish and French
Barış Kabak*, Kazumi Maniwa* & Nina Kazanina#
*Department of Linguistics, University of Konstanz, Germany;
[email protected], [email protected]
#Department of Experimental Psychology, University of Bristol, UK; [email protected]
Speakers’ knowledge of sound distributions and rhythmic alternations that systematically characterize
wordhood in individual languages is known to aid word segmentation. Vowel harmony is one such regularity
that dictates a set of co-occurrence restrictions on vowel features within a word, e.g. in Finnish and Turkish,
all vowels within a word must agree on the front-back dimension. In these languages opposite values of the
front/back feature on adjacent vowels automatically signals a word boundary, as disharmony is not expected
within single words. Accordingly, Finnish speakers detect target words faster when the preceding syllable
contains a vowel that differs on the front-back dimension from the vowels in the target (Suomi et al., 1997).
Likewise, the culminative nature of accent, which requires that every lexical word has one primary stress, is
also known to aid speech segmentation. Especially when primary stress is fixed to a particular position that
demarcates word boundaries, as in word initial- or word final-stress languages, this may provide the
language user with invaluable cues to detect word boundaries. This idea found support in a previous study
which reported a facilitatory effect of word-initial stress in Finnish (Vroomen et al., 1998). Since primary
stress overlaps with the beginning of words in this language, it is difficult to know whether facilitation
effects are due to (i) the demarcative function of stress per se, which prompts a word boundary before the
stressed syllable, or (ii) the well-known primacy of word onsets in general. Instead, we test Turkish and
French, where stress typically falls on the word-final syllable, and thus separate the demarcative function of
stress from the primacy of word onsets. We demonstrate that listeners employ word-final stress cues to
progressively postulate an upcoming word boundary. Furthermore, we show that detection of a vowel
harmony mismatch, which unlike word-final stress constitutes a regressively operating cue for a word
boundary, is robustly exploited only by Turkish listeners. This finds a straightforward explanation since
Turkish, but not French, has front-back vowel harmony. Thus, we show that listeners can exploit abstract
phonological regularities in their native language to segment even nonsense words.
We conducted a target-detection task that employed a 2x2x2 design with the factors language
(Turkish/French), stress (stress2/stress3) and harmony (match/mismatch). Participants heard a 5-syllable
CVCVCVCVCV auditory string that consisted of a trisyllabic pre-target string and a disyllabic target (Table
1). The pre-target string and the target were both nonwords in Turkish and French, and were harmonious, i.e.
each contained only front or only back vowels. However, in half of the cases the pre-target and the target
matched on the frontness/backness dimension, their concatenation contained only front vowels or only back
vowels (the harmony-match conditions). In the remaining cases, the pre-target contained front vowels and
the target contained back vowels or vice versa (harmony-mismatch). Furthermore, the location of stress in
the pre-target was manipulated so that it fell either on the 2nd/3rd syllable (stress2 vs. stress3 conditions). On
each trial, the participants were prompted with a visual target, e.g. pαvo, which was then followed by an
auditory 5- syllable nonsense string, e.g. golushopαvo. The task was to determine whether the auditory string
contained the visual prompt as quickly and accurately as possible (the correct response was always ‘Yes’ for
experimental items).
Table 1. A sample set of conditions for the target pavo. The stressed syllables (in bold) are longer than the unstressed
syllables (240 vs. 160 ms), in their F0 range and F0 contour. Front vowels are in grey and back vowels are in black.
stress 2
stress 3
LabPhon11 abstracts
edited by Paul Warren
harmony-match
golushopαvo
golushopαvo
harmony-mismatch
golushopαvo
golushopαvo
Wellington, New Zealand
30 June - 2 July 2008
Abstract accepted after review
62
Kabak, Maniwa & Kazanina
Laboratory Phonology 11
Response times (RTs) were measured from the onset of the target in each auditory string. Thirty two sets of
experimental materials were distributed across 4 presentation lists following a Latin Square design. Each list
also contained 224 filler items to ensure an equal proportion of ‘Yes/No’ responses across all items, an equal
number of harmonic/disharmonic targets and an equal probability of a target word occurring in different
positions within an auditory string. Given that stress in both Turkish and French signals a word-boundary
immediately after the stressed syllable, identifying the target nonword should be easier in the stress3
conditions than in the stress2 conditions in both languages. In addition, targets should be detected faster
and/or more accurately in the harmony-mismatch conditions than in the harmony-match conditions in
Turkish, but not in French. Mean accuracy rates and RTs for experimental items based on 40 Turkish and 40
French speakers are summarized in Table 2. RTs below 300 ms and those that exceeded a threshold of 2.5
standard deviations above a participant’s mean reading rate for experimental items were replaced by the
threshold value; incorrectly responded trials were excluded from the RT analyses. Consequently, a number
of conditions in some sets were left with no data points, hence the corresponding sets had to be excluded in
order to preserve the validity of the items analysis (1 set excluded in French, 6 sets in Turkish).
Table 2
stress 2, match
stress 2, mismatch
stress 3, match
stress 3, mismatch
Turkish (n=40)
% correct
RT (st.err.)
69.7 (2.6)
895 (25)
88.1 (1.8)
872 (21)
72.2 (2.5)
772 (19)
94.4 (1.3)
733 (19)
French (n=40)
% correct RT (st.err.)
84.1 (2.1)
950 (20)
89.1 (1.7)
948 (21)
87.8 (1.8)
914 (23)
90.6 (1.6)
831 (21)
Accuracy: There was no difference in accuracy rates to filler items between the Turkish and the French
groups (Turkish = 85.5%, French = 86.6%). In 2x2x2 ANOVAs on experimental items, main effects of
language, stress, and harmony were all significant and, critically, the interaction language x harmony was
significant. 2x2 ANOVAs within each language group revealed a robust significant effect of harmony in the
Turkish group due to higher accuracy rates in the harmony-mismatch conditions than in the harmony-match
conditions (91.2 vs. 70.9%). No similar robust effect of harmony was found in the French group (89.8 vs.
85.9%). RTs: 2x2x2 ANOVAs showed a marginally significant language x stress interaction and language x
stress x harmony interaction. 2x2 ANOVAs within each language group revealed a significant main effect of
stress in both language groups. In French, the interaction stress .harmony was also significant. Post-hoc
analyses (with Bonferroni correction) showed that the effect of stress was significant both in the harmonymatch and harmony-mismatch conditions in Turkish, whereas in French this was significant only in the
harmony-mismatch conditions but not in the harmony-match conditions.
Manipulation of the position of stress yielded a significant effect on RTs in both languages. Conversely,
harmony had a robust effect only on accuracy rates in Turkish. These results support the claim that stress and
vowel harmony regularities that bear demarcative functions can facilitate speech segmentation, and this also
applies to non-words. They also suggest that while speakers of languages with a fixed stress may be stressdeaf, i.e. unable to robustly identify the location of the stress in the word or even discriminate two words on
the basis of a differential location of stress (Dupoux et al., 1997), they can successfully use the same cue for
word segmentation. This is on a par with allophonic and durational regularities which have been shown to be
exploited by speakers in word recognition tasks but are not substantially and consistently operationalized for
identification or discrimination purposes in speech perception tasks (e.g., Whalen, Best & Irwin, 1997).
References
Dupoux, E., Pallier, C., Sebastián-Gallés, N., & Mehler, J. (1997). A destressing deafness in French. Journal of
Memory and Language, 36, 399-421.
Suomi, K., McQueen, J. M., & Cutler, A. (1997). Vowel harmony and speech segmentation in Finnish. Journal of
Memory & Language, 36, 422-444.
Vroomen, J., Tuomainen, J., & de Gelder, B. (1998). The roles of word stress and vowel harmony in speech
segmentation. Journal of Memory and Language, 38, 133-149.
Whalen, D. H., Best, C. T., & Irwin, J. (1997). Lexical effects in the perception and production of American English /p/
allophones. Journal of Phonetics, 25, 501-528.